Re: [PATCH 0/2] mmap: Avoid the sanitizer configure check failure

2024-04-09 Thread Alan Modra
On Tue, Apr 09, 2024 at 07:24:33AM -0700, H.J. Lu wrote:
> Define GCC_AC_FUNC_MMAP with export ASAN_OPTIONS=detect_leaks=0 to avoid
> the sanitizer configure check failure.

OK for binutils.  (I just fixed my local copy of autoconf so I
wouldn't run into this again.)  The proper fix of course is to update
autotools to something more recent.

-- 
Alan Modra
Australia Development Lab, IBM


Re: CREL relocation format for ELF (was: RELLEB)

2024-03-28 Thread Alan Modra via Gcc
On Fri, Mar 22, 2024 at 06:51:41PM -0700, Fangrui Song wrote:
> On Thu, Mar 14, 2024 at 5:16 PM Fangrui Song  wrote:
> > I propose RELLEB, a new format offering significant file size
> > reductions: 17.2% (x86-64), 16.5% (aarch64), and even 32.4% (riscv64)!
> >
> > Your thoughts on RELLEB are welcome!

Does anyone really care about relocatable object file size?  If they
do, wouldn't they be better off using a compressed file system?

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RFC,patch] Linker plugin - extend API for offloading corner case (aka: LDPT_REGISTER_CLAIM_FILE_HOOK_V2 linker plugin hook [GCC PR109128])

2023-05-10 Thread Alan Modra via Gcc-patches
On Thu, May 04, 2023 at 11:02:25AM +, Richard Biener via Binutils wrote:
> So since we expect the linker to use the host side table is there a way
> for the plugin to exactly query that (the set of symbols the linker
> uses from the object passed to the plugin)?

That would be possible and relatively easy to implement, but might be
slow.

>  Because if the linker
> uses something from the file but _not_ the host side offload table
> (-ffunction-sections -fdata-sections) then things would still go
> wrong, right?

> Is there a way to connect both in a way that the linker discards
> either if the other isn't present?

No, or at least I do not want to even think about implementing such a
linker "feature".  The problem is that after you have modified the
global linker symbol table after adding an object's symbols, it is
virtually impossible to restore the state of symbols to what they
would be without that object.  (Yes, we do that sort of thing for
as-needed shared libraries, but the restoration happens immediately
after adding the symbols.  I also regret implementing it the way I
did.)

The patch posted is OK from the linker side of things.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RFC][top-level] Add configure test-case

2022-11-07 Thread Alan Modra via Gcc
On Mon, Nov 07, 2022 at 06:23:45PM +, Joseph Myers wrote:
> On Mon, 7 Nov 2022, Alan Modra via Binutils wrote:
> 
> > a) that top-level binutils/gdb patches don't get applied to the gcc
> >git repository in a timely manner, or
> 
> If a toplevel patch is approved for either repository, I think you should 
> treat it as approved for the other one without needing separate review.

Thanks Joseph, that's how I see it too.  Of course with the
understanding that binutils-gdb can't be used as a back door way of
sneaking in a gcc-specific change.

Can I get agreement among the gcc build maintainers that such a
policy is acceptable?

-- 
Alan Modra
Australia Development Lab, IBM


Correct spelling of DW_AT_namelist_item

2022-05-31 Thread Alan Modra via Gcc-patches
This typo was fixed a little while ago in binutils-gdb with commit
e951225303.  I noticed the difference today when importing libiberty
from gcc.  Committed as obvious.

include/
* dwarf2.def: Correct spelling of DW_AT_namelist_item.
gcc/
* dwarf2out.cc (gen_namelist_decl): Adjust to suit correct
spelling of DW_AT_namelist_item.

diff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc
index fccf59e8ec3..29f32ec6939 100644
--- a/gcc/dwarf2out.cc
+++ b/gcc/dwarf2out.cc
@@ -27479,7 +27479,7 @@ gen_namelist_decl (tree name, dw_die_ref scope_die, 
tree item_decls)
nml_item_ref_die = force_decl_die (value);
 
   nml_item_die = new_die (DW_TAG_namelist_item, nml_die, NULL);
-  add_AT_die_ref (nml_item_die, DW_AT_namelist_items, nml_item_ref_die);
+  add_AT_die_ref (nml_item_die, DW_AT_namelist_item, nml_item_ref_die);
 }
   return nml_die;
 }
diff --git a/include/dwarf2.def b/include/dwarf2.def
index 4214c80907a..530c6f849f9 100644
--- a/include/dwarf2.def
+++ b/include/dwarf2.def
@@ -289,7 +289,7 @@ DW_AT (DW_AT_frame_base, 0x40)
 DW_AT (DW_AT_friend, 0x41)
 DW_AT (DW_AT_identifier_case, 0x42)
 DW_AT (DW_AT_macro_info, 0x43)
-DW_AT (DW_AT_namelist_items, 0x44)
+DW_AT (DW_AT_namelist_item, 0x44)
 DW_AT (DW_AT_priority, 0x45)
 DW_AT (DW_AT_segment, 0x46)
 DW_AT (DW_AT_specification, 0x47)

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] libiberty: remove FINAL and OVERRIDE from ansidecl.h

2022-05-24 Thread Alan Modra via Gcc-patches
On Mon, May 23, 2022 at 07:42:29PM -0400, David Malcolm via Binutils wrote:
> Any objections, or is there a reason to keep these macros that I'm
> not aware of?  (and did I send this to all the pertinent lists?)

No objection from me.  These macros are not used anywhere in
binutils-gdb.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] libiberty: stop using PTR macro.

2022-05-10 Thread Alan Modra via Gcc-patches
On Tue, May 10, 2022 at 10:56:22AM +0200, Martin Liška wrote:

> diff --git a/libiberty/hashtab.c b/libiberty/hashtab.c

> @@ -457,15 +457,15 @@ htab_empty (htab_t htab)
>else if (htab->free_with_arg_f != NULL)
>   (*htab->free_with_arg_f) (htab->alloc_arg, htab->entries);
>if (htab->alloc_with_arg_f != NULL)
> - htab->entries = (PTR *) (*htab->alloc_with_arg_f) (htab->alloc_arg, 
> nsize,
> -sizeof (PTR *));
> + htab->entries = (void **) (*htab->alloc_with_arg_f) (htab->alloc_arg, 
> nsize,
> +sizeof (void **));

Here, and below, the code should really be using "sizeof (void *)".
You may as well fix that nit while you're at it.  Also, indentation
looks wrong.

>else
> - htab->entries = (PTR *) (*htab->alloc_f) (nsize, sizeof (PTR *));
> + htab->entries = (void **) (*htab->alloc_f) (nsize, sizeof (void **));
>   htab->size = nsize;
>   htab->size_prime_index = nindex;
>  }
>else
> -memset (entries, 0, size * sizeof (PTR));
> +memset (entries, 0, size * sizeof (void *));
>htab->n_deleted = 0;
>htab->n_elements = 0;
>  }

> @@ -543,10 +543,10 @@ htab_expand (htab_t htab)
>  }
>  
>if (htab->alloc_with_arg_f != NULL)
> -nentries = (PTR *) (*htab->alloc_with_arg_f) (htab->alloc_arg, nsize,
> -   sizeof (PTR *));
> +nentries = (void **) (*htab->alloc_with_arg_f) (htab->alloc_arg, nsize,
> +   sizeof (void **));
>else
> -nentries = (PTR *) (*htab->alloc_f) (nsize, sizeof (PTR *));
> +nentries = (void **) (*htab->alloc_f) (nsize, sizeof (void **));
>if (nentries == NULL)
>  return 0;
>htab->entries = nentries;

Here too.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] Remove non-ANSI C macros in ansidecl.h.

2022-05-10 Thread Alan Modra via Gcc-patches
On Tue, May 10, 2022 at 10:57:47AM +0200, Martin Liška wrote:
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> @Alan: Are you fine with the change from binutils/gdb perspective?

I'm fine if this isn't going into the binutils-gdb repo immediately.
There are occurrences of PTR in the cgen generated parts of opcodes,
sim, and even gdb.  I have a few patches I haven't yet committed.

-- 
Alan Modra
Australia Development Lab, IBM


Make opcodes configure depend on bfd configure

2021-11-12 Thread Alan Modra via Gcc-patches
The idea is for opcodes to be able to see whether bfd is compiled
for 64-bit.  A lot of --enable-targets=all libopcodes is wasted space
if bfd can't load 64-bit target object files.

* Makefile.def (configure-opcodes): Depend on configure-bfd.
* Makefile.in: Regenerate.

Applied to gcc.  I'll be importing the gcc versions over to binutils.

diff --git a/Makefile.def b/Makefile.def
index 0abc42b1a1b..a504192e6d7 100644
--- a/Makefile.def
+++ b/Makefile.def
@@ -4,7 +4,7 @@ AutoGen definitions Makefile.tpl;
 // Makefile.in is generated from Makefile.tpl by 'autogen Makefile.def'.
 // This file was originally written by Nathanael Nerode.
 //
-//   Copyright 2002-2019 Free Software Foundation
+//   Copyright 2002-2021 Free Software Foundation
 //
 // This file is free software; you can redistribute it and/or modify
 // it under the terms of the GNU General Public License as published by
@@ -493,6 +493,7 @@ dependencies = { module=install-strip-ld; 
on=install-strip-bfd; };
 dependencies = { module=install-strip-ld; on=install-strip-libctf; };
 
 // libopcodes depends on libbfd
+dependencies = { module=configure-opcodes; on=configure-bfd; hard=true; };
 dependencies = { module=install-opcodes; on=install-bfd; };
 dependencies = { module=install-strip-opcodes; on=install-strip-bfd; };
 
diff --git a/Makefile.in b/Makefile.in
index d13f6c353ee..860cf8f067b 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -62717,6 +62717,16 @@ install-ld: maybe-install-libctf
 install-strip-libctf: maybe-install-strip-bfd
 install-strip-ld: maybe-install-strip-bfd
 install-strip-ld: maybe-install-strip-libctf
+configure-opcodes: configure-bfd
+configure-stage1-opcodes: configure-stage1-bfd
+configure-stage2-opcodes: configure-stage2-bfd
+configure-stage3-opcodes: configure-stage3-bfd
+configure-stage4-opcodes: configure-stage4-bfd
+configure-stageprofile-opcodes: configure-stageprofile-bfd
+configure-stagetrain-opcodes: configure-stagetrain-bfd
+configure-stagefeedback-opcodes: configure-stagefeedback-bfd
+configure-stageautoprofile-opcodes: configure-stageautoprofile-bfd
+configure-stageautofeedback-opcodes: configure-stageautofeedback-bfd
 install-opcodes: maybe-install-bfd
 install-strip-opcodes: maybe-install-strip-bfd
 configure-gas: maybe-configure-intl

-- 
Alan Modra
Australia Development Lab, IBM


obstack.h __PTR_ALIGN vs. ubsan

2021-09-01 Thread Alan Modra via Gcc-patches
Current ubsan complains on every use of __PTR_ALIGN (when ptrdiff_t is
as large as a pointer), due to making calculations relative to a NULL
pointer.  This patch avoids the problem by extracting out and
simplifying __BPTR_ALIGN for the usual case.  I've continued to use
ptrdiff_t here, where it might be better to throw away __BPTR_ALIGN
entirely and just assume uintptr_t exists.

OK to apply for gcc?

* obstack.h (__PTR_ALIGN): Expand and simplify __BPTR_ALIGN
rather than calculating relative to a NULL pointer.

diff --git a/include/obstack.h b/include/obstack.h
index a6eb6c95587..0d8746f835b 100644
--- a/include/obstack.h
+++ b/include/obstack.h
@@ -137,9 +137,9 @@
relative to B.  Otherwise, use the faster strategy of computing the
alignment relative to 0.  */
 
-#define __PTR_ALIGN(B, P, A) \
-  __BPTR_ALIGN (sizeof (ptrdiff_t) < sizeof (void *) ? (B) : (char *) 0,  \
-P, A)
+#define __PTR_ALIGN(B, P, A)   \
+  (sizeof (ptrdiff_t) < sizeof (void *) ? __BPTR_ALIGN (B, P, A)   \
+   : (char *) (((ptrdiff_t) (P) + (A)) & ~(A)))
 
 #ifndef __attribute_pure__
 # if defined __GNUC_MINOR__ && __GNUC__ * 1000 + __GNUC_MINOR__ >= 2096

-- 
Alan Modra
Australia Development Lab, IBM


Re: [POWER10] __morestack calls from pcrel code

2021-07-21 Thread Alan Modra via Gcc-patches
On Wed, Jul 21, 2021 at 08:59:04AM -0400, David Edelsohn wrote:
> On Wed, Jul 21, 2021 at 4:29 AM Alan Modra  wrote:
> >
> > On Wed, Jul 14, 2021 at 08:24:16PM -0400, David Edelsohn wrote:
> > > > > > * config/rs6000/morestack.S (R2_SAVE): Define.
> > > > > > (__morestack): Save and restore r2.  Set up r2 for called
> > > > > > functions.
> > >
> > > This patch is okay.
> >
> > Thanks David, the patch is needed on gcc-11 and gcc-10 too.
> > OK for the branches too?
> 
> Backports are fine, but I believe that Richi is planning to cut GCC 11
> RC today, so you really should check with him about a backport at the
> last minute.

Hi Richard,
Is this patch OK at this late stage for the gcc-11 branch?
https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573978.html

The impacts of the bug are segfaults and other undesirable behaviour
with Go (or more generally -fsplit-stack) on power10 when libgcc is
not power10 pcrel.  A non-pcrel libgcc is very likely how distros
will ship gcc.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 2/4 REVIEW] libtool.m4: fix nm BSD flag detection

2021-07-21 Thread Alan Modra via Gcc-patches
On Wed, Jul 07, 2021 at 08:03:45PM +0100, Nick Alcock via Gcc-patches wrote:
> On 7 Jul 2021, Nick Clifton told this:
> 
> > Hi Nick,
> >
> >> Ping?
> >
> > Oops.
> 
> I sent a bunch of pings out at the same time, to a bunch of different
> projects. You are the only person to respond, so thank you!
> 
> >>>   PR libctf/27482
> >>>   * libtool.m4 (LT_PATH_NM): Try BSDization flags with a user-provided
> >
> > Changes to libtool need to be posted to the libtool project:
> >
> >   https://www.gnu.org/software/libtool/
> 
> I considered this, but there is *serious* divergence between the
> libtool.m4 in our tree and upstream. Fixing this divergence looks to be
> a fairly major project in and of itself :( the last real sync looked
> like being all the way back in 2008.

Yes, I looked at doing a sync myself a few years ago..
I'll OK the two libtool changes for binutils.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [POWER10] __morestack calls from pcrel code

2021-07-14 Thread Alan Modra via Gcc-patches
On Wed, Jun 30, 2021 at 05:06:30PM -0300, Tulio Magno Quites Machado Filho 
wrote:
> Alan Modra via Gcc-patches  writes:
> 
> > Compiling gcc/testsuite/gcc.dg/split-*.c and others with -mcpu=power10
> > and linking with a non-pcrel libgcc results in crashes due to the
> > power10 pcrel code not having r2 set for the generic-morestack.c
> > functions called from __morestack.  There is also a problem when
> > non-pcrel code calls a pcrel libgcc.  See the patch comments.
> >
> > A similar situation theoretically occurs with ELFv1 multi-toc
> > executables, when __morestack might be located in a different toc
> > group to its caller.  This patch makes no attempt to fix that, since
> > the gold linker does not support multi-toc (gold is needed for proper
> > support of -fsplit-stack code) nor does gcc emit __morestack calls
> > that support multi-toc.
> >
> > Bootstrapped and regression tested power64le-linux with both
> > -mcpu=power10 and -mcpu=power9.  OK for mainline and backporting to
> > gcc-11 and gcc-10?
> >
> > * config/rs6000/morestack.S (R2_SAVE): Define.
> > (__morestack): Save and restore r2.  Set up r2 for called
> > functions.
> 
> Thanks! This patch solved the issue I was seeing.
> 
> If it gets merged, can this patch be backported to GCC 10 and 11, please?
> 
> -- 
> Tulio Magno

https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573978.html

This patch has now been unreviewed for over two weeks.  I expected a
rubber stamp style approval;  This assembly file is all mine, I know
the ABI and how .eh_frame driven exception handling works on powerpc.
So I'm going to claim the patch is obvious enough to someone with a
good understanding of what is going on in morestack.S and commit under
the "obvious" rule after allowing a few more days for comment.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] Adjust testcases for power10 instructions

2021-07-01 Thread Alan Modra via Gcc-patches
On Thu, Jul 01, 2021 at 04:47:21PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Jul 01, 2021 at 10:59:15PM +0930, Alan Modra wrote:
> > * lib/target-supports.exp (check_effective_target_has_arch_pwr10): New.
> 
> Mike added this already, please make sure to not add it twice :-)

Yup, rebasing took it out of my patch and a little edit took it out of
my changelog.

> [...]
> > gcc.target/powerpc/pr86731-fwrapv-longlong.c: Match power10 insns.
> 
> (It still allows older as well, so "Also match" maybe?)

OK.

> Did you make sure all of these are correct and expected?

Yes, they still are.  I checked that there was a corresponding
testsuite regression fix for each change too.

>  Are the
> testcases still strict enough.

I think so.

, or should you add -mno-pcrel to the
> options, instead? Or maybe test both -mpcrel and -mno-pcrel?  Etc.

I think adding -mno-pcrel would be a bad idea, since it would reduce
power10 code coverage, and you'll get both by simply running the
testsuite on power10 and say, power9.

> 
> > * gcc.target/powerpc/lvsl-lvsr.c: Avoid file name match.
> 
> You also add a "p?", is that expected?  Should be in the changelog
> then :-)

It was in the changelog..  I mentioned lvsl-lvsr.c twice (which I
suppose might fall foul of the changelog commit checking).  Changing
to

* gcc.target/powerpc/lvsl-lvsr.c: Likewise.  Avoid file name match.

> > -/* { dg-final { scan-assembler-times {\mlxvd2x\M|\mlxv\M} 2 } } */
> > +/* { dg-final { scan-assembler-times {\mlxvd2x\M|\mp?lxv\M} 2 } } */
> 
> 
> > @@ -1,12 +1,12 @@
> >  /* Test expected code generation for lvsl and lvsr on little endian.
> > -   Note that lvsl and lvsr are each produced once, but the filename
> > -   causes them to appear twice in the file.  */
> > +   Note that \s is used in the lvsl/lvsr matches so we don't match
> > +   on '.file "lvsl-lvsr.c"'.  */
> 
> Even better is to not put the instruction names in the filename, but
> heh, maybe that would be too simple ;-)
> 
> 
> Segher

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] Adjust testcases for power10 instructions

2021-07-01 Thread Alan Modra via Gcc-patches
-final { scan-assembler-times {\maddic\M} 3 { target { has_arch_pwr10 } 
} } } */
+/* { dg-final { scan-assembler-times {\msetbcr\M} 1 { target { has_arch_pwr10 
} } } } */
+/* { dg-final { scan-assembler-times {\maddze\M} 3 } } */
 
 long ne0(long a)
 {
diff --git a/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c 
b/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
index 1269fe635c6..72b83ef9d08 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
@@ -30,5 +30,5 @@ vector signed long long splats4(void)
 
 /* { dg-final { scan-assembler-times {\mvspltis[bhw]\M} 0 } } */
 /* { dg-final { scan-assembler-times {\mvsl[bhwd]\M} 0 } } */
-/* { dg-final { scan-assembler-times {\mlvx\M|\mlxv\M|\mlxvd2x\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mlvx\M|\mp?lxv\M|\mlxvd2x\M} 2 } } */
 
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 7f78c5593ac..789723fb287 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -6127,6 +6127,16 @@ proc check_effective_target_has_arch_pwr9 { } {
}]
 }
 
+proc check_effective_target_has_arch_pwr10 { } {
+   return [check_no_compiler_messages arch_pwr10 assembly {
+   #ifndef _ARCH_PWR10
+   #error does not have power10 support.
+   #else
+   /* "has power10 support" */
+   #endif
+   }]
+}
+
 # Return 1 if this is a PowerPC target supporting -mcpu=power10.
 # Limit this to 64-bit linux systems for now until other targets support
 # power10.

-- 
Alan Modra
Australia Development Lab, IBM


[POWER10] __morestack calls from pcrel code

2021-06-30 Thread Alan Modra via Gcc-patches
Compiling gcc/testsuite/gcc.dg/split-*.c and others with -mcpu=power10
and linking with a non-pcrel libgcc results in crashes due to the
power10 pcrel code not having r2 set for the generic-morestack.c
functions called from __morestack.  There is also a problem when
non-pcrel code calls a pcrel libgcc.  See the patch comments.

A similar situation theoretically occurs with ELFv1 multi-toc
executables, when __morestack might be located in a different toc
group to its caller.  This patch makes no attempt to fix that, since
the gold linker does not support multi-toc (gold is needed for proper
support of -fsplit-stack code) nor does gcc emit __morestack calls
that support multi-toc.

Bootstrapped and regression tested power64le-linux with both
-mcpu=power10 and -mcpu=power9.  OK for mainline and backporting to
gcc-11 and gcc-10?

* config/rs6000/morestack.S (R2_SAVE): Define.
(__morestack): Save and restore r2.  Set up r2 for called
functions.

diff --git a/libgcc/config/rs6000/morestack.S b/libgcc/config/rs6000/morestack.S
index 4a07de927c5..a2e255e5c21 100644
--- a/libgcc/config/rs6000/morestack.S
+++ b/libgcc/config/rs6000/morestack.S
@@ -31,6 +31,7 @@
 #define PARAMS 48
 #endif
 #define MORESTACK_FRAMESIZE(PARAMS+96)
+#define R2_SAVE-MORESTACK_FRAMESIZE+PARAMS-8
 #define PARAMREG_SAVE  -MORESTACK_FRAMESIZE+PARAMS+0
 #define STATIC_CHAIN_SAVE  -MORESTACK_FRAMESIZE+PARAMS+64
 #define R29_SAVE   -MORESTACK_FRAMESIZE+PARAMS+72
@@ -143,6 +144,17 @@ ENTRY0(__morestack_non_split)
 # cr7 must also be preserved.
 
 ENTRY0(__morestack)
+
+#if _CALL_ELF == 2
+# Functions with localentry bits of zero cannot make calls if those
+# calls might change r2.  This is true generally, and also true for
+# __morestack with its special calling convention.  When __morestack's
+# caller is non-pcrel but libgcc is pcrel, the functions called here
+# might modify r2.  r2 must be preserved on exit, and also restored
+# for the call back to our caller.
+   std %r2,R2_SAVE(%r1)
+#endif
+
 # Save parameter passing registers, our arguments, lr, r29
 # and use r29 as a frame pointer.
std %r3,PARAMREG_SAVE+0(%r1)
@@ -161,10 +173,24 @@ ENTRY0(__morestack)
std %r12,LINKREG_SAVE(%r1)
std %r3,NEWSTACKSIZE_SAVE(%r1)  # new stack size
mr %r29,%r1
+#if _CALL_ELF == 2
+   .cfi_offset %r2,R2_SAVE
+#endif
.cfi_offset %r29,R29_SAVE
.cfi_def_cfa_register %r29
stdu %r1,-MORESTACK_FRAMESIZE(%r1)
 
+#if _CALL_ELF == 2 && !defined __PCREL__
+# If this isn't a pcrel libgcc then the functions we call here will
+# require r2 to be valid.  If __morestack is called from pcrel code r2
+# won't be valid.  Set it up.
+   bcl 20,31,1f
+1:
+   mflr %r12
+   addis %r2,%r12,.TOC.-1b@ha
+   addi %r2,%r2,.TOC.-1b@l
+#endif
+
# void __morestack_block_signals (void)
bl JUMP_TARGET(__morestack_block_signals)
 
@@ -199,6 +225,9 @@ ENTRY0(__morestack)
 # instructions after __morestack's return address.
 #
ld %r12,LINKREG_SAVE(%r29)
+#if _CALL_ELF == 2
+   ld %r2,R2_SAVE(%r29)
+#endif
ld %r3,PARAMREG_SAVE+0(%r29)# restore arg regs
ld %r4,PARAMREG_SAVE+8(%r29)
ld %r5,PARAMREG_SAVE+16(%r29)
@@ -228,6 +257,15 @@ ENTRY0(__morestack)
std %r10,PARAMREG_SAVE+56(%r29)
 #endif
 
+#if _CALL_ELF == 2 && !defined __PCREL__
+# r2 was restored for calling back into our caller.  Set it up again.
+   bcl 20,31,1f
+1:
+   mflr %r12
+   addis %r2,%r12,.TOC.-1b@ha
+   addi %r2,%r2,.TOC.-1b@l
+#endif
+
bl JUMP_TARGET(__morestack_block_signals)
 
# void *__generic_releasestack (size_t *pavailable)
@@ -249,6 +287,9 @@ ENTRY0(__morestack)
 # Restore return value regs, and return.
ld %r0,LINKREG_SAVE(%r29)
mtlr %r0
+#if _CALL_ELF == 2
+   ld %r2,R2_SAVE(%r29)
+#endif
ld %r3,PARAMREG_SAVE+0(%r29)
ld %r4,PARAMREG_SAVE+8(%r29)
ld %r5,PARAMREG_SAVE+16(%r29)

-- 
Alan Modra
Australia Development Lab, IBM


Re: PowerPC64 ELFv2 -fpatchable-function-entry

2021-05-19 Thread Alan Modra via Gcc-patches
On Tue, May 18, 2021 at 05:48:40AM -0500, Segher Boessenkool wrote:
> I wrote a bit later:
> 
> > > Can you make this less hacky please?  Changing the generic code is just
> > > fine as well, it needs some love.
> 
> In effect making a callback / hook without making that explicit is bad
> for maintainability.  We are in stage 1, now is a good time for
> infrastructure changes.

Yes, perhaps I could do that, and possibly even write a patch that is
accepted.  I'm not going to though, because I made a decision a little
while ago that I'm not going to contribute new gcc work.  This one was
just flushing some patches that I wrote a while ago.

-- 
Alan Modra
Australia Development Lab, IBM


Re: PowerPC64 ELFv2 -fpatchable-function-entry

2021-05-18 Thread Alan Modra via Gcc-patches
On Mon, May 10, 2021 at 04:39:55PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, May 07, 2021 at 12:19:52PM +0930, Alan Modra wrote:
> > PowerPC64 ELFv2 dual entry point functions have a couple of problems
> > with -fpatchable-function-entry.  One is that the nops added after the
> > global entry land in the global entry code which is constrained to be
> > a power of two number of instructions, and zero global entry code has
> > special meaning for linkage.  The other is that the global entry code
> > isn't always used on function entry, and some uses of
> > -fpatchable-function-entry might want to affect all entries to the
> > function.  So this patch arranges to put one batch of nops before the
> > global entry, and the other after the local entry point.
> 
> Wow ugly.  Not worse than patchable-funtion-enbtry itself I suppose :-)
> 
> > * config/rs6000/rs6000-logue.c: Include targhooks.h.
> 
> Huh, did it not already do that?!  Hrm, all the other hooks seem to be
> called via rs6000.c currently.  But you do the same, so why do you need
> to include the header in rs6000-logue.c?

For the new call to default_print_patchable_function_entry in
rs6000_output_function_prologue.

> 
> > +/* Write NOPs into the asm outfile FILE around a function entry.  This
> > +   routine may be called twice per function to put NOPs before and after
> > +   the function entry.  If RECORD_P is true the location of the NOPs will
> > +   be recorded by default_print_patchable_function_entry in a special
> > +   object section called "__patchable_function_entries".  Disable output
> > +   of any NOPs for the second call.  Those, if any, are output by
> > +   rs6000_output_function_prologue.  This means that for ELFv2 any NOPs
> > +   after the function entry are placed after the local entry point, not
> > +   the global entry point.  NOPs after the entry may be found at
> > +   record_loc + nops_before * 4 + local_entry_offset.  This holds true
> > +   when nops_before is zero.  */
> > +
> > +static void
> > +rs6000_print_patchable_function_entry (FILE *file,
> > +  unsigned HOST_WIDE_INT patch_area_size 
> > ATTRIBUTE_UNUSED,
> > +  bool record_p)
> 
> It is not a predicate.  Do not call it _p please.  Yes, I know the
> generic code calls is this.  That needs fixing.
> 
> A better name (from the viewpoint of callers, which is what matters!)
> might be first_time or similar?  Or something that really says what it
> *does*?
> 
> > +{
> > +  /* Always call default_print_patchable_function_entry when RECORD_P in
> > + order to output the location of the NOPs, but use the size of the
> > + area before the entry on both possible calls.  If RECORD_P is true
> > + on the second call then the area before the entry was zero size and
> > + thus no NOPs will be output.  */
> > +  if (record_p)
> > +default_print_patchable_function_entry (file, crtl->patch_area_entry,
> > +   record_p);
> > +}
> 
> That is trickiness that will break later.

There is a fine line between trickiness and elegance.  Given the
current available crtl variables dealing with the patch area and the
current patchable_function_entry parameters, any supposedly "less
hacky" but correct expression will simplify down to what I wrote
here.

-- 
Alan Modra
Australia Development Lab, IBM


Re: PowerPC64 ELFv1 -fpatchable-function-entry

2021-05-18 Thread Alan Modra via Gcc-patches
On Fri, May 07, 2021 at 07:24:02PM -0500, Segher Boessenkool wrote:
> Looks good with those things tweaked.

Except that the patch chose the wrong section to emit the label,
because the decl is wrong.  But of course I was using the same decl as
used in existing SHF_LINK_ORDER support, so it was already broken.
You can see this on x86_64 gcc/testsuite/g++.dg/pr93195a.C output for
bar().

.globl  _Z3barv
.type   _Z3barv, @function
_Z3barv:
.LFB1:
.cfi_startproc
.section__patchable_function_entries,"awo",@progbits,_Z3foov
.align 8
.quad   .LPFE2
.text
.LPFE2:
nop
pushq   %rbp

Note the reference to _Z3foov!  Tracking down why this happens isn't
hard.  The first get_section ("__patchable_function_entries", ...)
saves the decl for foo, which is the one then used on all subsequent
calls.  Oops.  Jakub, commit 134afa38f0be didn't quite do the right
thing.

This problem can be fixed with a simplification of the patch I posted,
relying on the fact that currently in the only place that wants to
emit SHF_LINK_ORDER sections we've already switched to the correct
function code section.

Regression testing in progress.

PR target/98125
* varasm.c (default_elf_asm_named_section): Use a function
code label rather than the function symbol as the "o" argument.

diff --git a/gcc/varasm.c b/gcc/varasm.c
index 97c1e6fff25..6f10ac7762e 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -6866,6 +6866,15 @@ default_elf_asm_named_section (const char *name, 
unsigned int flags,
   *f = '\0';
 }
 
+  char func_label[256];
+  if (flags & SECTION_LINK_ORDER)
+{
+  static int func_code_labelno;
+  func_code_labelno++;
+  ASM_GENERATE_INTERNAL_LABEL (func_label, "LPFC", func_code_labelno);
+  ASM_OUTPUT_LABEL (asm_out_file, func_label);
+}
+
   fprintf (asm_out_file, "\t.section\t%s,\"%s\"", name, flagchars);
 
   /* default_section_type_flags (above) knows which flags need special
@@ -6893,11 +6902,8 @@ default_elf_asm_named_section (const char *name, 
unsigned int flags,
fprintf (asm_out_file, ",%d", flags & SECTION_ENTSIZE);
   if (flags & SECTION_LINK_ORDER)
{
- tree id = DECL_ASSEMBLER_NAME (decl);
- ultimate_transparent_alias_target ();
- const char *name = IDENTIFIER_POINTER (id);
- name = targetm.strip_name_encoding (name);
- fprintf (asm_out_file, ",%s", name);
+ fprintf (asm_out_file, ",");
+ assemble_name_raw (asm_out_file, func_label);
}
   if (HAVE_COMDAT_GROUP && (flags & SECTION_LINKONCE))
{

-- 
Alan Modra
Australia Development Lab, IBM


PowerPC64 ELFv2 -fpatchable-function-entry

2021-05-06 Thread Alan Modra via Gcc-patches
PowerPC64 ELFv2 dual entry point functions have a couple of problems
with -fpatchable-function-entry.  One is that the nops added after the
global entry land in the global entry code which is constrained to be
a power of two number of instructions, and zero global entry code has
special meaning for linkage.  The other is that the global entry code
isn't always used on function entry, and some uses of
-fpatchable-function-entry might want to affect all entries to the
function.  So this patch arranges to put one batch of nops before the
global entry, and the other after the local entry point.

PR target/98125
* config/rs6000/rs6000.c (rs6000_print_patchable_function_entry): New
function.
(TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY): Define.
* config/rs6000/rs6000-logue.c: Include targhooks.h.
(rs6000_output_function_prologue): Handle nops for
-fpatchable-function-entry after local entry point.

diff --git a/gcc/config/rs6000/rs6000-logue.c b/gcc/config/rs6000/rs6000-logue.c
index b0ac183ceff..ffa3bb3dcf1 100644
--- a/gcc/config/rs6000/rs6000-logue.c
+++ b/gcc/config/rs6000/rs6000-logue.c
@@ -51,6 +51,7 @@
 #include "gstab.h"  /* for N_SLINE */
 #include "dbxout.h" /* dbxout_ */
 #endif
+#include "targhooks.h"
 
 static int rs6000_ra_ever_killed (void);
 static void is_altivec_return_reg (rtx, void *);
@@ -3991,6 +3992,10 @@ rs6000_output_function_prologue (FILE *file)
   fputs (",1\n", file);
 }
 
+  int nops_after_entry = crtl->patch_area_size - crtl->patch_area_entry;
+  if (nops_after_entry > 0)
+default_print_patchable_function_entry (file, nops_after_entry, false);
+
   /* Output -mprofile-kernel code.  This needs to be done here instead of
  in output_function_profile since it must go after the ELFv2 ABI
  local entry point.  */
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index d43c36e7f1a..97f1b3e0674 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -1404,6 +1404,10 @@ static const struct attribute_spec 
rs6000_attribute_table[] =
 #undef TARGET_ASM_FUNCTION_EPILOGUE
 #define TARGET_ASM_FUNCTION_EPILOGUE rs6000_output_function_epilogue
 
+#undef TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY
+#define TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY \
+  rs6000_print_patchable_function_entry
+
 #undef TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA
 #define TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA rs6000_output_addr_const_extra
 
@@ -14953,6 +14957,33 @@ rs6000_assemble_visibility (tree decl, int vis)
 }
 #endif
 
+/* Write NOPs into the asm outfile FILE around a function entry.  This
+   routine may be called twice per function to put NOPs before and after
+   the function entry.  If RECORD_P is true the location of the NOPs will
+   be recorded by default_print_patchable_function_entry in a special
+   object section called "__patchable_function_entries".  Disable output
+   of any NOPs for the second call.  Those, if any, are output by
+   rs6000_output_function_prologue.  This means that for ELFv2 any NOPs
+   after the function entry are placed after the local entry point, not
+   the global entry point.  NOPs after the entry may be found at
+   record_loc + nops_before * 4 + local_entry_offset.  This holds true
+   when nops_before is zero.  */
+
+static void
+rs6000_print_patchable_function_entry (FILE *file,
+  unsigned HOST_WIDE_INT patch_area_size 
ATTRIBUTE_UNUSED,
+  bool record_p)
+{
+  /* Always call default_print_patchable_function_entry when RECORD_P in
+ order to output the location of the NOPs, but use the size of the
+ area before the entry on both possible calls.  If RECORD_P is true
+ on the second call then the area before the entry was zero size and
+ thus no NOPs will be output.  */
+  if (record_p)
+default_print_patchable_function_entry (file, crtl->patch_area_entry,
+   record_p);
+}
+
 enum rtx_code
 rs6000_reverse_condition (machine_mode mode, enum rtx_code code)
 {


Revert "rs6000: Avoid -fpatchable-function-entry* regressions on powerpc64 be [PR98125]"

2021-05-06 Thread Alan Modra via Gcc-patches
This reverts commit b680b9049737198d010e49cf434704c6a6ed2b3f now
that the PowerPC64 ELFv1 regression is fixed properly.

PR testsuite/98125
* targhooks.h (default_print_patchable_function_entry_1): Delete.
* targhooks.c (default_print_patchable_function_entry_1): Delete.
(default_print_patchable_function_entry): Expand above.
* config/rs6000/rs6000.c (TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY):
Don't define.
(rs6000_print_patchable_function_entry): Delete.
* testsuite/g++.dg/pr93195a.C: Revert 2021-04-03 change.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 663eed4f055..d43c36e7f1a 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -1362,10 +1362,6 @@ static const struct attribute_spec 
rs6000_attribute_table[] =
 #define TARGET_ASM_ASSEMBLE_VISIBILITY rs6000_assemble_visibility
 #endif
 
-#undef TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY
-#define TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY \
-  rs6000_print_patchable_function_entry
-
 #undef TARGET_SET_UP_BY_PROLOGUE
 #define TARGET_SET_UP_BY_PROLOGUE rs6000_set_up_by_prologue
 
@@ -14957,30 +14953,6 @@ rs6000_assemble_visibility (tree decl, int vis)
 }
 #endif
 
-/* Write PATCH_AREA_SIZE NOPs into the asm outfile FILE around a function
-   entry.  If RECORD_P is true and the target supports named sections,
-   the location of the NOPs will be recorded in a special object section
-   called "__patchable_function_entries".  This routine may be called
-   twice per function to put NOPs before and after the function
-   entry.  */
-
-void
-rs6000_print_patchable_function_entry (FILE *file,
-  unsigned HOST_WIDE_INT patch_area_size,
-  bool record_p)
-{
-  unsigned int flags = SECTION_WRITE | SECTION_RELRO;
-  /* When .opd section is emitted, the function symbol
- default_print_patchable_function_entry_1 is emitted into the .opd section
- while the patchable area is emitted into the function section.
- Don't use SECTION_LINK_ORDER in that case.  */
-  if (!(TARGET_64BIT && DEFAULT_ABI != ABI_ELFv2)
-  && HAVE_GAS_SECTION_LINK_ORDER)
-flags |= SECTION_LINK_ORDER;
-  default_print_patchable_function_entry_1 (file, patch_area_size, record_p,
-   flags);
-}
-
 enum rtx_code
 rs6000_reverse_condition (machine_mode mode, enum rtx_code code)
 {
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 952fad422eb..d69c9a2d819 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -1832,15 +1832,17 @@ default_compare_by_pieces_branch_ratio (machine_mode)
   return 1;
 }
 
-/* Helper for default_print_patchable_function_entry and other
-   print_patchable_function_entry hook implementations.  */
+/* Write PATCH_AREA_SIZE NOPs into the asm outfile FILE around a function
+   entry.  If RECORD_P is true and the target supports named sections,
+   the location of the NOPs will be recorded in a special object section
+   called "__patchable_function_entries".  This routine may be called
+   twice per function to put NOPs before and after the function
+   entry.  */
 
 void
-default_print_patchable_function_entry_1 (FILE *file,
- unsigned HOST_WIDE_INT
- patch_area_size,
- bool record_p,
- unsigned int flags)
+default_print_patchable_function_entry (FILE *file,
+   unsigned HOST_WIDE_INT patch_area_size,
+   bool record_p)
 {
   const char *nop_templ = 0;
   int code_num;
@@ -1862,6 +1864,9 @@ default_print_patchable_function_entry_1 (FILE *file,
   patch_area_number++;
   ASM_GENERATE_INTERNAL_LABEL (buf, "LPFE", patch_area_number);
 
+  unsigned int flags = SECTION_WRITE | SECTION_RELRO;
+  if (HAVE_GAS_SECTION_LINK_ORDER)
+   flags |= SECTION_LINK_ORDER;
   switch_to_section (get_section ("__patchable_function_entries",
  flags, current_function_decl));
   assemble_align (POINTER_SIZE);
@@ -1878,25 +1883,6 @@ default_print_patchable_function_entry_1 (FILE *file,
 output_asm_insn (nop_templ, NULL);
 }
 
-/* Write PATCH_AREA_SIZE NOPs into the asm outfile FILE around a function
-   entry.  If RECORD_P is true and the target supports named sections,
-   the location of the NOPs will be recorded in a special object section
-   called "__patchable_function_entries".  This routine may be called
-   twice per function to put NOPs before and after the function
-   entry.  */
-
-void
-default_print_patchable_function_entry (FILE *file,
-   unsigned HOST_WIDE_INT patch_area_size,
-   bool record_p)
-{
-  unsigned int flags = SECTION_WRITE | SECTION_RELRO;
-  if 

PowerPC64 ELFv1 -fpatchable-function-entry

2021-05-06 Thread Alan Modra via Gcc-patches
On PowerPC64 ELFv1 function symbols are defined on function
descriptors in an .opd section rather than in the function code.
.opd is not split up by the PowerPC64 backend for comdat groups or
other situations where per-function sections are required.  Thus
SECTION_LINK_ORDER can't use the function name to reference a suitable
section for ordering:  The .opd section might contain many other
function descriptors and they may be in a different order to the final
function code placement.  This patch arranges to use a code label
instead of the function name symbol.

I chose to emit the label inside default_elf_asm_named_section,
immediately before the .section directive using the label, and in case
someone uses .previous or the like, need to save and restore the
current section when switching to the function code section to emit
the label.  That requires a tweak to switch_to_section in order to get
the current section.  I checked all the TARGET_ASM_NAMED_SECTION
functions and unnamed.callback functions and it appears none will be
affected by that tweak.

PR target/98125
* varasm.c (default_elf_asm_named_section): Use a function
code label rather than the function symbol as the "o" argument.
(switch_to_section): Don't set in_section until section
directive has been emitted.

diff --git a/gcc/varasm.c b/gcc/varasm.c
index 97c1e6fff25..5f95f8cfa75 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -6866,6 +6866,26 @@ default_elf_asm_named_section (const char *name, 
unsigned int flags,
   *f = '\0';
 }
 
+  char func_label[256];
+  if (flags & SECTION_LINK_ORDER)
+{
+  static int recur;
+  if (recur)
+   gcc_unreachable ();
+  else
+   {
+ ++recur;
+ section *save_section = in_section;
+ static int func_code_labelno;
+ switch_to_section (function_section (decl));
+ ++func_code_labelno;
+ ASM_GENERATE_INTERNAL_LABEL (func_label, "LPFC", func_code_labelno);
+ ASM_OUTPUT_LABEL (asm_out_file, func_label);
+ switch_to_section (save_section);
+ --recur;
+   }
+}
+
   fprintf (asm_out_file, "\t.section\t%s,\"%s\"", name, flagchars);
 
   /* default_section_type_flags (above) knows which flags need special
@@ -6893,11 +6913,8 @@ default_elf_asm_named_section (const char *name, 
unsigned int flags,
fprintf (asm_out_file, ",%d", flags & SECTION_ENTSIZE);
   if (flags & SECTION_LINK_ORDER)
{
- tree id = DECL_ASSEMBLER_NAME (decl);
- ultimate_transparent_alias_target ();
- const char *name = IDENTIFIER_POINTER (id);
- name = targetm.strip_name_encoding (name);
- fprintf (asm_out_file, ",%s", name);
+ fputc (',', asm_out_file);
+ assemble_name_raw (asm_out_file, func_label);
}
   if (HAVE_COMDAT_GROUP && (flags & SECTION_LINKONCE))
{
@@ -7821,11 +7838,6 @@ switch_to_section (section *new_section, tree decl)
   else if (in_section == new_section)
 return;
 
-  if (new_section->common.flags & SECTION_FORGET)
-in_section = NULL;
-  else
-in_section = new_section;
-
   switch (SECTION_STYLE (new_section))
 {
 case SECTION_NAMED:
@@ -7843,6 +7855,11 @@ switch_to_section (section *new_section, tree decl)
   break;
 }
 
+  if (new_section->common.flags & SECTION_FORGET)
+in_section = NULL;
+  else
+in_section = new_section;
+
   new_section->common.flags |= SECTION_DECLARED;
 }
 


PR98125, PowerPC64 -fpatchable-function-entry

2021-05-06 Thread Alan Modra via Gcc-patches
This series of patches fixes -fpatchable-function-entry on PowerPC64
ELFv1 so that SECTION_LINK_ORDER (.section 'o' arg) is now supported,
and on PowerPC64 ELFv2 to not break the global entry code.

Bootstrapped powerpc64le-linux and x86_64-linux all langs.  I did see
one regression on both targets, libgo runtime/pprof.  It's unclear to
me what that means.

Alan Modra (3):
  PowerPC64 ELFv1 -fpatchable-function-entry
  Revert "rs6000: Avoid -fpatchable-function-entry* regressions on
powerpc64 be [PR98125]"
  PowerPC64 ELFv2 -fpatchable-function-entry

 gcc/config/rs6000/rs6000-logue.c |  5 
 gcc/config/rs6000/rs6000.c   | 47 +---
 gcc/targhooks.c  | 38 --
 gcc/targhooks.h  |  3 --
 gcc/testsuite/g++.dg/pr93195a.C  |  1 -
 gcc/varasm.c | 37 ++---
 6 files changed, 69 insertions(+), 62 deletions(-)



Re: RFC: Changing AC_PROG_CC to AC_PROG_CC_C99 in top level configure

2021-05-05 Thread Alan Modra via Gcc-patches
On Wed, May 05, 2021 at 08:05:29AM +0100, Iain Sandoe wrote:
> Alan Modra via Gcc-patches  wrote:
> 
> > On 2021-05-04 8:42 a.m., Nick Clifton wrote:
> > > Hi Guys,
> > > 
> > > On 4/30/21 7:36 PM, Simon Marchi wrote:
> > > > I think this fix is obvious enough, I encourage you to push it,
> > > 
> > > OK - I have pushed the patch to the mainline branches of both
> > > the gcc and binutils-gdb repositories.
> > 
> > Thanks Nick!  Incidentally, I checked the AC_PROG_CC_C99 change on
> > both binutils and gcc mainline using gcc-4.9.
> > 
> > To build gcc on x86_64 I found the following patch necessary to avoid
> > lots of
> > error: uninitialized const member ‘stringop_algs::stringop_strategy::max’
> > error: uninitialized const member ‘stringop_algs::stringop_strategy::alg’
> > when compiling config/i386/i386-options.c.  These can't be cured by
> > configuring with --disable-stage1-checking.
> > 
> > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> > index 97d6f3863cb..cc3b1b6d666 100644
> > --- a/gcc/config/i386/i386.h
> > +++ b/gcc/config/i386/i386.h
> > @@ -73,8 +73,8 @@ struct stringop_algs
> > {
> >   const enum stringop_alg unknown_size;
> >   const struct stringop_strategy {
> > -const int max;
> > -const enum stringop_alg alg;
> > +int max;
> > +enum stringop_alg alg;
> > int noalign;
> >   } size [MAX_STRINGOP_ALGS];
> > };
> 
> does this relate to / fix PR 100246 (which seems to fire for some GCC
> versions as well
> as older clang)?

Yes, looks like the same issue.  I started making a similar fix to the
one you attached to the PR, then laziness kicked in after noticing the
errors were only given on the const elements.

-- 
Alan Modra
Australia Development Lab, IBM


Re: RFC: Changing AC_PROG_CC to AC_PROG_CC_C99 in top level configure

2021-05-04 Thread Alan Modra via Gcc-patches
On 2021-05-04 8:42 a.m., Nick Clifton wrote:
> Hi Guys,
> 
> On 4/30/21 7:36 PM, Simon Marchi wrote:
>> I think this fix is obvious enough, I encourage you to push it,
> 
> OK - I have pushed the patch to the mainline branches of both
> the gcc and binutils-gdb repositories.

Thanks Nick!  Incidentally, I checked the AC_PROG_CC_C99 change on
both binutils and gcc mainline using gcc-4.9.

To build gcc on x86_64 I found the following patch necessary to avoid
lots of
error: uninitialized const member ‘stringop_algs::stringop_strategy::max’
error: uninitialized const member ‘stringop_algs::stringop_strategy::alg’
when compiling config/i386/i386-options.c.  These can't be cured by
configuring with --disable-stage1-checking.

diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 97d6f3863cb..cc3b1b6d666 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -73,8 +73,8 @@ struct stringop_algs
 {
   const enum stringop_alg unknown_size;
   const struct stringop_strategy {
-const int max;
-const enum stringop_alg alg;
+int max;
+enum stringop_alg alg;
 int noalign;
   } size [MAX_STRINGOP_ALGS];
 };


-- 
Alan Modra
Australia Development Lab, IBM


Re: RFC: Changing AC_PROG_CC to AC_PROG_CC_C99 in top level configure

2021-05-03 Thread Alan Modra via Gcc-patches
On Mon, May 03, 2021 at 10:47:15AM -0400, Simon Marchi wrote:
> > Yes, I prefer the configure fix too.  If we state we require C99 in
> > binutils then we ought to be able to use C99..
> > 
> > Nick, does the configure.ac change also need to go in all subdirs, to
> > support people running make in say ld/ rather than running make in the
> > top build dir?
> 
> For GDB, it's not supported to run gdb/configure directly, you need to
> use the top-level configure.  Is it supported from some of the other
> projects in the repo?
> 
> I just tried with ld, it doesn't work since it depends on bfd also being
> built.  I tried with just bfd, it doesn't work (with the default
> configure options at least) because it requires zlib being built.

I wasn't talking about running configure, I was talking about running
make.  For example, you configure and make binutils as usual, then
after making a change to ld/ files, run make in the ld build dir.  I
don't tend to do that myself but I do run "make check" sometimes in a
subdir expecting to get the same results in that subdir as if "make
check" was run from the top level.

But I should have just tried it myself rather than asking.  CC, CPP
and others are inherited from the top level and appear with -std=gnu99
in the subdir Makefiles.  So it seems all the AC_PROG_CC in subdir
configure.ac can stay as they are.

> 
> So if all projects need to go through the top-level configure script
> anyway, and C99 is a baseline for all projects, then having the check
> only in the top-level makes sense to me.  Projects that have more
> specific requirements can have their own checks.  For example, sim/
> requires C11 now.  Unless the C99 check at top-level somehow does not
> play well with the C11 check in sim/?  Like if that would cause CC to be
> set to "gcc -std=gnu99 -std=gnu11" or something like that.
> 
> Simon

-- 
Alan Modra
Australia Development Lab, IBM


Re: RFC: Changing AC_PROG_CC to AC_PROG_CC_C99 in top level configure

2021-05-03 Thread Alan Modra via Gcc-patches
On Fri, Apr 30, 2021 at 03:48:00PM -0600, Jeff Law via Gcc-patches wrote:
> 
> On 4/30/2021 12:36 PM, Simon Marchi via Gcc-patches wrote:
> > On 2021-04-26 7:32 a.m., Nick Clifton via Gdb-patches wrote:> Hi Guys,
> > >Given that gcc, gdb and now binutils are all now requiring C99 as a
> > >minimum version of C, are there any objections to updating
> > >configure.ac to reflect this ?
> > > 
> > > Cheers
> > >Nick
> > > 
> > > diff --git a/configure.ac b/configure.ac
> > > index a721316d07b..59b4194fb24 100644
> > > --- a/configure.ac
> > > +++ b/configure.ac
> > > @@ -1278,7 +1278,7 @@ else
> > > WINDMC_FOR_BUILD="\$(WINDMC)"
> > >   fi
> > > 
> > > -AC_PROG_CC
> > > +AC_PROG_CC_C99
> > >   AC_PROG_CXX
> > > 
> > >   # We must set the default linker to the linker used by gcc for the 
> > > correct
> > Hi Nick,
> > 
> > I think this fix is obvious enough, I encourage you to push it, that
> > will fix the build failure many people get in opcodes/ppc-dis.c.  We'll
> > just remove the line later when we upgrade to Autoconf 2.71, as simple
> > as that.  For now we use 2.69.  If that matters, you have my OK for the
> > GDB side of things.
> 
> That works for me.  I'd just sent Alan the trivial patch to make ppc-dis.c
> compile again with C89, but if we're going to update configure.ac
> appropriately, then it wouldn't be needed.

Yes, I prefer the configure fix too.  If we state we require C99 in
binutils then we ought to be able to use C99..

Nick, does the configure.ac change also need to go in all subdirs, to
support people running make in say ld/ rather than running make in the
top build dir?

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] Add -fgnu-retain/-fno-gnu-retain

2021-02-18 Thread Alan Modra via Gcc-patches
On Wed, Feb 17, 2021 at 11:23:12AM +, Jozef Lawrynowicz wrote:
> In previous discussions, it seemed to me that there was general support
> for the "used" attribute implying SHF_GNU_RETAIN and saving symbols from
> linker garbage collection[1]. Of course, the current implementation
> results in undesirable behavior - the thought that all linker scripts
> not supporting uniquely named sections would need to be updated is quite
> alarming.
> 
> It's a shame that all this extra complication is required, just because
> we cannot have a ".retain ", directive.

Is that true?  Isn't the problem really that retained sections are
named as if -ffunction-sections/-fdata-sections were in force?  And
that is due to wanting separate sections so that garbage collection
works, since SHF_GNU_RETAIN is all about linker garbage collection.

I don't see how having a ".retain " would help much.

> My preferred vision for this functionality was:
>   - SHF_GNU_RETAIN section flag indicates a section should be saved
> from linker garbage collection.
>   - ".retain " assembler directive applies SHF_GNU_RETAIN
> to the section containing .
>   - GCC "used" attribute emits a ".retain " directive for
> the symbol declaration is is applied to.  Applying the "used"
> attribute to a symbol declaration does not change the structure of
> the object file, beyond applying SHF_GNU_RETAIN to the section
> containing that symbol.

That description seems to say that a ".retain foo" would mean
everything in foo's section is kept.  If foo's section was the usual
.data, you've kept virtually everything from garbage collection.
Surely you don't expect ".retain foo" to create a separate .data
section for foo?  If you do, I'm strongly against that idea.

Note that gas indeed supports multiple sections named .data that can
serve the same purpose as -fdata-sections.  See the gas doc for the
optional .section field "unique".  That might be the best way to avoid
an under-the-hood -ffunction-sections/-fdata-sections.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 4/8] intl: turn LIBINTL into -L / -l form

2021-02-10 Thread Alan Modra via Gcc-patches
On Mon, Feb 08, 2021 at 11:16:31AM +, Nick Alcock via Binutils wrote:
> intl/ChangeLog
> 2021-02-04  Nick Alcock  
> 
>   * configure.ac (LIBINTL): Transform into -L/-lintl form.
>   * configure: Regenerate.

OK for binutils.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 3/8] intl: always picify

2021-02-10 Thread Alan Modra via Gcc-patches
On Mon, Feb 08, 2021 at 11:16:30AM +, Nick Alcock via Binutils wrote:
> intl/ChangeLog
> 2021-02-02  Nick Alcock  
> 
>   * aclocal.m4: include picflag.m4.
>   * configure.ac (PICFLAG): Add and substitute.
>   * Makefile.in (PICFLAG): New.
>   (COMPILE): Use it.
>   * configure: Regenerate.

OK for binutils.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH, rs6000] Optimization for PowerPC 64bit constant generation [PR94395]

2021-02-02 Thread Alan Modra via Gcc-patches
On Fri, Jan 29, 2021 at 11:11:23AM +0800, HAO CHEN GUI via Gcc-patches wrote:
>    This patch tries to optimize PowerPC 64 bit constant generation when the
> constant can be transformed from a 32 bit or 16 bit constant by rotating,
> shifting and mask AND.

All and more of what you are doing here for rotated 16-bit constants
is covered by
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555760.html

That patch is still waiting on review.  Hmm, I see my local copy of
that patch has one extra line in
gcc/testsuite/gcc.target/powerpc/rot_cst2.c
+/* { dg-additional-options "-mno-prefixed" { target { lp64 } } } */
in order to keep scan-assembler-times counts correct for power10.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 5/8] [RS6000] rs6000_rtx_costs cost IOR

2021-01-31 Thread Alan Modra via Gcc-patches
On Mon, Jan 25, 2021 at 04:51:43PM -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Oct 08, 2020 at 09:27:57AM +1030, Alan Modra wrote:
> > * config/rs6000/rs6000.c (rotate_insert_cost): New function.
> > (rs6000_rtx_costs): Cost IOR.
> > 
> > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> > index 383d2901c9f..15a806fe307 100644
> > --- a/gcc/config/rs6000/rs6000.c
> > +++ b/gcc/config/rs6000/rs6000.c
> > @@ -21206,6 +21206,91 @@ rs6000_cannot_copy_insn_p (rtx_insn *insn)
> >  && get_attr_cannot_copy (insn);
> >  }
> >  
> > +/* Handle rtx_costs for scalar integer rotate and insert insns.  */
> 
> You need to document here what the return value means, and what the
> preconditions for "left" (and "right") are.

Done, and I moved the preconditions on "left" into the new function.

> > +static bool
> > +rotate_insert_cost (rtx left, rtx right, machine_mode mode, bool speed,
> > +   int *total)
> > +{
> > +  if (GET_CODE (right) == AND
> 
> ... because you never check the CODE of "left".
> 
> > +  && CONST_INT_P (XEXP (right, 1))
> > +  && UINTVAL (XEXP (left, 1)) + UINTVAL (XEXP (right, 1)) + 1 == 0)
> 
> HOST_WIDE_INT is always exactly 64 bits now, so you could do "== -1".

Yes, but this is exactly the way the expression occurs in rotl*_insert*
instruction patterns.  I think it's better to keep them the same.

> > +{
> > +  rtx leftop = XEXP (left, 0);
> > +  rtx rightop = XEXP (right, 0);
> > +
> > +  /* rotlsi3_insert_5.  */
> > +  if (REG_P (leftop)
> > + && REG_P (rightop)
> > + && mode == SImode
> > + && UINTVAL (XEXP (left, 1)) != 0
> > + && UINTVAL (XEXP (right, 1)) != 0
> > + && rs6000_is_valid_mask (XEXP (left, 1), NULL, NULL, mode))
> > +   return true;
> 
> Empty line after return please.

Done, here and elsewhere.

> > +  /* rotldi3_insert_6.  */
> > +  if (REG_P (leftop)
> > + && REG_P (rightop)
> > + && mode == DImode
> > + && exact_log2 (-UINTVAL (XEXP (left, 1))) > 0)
> > +   return true;
> > +  /* rotldi3_insert_7.  */
> > +      if (REG_P (leftop)
> > + && REG_P (rightop)
> > + && mode == DImode
> > + && exact_log2 (-UINTVAL (XEXP (right, 1))) > 0)
> > +   return true;
> 
> Those could just use rs6000_is_valid_mask as well?

This is taken straight from rotldi3_insert_7, so it really ought to
stay that way.

> 
> Please wait this until stage 1.  Sorry.

OK, I'll leave all the rs6000_rtx_costs changes until then.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 3/8] [RS6000] rs6000_rtx_costs tidy AND

2021-01-31 Thread Alan Modra via Gcc-patches
On Mon, Jan 25, 2021 at 04:37:21PM -0600, Segher Boessenkool wrote:
> I still do not see what this improves, I only see possible obvious
> regressions :-(

You asked me to break the patch series into small pieces, for ease of
review and to separate tidies from functional changes.  Well OK, fair
enough.  This is one of the tidies.  The idea being to make
rs6000_rtx_costs a little more self-consistent, to not have someone
look at this code in the future and wonder why AND was treated
differently to other operations.

The only part of this patch that I can imagine you see as a possible
regression is the "Don't avoid recursion on const_int shift count"
part.  That is there only because you wanted it that way in new code.
I think you said something about premature optimisation when I made
the new code special case const_int and reg to stop recursion, like
AND.  So for consistency I made the change in old code too.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] Adjust testcases for power10 instructions V3

2021-01-21 Thread Alan Modra via Gcc-patches
Ping.

On Tue, Jan 12, 2021 at 02:03:18PM +1030, Alan Modra wrote:
> Ping
> https://gcc.gnu.org/pipermail/gcc-patches/2020-October/557587.html
> 
> On Fri, Oct 30, 2020 at 07:00:14PM +1030, Alan Modra wrote:
> > And now waking up to what you meant by the lvsl-lvsr.c \s comment,
> > plus a revised ppc-ne0-1.c scan-assembler.
> > 
> > I think this covers all previous review corrections.  Regression tested
> > powerpc64-linux power7 and powerpc64le-linux power10.  OK?
> > 
> > * lib/target-supports.exp (check_effective_target_has_arch_pwr10): New.
> > * gcc.dg/pr56727-2.c,
> > gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c,
> > gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c,
> > gcc.target/powerpc/fold-vec-load-builtin_vec_xl-double.c,
> > gcc.target/powerpc/fold-vec-load-builtin_vec_xl-float.c,
> > gcc.target/powerpc/fold-vec-load-builtin_vec_xl-int.c,
> > gcc.target/powerpc/fold-vec-load-builtin_vec_xl-longlong.c,
> > gcc.target/powerpc/fold-vec-load-builtin_vec_xl-short.c,
> > gcc.target/powerpc/fold-vec-load-vec_vsx_ld-char.c,
> > gcc.target/powerpc/fold-vec-load-vec_vsx_ld-double.c,
> > gcc.target/powerpc/fold-vec-load-vec_vsx_ld-float.c,
> > gcc.target/powerpc/fold-vec-load-vec_vsx_ld-int.c,
> > gcc.target/powerpc/fold-vec-load-vec_vsx_ld-longlong.c,
> > gcc.target/powerpc/fold-vec-load-vec_vsx_ld-short.c,
> > gcc.target/powerpc/fold-vec-load-vec_xl-char.c,
> > gcc.target/powerpc/fold-vec-load-vec_xl-double.c,
> > gcc.target/powerpc/fold-vec-load-vec_xl-float.c,
> > gcc.target/powerpc/fold-vec-load-vec_xl-int.c,
> > gcc.target/powerpc/fold-vec-load-vec_xl-longlong.c,
> > gcc.target/powerpc/fold-vec-load-vec_xl-short.c,
> > gcc.target/powerpc/fold-vec-splat-floatdouble.c,
> > gcc.target/powerpc/fold-vec-splat-longlong.c,
> > gcc.target/powerpc/fold-vec-store-builtin_vec_xst-char.c,
> > gcc.target/powerpc/fold-vec-store-builtin_vec_xst-double.c,
> > gcc.target/powerpc/fold-vec-store-builtin_vec_xst-float.c,
> > gcc.target/powerpc/fold-vec-store-builtin_vec_xst-int.c,
> > gcc.target/powerpc/fold-vec-store-builtin_vec_xst-short.c,
> > gcc.target/powerpc/fold-vec-store-vec_vsx_st-char.c,
> > gcc.target/powerpc/fold-vec-store-vec_vsx_st-double.c,
> > gcc.target/powerpc/fold-vec-store-vec_vsx_st-float.c,
> > gcc.target/powerpc/fold-vec-store-vec_vsx_st-int.c,
> > gcc.target/powerpc/fold-vec-store-vec_vsx_st-longlong.c,
> > gcc.target/powerpc/fold-vec-store-vec_vsx_st-short.c,
> > gcc.target/powerpc/fold-vec-store-vec_xst-char.c,
> > gcc.target/powerpc/fold-vec-store-vec_xst-double.c,
> > gcc.target/powerpc/fold-vec-store-vec_xst-float.c,
> > gcc.target/powerpc/fold-vec-store-vec_xst-int.c,
> > gcc.target/powerpc/fold-vec-store-vec_xst-longlong.c,
> > gcc.target/powerpc/fold-vec-store-vec_xst-short.c,
> > gcc.target/powerpc/lvsl-lvsr.c,
> > gcc.target/powerpc/ppc-eq0-1.c,
> > gcc.target/powerpc/ppc-ne0-1.c,
> > gcc.target/powerpc/pr86731-fwrapv-longlong.c: Match power10 insns.
> > * gcc.target/powerpc/lvsl-lvsr.c: Avoid file name match.
> > 
> > diff --git a/gcc/testsuite/gcc.dg/pr56727-2.c 
> > b/gcc/testsuite/gcc.dg/pr56727-2.c
> > index c54369ed25e..f055116772a 100644
> > --- a/gcc/testsuite/gcc.dg/pr56727-2.c
> > +++ b/gcc/testsuite/gcc.dg/pr56727-2.c
> > @@ -18,4 +18,4 @@ void h ()
> >  
> >  /* { dg-final { scan-assembler "@(PLT|plt)" { target i?86-*-* x86_64-*-* } 
> > } } */
> >  /* { dg-final { scan-assembler "@(PLT|plt)" { target { powerpc*-*-linux* 
> > && ilp32 } } } } */
> > -/* { dg-final { scan-assembler "bl f\n\\s*nop" { target { 
> > powerpc*-*-linux* && lp64 } } } } */
> > +/* { dg-final { scan-assembler {bl f(\n\s*nop|@notoc\n)} { target { 
> > powerpc*-*-linux* && lp64 } } } } */
> > diff --git 
> > a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c 
> > b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c
> > index 246f38fa6d1..1cff4550f28 100644
> > --- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c
> > +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c
> > @@ -25,6 +25,6 @@ main1 (void)
> > with no word loads (lw, lwu, lwz, lwzu, or their indexed forms)
> > or word stores (stw, stwu, stwx, stwux, or their indexed forms).  */
> >  
> > -/* { dg-final { scan-assembler "

Re: [PATCH 8/8] [RS6000] rs6000_rtx_costs for !speed

2021-01-21 Thread Alan Modra via Gcc-patches
Ping.

On Tue, Jan 12, 2021 at 02:02:36PM +1030, Alan Modra wrote:
> Ping
> https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555759.html
> 
> On Thu, Oct 08, 2020 at 09:28:00AM +1030, Alan Modra wrote:
> > When optimizing for size we shouldn't be using metrics based on speed
> > or vice-versa.  rtlanal.c:get_full_rtx_cost wants both speed and size
> > metric from rs6000_rtx_costs independent of the global optimize_size.
> > 
> > Note that the patch changes param_simultaneous_prefetches,
> > param_l1_cache_size, param_l1_cache_line_size and param_l2_cache_size,
> > which were previously all set to zero for optimize_size.  I think that
> > was a bug.  Those params are a function of the processor.
> > 
> > * config/rs6000/rs6000.h (rs6000_cost): Don't declare.
> > (struct processor_costs): Move to..
> > * config/rs6000/rs6000.c: ..here.
> > (rs6000_cost): Make static.
> > (rs6000_option_override_internal): Ignore optimize_size when
> > setting up rs6000_cost.
> > (rs6000_insn_cost): Take into account optimize_size here
> > instead.
> > (rs6000_emit_parity): Likewise.
> > (rs6000_rtx_costs): Don't use rs6000_cost when !speed.
> > 
> > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> > index d455aa52427..14ecbad5df4 100644
> > --- a/gcc/config/rs6000/rs6000.c
> > +++ b/gcc/config/rs6000/rs6000.c
> > @@ -497,7 +497,26 @@ rs6000_store_data_bypass_p (rtx_insn *out_insn, 
> > rtx_insn *in_insn)
> >  
> >  /* Processor costs (relative to an add) */
> >  
> > -const struct processor_costs *rs6000_cost;
> > +struct processor_costs {
> > +  const int mulsi;   /* cost of SImode multiplication.  */
> > +  const int mulsi_const;  /* cost of SImode multiplication by constant.  */
> > +  const int mulsi_const9; /* cost of SImode mult by short constant.  */
> > +  const int muldi;   /* cost of DImode multiplication.  */
> > +  const int divsi;   /* cost of SImode division.  */
> > +  const int divdi;   /* cost of DImode division.  */
> > +  const int fp;  /* cost of simple SFmode and DFmode insns.  */
> > +  const int dmul;/* cost of DFmode multiplication (and fmadd).  */
> > +  const int sdiv;/* cost of SFmode division (fdivs).  */
> > +  const int ddiv;/* cost of DFmode division (fdiv).  */
> > +  const int cache_line_size;/* cache line size in bytes. */
> > +  const int l1_cache_size; /* size of l1 cache, in kilobytes.  */
> > +  const int l2_cache_size; /* size of l2 cache, in kilobytes.  */
> > +  const int simultaneous_prefetches; /* number of parallel prefetch
> > +   operations.  */
> > +  const int sfdf_convert;  /* cost of SF->DF conversion.  */
> > +};
> > +
> > +static const struct processor_costs *rs6000_cost;
> >  
> >  /* Instruction size costs on 32bit processors.  */
> >  static const
> > @@ -4618,131 +4637,128 @@ rs6000_option_override_internal (bool 
> > global_init_p)
> >  }
> >  
> >/* Initialize rs6000_cost with the appropriate target costs.  */
> > -  if (optimize_size)
> > -rs6000_cost = TARGET_POWERPC64 ? _cost : _cost;
> > -  else
> > -switch (rs6000_tune)
> > -  {
> > -  case PROCESSOR_RS64A:
> > -   rs6000_cost = _cost;
> > -   break;
> > +  switch (rs6000_tune)
> > +{
> > +case PROCESSOR_RS64A:
> > +  rs6000_cost = _cost;
> > +  break;
> >  
> > -  case PROCESSOR_MPCCORE:
> > -   rs6000_cost = _cost;
> > -   break;
> > +case PROCESSOR_MPCCORE:
> > +  rs6000_cost = _cost;
> > +  break;
> >  
> > -  case PROCESSOR_PPC403:
> > -   rs6000_cost = _cost;
> > -   break;
> > +case PROCESSOR_PPC403:
> > +  rs6000_cost = _cost;
> > +  break;
> >  
> > -  case PROCESSOR_PPC405:
> > -   rs6000_cost = _cost;
> > -   break;
> > +case PROCESSOR_PPC405:
> > +  rs6000_cost = _cost;
> > +  break;
> >  
> > -  case PROCESSOR_PPC440:
> > -   rs6000_cost = _cost;
> > -   break;
> > +case PROCESSOR_PPC440:
> > +  rs6000_cost = _cost;
> > +  break;
> >  
> > -  case PROCESSOR_PPC476:
> > -   rs6000_cost = _cost;
> > -   break;
> > +case PROCESSOR_PPC476:
> > +  rs6000_cost = _cost;
> > +  break;
> >  
> > -  case PROCESSOR_PPC601:
> > -   rs6000_cost = _cost;
> > -   break;
> > +case P

Re: [PATCH 7/8] [RS6000] rs6000_rtx_costs reduce cost for SETs

2021-01-21 Thread Alan Modra via Gcc-patches
Ping.

On Tue, Jan 12, 2021 at 02:02:27PM +1030, Alan Modra wrote:
> Ping
> https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555758.html
> 
> On Thu, Oct 08, 2020 at 09:27:59AM +1030, Alan Modra wrote:
> > The aim of this patch is to make rtx_costs for SETs closer to
> > insn_cost for SETs.  One visible effect on powerpc code is increased
> > if-conversion.
> > 
> > * config/rs6000/rs6000.c (rs6000_rtx_costs): Reduce cost of SET
> > operands.
> > 
> > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> > index 76aedbfae6f..d455aa52427 100644
> > --- a/gcc/config/rs6000/rs6000.c
> > +++ b/gcc/config/rs6000/rs6000.c
> > @@ -21684,6 +21684,35 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int 
> > outer_code,
> > }
> >return false;
> >  
> > +case SET:
> > +  /* On entry the value in *TOTAL is the number of general purpose
> > +regs being set, multiplied by COSTS_N_INSNS (1).  Handle
> > +costing of set operands specially since in most cases we have
> > +an instruction rather than just a piece of RTL and should
> > +return a cost comparable to insn_cost.  That's a little
> > +complicated because in some cases the cost of SET operands is
> > +non-zero, see point 5 above and cost of PLUS for example, and
> > +in others it is zero, for example for (set (reg) (reg)).
> > +But (set (reg) (reg)) has the same insn_cost as
> > +(set (reg) (plus (reg) (reg))).  Hack around this by
> > +subtracting COSTS_N_INSNS (1) from the operand cost in cases
> > +were we add at least COSTS_N_INSNS (1) for some operation.
> > +However, don't do so for constants.  Constants might cost
> > +more than zero when they require more than one instruction,
> > +and we do want the cost of extra instructions.  */
> > +  {
> > +   rtx_code src_code = GET_CODE (SET_SRC (x));
> > +   if (src_code == CONST_INT
> > +   || src_code == CONST_DOUBLE
> > +   || src_code == CONST_WIDE_INT)
> > + return false;
> > +   int set_cost = (rtx_cost (SET_SRC (x), mode, SET, 1, speed)
> > +   + rtx_cost (SET_DEST (x), mode, SET, 0, speed));
> > +   if (set_cost >= COSTS_N_INSNS (1))
> > + *total += set_cost - COSTS_N_INSNS (1);
> > +   return true;
> > +  }
> > +
> >  default:
> >return false;
> >  }

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 5/8] [RS6000] rs6000_rtx_costs cost IOR

2021-01-21 Thread Alan Modra via Gcc-patches
Ping.

On Tue, Jan 12, 2021 at 02:02:18PM +1030, Alan Modra wrote:
> Ping
> https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555756.html
> 
> On Thu, Oct 08, 2020 at 09:27:57AM +1030, Alan Modra wrote:
> > * config/rs6000/rs6000.c (rotate_insert_cost): New function.
> > (rs6000_rtx_costs): Cost IOR.
> > 
> > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> > index 383d2901c9f..15a806fe307 100644
> > --- a/gcc/config/rs6000/rs6000.c
> > +++ b/gcc/config/rs6000/rs6000.c
> > @@ -21206,6 +21206,91 @@ rs6000_cannot_copy_insn_p (rtx_insn *insn)
> >  && get_attr_cannot_copy (insn);
> >  }
> >  
> > +/* Handle rtx_costs for scalar integer rotate and insert insns.  */
> > +
> > +static bool
> > +rotate_insert_cost (rtx left, rtx right, machine_mode mode, bool speed,
> > +   int *total)
> > +{
> > +  if (GET_CODE (right) == AND
> > +  && CONST_INT_P (XEXP (right, 1))
> > +  && UINTVAL (XEXP (left, 1)) + UINTVAL (XEXP (right, 1)) + 1 == 0)
> > +{
> > +  rtx leftop = XEXP (left, 0);
> > +  rtx rightop = XEXP (right, 0);
> > +
> > +  /* rotlsi3_insert_5.  */
> > +  if (REG_P (leftop)
> > + && REG_P (rightop)
> > + && mode == SImode
> > + && UINTVAL (XEXP (left, 1)) != 0
> > + && UINTVAL (XEXP (right, 1)) != 0
> > + && rs6000_is_valid_mask (XEXP (left, 1), NULL, NULL, mode))
> > +   return true;
> > +  /* rotldi3_insert_6.  */
> > +  if (REG_P (leftop)
> > + && REG_P (rightop)
> > + && mode == DImode
> > + && exact_log2 (-UINTVAL (XEXP (left, 1))) > 0)
> > +   return true;
> > +  /* rotldi3_insert_7.  */
> > +  if (REG_P (leftop)
> > + && REG_P (rightop)
> > + && mode == DImode
> > + && exact_log2 (-UINTVAL (XEXP (right, 1))) > 0)
> > +   return true;
> > +
> > +  rtx mask = 0;
> > +  rtx shift = leftop;
> > +  rtx_code shift_code = GET_CODE (shift);
> > +  /* rotl3_insert.  */
> > +  if (shift_code == ROTATE
> > + || shift_code == ASHIFT
> > + || shift_code == LSHIFTRT)
> > +   mask = right;
> > +  else
> > +   {
> > + shift = rightop;
> > + shift_code = GET_CODE (shift);
> > + /* rotl3_insert_2.  */
> > + if (shift_code == ROTATE
> > + || shift_code == ASHIFT
> > + || shift_code == LSHIFTRT)
> > +   mask = left;
> > +   }
> > +  if (mask
> > + && CONST_INT_P (XEXP (shift, 1))
> > + && rs6000_is_valid_insert_mask (XEXP (mask, 1), shift, mode))
> > +   {
> > + *total += rtx_cost (XEXP (shift, 0), mode, shift_code, 0, speed);
> > + *total += rtx_cost (XEXP (mask, 0), mode, AND, 0, speed);
> > + return true;
> > +   }
> > +}
> > +  /* rotl3_insert_3.  */
> > +  if (GET_CODE (right) == ASHIFT
> > +  && CONST_INT_P (XEXP (right, 1))
> > +  && (INTVAL (XEXP (right, 1))
> > + == exact_log2 (UINTVAL (XEXP (left, 1)) + 1)))
> > +{
> > +  *total += rtx_cost (XEXP (left, 0), mode, AND, 0, speed);
> > +  *total += rtx_cost (XEXP (right, 0), mode, ASHIFT, 0, speed);
> > +  return true;
> > +}
> > +  /* rotl3_insert_4.  */
> > +  if (GET_CODE (right) == LSHIFTRT
> > +  && CONST_INT_P (XEXP (right, 1))
> > +  && mode == SImode
> > +  && (INTVAL (XEXP (right, 1))
> > + + exact_log2 (-UINTVAL (XEXP (left, 1 == 32)
> > +{
> > +  *total += rtx_cost (XEXP (left, 0), mode, AND, 0, speed);
> > +  *total += rtx_cost (XEXP (right, 0), mode, LSHIFTRT, 0, speed);
> > +  return true;
> > +}
> > +  return false;
> > +}
> > +
> >  /* Compute a (partial) cost for rtx X.  Return true if the complete
> > cost has been computed, and false if subexpressions should be
> > scanned.  In either case, *TOTAL contains the cost result.
> > @@ -21253,7 +21338,7 @@ static bool
> >  rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
> >   int opno ATTRIBUTE_UNUSED, int *total, bool speed)
> >  {
> > -  rtx right;
> > +  rtx left, right;
> >int code = GET_CODE (x);
> >  
> >switch (code)
> > @@ -21435,7 +21520,7 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int 
> > outer_code,
> >right = XEXP (x, 1);
> >if (CONST_INT_P (right))
> > {
> > - rtx left = XEXP (x, 0);
> > + left = XEXP (x, 0);
> >   rtx_code left_code = GET_CODE (left);
> >  
> >   /* rotate-and-mask: 1 insn.  */
> > @@ -21452,9 +21537,16 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int 
> > outer_code,
> >return false;
> >  
> >  case IOR:
> > -  /* FIXME */
> >*total = COSTS_N_INSNS (1);
> > -  return true;
> > +  left = XEXP (x, 0);
> > +  if (GET_CODE (left) == AND
> > + && CONST_INT_P (XEXP (left, 1)))
> > +   {
> > + right = XEXP (x, 1);
> > + if (rotate_insert_cost (left, right, mode, speed, total))
> > +   return true;
> > +   }
> > +  return false;
> >  
> >  case CLZ:
> >  case XOR:

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 4/8] [RS6000] rs6000_rtx_costs tidy break/return

2021-01-21 Thread Alan Modra via Gcc-patches
Ping.

On Tue, Jan 12, 2021 at 02:02:09PM +1030, Alan Modra wrote:
> Ping
> https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555755.html
> 
> On Thu, Oct 08, 2020 at 09:27:56AM +1030, Alan Modra wrote:
> > Most cases use "return false" rather than breaking out of the switch.
> > Do so in all cases.
> > 
> > * config/rs6000/rs6000.c (rs6000_rtx_costs): Tidy break/return.
> > 
> > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> > index bc5e51aa5ce..383d2901c9f 100644
> > --- a/gcc/config/rs6000/rs6000.c
> > +++ b/gcc/config/rs6000/rs6000.c
> > @@ -21371,7 +21371,7 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int 
> > outer_code,
> > *total = rs6000_cost->fp;
> >else
> > *total = rs6000_cost->dmul;
> > -  break;
> > +  return false;
> >  
> >  case DIV:
> >  case MOD:
> > @@ -21539,7 +21539,7 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int 
> > outer_code,
> >   *total = rs6000_cost->fp;
> >   return false;
> > }
> > -  break;
> > +  return false;
> >  
> >  case NE:
> >  case EQ:
> > @@ -21577,13 +21577,11 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int 
> > outer_code,
> >   *total = 0;
> >   return true;
> > }
> > -  break;
> > +  return false;
> >  
> >  default:
> > -  break;
> > +  return false;
> >  }
> > -
> > -  return false;
> >  }
> >  
> >  /* Debug form of r6000_rtx_costs that is selected if -mdebug=cost.  */

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 3/8] [RS6000] rs6000_rtx_costs tidy AND

2021-01-21 Thread Alan Modra via Gcc-patches
Ping.

On Tue, Jan 12, 2021 at 02:01:57PM +1030, Alan Modra wrote:
> Ping
> https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555754.html
> 
> On Thu, Oct 08, 2020 at 09:27:55AM +1030, Alan Modra wrote:
> > * config/rs6000/rs6000.c (rs6000_rtx_costs): Tidy AND code.
> > Don't avoid recursion on const_int shift count.
> > 
> > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> > index e870ba0039a..bc5e51aa5ce 100644
> > --- a/gcc/config/rs6000/rs6000.c
> > +++ b/gcc/config/rs6000/rs6000.c
> > @@ -21253,6 +21253,7 @@ static bool
> >  rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
> >   int opno ATTRIBUTE_UNUSED, int *total, bool speed)
> >  {
> > +  rtx right;
> >int code = GET_CODE (x);
> >  
> >switch (code)
> > @@ -21430,7 +21431,9 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int 
> > outer_code,
> >return false;
> >  
> >  case AND:
> > -  if (CONST_INT_P (XEXP (x, 1)))
> > +  *total = COSTS_N_INSNS (1);
> > +  right = XEXP (x, 1);
> > +  if (CONST_INT_P (right))
> > {
> >   rtx left = XEXP (x, 0);
> >   rtx_code left_code = GET_CODE (left);
> > @@ -21439,17 +21442,13 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int 
> > outer_code,
> >   if ((left_code == ROTATE
> >|| left_code == ASHIFT
> >|| left_code == LSHIFTRT)
> > - && rs6000_is_valid_shift_mask (XEXP (x, 1), left, mode))
> > + && rs6000_is_valid_shift_mask (right, left, mode))
> > {
> > - *total = rtx_cost (XEXP (left, 0), mode, left_code, 0, speed);
> > - if (!CONST_INT_P (XEXP (left, 1)))
> > -   *total += rtx_cost (XEXP (left, 1), SImode, left_code, 1, 
> > speed);
> > - *total += COSTS_N_INSNS (1);
> > + *total += rtx_cost (XEXP (left, 0), mode, left_code, 0, speed);
> > + *total += rtx_cost (XEXP (left, 1), SImode, left_code, 1, speed);
> >   return true;
> > }
> > }
> > -
> > -  *total = COSTS_N_INSNS (1);
> >return false;
> >  
> >  case IOR:

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] binuitils: Check if AR is usable for LTO build

2021-01-11 Thread Alan Modra via Gcc-patches
On Mon, Jan 11, 2021 at 02:52:43PM -0800, H.J. Lu wrote:
> On Mon, Jan 11, 2021 at 1:20 PM Alan Modra  wrote:
> >
> > On Mon, Jan 11, 2021 at 11:53:15AM -0800, H.J. Lu via Binutils wrote:
> > > Check if AR is usable for LTO build with --enable-pgo-build=lto:
> > >
> > > checking for -plugin option... ar: no operation specified
> > > Failed: ar --plugin 
> > > /usr/gcc-11.0.0-x32/libexec/gcc/x86_64-pc-linux-gnu/11.0.0/liblto_plugin.so
> > >  rc
> > > no
> > > configure: error: AR with --plugin and rc is required for LTO build
> > >
> > > instead of build failure later.
> > >
> > >   PR binutils/26766
> > >   * configure.ac:
> > >   * configure: Regenerated.
> >
> > See pr27173 too.  The problem isn't a matter of finding an "ar" that
> > supports --plugin, we have versions of GNU ar (2.30 to 2.32?) that
> > accept --plugin but then don't parse the "rc" or other command
> > options.  I don't think this patch will help.
> 
> PR 27173 patches are at
> 
> https://sourceware.org/pipermail/binutils/2021-January/114879.html

After that one is committed, this patch is OK too (with any needed
modifications).

-- 
Alan Modra
Australia Development Lab, IBM


Re: V2 [PATCH 1/2] GCC: Check if AR works with --plugin and rc

2021-01-11 Thread Alan Modra via Gcc-patches
On Mon, Jan 11, 2021 at 04:07:22PM -0800, H.J. Lu wrote:
> These are not fatal errors.  Here is the updated patch to use
> AC_MSG_WARN instead.  OK for master?

OK by me.  Please squash the two patches.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 1/2] GCC: Check if AR works with --plugin and rc

2021-01-11 Thread Alan Modra via Gcc-patches
On Mon, Jan 11, 2021 at 08:57:05AM -0800, H.J. Lu via Binutils wrote:
> diff --git a/config/gcc-plugin.m4 b/config/gcc-plugin.m4
> index c5b72e9a13d..798a2054edd 100644
> --- a/config/gcc-plugin.m4
> +++ b/config/gcc-plugin.m4
> @@ -145,6 +145,18 @@ for plugin in $plugin_names; do
>  break
>fi
>  done
> +dnl Check if ${AR} $plugin_option rc works.
> +AC_CHECK_TOOL(AR, ar)
> +if test "${AR}" = "" ; then
> +  AC_MSG_ERROR([Required archive tool 'ar' not found on PATH.])
> +fi
> +touch conftest.c
> +${AR} $plugin_option rc conftest.a conftest.c
> +if test "$?" != 0; then
> +  echo "Failed: ${AR} $plugin_option rc"

Use AC_MSG_ERROR rather than echo.

> +  plugin_option=
> +fi
> +rm -f conftest.*
>  if test -n "$plugin_option"; then
>$1="$plugin_option"
>AC_MSG_RESULT($plugin_option)

> diff --git a/libtool.m4 b/libtool.m4
> index 3672e9516e2..150971974c1 100644
> --- a/libtool.m4
> +++ b/libtool.m4
> @@ -1340,7 +1340,14 @@ AC_CHECK_TOOL(AR, ar, false)
>  test -z "$AR" && AR=ar
>  if test -n "$plugin_option"; then
>if $AR --help 2>&1 | grep -q "\--plugin"; then
> -AR="$AR $plugin_option"
> +touch conftest.c
> +$AR $plugin_option rc conftest.a conftest.c
> +if test "$?" != 0; then
> +  echo "Failed: $AR $plugin_option rc"

AC_MSG_ERROR again.

> +else
> +  AR="$AR $plugin_option"
> +fi
> +rm -f conftest.*
>fi
>  fi
>  test -z "$AR_FLAGS" && AR_FLAGS=cru

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 0/2] Check if AR works with --plugin and rc

2021-01-11 Thread Alan Modra via Gcc-patches
On Mon, Jan 11, 2021 at 08:57:04AM -0800, H.J. Lu via Binutils wrote:
> Check if AR works with --plugin and rc before passing --plugin to AR and
> RANLIB.

Thanks for looking at this, but next time please assign the bug to
yourself.  I fixed the bug too this morning, before seeing your
email.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 0/8] [RS6000] rs6000_rtx_costs V2

2021-01-11 Thread Alan Modra via Gcc-patches
On Sat, Dec 05, 2020 at 07:42:07PM +1030, Alan Modra wrote:
> Hi Segher,
> I've been holding off pinging these knowing you had a lot of other
> review work, but maybe that's settling down now?  You already OK'd
> 1/8, 2/8 and 6/8.

Ping.

> [PATCH 3/8] [RS6000] rs6000_rtx_costs tidy AND
> https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555754.html
> 
> [PATCH 4/8] [RS6000] rs6000_rtx_costs tidy break/return
> https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555755.html
> 
> [PATCH 5/8] [RS6000] rs6000_rtx_costs cost IOR
> https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555756.html
> 
> [PATCH 7/8] [RS6000] rs6000_rtx_costs reduce cost for SETs
> https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555758.html
> 
> [PATCH 8/8] [RS6000] rs6000_rtx_costs for !speed
> https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555759.html
> 
> [RS6000] rotate and mask constants
> https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555760.html
> 
> [RS6000] Adjust testcases for power10 instructions V3
> https://gcc.gnu.org/pipermail/gcc-patches/2020-October/557587.html

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] binuitils: Check if AR is usable for LTO build

2021-01-11 Thread Alan Modra via Gcc-patches
On Mon, Jan 11, 2021 at 11:53:15AM -0800, H.J. Lu via Binutils wrote:
> Check if AR is usable for LTO build with --enable-pgo-build=lto:
> 
> checking for -plugin option... ar: no operation specified
> Failed: ar --plugin 
> /usr/gcc-11.0.0-x32/libexec/gcc/x86_64-pc-linux-gnu/11.0.0/liblto_plugin.so rc
> no
> configure: error: AR with --plugin and rc is required for LTO build
> 
> instead of build failure later.
> 
>   PR binutils/26766
>   * configure.ac:
>   * configure: Regenerated.

See pr27173 too.  The problem isn't a matter of finding an "ar" that
supports --plugin, we have versions of GNU ar (2.30 to 2.32?) that
accept --plugin but then don't parse the "rc" or other command
options.  I don't think this patch will help.

> ---
>  configure| 4 
>  configure.ac | 4 
>  2 files changed, 8 insertions(+)
> 
> diff --git a/configure b/configure
> index c44184f72ff..84285addafe 100755
> --- a/configure
> +++ b/configure
> @@ -10240,6 +10240,10 @@ if test -n "$PLUGIN_OPTION"; then
>if $RANLIB --help 2>&1 | grep -q "\--plugin"; then
>  RANLIB_PLUGIN_OPTION="$PLUGIN_OPTION"
>fi
> +else
> +  if test "$enable_pgo_build" != "no"; then
> +as_fn_error $? "AR with --plugin and rc is required for LTO build" 
> "$LINENO" 5
> +  fi
>  fi
>  
>  
> diff --git a/configure.ac b/configure.ac
> index 9dd51c36e5a..d39019d7093 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -3452,6 +3452,10 @@ if test -n "$PLUGIN_OPTION"; then
>if $RANLIB --help 2>&1 | grep -q "\--plugin"; then
>  RANLIB_PLUGIN_OPTION="$PLUGIN_OPTION"
>fi
> +else
> +  if test "$enable_pgo_build" != "no"; then
> +AC_MSG_ERROR([AR with --plugin and rc is required for LTO build])
> +  fi
>  fi
>  AC_SUBST(AR_PLUGIN_OPTION)
>  AC_SUBST(RANLIB_PLUGIN_OPTION)
> -- 
> 2.29.2

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH toplevel] libctf: new testsuite

2021-01-05 Thread Alan Modra via Gcc-patches
On Tue, Jan 05, 2021 at 03:25:10PM +, Nick Alcock wrote:
> This enables 'make libctf-check', used by a new libctf testsuite in
> binutils.
> 
> 2021-01-05  Nick Alcock  
> 
>   * Makefile.def (libctf): No longer no_check.  Checking depends on
>   all-ld.
>   * Makefile.in: Regenerated.
> 
> ---
> 
>  Makefile.def  |   4 +-
>  Makefile.in   |  13 +
> 
> This is a stripped-down top-level-only subset of commit 
> c59e30ed1727135f8efb79890f2c458f73709757 in binutils-gdb.git.  (Because
> it is identical to what has already landed in binutils, it should apply
> without trouble in syncs back to there.)
> 
> I don't have permission to push this: Alan has offered to do so.

It doesn't apply due to gcc missing binutils 87279e3cef5b2c5 changes
too.  I could fix that easily enough but I'm going to ask that you
post a combined patch to bring the gcc repo up to date with any libctf
changes.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH, powerpc] testsuite update tests for powerpc power10 target codegen.

2020-12-08 Thread Alan Modra via Gcc-patches
On Mon, Dec 07, 2020 at 05:49:05PM -0600, will schmidt via Gcc-patches wrote:
> [PATCH, powerpc] testsuite update tests for powerpc power10 target codegen.

Appears to duplicate work I did earlier,
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/557587.html

Except I omitted fold-vec-store-builtin_vec_xst-longlong.c, due to
-mdejagnu-cpu=power8 in that test meaning we don't see any power10
insns.

-- 
Alan Modra
Australia Development Lab, IBM


Re: testsuite: Adjust target requirements for sad-vectorize and signbit

2020-12-05 Thread Alan Modra via Gcc-patches
On Thu, Oct 29, 2020 at 10:10:58PM +1030, Alan Modra wrote:
> Fixes
> FAIL: gcc.target/powerpc/signbit-1.c scan-assembler-not stxvd2x
> FAIL: gcc.target/powerpc/signbit-1.c scan-assembler-times mfvsrd 3
> FAIL: gcc.target/powerpc/signbit-1.c scan-assembler-times srdi 3
> FAIL: gcc.target/powerpc/signbit-2.c scan-assembler-times ld 1
> FAIL: gcc.target/powerpc/signbit-2.c scan-assembler-times srdi 1
> on powerpc-linux (or powerpc64-linux biarch -m32).

David,
This patch is fixing a regression caused by one of your testsuite
patches.  Please review.

https://gcc.gnu.org/pipermail/gcc-patches/2020-October/557443.html

> signbit-1.c is quite obviously a 64-bit only testcase given the
> scan-assembler directives, and the purpose of the testcase to verify
> the 64-bit only UNSPEC_SIGNBIT patterns.  It could be made to pass for
> -m32 by adding -mpowerpc64, but that option that isn't very effective
> when bi-arch testing and results in errors on rs6000-aix.  And it is
> pointless to match -m32 stores to the stack followed by loads, which
> is what we do at the moment.
> 
> signbit-2.c on the other hand has more reasonable 32-bit output.
> 
> Regression tested powerpc64-linux biarch.
> 
>   * gcc.target/powerpc/signbit-1.c: Reinstate lp64 condition.
>   * gcc.target/powerpc/signbit-2.c: Match 32-bit output too.
> 
> diff --git a/gcc/testsuite/gcc.target/powerpc/signbit-1.c 
> b/gcc/testsuite/gcc.target/powerpc/signbit-1.c
> index eb4f53e397d..1642bf46d7a 100644
> --- a/gcc/testsuite/gcc.target/powerpc/signbit-1.c
> +++ b/gcc/testsuite/gcc.target/powerpc/signbit-1.c
> @@ -1,4 +1,5 @@
>  /* { dg-do compile } */
> +/* { dg-require-effective-target lp64 } */
>  /* { dg-require-effective-target ppc_float128_sw } */
>  /* { dg-require-effective-target powerpc_p8vector_ok } */
>  /* { dg-options "-mdejagnu-cpu=power8 -O2 -mfloat128" } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/signbit-2.c 
> b/gcc/testsuite/gcc.target/powerpc/signbit-2.c
> index ff6af963dda..1b792916eba 100644
> --- a/gcc/testsuite/gcc.target/powerpc/signbit-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/signbit-2.c
> @@ -13,5 +13,7 @@ int do_signbit_kf (__float128 *a) { return 
> __builtin_signbit (*a); }
>  /* { dg-final { scan-assembler-not   "lxvw4x"   } } */
>  /* { dg-final { scan-assembler-not   "lxsd" } } */
>  /* { dg-final { scan-assembler-not   "lxsdx"} } */
> -/* { dg-final { scan-assembler-times "ld" 1 } } */
> -/* { dg-final { scan-assembler-times "srdi"   1 } } */
> +/* { dg-final { scan-assembler-times "ld" 1 { target lp64 } } } */
> +/* { dg-final { scan-assembler-times "srdi"   1 { target lp64 } } } */
> +/* { dg-final { scan-assembler-times "lwz"1 { target ilp32 } } } */
> +/* { dg-final { scan-assembler-times "rlwinm" 1 { target ilp32 } } } */

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 0/8] [RS6000] rs6000_rtx_costs V2

2020-12-05 Thread Alan Modra via Gcc-patches
Hi Segher,
I've been holding off pinging these knowing you had a lot of other
review work, but maybe that's settling down now?  You already OK'd
1/8, 2/8 and 6/8.

[PATCH 3/8] [RS6000] rs6000_rtx_costs tidy AND
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555754.html

[PATCH 4/8] [RS6000] rs6000_rtx_costs tidy break/return
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555755.html

[PATCH 5/8] [RS6000] rs6000_rtx_costs cost IOR
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555756.html

[PATCH 7/8] [RS6000] rs6000_rtx_costs reduce cost for SETs
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555758.html

[PATCH 8/8] [RS6000] rs6000_rtx_costs for !speed
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555759.html

[RS6000] rotate and mask constants
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555760.html

[RS6000] Adjust testcases for power10 instructions V3
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/557587.html

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] Use LIB2_SIDITI_CONV_FUNCS in place of ppc64-fp.c

2020-11-13 Thread Alan Modra via Gcc-patches
This patch retires ppc64-fp.c in favour of using
"LIB2_SIDITI_CONV_FUNCS = yes", which is a lot better solution than
having a copy of selected libgcc2.c functions.

So for powerpc64-linux we see these changes in libgcc files (plus
corresponding _s.o variants).
+_fixdfti.o
+_fixsfti.o
+_fixtfti.o
+_fixunsdfti.o
+_fixunssfti.o
+_fixunstfti.o
+_floattidf.o
+_floattisf.o
+_floattitf.o
+_floatuntidf.o
+_floatuntisf.o
+_floatuntitf.o
-ppc64-fp.o

with these empty objects also appearing (plus _s.o variants).
+_fixunsxfti.o
+_fixxfti.o
+_floattixf.o
+_floatuntixf.o

In reality we aren't getting new TI mode conversions as it might seem,
because the old *di*.o files corresponding to the above files
contained TI mode conversions, whereas now they contain DI mode
conversions.  Those match the functions provided in ppc64-fp.o, and
the set of dynamic libgcc_s.so.1 symbols is identical, apart from 
values, to before this patch.

For ppc32 we get a whole lot more empty objects replacing the empty
ppc64-fp.o.  Again the set of global symbol in libgcc.a and dynamic
symbols in libgcc_s.so.1 are unchanged.

Bootstrapped and regression tested powerpc64-linux, powerpc64le-linux,
powerpc-linux and powerpc-ibm-aix7.2.4.0.  OK?

* config/rs6000/t-ppc64-fp (LIB2ADD): Delete.
(LIB2_SIDITI_CONV_FUNCS): Define.
* config/rs6000/ppc64-fp.c: Delete file.

diff --git a/libgcc/config/rs6000/t-ppc64-fp b/libgcc/config/rs6000/t-ppc64-fp
index 26d1730bcdb..999679fc3cb 100644
--- a/libgcc/config/rs6000/t-ppc64-fp
+++ b/libgcc/config/rs6000/t-ppc64-fp
@@ -1,2 +1 @@
-# Can be used unconditionally, wrapped in __powerpc64__ || __64BIT__ __ppc64__.
-LIB2ADD += $(srcdir)/config/rs6000/ppc64-fp.c
+LIB2_SIDITI_CONV_FUNCS = yes

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 1/3] Refactor copying decl section names

2020-11-10 Thread Alan Modra via Gcc-patches
On Tue, Nov 10, 2020 at 09:19:37PM -0700, Jeff Law wrote:
> I think the const char * is fine.  That should force resolution to the
> same routine we were using earlier.  Do you want to commit that fix or
> shall I?

Commited 693a79a355e1.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 1/3] Refactor copying decl section names

2020-11-10 Thread Alan Modra via Gcc-patches
On Tue, Nov 10, 2020 at 11:45:37AM -0700, Jeff Law wrote:
> 
> On 11/12/19 11:28 PM, Strager Neds wrote:
> > * gcc/cgraph.h (symtab_node::get_section): Constify.
> > (symtab_node::set_section): Declare new overload.
> > * gcc/symtab.c (symtab_node::set_section): Define new overload.
> > (symtab_node::copy_visibility_from): Use new overload of
> > symtab_node::set_section.
> > (symtab_node::resolve_alias): Same.
> > * gcc/tree.h (set_decl_section_name): Declare new overload.
> > * gcc/tree.c (set_decl_section_name): Define new overload.
> > * gcc/c/c-decl.c (merge_decls): Use new overload of
> > set_decl_section_name.
> > * gcc/cp/decl.c (duplicate_decls): Same.
> > * gcc/cp/method.c (use_thunk): Same.
> > * gcc/cp/optimize.c (maybe_clone_body): Same.
> > * gcc/d/decl.cc (finish_thunk): Same.
> > * gcc/tree-emutls.c (get_emutls_init_templ_addr): Same.
> > * gcc/cgraphclones.c (cgraph_node::create_virtual_clone): Use new
> > overload of symtab_node::set_section.
> > (cgraph_node::create_version_clone_with_body): Same.
> > * gcc/trans-mem.c (ipa_tm_create_version): Same.
> 
> I adjusted the ChangeLog, added an entry for the coroutines.cc addition
> I made and pushed this to the trunk.

/home/alan/src/gcc/gcc/go/go-gcc.cc:2759:44: error: call of overloaded 
‘set_decl_section_name(tree_node*&, std::nullptr_t)’ is ambiguous
   set_decl_section_name (var_decl, NULL);
^
In file included from /home/alan/src/gcc/gcc/go/go-gcc.cc:27:0:
/home/alan/src/gcc/gcc/tree.h:4283:13: note: candidate: void 
set_decl_section_name(tree, const char*)
 extern void set_decl_section_name (tree, const char *);
 ^
/home/alan/src/gcc/gcc/tree.h:4284:13: note: candidate: void 
set_decl_section_name(tree, const_tree)
 extern void set_decl_section_name (tree, const_tree);
 ^

I'm using the obvious to me "(const char *) NULL" in the call to fix
this, but you might like a different style C++ fix instead.

-- 
Alan Modra
Australia Development Lab, IBM


[PATCH 2/2] [RS6000] REG_PARM_STACK_SPACE check V2

2020-11-02 Thread Alan Modra via Gcc-patches
On PowerPC we can tail call if the callee has less or equal
REG_PARM_STACK_SPACE than the caller, as demonstrated by the
testcase.  So we should use

  /* If reg parm stack space increases, we cannot sibcall.  */
  if (REG_PARM_STACK_SPACE (decl ? decl : fntype)
  > INCOMING_REG_PARM_STACK_SPACE (current_function_decl))

and note the change to use INCOMING_REG_PARM_STACK_SPACE.
REG_PARM_STACK_SPACE has always been wrong there for PowerPC.  See
https://gcc.gnu.org/pipermail/gcc-patches/2014-May/389867.html for why
if you're curious.  Not that it matters, because PowerPC can do
without this check entirely, relying on a stack slot test in generic
code.

a) The generic code checks that arg passing stack in the callee is not
   greater than that in the caller, and,
b) ELFv2 only allocates reg_parm_stack_space when some parameter is
   passed on the stack.
Point (b) means that zero reg_parm_stack_space implies zero stack
space, and non-zero reg_parm_stack_space implies non-zero stack
space.  So the case of 0 reg_parm_stack_space in the caller and 64 in
the callee will be caught by (a).

Bootstrapped and regression tested powerpc64le-linux and biarch
powerpc64-linux. OK?

PR middle-end/97267
gcc/
* config/rs6000/rs6000-logue.c (rs6000_function_ok_for_sibcall):
Remove code checking REG_PARM_STACK_SPACE.
testsuite/
* gcc.target/powerpc/pr97267.c: New test.

diff --git a/gcc/config/rs6000/rs6000-logue.c b/gcc/config/rs6000/rs6000-logue.c
index 61eb7ce7ade..d90cd5736e1 100644
--- a/gcc/config/rs6000/rs6000-logue.c
+++ b/gcc/config/rs6000/rs6000-logue.c
@@ -30,7 +30,6 @@
 #include "df.h"
 #include "tm_p.h"
 #include "ira.h"
-#include "calls.h"
 #include "print-tree.h"
 #include "varasm.h"
 #include "explow.h"
@@ -1134,19 +1133,6 @@ rs6000_function_ok_for_sibcall (tree decl, tree exp)
   else
 fntype = TREE_TYPE (TREE_TYPE (CALL_EXPR_FN (exp)));
 
-  /* If outgoing reg parm stack space changes, we cannot do sibcall.  */
-  if ((OUTGOING_REG_PARM_STACK_SPACE (fntype)
-   != OUTGOING_REG_PARM_STACK_SPACE (TREE_TYPE (current_function_decl)))
-  || (REG_PARM_STACK_SPACE (decl ? decl : fntype)
- != REG_PARM_STACK_SPACE (current_function_decl)))
-{
-  maybe_complain_about_tail_call (exp,
- "inconsistent size of stack space"
- " allocated for arguments which are"
- " passed in registers");
-  return false;
-}
-
   /* We can't do it if the called function has more vector parameters
  than the current function; there's nowhere to put the VRsave code.  */
   if (TARGET_ALTIVEC_ABI
diff --git a/gcc/testsuite/gcc.target/powerpc/pr97267.c 
b/gcc/testsuite/gcc.target/powerpc/pr97267.c
new file mode 100644
index 000..cab46245fc9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr97267.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+static int __attribute__ ((__noclone__, __noinline__))
+reg_args (int j1, int j2, int j3, int j4, int j5, int j6, int j7, int j8)
+{
+  return j1 + j2 + j3 + j4 + j5 + j6 + j7 + j8;
+}
+
+int __attribute__ ((__noclone__, __noinline__))
+stack_args (int j1, int j2, int j3, int j4, int j5, int j6, int j7, int j8,
+   int j9)
+{
+  if (j9 == 0)
+return 0;
+  return reg_args (j1, j2, j3, j4, j5, j6, j7, j8);
+}
+
+/* { dg-final { scan-assembler-not {(?n)^\s+bl\s} } } */

-- 
Alan Modra
Australia Development Lab, IBM


[PATCH 1/2] can_implement_as_sibling_call_p REG_PARM_STACK_SPACE check V2

2020-11-02 Thread Alan Modra via Gcc-patches
On Sat, Oct 31, 2020 at 12:10:35AM +1030, Alan Modra wrote:
> Would it be better if I post the patches again, restructuring them as
> 1) completely no functional change just moving the existing condition
>to the powerpc and i386 target hooks, and
> 2) twiddling the powerpc target hook?

The no function change patch.

This moves an #ifdef block of code from calls.c to
targetm.function_ok_for_sibcall.  Only two targets, x86 and rs6000,
define REG_PARM_STACK_SPACE or OUTGOING_REG_PARM_STACK_SPACE macros
that might vary depending on the called function.  Macros like
UNITS_PER_WORD don't change over a function boundary, nor does the
MIPS ABI, nor does TARGET_64BIT on PA-RISC.  Other targets are even
more trivially proven to not need the calls.c code.

Besides cleaning up a small piece of #ifdef code, the motivation for
this patch is to allow tail calls on PowerPC for functions that
require less reg_parm_stack_space than their caller.  The original
code in calls.c only permitted tail calls when exactly equal.  That
will be done in a followup patch (removing the code added to
rs6000-logue.c in this patch).

Bootstrapped and regression tested powerpc64le-linux, powerpc64-linux,
and x86_64-linux.  OK?

PR middle-end/97267
* calls.h (maybe_complain_about_tail_call): Declare.
* calls.c (maybe_complain_about_tail_call): Make global.
(can_implement_as_sibling_call_p): Delete reg_parm_stack_space
param.  Adjust caller.  Move REG_PARM_STACK_SPACE check to..
* config/i386/i386.c (ix86_function_ok_for_sibcall): ..here, and..
* config/rs6000/rs6000-logue.c (rs6000_function_ok_for_sibcall): ..
here.

diff --git a/gcc/calls.c b/gcc/calls.c
index a8f459632f2..1a7632d2d48 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -1922,7 +1922,7 @@ maybe_warn_nonstring_arg (tree fndecl, tree exp)
 /* Issue an error if CALL_EXPR was flagged as requiring
tall-call optimization.  */
 
-static void
+void
 maybe_complain_about_tail_call (tree call_expr, const char *reason)
 {
   gcc_assert (TREE_CODE (call_expr) == CALL_EXPR);
@@ -3525,7 +3525,6 @@ static bool
 can_implement_as_sibling_call_p (tree exp,
 rtx structure_value_addr,
 tree funtype,
-int reg_parm_stack_space ATTRIBUTE_UNUSED,
 tree fndecl,
 int flags,
 tree addr,
@@ -3550,20 +3549,6 @@ can_implement_as_sibling_call_p (tree exp,
   return false;
 }
 
-#ifdef REG_PARM_STACK_SPACE
-  /* If outgoing reg parm stack space changes, we cannot do sibcall.  */
-  if (OUTGOING_REG_PARM_STACK_SPACE (funtype)
-  != OUTGOING_REG_PARM_STACK_SPACE (TREE_TYPE (current_function_decl))
-  || (reg_parm_stack_space != REG_PARM_STACK_SPACE 
(current_function_decl)))
-{
-  maybe_complain_about_tail_call (exp,
- "inconsistent size of stack space"
- " allocated for arguments which are"
- " passed in registers");
-  return false;
-}
-#endif
-
   /* Check whether the target is able to optimize the call
  into a sibcall.  */
   if (!targetm.function_ok_for_sibcall (fndecl, exp))
@@ -4088,7 +4073,6 @@ expand_call (tree exp, rtx target, int ignore)
 try_tail_call = can_implement_as_sibling_call_p (exp,
 structure_value_addr,
 funtype,
-reg_parm_stack_space,
 fndecl,
 flags, addr, args_size);
 
diff --git a/gcc/calls.h b/gcc/calls.h
index f32b6308b58..b20d24bb888 100644
--- a/gcc/calls.h
+++ b/gcc/calls.h
@@ -133,6 +133,7 @@ extern bool reference_callee_copied (CUMULATIVE_ARGS *,
 extern void maybe_warn_alloc_args_overflow (tree, tree, tree[2], int[2]);
 extern tree get_attr_nonstring_decl (tree, tree * = NULL);
 extern bool maybe_warn_nonstring_arg (tree, tree);
+extern void maybe_complain_about_tail_call (tree, const char *);
 enum size_range_flags
   {
/* Set to consider zero a valid range.  */
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 502d24057b5..809c145b638 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -939,6 +939,19 @@ ix86_function_ok_for_sibcall (tree decl, tree exp)
   decl_or_type = type;
 }
 
+  /* If outgoing reg parm stack space changes, we cannot do sibcall.  */
+  if ((OUTGOING_REG_PARM_STACK_SPACE (type)
+   != OUTGOING_REG_PARM_STACK_SPACE (TREE_TYPE (current_function_decl)))
+  || (REG_PARM_STACK_SPACE (decl_or_type)
+ != REG_PARM_STACK_SPACE (current_function_decl)))
+{

Re: [PATCH] PowerPC: PR libgcc/97543, build libgcc with -mno-gnu-attribute

2020-10-31 Thread Alan Modra via Gcc-patches
s6000/t-linux | 18 +-
>  1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/libgcc/config/rs6000/t-linux b/libgcc/config/rs6000/t-linux
> index ed821947b66..b2a079c6b54 100644
> --- a/libgcc/config/rs6000/t-linux
> +++ b/libgcc/config/rs6000/t-linux
> @@ -1,6 +1,22 @@
>  SHLIB_MAPFILES += $(srcdir)/config/rs6000/libgcc-glibc.ver
>  
> -HOST_LIBGCC2_CFLAGS += -mlong-double-128
> +# On the modules that deal with IBM 128-bit values, we need to make sure that
> +# TFmode uses the IBM extended double format.
> +IBM128_OBJS  = ibm-ldouble$(objext) _powitf2$(objext) ppc64-fp$(objext) \
> +   _divtc3$(object) _multc3$(object) \
> +   _fixtfdi$(object) _fixunstfdi$(object) \
> +   _floatditf$(objext) _floatunsditf$(objext)
> +
> +IBM128_S_OBJS= $(patsubst %$(objext),%_s$(objext),$(IBM128_OBJS))
> +IBM128_ALL_OBJS  = $(IBM128_OBJS) $(IBM128_S_OBJS)
> +
> +IBM128_CFLAGS= -mlong-double-128 -Wno-psabi -mabi=ibmlongdouble
> +
> +$(IBM128_ALL_OBJS) : INTERNAL_CFLAGS += $(IBM128_CFLAGS)
> +
> +# Turn off gnu attributes for the whole library.  This allows us to build
> +# libgcc that supports the different long double formats.
> +HOST_LIBGCC2_CFLAGS += -mno-gnu-attribute

It would be good if you could pass -mno-gnu-attribute just for the
libgcc_s.so objects.  That way you might save users from making
mistakes due to using the wrong libgcc.a somehow.  I think you may be
able to do that with

gcc_s_compile += -mno-gnu-attribute

And, yes, I completely agree that libgcc_s.so should not be marked
with .gnu_attribute.

>  # This is a way of selecting -mcmodel=small for ppc64, which gives
>  # smaller and faster libgcc code.  Directly specifying -mcmodel=small
> -- 
> 2.22.0
> 
> 
> -- 
> Michael Meissner, IBM
> IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
> email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] calls.c:can_implement_as_sibling_call_p REG_PARM_STACK_SPACE check

2020-10-30 Thread Alan Modra via Gcc-patches
On Fri, Oct 30, 2020 at 09:21:09AM +, Richard Sandiford wrote:
> Alan Modra via Gcc-patches  writes:
> > This moves an #ifdef block of code from calls.c to
> > targetm.function_ok_for_sibcall.  Only two targets, x86 and rs6000,
> > define REG_PARM_STACK_SPACE or OUTGOING_REG_PARM_STACK_SPACE macros
> > that might vary depending on the called function.  Macros like
> > UNITS_PER_WORD don't change over a function boundary, nor does the
> > MIPS ABI, nor does TARGET_64BIT on PA-RISC.  Other targets are even
> > more trivially seen to not need the calls.c code.
> >
> > Besides cleaning up a small piece of #ifdef code, the motivation for
> > this patch is to allow tail calls on PowerPC for functions that
> > require less reg_parm_stack_space than their caller.  The original
> > code in calls.c only permitted tail calls when exactly equal.
> 
> Is there something PowerPC-specific that makes the relaxation safe
> for that target while not being safe on x86?

It is quite possible that x86 can relax this condition too, I'm just
not familiar enough with all the x86 ABIs know with any certainty.  By
moving the test to the target hook we allow target maintainers to have
full say in the matter.

> I take your point about x86 and PowerPC being the only two affected
> targets.  But the interface does still take an fndecl on all targets,
> so I think the target-independent assumption should be that the value
> might vary depending on function.  So I guess an alternative would be
> to relax the target-independent condition and make the x86 hook enforce
> the stricter condition (if it really is needed).

Yes, except that actually the REG_PARM_STACK_SPACE condition for
PowerPC can be removed entirely.  I agree that doing as you suggest
would be OK for PowerPC, it would just mean we continue to do some
unnecessary work in the non-trivial rs6000_function_parms_need_stack.

Would it be better if I post the patches again, restructuring them as
1) completely no functional change just moving the existing condition
   to the powerpc and i386 target hooks, and
2) twiddling the powerpc target hook?

Thanks for your time spent reviewing, and comments!

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] Adjust testcases for power10 instructions V3

2020-10-30 Thread Alan Modra via Gcc-patches
addic\M} 4 { target { ! has_arch_pwr10 
} } } } */
+/* { dg-final { scan-assembler-times {\msubfe\M} 1 { target { ! has_arch_pwr10 
} } } } */
+/* { dg-final { scan-assembler-times {\maddic\M} 3 { target { has_arch_pwr10 } 
} } } */
+/* { dg-final { scan-assembler-times {\msetbcr\M} 1 { target { has_arch_pwr10 
} } } } */
+/* { dg-final { scan-assembler-times {\maddze\M} 3 } } */
 
 long ne0(long a)
 {
diff --git a/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c 
b/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
index 1269fe635c6..72b83ef9d08 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
@@ -30,5 +30,5 @@ vector signed long long splats4(void)
 
 /* { dg-final { scan-assembler-times {\mvspltis[bhw]\M} 0 } } */
 /* { dg-final { scan-assembler-times {\mvsl[bhwd]\M} 0 } } */
-/* { dg-final { scan-assembler-times {\mlvx\M|\mlxv\M|\mlxvd2x\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mlvx\M|\mp?lxv\M|\mlxvd2x\M} 2 } } */
 
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 8439720baea..7e2dd358640 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -5903,6 +5903,16 @@ proc check_effective_target_has_arch_pwr9 { } {
}]
 }
 
+proc check_effective_target_has_arch_pwr10 { } {
+   return [check_no_compiler_messages arch_pwr10 assembly {
+   #ifndef _ARCH_PWR10
+   #error does not have power10 support.
+   #else
+   /* "has power10 support" */
+   #endif
+   }]
+}
+
 # Return 1 if this is a PowerPC target supporting -mcpu=power10.
 # Limit this to 64-bit linux systems for now until other targets support
 # power10.

-- 
Alan Modra
Australia Development Lab, IBM


Re: testsuite: Adjust target requirements for sad-vectorize and signbit

2020-10-29 Thread Alan Modra via Gcc-patches
Fixes
FAIL: gcc.target/powerpc/signbit-1.c scan-assembler-not stxvd2x
FAIL: gcc.target/powerpc/signbit-1.c scan-assembler-times mfvsrd 3
FAIL: gcc.target/powerpc/signbit-1.c scan-assembler-times srdi 3
FAIL: gcc.target/powerpc/signbit-2.c scan-assembler-times ld 1
FAIL: gcc.target/powerpc/signbit-2.c scan-assembler-times srdi 1
on powerpc-linux (or powerpc64-linux biarch -m32).

signbit-1.c is quite obviously a 64-bit only testcase given the
scan-assembler directives, and the purpose of the testcase to verify
the 64-bit only UNSPEC_SIGNBIT patterns.  It could be made to pass for
-m32 by adding -mpowerpc64, but that option that isn't very effective
when bi-arch testing and results in errors on rs6000-aix.  And it is
pointless to match -m32 stores to the stack followed by loads, which
is what we do at the moment.

signbit-2.c on the other hand has more reasonable 32-bit output.

Regression tested powerpc64-linux biarch.

* gcc.target/powerpc/signbit-1.c: Reinstate lp64 condition.
* gcc.target/powerpc/signbit-2.c: Match 32-bit output too.

diff --git a/gcc/testsuite/gcc.target/powerpc/signbit-1.c 
b/gcc/testsuite/gcc.target/powerpc/signbit-1.c
index eb4f53e397d..1642bf46d7a 100644
--- a/gcc/testsuite/gcc.target/powerpc/signbit-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/signbit-1.c
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target lp64 } */
 /* { dg-require-effective-target ppc_float128_sw } */
 /* { dg-require-effective-target powerpc_p8vector_ok } */
 /* { dg-options "-mdejagnu-cpu=power8 -O2 -mfloat128" } */
diff --git a/gcc/testsuite/gcc.target/powerpc/signbit-2.c 
b/gcc/testsuite/gcc.target/powerpc/signbit-2.c
index ff6af963dda..1b792916eba 100644
--- a/gcc/testsuite/gcc.target/powerpc/signbit-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/signbit-2.c
@@ -13,5 +13,7 @@ int do_signbit_kf (__float128 *a) { return __builtin_signbit 
(*a); }
 /* { dg-final { scan-assembler-not   "lxvw4x"   } } */
 /* { dg-final { scan-assembler-not   "lxsd" } } */
 /* { dg-final { scan-assembler-not   "lxsdx"} } */
-/* { dg-final { scan-assembler-times "ld" 1 } } */
-/* { dg-final { scan-assembler-times "srdi"   1 } } */
+/* { dg-final { scan-assembler-times "ld" 1 { target lp64 } } } */
+/* { dg-final { scan-assembler-times "srdi"   1 { target lp64 } } } */
+/* { dg-final { scan-assembler-times "lwz"1 { target ilp32 } } } */
+/* { dg-final { scan-assembler-times "rlwinm" 1 { target ilp32 } } } */

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] float128-type-2.c unsupported

2020-10-28 Thread Alan Modra via Gcc-patches
On Wed, Oct 28, 2020 at 11:35:07PM -0400, David Edelsohn wrote:
> Alan,
> 
> It is disrespectful for you to ignore the review of a maintainer and
> your colleague.  You may not pick and choose amongst maintainers.  And
> Segher should not be so disrespectful as to contradict his colleague
> and co-maintainer.

I'm sorry you see this as a matter of respect.  I didn't see it that
way at all.  Segher disagreed with your review, and gave sufficient
technical reason for me to commit the patch.

> I replied no to your patch and requested a different solution -- one
> that does not require significant effort.  Please fix this testcase
> the way that I requested.

I am not going to do as you demand.  Those tests are clearly
commented as "VSX Linux 64-bit" tests.  Ignoring that comment and
changing them to make them run for AIX and Darwin plainly requires
that I test those changes, which often results in significant effort
building cross-compilers.

If you wish them to run for AIX targets then that is your job as an
AIX maintainer.  Don't expect me to do the work.

Incidentally, this is the result of your recent commit 122f0db2793 on
powerpc64-linux biarch.
+FAIL: gcc.target/powerpc/signbit-1.c scan-assembler-not stxvd2x
+FAIL: gcc.target/powerpc/signbit-1.c scan-assembler-times mfvsrd 3
+FAIL: gcc.target/powerpc/signbit-1.c scan-assembler-times srdi 3
+FAIL: gcc.target/powerpc/signbit-2.c scan-assembler-times ld 1
+FAIL: gcc.target/powerpc/signbit-2.c scan-assembler-times srdi 1

I'm not complaining about that, or demanding that you fix those fails;
In fact I'm in the middle of fixing them.  But they do quite perfectly
illustrate that tweaking the testsuite is never simple.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] float128-type-2.c unsupported

2020-10-28 Thread Alan Modra via Gcc-patches
On Wed, Oct 28, 2020 at 01:44:54PM -0500, Segher Boessenkool wrote:
> On Wed, Oct 28, 2020 at 09:18:35PM +1030, Alan Modra wrote:
> > >From e7ce33cef478a826a2fe4e110b43b49586ef2438 Mon Sep 17 00:00:00 2001
> > From: Alan Modra 
> > Date: Wed, 28 Oct 2020 15:57:57 +1030
> > Subject: 
> > 
> > I noticed this test is unsupported on power10 when looking through
> > test logs.  There seems no reason why that should be the case, ie.
> > likely the target test was meant to be powerpc64*-*-linux*.  And that
> > simplifies down further.
> 
> The target name does not tell you if you are doing a -m32 or a -m64
> build; both powerpc-linux and powerpc64-linux can build both 32-bit and
> 64-bit just fine (and hopefully identically).  Having target powerpc64*
> is basically always wrong.

Yes.  Even le/be selection should really be done with { target le }
for example rather than { target powerpc*le-*-* }.  One day we might
want to test compilers with multi-endian support.

> Your patch is fine though, modulo what David said.  If there is some
> selector you can use (or you can make one) that is much preferred.  But
> since this patch is strictly an improvement already, it is okay for
> trunk (if the 2nd works on powerpc64le-linux of course ;-) )  Thanks!
> 
> (Improving it to test on exactly the right targets would be nice :-) )

Thank you, I committed it "as is".  An incremental improvement is
better than no improvement.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] Don't be too clever with dg-do run and dg-do compile

2020-10-28 Thread Alan Modra via Gcc-patches
On Wed, Oct 28, 2020 at 12:16:00PM -0500, Segher Boessenkool wrote:
> On Wed, Oct 28, 2020 at 09:20:56PM +1030, Alan Modra wrote:
> > Otherwise some versions of dejagnu go ahead and run the vsx tests
> > below when they should not.  To best cope with older dejagnu, put
> > "run" before "compile", the idea being that if the second dg-do always
> > wins then that won't cause fails.
> 
> If they are mutually exclusive, does the order still matter?  (Just FMI.)

Yes, it does.  Older dejagnu takes the last action regardless of the
target selector success.  Newer dejagnu uses the last action of a line
with a successful target selector.

commit 569f8718b534a2cd9511a7d640352eb0126ff492
Author: Dominik Vogt 
Date:   Mon Mar 28 17:31:07 2016 +1100

* dg.exp (dg-do): Do not change the previously selected action if
a de-selected dg-do is encountered.

Signed-off-by: Ben Elliston 

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] Don't be too clever with dg-do run and dg-do compile

2020-10-28 Thread Alan Modra via Gcc-patches
 a/gcc/testsuite/gcc.target/powerpc/vsx-load-element-extend-short.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-load-element-extend-short.c
index 68fdcdcea37..837ba79c9ab 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-load-element-extend-short.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-load-element-extend-short.c
@@ -2,8 +2,9 @@
Test of vec_xl_sext and vec_xl_zext (load into rightmost
vector element and zero/sign extend). */
 
-/* { dg-do compile {target power10_ok} } */
-/* { dg-do run {target power10_hw} } */
+/* { dg-do run { target power10_hw } } */
+/* { dg-do compile { target { ! power10_hw } } } */
+/* { dg-require-effective-target power10_ok } */
 /* { dg-require-effective-target int128 } */
 
 /* Deliberately set optization to zero for this test to confirm
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-char.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-char.c
index 45c49547d66..3049b1c2c28 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-char.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-char.c
@@ -1,8 +1,9 @@
 /*
Test of vec_xst_trunc (truncate and store rightmost vector element) */
 
-/* { dg-do compile {target power10_ok} } */
-/* { dg-do run {target power10_hw} } */
+/* { dg-do run { target power10_hw } } */
+/* { dg-do compile { target { ! power10_hw } } } */
+/* { dg-require-effective-target power10_ok } */
 /* { dg-require-effective-target int128 } */
 /* Deliberately set optization to zero for this test to confirm
the stxvr*x instruction is generated. At higher optimization levels
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-int.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-int.c
index f263e3d5cc9..7cc7699f8eb 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-int.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-int.c
@@ -1,8 +1,9 @@
 /*
Test of vec_xst_trunc (truncate and store rightmost vector element) */
 
-/* { dg-do compile {target power10_ok} } */
-/* { dg-do run {target power10_hw} } */
+/* { dg-do run { target power10_hw } } */
+/* { dg-do compile { target { ! power10_hw } } } */
+/* { dg-require-effective-target power10_ok } */
 /* { dg-require-effective-target int128 } */
 /* Deliberately set optization to zero for this test to confirm
the stxvr*x instruction is generated. At higher optimization levels
diff --git 
a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c
index 0eeef5e6ba9..e1bd0216611 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c
@@ -1,8 +1,9 @@
 /*
Test of vec_xst_trunc (truncate and store rightmost vector element) */
 
-/* { dg-do compile {target power10_ok} } */
-/* { dg-do run {target power10_hw} } */
+/* { dg-do run { target power10_hw } } */
+/* { dg-do compile { target { ! power10_hw } } } */
+/* { dg-require-effective-target power10_ok } */
 /* { dg-require-effective-target int128 } */
 
 /* Deliberately set optization to zero for this test to confirm
diff --git 
a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c
index 0186ddc552f..b173b36dbda 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c
@@ -1,8 +1,9 @@
 /*
Test of vec_xst_trunc (truncate and store rightmost vector element) */
 
-/* { dg-do compile {target power10_ok} } */
-/* { dg-do run {target power10_hw} } */
+/* { dg-do run { target power10_hw } } */
+/* { dg-do compile { target { ! power10_hw } } } */
+/* { dg-require-effective-target power10_ok } */
 /* { dg-require-effective-target int128 } */
 
 /* Deliberately set optization to zero for this test to confirm

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] float128-type-2.c unsupported

2020-10-28 Thread Alan Modra via Gcc-patches
>From e7ce33cef478a826a2fe4e110b43b49586ef2438 Mon Sep 17 00:00:00 2001
From: Alan Modra 
Date: Wed, 28 Oct 2020 15:57:57 +1030
Subject: 

I noticed this test is unsupported on power10 when looking through
test logs.  There seems no reason why that should be the case, ie.
likely the target test was meant to be powerpc64*-*-linux*.  And that
simplifies down further.

Regression tested powerpc64le-linux.  OK?

* gcc.target/powerpc/float128-type-1.c: Simplify target test.
* gcc.target/powerpc/float128-type-2.c: Likewise.

diff --git a/gcc/testsuite/gcc.target/powerpc/float128-type-1.c 
b/gcc/testsuite/gcc.target/powerpc/float128-type-1.c
index 13152ac7c26..53f9e357535 100644
--- a/gcc/testsuite/gcc.target/powerpc/float128-type-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/float128-type-1.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { powerpc64*-*-linux* && lp64 } } } */
+/* { dg-do compile { target { *-*-linux* && lp64 } } } */
 /* { dg-require-effective-target powerpc_p8vector_ok } */
 /* { dg-options "-mdejagnu-cpu=power8 -O2 -mno-float128" } */
 
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-type-2.c 
b/gcc/testsuite/gcc.target/powerpc/float128-type-2.c
index 5644281c3d4..02dbad1fa4f 100644
--- a/gcc/testsuite/gcc.target/powerpc/float128-type-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/float128-type-2.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { powerpc64-*-linux* && lp64 } } } */
+/* { dg-do compile { target { *-*-linux* && lp64 } } } */
 /* { dg-require-effective-target powerpc_p9vector_ok } */
 /* { dg-options "-mdejagnu-cpu=power9 -O2 -mno-float128" } */
 

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] Do not define builtins that overload disabled builtins

2020-10-28 Thread Alan Modra via Gcc-patches
commit 25ffd3d34e means we no longer define an overloaded
__builtin_byte_in_set for -m32, so the more informative
"__builtin_byte_in_set is not supported in this compiler
configuration" is not reported.

Regression tested powerpc64-linux biarch.  OK?

PR bootstrap/92661
* gcc.target/powerpc/byte-in-set-2.c: Update expected error.

diff --git a/gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c 
b/gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c
index 9a80c27fe26..34ab50e25ba 100644
--- a/gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c
@@ -11,5 +11,5 @@
 int
 test_byte_in_set (unsigned char b, unsigned long long set_members)
 {
-  return __builtin_byte_in_set (b, set_members); /* { dg-error 
"'__builtin_byte_in_set' is not supported in this compiler configuration" } */
+  return __builtin_byte_in_set (b, set_members); /* { dg-warning "implicit 
declaration of function" } */
 }

-- 
Alan Modra
Australia Development Lab, IBM


Re: testsuite: Enable and adjust powerpc fold-vec-extract/insert testcases

2020-10-28 Thread Alan Modra via Gcc-patches
git a/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-int-p9.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-int-p9.c
index 81ac1f1a00a..a851fd6b8dc 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-int-p9.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-int-p9.c
@@ -58,8 +58,6 @@ testui2_cst(unsigned int x, vector unsigned int v)
 /* { dg-final { scan-assembler-times {\mmtvsrwz\M} 4 { target lp64 } } } */
 /* { dg-final { scan-assembler-times {\mxxinsertw\M} 4 { target lp64 } } } */
 
-
-/* { dg-final { scan-assembler-times {\maddi\M} 8 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times {\mstw\M} 8 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times {\mlxv\M} 8 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times {\mlvewx\M} 4 { target ilp32 } } } */

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] dg-do !compile and scan-assembler

2020-10-27 Thread Alan Modra via Gcc-patches
On Tue, Oct 27, 2020 at 05:34:45PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, Oct 27, 2020 at 09:26:10PM +1030, Alan Modra wrote:
> > These tests never checked assembly, because .s files were not
> > produced.  One test was looking for the wrong instructions.
> 
>  -/* { dg-final { scan-assembler-times {\msldbi\M} 6 } } */
> > -/* { dg-final { scan-assembler-times {\msrdbi\M} 6 } } */
> > -
> > -
> > +/* { dg-final { scan-assembler-times {\mvsldbi\M} 8 } } */
> > +/* { dg-final { scan-assembler-times {\mvsrdbi\M} 8 } } */
> 
> You also changed the instruction count here; did you check 8 is actually
> what is expected here, or is that just what you saw come out?

Yes, 8 is the correct count.  And in this instance I amazingly changed
the count before running the test and finding it "wrong".  :-)

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] power10 scan-assembler tests

2020-10-27 Thread Alan Modra via Gcc-patches
ot;-mdejagnu-cpu=power10 -O0 -save-temps" } */
  
 /* { dg-final { scan-assembler-times {\mstxvrwx\M} 2 } } */
 /* { dg-final { scan-assembler-times {\mstwx\M} 0 } } */
diff --git 
a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c
index c137ce2d19f..0eeef5e6ba9 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c
@@ -9,7 +9,7 @@
the stxvr*x instruction is generated. At higher optimization levels
the instruction we are looking for is sometimes replaced by other
store instructions. */
-/* { dg-options "-mdejagnu-cpu=power10 -O0" } */
+/* { dg-options "-mdejagnu-cpu=power10 -O0 -save-temps" } */
 
 /* { dg-final { scan-assembler-times {\mstxvrdx\M} 2 } } */
 /* { dg-final { scan-assembler-times {\mstwx\M} 0 } } */
diff --git 
a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c
index 7d856e7c3eb..0186ddc552f 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c
@@ -9,7 +9,7 @@
the stxvr*x instruction is generated. At higher optimization levels
the instruction we are looking for is sometimes replaced by other
store instructions.  */
-/* { dg-options "-mdejagnu-cpu=power10 -O0" } */
+/* { dg-options "-mdejagnu-cpu=power10 -O0 -save-temps" } */
 
 /* { dg-final { scan-assembler-times {\mstxvrhx\M} 2 } } */
 /* { dg-final { scan-assembler-times {\msthx\M} 0 } } */

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] dg-do !compile and scan-assembler

2020-10-27 Thread Alan Modra via Gcc-patches
>From 6c1817cece47ce2cb36df1f57b533b9d2385f0a5 Mon Sep 17 00:00:00 2001
From: Alan Modra 
Date: Tue, 27 Oct 2020 17:32:13 +1030
Subject: 

These tests never checked assembly, because .s files were not
produced.  One test was looking for the wrong instructions.

A typical error log
PASS: gcc.target/powerpc/vec-permute-ext-runnable.c (test for excess errors)
gcc.target/powerpc/vec-permute-ext-runnable.c output file does not exist
UNRESOLVED: gcc.target/powerpc/vec-permute-ext-runnable.c scan-assembler-times 
\\mpermx\\M 10

Bootstrapped and regression tested powerpc64le-linux power8.
powerpc64le-linux power10 and powerpc64-linux power7 biarch regtest
still in progress.  OK?

* gcc.target/powerpc/vec-blend-runnable.c: Add save-temps.
* gcc.target/powerpc/vec-insert-word-runnable.c: Likewise.
* gcc.target/powerpc/vec-permute-ext-runnable.c: Likewise.
* gcc.target/powerpc/vec-replace-word-runnable.c: Likewise.
* gcc.target/powerpc/vec-splati-runnable.c: Likewise.
* gcc.target/powerpc/vec-ternarylogic-3.c: Likewise.
* gcc.target/powerpc/vec-ternarylogic-9.c: Likewise.
* gcc.target/powerpc/vsx_mask-count-runnable.c: Likewise.
* gcc.target/powerpc/vsx_mask-expand-runnable.c: Likewise.
* gcc.target/powerpc/vsx_mask-extract-runnable.c: Likewise.
* gcc.target/powerpc/vsx_mask-move-runnable.c: Likewise.
* gcc.target/powerpc/vec-shift-double-runnable.c: Likewise,
and correct assembly match.

diff --git a/gcc/testsuite/gcc.target/powerpc/vec-blend-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vec-blend-runnable.c
index 774960bbcd3..0f4b2130351 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-blend-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-blend-runnable.c
@@ -1,7 +1,7 @@
 /* { dg-do run { target { power10_hw } } } */
 /* { dg-do link { target { ! power10_hw } } } */
 /* { dg-require-effective-target power10_ok } */
-/* { dg-options "-mdejagnu-cpu=power10" } */
+/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
 #include 
 
 #define DEBUG 0
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-insert-word-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vec-insert-word-runnable.c
index 55ca1c4b35d..be45182a6be 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-insert-word-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-insert-word-runnable.c
@@ -1,7 +1,7 @@
 /* { dg-do run { target { power10_hw } } } */
 /* { dg-do link { target { ! power10_hw } } } */
 /* { dg-require-effective-target power10_ok } */
-/* { dg-options "-mdejagnu-cpu=power10" } */
+/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
 #include 
 
 #define DEBUG 0
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-permute-ext-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vec-permute-ext-runnable.c
index 2626d876d3d..3e3f9a77ecb 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-permute-ext-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-permute-ext-runnable.c
@@ -1,7 +1,7 @@
 /* { dg-do run { target { power10_hw } } } */
 /* { dg-do link { target { ! power10_hw } } } */
 /* { dg-require-effective-target power10_ok } */
-/* { dg-options "-mdejagnu-cpu=power10" } */
+/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
 #include 
 
 #define DEBUG 0
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-replace-word-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vec-replace-word-runnable.c
index 413b9048eca..162968316bc 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-replace-word-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-replace-word-runnable.c
@@ -1,7 +1,7 @@
 /* { dg-do run { target { power10_hw } } } */
 /* { dg-do link { target { ! power10_hw } } } */
 /* { dg-require-effective-target power10_ok } */
-/* { dg-options "-mdejagnu-cpu=power10" } */
+/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable.c
index b8478f5c32d..128da2ad42b 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable.c
@@ -1,7 +1,7 @@
 /* { dg-do run { target { power10_hw } } } */
 /* { dg-do link { target { ! power10_hw } } } */
 /* { dg-require-effective-target power10_ok } */
-/* { dg-options "-mdejagnu-cpu=power10" } */
+/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
 #include 
 
 #define DEBUG 0
@@ -379,7 +379,5 @@ main (int argc, char *argv [])
   return 0;
 }
 
-/* { dg-final { scan-assembler-times {\msldbi\M} 6 } } */
-/* { dg-final { scan-assembler-times {\msrdbi\M} 6 } } */
-
-
+/* { dg-final { scan-assembler-times {\mvsldbi\M} 8 } } */
+/* { dg-final { scan-assembler-times {\mvsrdbi\M} 8 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c 
b/gcc/testsuite/gcc.ta

[RS6000] Separate dg-require-effective-target options

2020-10-26 Thread Alan Modra via Gcc-patches
Subject was "[RS6000] Tests that use int128_t and -m32"
I meant to make this change before committing too.  Pushed.

* gcc.target/powerpc/vsx_mask-count-runnable.c: Separate options
passed to dg-require-effective-target.
* gcc.target/powerpc/vsx_mask-expand-runnable.c: Likewise.
* gcc.target/powerpc/vsx_mask-extract-runnable.c: Likewise.
* gcc.target/powerpc/vsx_mask-move-runnable.c: Likewise.

diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c
index 6aa165c675c..28aa7da9d1f 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c
@@ -1,7 +1,8 @@
 /* { dg-do run { target { power10_hw } } } */
 /* { dg-do link { target { ! power10_hw } } } */
 /* { dg-options "-mdejagnu-cpu=power10 -O2" } */
-/* { dg-require-effective-target { int128 && power10_ok } } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-require-effective-target int128 } */
 
 /* Check that the expected 128-bit instructions are generated if the processor
supports the 128-bit integer instructions. */
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c
index 9fdfa4a8b82..68c1c3f1c9a 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c
@@ -1,7 +1,8 @@
 /* { dg-do run { target { power10_hw } } } */
 /* { dg-do link { target { ! power10_hw } } } */
 /* { dg-options "-mdejagnu-cpu=power10 -O2" } */
-/* { dg-require-effective-target { int128 && power10_ok } } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-require-effective-target int128 } */
 
 /* Check that the expected 128-bit instructions are generated if the processor
supports the 128-bit integer instructions. */
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c
index a038e56c9cd..4664807a69e 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c
@@ -1,7 +1,8 @@
 /* { dg-do run { target { power10_hw } } } */
 /* { dg-do link { target { ! power10_hw } } } */
 /* { dg-options "-mdejagnu-cpu=power10 -O2" } */
-/* { dg-require-effective-target { int128 && power10_ok } } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-require-effective-target int128 } */
 
 /* Check that the expected 128-bit instructions are generated if the processor
supports the 128-bit integer instructions. */
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c
index 6f87e60ea41..58954dc5fc9 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c
@@ -1,7 +1,8 @@
 /* { dg-do run { target { power10_hw } } } */
 /* { dg-do link { target { ! power10_hw } } } */
 /* { dg-options "-mdejagnu-cpu=power10 -O2" } */
-/* { dg-require-effective-target { int128 && power10_ok } } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-require-effective-target int128 } */
 
 /* Check that the expected 128-bit instructions are generated if the processor
supports the 128-bit integer instructions. */


-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] Adjust testcases for power10 instructions V2

2020-10-26 Thread Alan Modra via Gcc-patches
0644
--- a/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
@@ -30,5 +30,5 @@ vector signed long long splats4(void)
 
 /* { dg-final { scan-assembler-times {\mvspltis[bhw]\M} 0 } } */
 /* { dg-final { scan-assembler-times {\mvsl[bhwd]\M} 0 } } */
-/* { dg-final { scan-assembler-times {\mlvx\M|\mlxv\M|\mlxvd2x\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mlvx\M|\mp?lxv\M|\mlxvd2x\M} 2 } } */
 

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] Unsupported test options for -m32

2020-10-26 Thread Alan Modra via Gcc-patches
On Mon, Oct 26, 2020 at 04:28:20PM +, Iain Sandoe wrote:
> David Edelsohn via Gcc-patches  wrote:
> 
> > FAIL: gcc.target/powerpc/swaps-p8-22.c (test for excess errors)
> > Excess errors:
> > cc1: error: '-mcmodel' not supported in this configuration
> > 
> > * gcc.target/powerpc/swaps-p8-22.c: Disable for -m32.
> > 
> > diff --git a/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c
> > b/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c
> > index 83f6ab3a1c0..bceada41b75 100644
> > --- a/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c
> > @@ -1,5 +1,5 @@
> > /* { dg-do compile } */
> > -/* { dg-require-effective-target powerpc_p8vector_ok } */
> > +/* { dg-require-effective-target { lp64 && powerpc_p8vector_ok } } */
> > /* { dg-options "-O2 -mdejagnu-cpu=power8 -maltivec -mcmodel=large" } */
> > 
> > /* The expansion for vector character multiply introduces a vperm
> > operation.
> > 
> > 
> > Please don't fix the failure this way.  This is incorrect.  -m32 means
> > more than Linux.  This reverts my hard work to run more of the powerpc
> > testsuite on AIX.  AIX also is -m32.

Ah, David was the culprit that broke the test on linux and darwin.
;-)

> Darwin also is (powerpc-darwin) and has an m32 multilib (powerpc64-darwin)
> so not reliable there either.
> > 
> > This probably should be fixed with
> > 
> > { dg-additional-options "-mcmodel=large" { target { lp64 ||
> > !powerpc*-*-linux* } } }
> > 
> > or whatever the appropriate incantation to omit only ppc32 linux.  Or maybe
> > 
> > { dg-do compile { target { lp64 || !powerpc*-*-linux* } } }
> 
> mcmodel will also break for powerpc64 and powerpc / m64 Darwin, so if this is
> meant to be Linux-specific, that seems to be the thing to mention.

$ grep mcmodel gcc/config/rs6000/*.opt
gcc/config/rs6000/aix64.opt:mcmodel=
gcc/config/rs6000/aix64.opt:Known code models (for use with the -mcmodel= 
option):
gcc/config/rs6000/linux64.opt:mcmodel=
gcc/config/rs6000/linux64.opt:Known code models (for use with the -mcmodel= 
option):

aix64.opt is used for all rs6000-aix targets, linux64.opt for 64-bit
and biarch powerpc linux targets.  powerpc linux errors for -mcmodel
when -m32 (yes, even when biarch Segher).  So the proper test is a
little more complicated than any suggestion given so far.

This looks correct to me, bearing in mind that the test is a duplicate
of swaps-p8-21.c aimed specifically at testing -mcmodel=large.

* gcc.target/powerpc/swaps-p8-22.c: Enable only for aix and
-m64 linux.

diff --git a/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c 
b/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c
index 83f6ab3a1c0..847aebccca8 100644
--- a/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c
+++ b/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do compile { target { *-*-aix* || { *-*-linux* && lp64 } } } } */
 /* { dg-require-effective-target powerpc_p8vector_ok } */
 /* { dg-options "-O2 -mdejagnu-cpu=power8 -maltivec -mcmodel=large" } */
 

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] Tests that use int128_t and -m32

2020-10-26 Thread Alan Modra via Gcc-patches
On Mon, Oct 26, 2020 at 07:33:49AM -0500, Segher Boessenkool wrote:
> On Sun, Oct 25, 2020 at 09:50:01PM +1030, Alan Modra wrote:
> > All these tests fail with -m32 due to lack of int128 support,
> 
> Is there any good reason __int128 is not enabled for rs6000 -m32, btw?

Lack of addti3 and subti3 perhaps?

> > in some
> > cases with what I thought was not the best error message.  For example
> > vsx_mask-move-runnable.c:34:3: error: unknown type name 'vector'
> > is misleading.  The problem isn't "vector" but "vector __uint128_t".
> 
> Ouch, yes.  Do you see a simple way to fix that?

I haven't looked.  The only reason I commented on the error was in the
hope that someone who knows gcc intimately enough to fix this without
much effort would do so.  :-)

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] Non-pcrel tests when power10

2020-10-26 Thread Alan Modra via Gcc-patches
On Mon, Oct 26, 2020 at 03:18:39PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Oct 22, 2020 at 05:28:17PM +1030, Alan Modra wrote:
> > These tests require -mno-pcrel because they are testing features
> > of the non-pcrel ABI.
> 
> > --- a/gcc/testsuite/gcc.target/powerpc/cprophard.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/cprophard.c
> > @@ -1,6 +1,6 @@
> >  /* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
> 
> Make this { target lp64 } if you want?

OK, changed.

> > --- a/gcc/testsuite/gcc.target/powerpc/pr79439-1.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/pr79439-1.c
> > @@ -1,5 +1,5 @@
> 
> (another)

But not this one.

> > --- a/gcc/testsuite/gcc.target/powerpc/pr79439-2.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/pr79439-2.c
> > @@ -1,5 +1,5 @@
> >  /* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */
> 
> (wow there are many)

Or this.  The tests are linux specific scan-assembler tests.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] Tests that use int128_t and -m32

2020-10-26 Thread Alan Modra via Gcc-patches
On Sun, Oct 25, 2020 at 10:43:12AM -0400, David Edelsohn wrote:
> On Sun, Oct 25, 2020 at 7:20 AM Alan Modra  wrote:
> >
> > All these tests fail with -m32 due to lack of int128 support, in some
> > cases with what I thought was not the best error message.  For example
> > vsx_mask-move-runnable.c:34:3: error: unknown type name 'vector'
> > is misleading.  The problem isn't "vector" but "vector __uint128_t".
> >
> > * gcc.target/powerpc/vsx-load-element-extend-char.c: Require int128.
> > * gcc.target/powerpc/vsx-load-element-extend-int.c: Likewise.
> > * gcc.target/powerpc/vsx-load-element-extend-longlong.c: Likewise.
> > * gcc.target/powerpc/vsx-load-element-extend-short.c: Likewise.
> > * gcc.target/powerpc/vsx-store-element-truncate-char.c: Likewise.
> > * gcc.target/powerpc/vsx-store-element-truncate-int.c: Likewise.
> > * gcc.target/powerpc/vsx-store-element-truncate-longlong.c: 
> > Likewise.
> > * gcc.target/powerpc/vsx-store-element-truncate-short.c: Likewise.
> > * gcc.target/powerpc/vsx_mask-count-runnable.c: Likewise.
> > * gcc.target/powerpc/vsx_mask-expand-runnable.c: Likewise.
> > * gcc.target/powerpc/vsx_mask-extract-runnable.c: Likewise.
> > * gcc.target/powerpc/vsx_mask-move-runnable.c: Likewise.
> 
> Good catch.
> 
> Another problem with all of the vsx_mask test cases is that they use
> -mcpu=power10 instead of  -mdejagnu-cpu=power10.  Can you follow up
> with that fix or do you want me to?

Sure, I can do that if you're pre-approving the patch.
gcc.target/powerpc/pr93122.c too.

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] biarch test fail

2020-10-25 Thread Alan Modra via Gcc-patches
I thought this one was worth at least commenting as to why it fails
when biarch testing.  OK?

* gcc.target/powerpc/bswap64-4.c: Comment.

diff --git a/gcc/testsuite/gcc.target/powerpc/bswap64-4.c 
b/gcc/testsuite/gcc.target/powerpc/bswap64-4.c
index a3c05539652..11787000409 100644
--- a/gcc/testsuite/gcc.target/powerpc/bswap64-4.c
+++ b/gcc/testsuite/gcc.target/powerpc/bswap64-4.c
@@ -7,6 +7,12 @@
 /* { dg-final { scan-assembler-times "ldbrx" 1 { target has_arch_pwr7 } } } */
 /* { dg-final { scan-assembler-times "stdbrx" 1 { target has_arch_pwr7 } } } */
 
+/* This test will fail when biarch testing with
+   "RUNTESTFLAGS=--target_board=unix'{-m64,-m32}'" because the -m32 is
+   added on the command line after the dg-options -mpowerpc64, and
+   common/config/rs6000/rs6000-common.c:rs6000_handle_option disables
+   -mpowerpc64 for -m32.  */
+
 long long swap_load (long long *a) { return __builtin_bswap64 (*a); }
 long long swap_reg (long long a) { return __builtin_bswap64 (a); }
 void swap_store (long long *a, long long b) { *a = __builtin_bswap64 (b); }

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] Remove -mpcrel from tests

2020-10-25 Thread Alan Modra via Gcc-patches
When running with -m32
FAIL: gcc.target/powerpc/pr94740.c (test for excess errors)
Excess errors:
cc1: error: '-mpcrel' requires '-mcmodel=medium'

The others don't run for -m32, but remove the unnecessary -mpcrel
anyway.

* gcc.target/powerpc/localentry-1.c: Remove -mpcrel from options.
* gcc.target/powerpc/notoc-direct-1.c: Likewise.
* gcc.target/powerpc/pr94740.c: Likewise.

diff --git a/gcc/testsuite/gcc.target/powerpc/localentry-1.c 
b/gcc/testsuite/gcc.target/powerpc/localentry-1.c
index c3c51680cfe..1343df2b8f1 100644
--- a/gcc/testsuite/gcc.target/powerpc/localentry-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/localentry-1.c
@@ -1,11 +1,10 @@
 /* { dg-do compile } */
-/* { dg-options "-mdejagnu-cpu=power10 -O2 -mpcrel" } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
 /* { dg-require-effective-target powerpc_elfv2 } */
 /* { dg-require-effective-target power10_ok } */
 
-/* Ensure we generate ".localentry fn,1" for both leaf and non-leaf functions.
-   At present, -mcpu=power10 does not enable pc-relative mode, so make sure we
-   enable it to be able to check for .localentry.  */
+/* Ensure we generate ".localentry fn,1" for both leaf and non-leaf
+   functions.  */
 
 extern int y (int);
 
diff --git a/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c 
b/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c
index 74187e1d5dc..8fa09b03f4f 100644
--- a/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c
@@ -1,11 +1,10 @@
 /* { dg-do compile } */
-/* { dg-options "-mdejagnu-cpu=power10 -O2 -mpcrel" } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
 /* { dg-require-effective-target powerpc_elfv2 } */
 /* { dg-require-effective-target power10_ok } */
 
-/* Test that calls generated from PC-relative code are annotated with @notoc.
-   At present, -mcpu=power10 does not enable pc-relative mode.  Enable it here
-   explicitly until it is turned on by default.  */
+/* Test that calls generated from PC-relative code are annotated with
+   @notoc.  */
 
 extern int yy0 (int);
 extern void yy1 (int);
@@ -40,4 +39,3 @@ int ww (void)
 /* { dg-final { scan-assembler {yy0@notoc} } } */
 /* { dg-final { scan-assembler {zz1@notoc} } } */
 /* { dg-final { scan-assembler {zz0@notoc} } } */
-
diff --git a/gcc/testsuite/gcc.target/powerpc/pr94740.c 
b/gcc/testsuite/gcc.target/powerpc/pr94740.c
index 9c2b4644701..09decc38c2c 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr94740.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr94740.c
@@ -1,7 +1,7 @@
 /* PR rtl-optimization/94740 */
 /* { dg-do compile } */
 /* { dg-require-effective-target power10_ok } */
-/* { dg-options "-O2 -mdejagnu-cpu=power10 -mpcrel" } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10" } */
 
 int array[8];
 int

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] Unsupported test options for -m32

2020-10-25 Thread Alan Modra via Gcc-patches
FAIL: gcc.target/powerpc/swaps-p8-22.c (test for excess errors)
Excess errors:
cc1: error: '-mcmodel' not supported in this configuration

* gcc.target/powerpc/swaps-p8-22.c: Disable for -m32.

diff --git a/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c 
b/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c
index 83f6ab3a1c0..bceada41b75 100644
--- a/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c
+++ b/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-require-effective-target { lp64 && powerpc_p8vector_ok } } */
 /* { dg-options "-O2 -mdejagnu-cpu=power8 -maltivec -mcmodel=large" } */
 
 /* The expansion for vector character multiply introduces a vperm operation.

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] Tests that use int128_t and -m32

2020-10-25 Thread Alan Modra via Gcc-patches
-git 
a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c
index 7fce6a44d4f..465fbeaf6ab 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c
@@ -3,6 +3,7 @@
 
 /* { dg-do compile {target power10_ok} } */
 /* { dg-do run {target power10_hw} } */
+/* { dg-require-effective-target { int128 } } */
 
 /* Deliberately set optization to zero for this test to confirm
the stxvr*x instruction is generated. At higher optimization levels
diff --git 
a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c
index 17925c87732..f87256921bf 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c
@@ -3,6 +3,7 @@
 
 /* { dg-do compile {target power10_ok} } */
 /* { dg-do run {target power10_hw} } */
+/* { dg-require-effective-target { int128 } } */
 
 /* Deliberately set optization to zero for this test to confirm
the stxvr*x instruction is generated. At higher optimization levels
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c
index 5862517eae9..6ac4ed2173f 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c
@@ -1,7 +1,7 @@
 /* { dg-do run { target { power10_hw } } } */
 /* { dg-do link { target { ! power10_hw } } } */
 /* { dg-options "-mcpu=power10 -O2" } */
-/* { dg-require-effective-target power10_ok } */
+/* { dg-require-effective-target { int128 && power10_ok } } */
 
 /* Check that the expected 128-bit instructions are generated if the processor
supports the 128-bit integer instructions. */
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c
index 13b4c8afd4f..05fedf77eb9 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c
@@ -1,7 +1,7 @@
 /* { dg-do run { target { power10_hw } } } */
 /* { dg-do link { target { ! power10_hw } } } */
 /* { dg-options "-mcpu=power10 -O2" } */
-/* { dg-require-effective-target power10_ok } */
+/* { dg-require-effective-target { int128 && power10_ok } } */
 
 /* Check that the expected 128-bit instructions are generated if the processor
supports the 128-bit integer instructions. */
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c
index d58a6b0b682..6e952695905 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c
@@ -1,7 +1,7 @@
 /* { dg-do run { target { power10_hw } } } */
 /* { dg-do link { target { ! power10_hw } } } */
 /* { dg-options "-mcpu=power10 -O2" } */
-/* { dg-require-effective-target power10_ok } */
+/* { dg-require-effective-target { int128 && power10_ok } } */
 
 /* Check that the expected 128-bit instructions are generated if the processor
supports the 128-bit integer instructions. */
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c
index 9147d67c9d1..c2eb53d3bb2 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c
@@ -1,7 +1,7 @@
 /* { dg-do run { target { power10_hw } } } */
 /* { dg-do link { target { ! power10_hw } } } */
 /* { dg-options "-mcpu=power10 -O2" } */
-/* { dg-require-effective-target power10_ok } */
+/* { dg-require-effective-target { int128 && power10_ok } } */
 
 /* Check that the expected 128-bit instructions are generated if the processor
supports the 128-bit integer instructions. */

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] VSX_MM_SUFFIX

2020-10-24 Thread Alan Modra via Gcc-patches
On Sat, Oct 24, 2020 at 12:31:43PM -0500, Segher Boessenkool wrote:
> On Sat, Oct 24, 2020 at 02:59:34PM +1030, Alan Modra wrote:
> > Those instructions aren't generated, we don't see them anywhere on a
> > power10 all-lang bootstrap except in one testcase designed to exercise
> > them.
> 
> But that is completely not relevant :-)
> 
> If you use a macro that doesn't exist, the compiler simply does not
> build!

My empirical evidence to the contrary says your theoretical arguments
are invalid.  :-)

$ gcc/xgcc -Bgcc/ -S 
~/src/gcc/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c -O2 
-mcpu=power10
$ grep VSX_MM vsx_mask-count-runnable.s
vcntmb 9,0,1
vcntmb 9,0,1
vcntmb 9,0,1
vcntmb 9,0,1

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] VSX_MM_SUFFIX

2020-10-23 Thread Alan Modra via Gcc-patches
On Fri, Oct 23, 2020 at 04:06:57PM -0500, Segher Boessenkool wrote:
> On Fri, Oct 23, 2020 at 09:41:30AM +1030, Alan Modra wrote:
> > On Thu, Oct 22, 2020 at 11:03:14AM -0500, Segher Boessenkool wrote:
> > > On Thu, Oct 22, 2020 at 08:58:10AM -0700, Carl Love wrote:
> > > > OK, looks like VSX_MM_SUFFIX doesn't exist anymore.  Don't know what
> > > > happed to it.  Thanks.
> > > 
> > > It never existed on trunk.  Please always regstrap patches on trunk
> > > before committing.  It feels unnecessary at times, but now you know
> > > why :-)
> > 
> > Regression testing wouldn't have caught the problem, unless Carl had
> > done so on power10.
> 
> "regstrap" is "bootstrap + regression check".  It won't bootstrap (won't
> cross build even afaics?)

Those instructions aren't generated, we don't see them anywhere on a
power10 all-lang bootstrap except in one testcase designed to exercise
them.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] altivec style lvx/stvx addresses vs power10

2020-10-23 Thread Alan Modra via Gcc-patches
Hi Segher,

On Fri, Oct 23, 2020 at 11:31:02AM -0500, Segher Boessenkool wrote:
> On Fri, Oct 23, 2020 at 05:15:08PM +1030, Alan Modra wrote:
> > The problem starts with fwprop creating
> > (insn 9 4 0 2 (set (mem:V8HI (and:DI (plus:DI (reg/v/f:DI 121 [ vpp ])
> > (const_int 12 [0xc]))
> > (const_int -16 [0xfff0])) [0 MEM  > short int> [(void *)_4 & -16B]+0 S16 A128])
> > (reg/v:V8HI 120 [ vp1 ])) "pixel.c":6:10 1237 {vsx_movv8hi_64bit}
> > which is finally thrown out as invalid by lra.  lra of course does that
> > by reloading the entire address.
> 
> While it could/should just reload that 12 into a reg :-(  Could you
> investigate doing that?  Is there any reason we should not?

My first thought was to write the necessary reloads too, but 
I think it's better to not generate the invalid addressing in the
first place.  Hiding the +12 in an invalid memory address means it
isn't exposed to optimization passes.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 2/8] [RS6000] rs6000_rtx_costs for AND

2020-10-23 Thread Alan Modra via Gcc-patches
On Fri, Oct 23, 2020 at 07:18:44PM -0400, Hans-Peter Nilsson wrote:
> On Thu, 22 Oct 2020, Alan Modra via Gcc-patches wrote:
> >   /* (reg) is costed at zero by rtlanal.c:rtx_cost.  That sets a
> >  baseline for rtx costs:  If a constant is valid in an insn,
> >  it is free.  */
> 
> >From where is this quote?  My "git grep" fails to find it for me
> (on master).  It seems like a port-specific commment so I'd
> have expected to find it somewhere in config/rs6000.

It's in a patch in the rs6000_rtx_costs series that hasn't yet been
committed.
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555757.html

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] Adjust testcases for power10 instructions

2020-10-23 Thread Alan Modra via Gcc-patches
On Fri, Oct 23, 2020 at 01:22:31PM -0500, Segher Boessenkool wrote:
> On Fri, Oct 23, 2020 at 04:45:29PM +1030, Alan Modra wrote:
> > Revised patch, removing changes to
> > gcc.target/powerpc/fold-vec-st-double.c,
> > gcc.target/powerpc/fold-vec-st-longlong.c,
> > gcc.target/powerpc/fold-vec-st-pixel.c.  Fixing fails on those three
> > tests will be the subject of another patch.
> 
> Okido.
> 
> > Most of these changes are fairly obvious.  Duplicated setbcr in
> > +/* { dg-final { scan-assembler-times {\maddic\M|\msetbcr\M} 4 } } */
> > +/* { dg-final { scan-assembler-times {\msubfe\M|\msetbcr\M} 1 } } */
> > is due to addic;subfe being replaced in one function with setbcr.
> 
> But that won't really work.  If there is more than one addic replaced by
> setbcr, that second scan fails (because it matches at least two times
> then).

Sure it works.  I did test the patch!  If future code gen changes, the
test will need updating at that point.  scan-assembler tests require
maintenance..

> > * gcc.dg/pr56727-2.c,
> ...
> > * gcc.target/powerpc/ppc-eq0-1.c,
> > * gcc.target/powerpc/ppc-ne0-1.c,
> > * gcc.target/powerpc/pr86731-fwrapv-longlong.c: Match power10 insns.
> 
> This should all be behind only one "*" (so delete it on all but the
> first line here).

Huh, ok.

> > --- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c
> > +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c
> > @@ -25,6 +25,6 @@ main1 (void)
> > with no word loads (lw, lwu, lwz, lwzu, or their indexed forms)
> > or word stores (stw, stwu, stwx, stwux, or their indexed forms).  */
> >  
> > -/* { dg-final { scan-assembler "\t(lvx|lxv|lvsr|stxv)" } } */
> > +/* { dg-final { scan-assembler "\t(lvx|lxv|lvsr|stxv|plxv|pstxv)" } } */
> 
> /* { dg-final { scan-assembler "\t(lvx|p?lxv|lvsr|p?stxv)" } } */
> might be more readable/maintainable/extensible?

OK.

> > --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c
> > @@ -36,4 +36,4 @@ BUILD_VAR_TEST( test10, vector unsigned char, signed long 
> > long, vector unsigned
> >  BUILD_VAR_TEST( test11, vector unsigned char, signed int, vector unsigned 
> > char);
> >  BUILD_CST_TEST( test12, vector unsigned char, 8, vector unsigned char);
> >  
> > -/* { dg-final { scan-assembler-times 
> > {\mlxvw4x\M|\mlxvd2x\M|\mlxvx\M|\mlvx\M} 12 } } */
> > +/* { dg-final { scan-assembler-times 
> > {\mlxvw4x\M|\mlxvd2x\M|\mlxvx\M|\mlvx\M|\mplxv\M} 12 } } */
> 
> Here, it did not allow lxv before.  Should it?
> 
> (in many files)

We don't generate lxv due to the insn only supporting DQ offsets.  For
example, on power9 test3_cst from this test is
li 9,2
lxvx 34,3,9
blr
On power10
plxv 34,2(3)
blr

> Have you verified the p10 code generation actually makes sense?

Yes, I did.

> > --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-splat-longlong.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-splat-longlong.c
> > @@ -25,7 +25,7 @@ vector signed long long test_sll () { const vector signed 
> > long long y = {34, 45}
> >  vector unsigned long long test_ull () { const vector unsigned long long y 
> > = {56, 67}; return vec_splat (y, 0b00010); }
> >  
> >  /* Assorted load instructions for the initialization with known constants. 
> > */
> > -/* { dg-final { scan-assembler-times {\mlvx\M|\mlxvd2x\M|\mlxv\M} 3 } } */
> > +/* { dg-final { scan-assembler-times 
> > {\mlvx\M|\mlxvd2x\M|\mlxv\M|\mplxv\M|\mplxv\M} 3 } } */
> 
> You have plxv twice here.

Oops.

> > --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-store-builtin_vec_xst-char.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-store-builtin_vec_xst-char.c
> > @@ -36,4 +36,4 @@ BUILD_VAR_TEST( test10,  vector unsigned char, signed 
> > long long, vector unsigned
> >  BUILD_VAR_TEST( test11,  vector unsigned char, signed int, vector unsigned 
> > char );
> >  BUILD_CST_TEST( test12,  vector unsigned char, 12, vector unsigned char );
> >  
> > -/* { dg-final { scan-assembler-times 
> > {\mstxvw4x\M|\mstxvd2x\M|\mstxvx\M|\mstvx\M} 12 } } */
> > +/* { dg-final { scan-assembler-times 
> > {\mstxvw4x\M|\mstxvd2x\M|\mstxvx\M|\mstvx\M|\mpstxv\M} 12 } } */
> 
> Similarly, should it have plain stxv as well?

No.  Same reason as to why lxv isn't generated.

> > --- a/gcc/testsuite/gcc.target/powerpc/lvsl-lvsr.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/lvsl-lvsr.c
> > @@ -1,12 +1,10 @@
>

[RS6000] Power10 ICE running gcc.target/powerpc/ppc-ne0-1.c

2020-10-23 Thread Alan Modra via Gcc-patches
rs6000_emit_int_cmove generates isel so the condition below needs
fixing for power10.  Bootstrapped and regression tested
powerpc64le-linux power10 and power8.  OK?

* config/rs6000/rs6000.md (cstore4): Don't call
rs6000_emit_int_cmove for power10 when -mno-isel.

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 2bf5df41a9b..cd41b759f71 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -11538,7 +11538,7 @@
   ""
 {
   /* Everything is best done with setbc[r] if available.  */
-  if (TARGET_POWER10)
+  if (TARGET_POWER10 && TARGET_ISEL)
 rs6000_emit_int_cmove (operands[0], operands[1], const1_rtx, const0_rtx);
 
   /* Expanding EQ and NE directly to some machine instructions does not help

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] altivec style lvx/stvx addresses vs power10

2020-10-23 Thread Alan Modra via Gcc-patches
gcc.target/powerpc/fold-vec-st-pixel.c and other testcases fail on
power10, generating
addi 9,5,12
rldicr 9,9,0,59
stxv 34,0(9)
rather than
addi 5,5,12
stvx 2,0,5
for an altivec lvx/stvx style address.

The problem starts with fwprop creating
(insn 9 4 0 2 (set (mem:V8HI (and:DI (plus:DI (reg/v/f:DI 121 [ vpp ])
(const_int 12 [0xc]))
(const_int -16 [0xfff0])) [0 MEM  [(void *)_4 & -16B]+0 S16 A128])
(reg/v:V8HI 120 [ vp1 ])) "pixel.c":6:10 1237 {vsx_movv8hi_64bit}
which is finally thrown out as invalid by lra.  lra of course does that
by reloading the entire address.

fwprop creates the invalid address due to rs6000_legitimate_address_p
trimming off the outer AND of altivec style addresses before applying
other predicates.  address_is_prefixed then allows the inner address.

Now at the time the AND stripping was added (git commit 850e8d3d56d),
rs6000_legitimate_address looked a lot simpler.  This patch allows
through just those addresses that were legitimate in those simpler
days.  (legitimate_small_data_p, legitimate_constant_pool_address_p,
legitimate_lo_sum_address_p, and legitimate_offset_address_p did get
to look at the inside of an AND address, but I'm fairly certain
small_data, constant_pool and lo_sum addresses would never be wrapped
in an AND, and offset_address_p of that time excluded altivec modes.)

Regstrapped powerpc64le-linux power8 and power10, with tests on
powerpc64-linux -m64/-m32 still running.  OK assuming they pass?

* config/rs6000/rs6000.c (rs6000_legitimate_address_p): Limit
AND addressing to just lvx/stvx style addresses.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 2f0ca7af030..7fd80fb75ff 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -9128,15 +9128,21 @@ rs6000_legitimate_address_p (machine_mode mode, rtx x, 
bool reg_ok_strict)
   bool reg_offset_p = reg_offset_addressing_ok_p (mode);
   bool quad_offset_p = mode_supports_dq_form (mode);
 
-  /* If this is an unaligned stvx/ldvx type address, discard the outer AND.  */
+  if (TARGET_ELF && RS6000_SYMBOL_REF_TLS_P (x))
+return 0;
+
+  /* Handle unaligned altivec lvx/stvx type addresses.  */
   if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)
   && GET_CODE (x) == AND
   && CONST_INT_P (XEXP (x, 1))
   && INTVAL (XEXP (x, 1)) == -16)
-x = XEXP (x, 0);
+{
+  x = XEXP (x, 0);
+  return (legitimate_indirect_address_p (x, reg_ok_strict)
+ || legitimate_indexed_address_p (x, reg_ok_strict)
+ || virtual_stack_registers_memory_p (x));
+}
 
-  if (TARGET_ELF && RS6000_SYMBOL_REF_TLS_P (x))
-return 0;
   if (legitimate_indirect_address_p (x, reg_ok_strict))
 return 1;
   if (TARGET_UPDATE

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] Adjust testcases for power10 instructions

2020-10-23 Thread Alan Modra via Gcc-patches
On Thu, Oct 22, 2020 at 11:25:09PM +1030, Alan Modra wrote:
> Some of these are wrong, sorry.  I need to go over and check them
> thoroughly.  Please consider the patch withdrawn.

Revised patch, removing changes to
gcc.target/powerpc/fold-vec-st-double.c,
gcc.target/powerpc/fold-vec-st-longlong.c,
gcc.target/powerpc/fold-vec-st-pixel.c.  Fixing fails on those three
tests will be the subject of another patch.

Most of these changes are fairly obvious.  Duplicated setbcr in
+/* { dg-final { scan-assembler-times {\maddic\M|\msetbcr\M} 4 } } */
+/* { dg-final { scan-assembler-times {\msubfe\M|\msetbcr\M} 1 } } */
is due to addic;subfe being replaced in one function with setbcr.

* gcc.dg/pr56727-2.c,
* gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c,
* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c,
* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-double.c,
* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-float.c,
* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-int.c,
* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-longlong.c,
* gcc.target/powerpc/fold-vec-load-builtin_vec_xl-short.c,
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-char.c,
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-double.c,
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-float.c,
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-int.c,
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-longlong.c,
* gcc.target/powerpc/fold-vec-load-vec_vsx_ld-short.c,
* gcc.target/powerpc/fold-vec-load-vec_xl-char.c,
* gcc.target/powerpc/fold-vec-load-vec_xl-double.c,
* gcc.target/powerpc/fold-vec-load-vec_xl-float.c,
* gcc.target/powerpc/fold-vec-load-vec_xl-int.c,
* gcc.target/powerpc/fold-vec-load-vec_xl-longlong.c,
* gcc.target/powerpc/fold-vec-load-vec_xl-short.c,
* gcc.target/powerpc/fold-vec-splat-floatdouble.c,
* gcc.target/powerpc/fold-vec-splat-longlong.c,
* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-char.c,
* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-double.c,
* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-float.c,
* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-int.c,
* gcc.target/powerpc/fold-vec-store-builtin_vec_xst-short.c,
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-char.c,
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-double.c,
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-float.c,
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-int.c,
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-longlong.c,
* gcc.target/powerpc/fold-vec-store-vec_vsx_st-short.c,
* gcc.target/powerpc/fold-vec-store-vec_xst-char.c,
* gcc.target/powerpc/fold-vec-store-vec_xst-double.c,
* gcc.target/powerpc/fold-vec-store-vec_xst-float.c,
* gcc.target/powerpc/fold-vec-store-vec_xst-int.c,
* gcc.target/powerpc/fold-vec-store-vec_xst-longlong.c,
* gcc.target/powerpc/fold-vec-store-vec_xst-short.c,
* gcc.target/powerpc/lvsl-lvsr.c,
* gcc.target/powerpc/ppc-eq0-1.c,
* gcc.target/powerpc/ppc-ne0-1.c,
* gcc.target/powerpc/pr86731-fwrapv-longlong.c: Match power10 insns.
* gcc.target/powerpc/lvsl-lvsr.c: Avoid file name match.

diff --git a/gcc/testsuite/gcc.dg/pr56727-2.c b/gcc/testsuite/gcc.dg/pr56727-2.c
index c54369ed25e..f055116772a 100644
--- a/gcc/testsuite/gcc.dg/pr56727-2.c
+++ b/gcc/testsuite/gcc.dg/pr56727-2.c
@@ -18,4 +18,4 @@ void h ()
 
 /* { dg-final { scan-assembler "@(PLT|plt)" { target i?86-*-* x86_64-*-* } } } 
*/
 /* { dg-final { scan-assembler "@(PLT|plt)" { target { powerpc*-*-linux* && 
ilp32 } } } } */
-/* { dg-final { scan-assembler "bl f\n\\s*nop" { target { powerpc*-*-linux* && 
lp64 } } } } */
+/* { dg-final { scan-assembler {bl f(\n\s*nop|@notoc\n)} { target { 
powerpc*-*-linux* && lp64 } } } } */
diff --git 
a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c 
b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c
index 246f38fa6d1..d9f173b521e 100644
--- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c
@@ -25,6 +25,6 @@ main1 (void)
with no word loads (lw, lwu, lwz, lwzu, or their indexed forms)
or word stores (stw, stwu, stwx, stwux, or their indexed forms).  */
 
-/* { dg-final { scan-assembler "\t(lvx|lxv|lvsr|stxv)" } } */
+/* { dg-final { scan-assembler "\t(lvx|lxv|lvsr|stxv|plxv|pstxv)" } } */
 /* { dg-final { scan-assembler-not "\tlwz?u?x? " { xfail { powerpc-ibm-aix* } 
} } } */
 /* { dg-final { scan-assembler-not "\tstwu?x? " } } */
diff --git 
a/gcc/testsuite/gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c 
b

Re: [RS6000] Power10 vec-splati-runnable multiple test failures

2020-10-22 Thread Alan Modra via Gcc-patches
On Thu, Oct 22, 2020 at 09:08:58AM -0700, Carl Love wrote:
> I see the error print statement you changed so that it would not wrap. 
> I have always been told it is best not to break the print statement
> across two lines.  The argument is it makes it harder to find it in the
> code when using grep.  In this case, it should be clear what file the
> error statement is in.  What is your take in general about breaking or
> not breaking the body of an error print statement across lines?

Submit to https://gcc.gnu.org/codingconventions.html which says

"Formatting Conventions
 Line Length

 Lines shall be at most 80 columns."

Seriously, it doesn't matter what you or I think about the formatting
rules, they are in place for good reason.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] VSX_MM_SUFFIX

2020-10-22 Thread Alan Modra via Gcc-patches
On Thu, Oct 22, 2020 at 11:03:14AM -0500, Segher Boessenkool wrote:
> On Thu, Oct 22, 2020 at 08:58:10AM -0700, Carl Love wrote:
> > OK, looks like VSX_MM_SUFFIX doesn't exist anymore.  Don't know what
> > happed to it.  Thanks.
> 
> It never existed on trunk.  Please always regstrap patches on trunk
> before committing.  It feels unnecessary at times, but now you know
> why :-)

Regression testing wouldn't have caught the problem, unless Carl had
done so on power10.  "dg_do compile" tests don't run the assembler,
and "dg_do run" tests with "dg_require_effective_target power10_hw" do
nothing at all if not on power10 hardware.  The single test that might
have caught the problem, vec-splati-runnable.c wasn't even compiled.
That's why I submitted
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/556757.html

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] Adjust testcases for power10 instructions

2020-10-22 Thread Alan Modra via Gcc-patches
On Thu, Oct 22, 2020 at 05:33:46PM +1030, Alan Modra wrote:
>   * gcc.dg/pr56727-2.c,
>   * gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c,
>   * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c,
>   * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-double.c,
>   * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-float.c,
>   * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-int.c,
>   * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-longlong.c,
>   * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-short.c,
>   * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-char.c,
>   * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-double.c,
>   * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-float.c,
>   * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-int.c,
>   * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-longlong.c,
>   * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-short.c,
>   * gcc.target/powerpc/fold-vec-load-vec_xl-char.c,
>   * gcc.target/powerpc/fold-vec-load-vec_xl-double.c,
>   * gcc.target/powerpc/fold-vec-load-vec_xl-float.c,
>   * gcc.target/powerpc/fold-vec-load-vec_xl-int.c,
>   * gcc.target/powerpc/fold-vec-load-vec_xl-longlong.c,
>   * gcc.target/powerpc/fold-vec-load-vec_xl-short.c,
>   * gcc.target/powerpc/fold-vec-splat-floatdouble.c,
>   * gcc.target/powerpc/fold-vec-splat-longlong.c,
>   * gcc.target/powerpc/fold-vec-st-double.c,
>   * gcc.target/powerpc/fold-vec-st-longlong.c,
>   * gcc.target/powerpc/fold-vec-st-pixel.c,
>   * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-char.c,
>   * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-double.c,
>   * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-float.c,
>   * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-int.c,
>   * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-short.c,
>   * gcc.target/powerpc/fold-vec-store-vec_vsx_st-char.c,
>   * gcc.target/powerpc/fold-vec-store-vec_vsx_st-double.c,
>   * gcc.target/powerpc/fold-vec-store-vec_vsx_st-float.c,
>   * gcc.target/powerpc/fold-vec-store-vec_vsx_st-int.c,
>   * gcc.target/powerpc/fold-vec-store-vec_vsx_st-longlong.c,
>   * gcc.target/powerpc/fold-vec-store-vec_vsx_st-short.c,
>   * gcc.target/powerpc/fold-vec-store-vec_xst-char.c,
>   * gcc.target/powerpc/fold-vec-store-vec_xst-double.c,
>   * gcc.target/powerpc/fold-vec-store-vec_xst-float.c,
>   * gcc.target/powerpc/fold-vec-store-vec_xst-int.c,
>   * gcc.target/powerpc/fold-vec-store-vec_xst-longlong.c,
>   * gcc.target/powerpc/fold-vec-store-vec_xst-short.c,
>   * gcc.target/powerpc/lvsl-lvsr.c,
>   * gcc.target/powerpc/ppc-eq0-1.c,
>   * gcc.target/powerpc/ppc-ne0-1.c,
>   * gcc.target/powerpc/pr86731-fwrapv-longlong.c: Match power10 insns.
>   * gcc.target/powerpc/lvsl-lvsr.c: Avoid file name match.
> 
> Regstrapped powerpc64le-linux power10 and power8.  OK?

Some of these are wrong, sorry.  I need to go over and check them
thoroughly.  Please consider the patch withdrawn.

> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-st-pixel.c 
> b/gcc/testsuite/gcc.target/powerpc/fold-vec-st-pixel.c
> index 5b95cc73d8d..0a3aaec6d8d 100644
> --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-st-pixel.c
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-st-pixel.c
> @@ -19,4 +19,4 @@ testst_cst1 (vector pixel vp1, int i1, vector pixel * vpp)
>   return vec_st(vp1, 12, vpp);
>  }
>  
> -/* { dg-final { scan-assembler-times {\mstvx\M}  2 } } */
> +/* { dg-final { scan-assembler-times {\mstvx\M|\mpstxv\M}  2 } } */

For example, this one.  We don't get two stvx insns here on power10,
as we should, but we do need altivec style addressing (ie. & -16).  So
pstxv should not be a pass.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] calls.c:can_implement_as_sibling_call_p REG_PARM_STACK_SPACE check

2020-10-22 Thread Alan Modra via Gcc-patches
On Mon, Oct 12, 2020 at 10:37:05PM +1030, Alan Modra wrote:
> Ping?
> 
> On Fri, Oct 02, 2020 at 05:03:50PM +0930, Alan Modra wrote:
> https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555309.html

Ping^2

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] Adjust testcases for power10 instructions

2020-10-22 Thread Alan Modra via Gcc-patches
ian.  */
 
 /* { dg-do compile { target { powerpc64le-*-* } } } */
 /* { dg-options "-O0 -Wno-deprecated" } */
-/* { dg-final { scan-assembler-times "lvsl" 2 } } */
-/* { dg-final { scan-assembler-times "lvsr" 2 } } */
-/* { dg-final { scan-assembler-times {\mlxvd2x\M|\mlxv\M} 2 } } */
+/* { dg-final { scan-assembler-times {\slvsl\s} 1 } } */
+/* { dg-final { scan-assembler-times {\slvsr\s} 1 } } */
+/* { dg-final { scan-assembler-times {\mlxvd2x\M|\mlxv\M|\mplxv\M} 2 } } */
 /* { dg-final { scan-assembler-times {\m(?:v|xx)permr?\M} 2 } } */
 
 
diff --git a/gcc/testsuite/gcc.target/powerpc/ppc-eq0-1.c 
b/gcc/testsuite/gcc.target/powerpc/ppc-eq0-1.c
index 496a6e340c0..63799a6c8a8 100644
--- a/gcc/testsuite/gcc.target/powerpc/ppc-eq0-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/ppc-eq0-1.c
@@ -7,4 +7,4 @@ int foo(int x)
   return x == 0;
 }
 
-/* { dg-final { scan-assembler "cntlzw|isel" } } */
+/* { dg-final { scan-assembler {\mcntlzw\M|\misel\M|\msetbc\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/ppc-ne0-1.c 
b/gcc/testsuite/gcc.target/powerpc/ppc-ne0-1.c
index 63c4b6087df..31627b192ab 100644
--- a/gcc/testsuite/gcc.target/powerpc/ppc-ne0-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/ppc-ne0-1.c
@@ -2,9 +2,9 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -mno-isel" } */
 
-/* { dg-final { scan-assembler-times "addic" 4 } } */
-/* { dg-final { scan-assembler-times "subfe" 1 } } */
-/* { dg-final { scan-assembler-times "addze" 3 } } */
+/* { dg-final { scan-assembler-times {\maddic\M|\msetbcr\M} 4 } } */
+/* { dg-final { scan-assembler-times {\msubfe\M|\msetbcr\M} 1 } } */
+/* { dg-final { scan-assembler-times {\maddze\M} 3 } } */
 
 long ne0(long a)
 {
diff --git a/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c 
b/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
index 1269fe635c6..b2cbc420922 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c
@@ -30,5 +30,5 @@ vector signed long long splats4(void)
 
 /* { dg-final { scan-assembler-times {\mvspltis[bhw]\M} 0 } } */
 /* { dg-final { scan-assembler-times {\mvsl[bhwd]\M} 0 } } */
-/* { dg-final { scan-assembler-times {\mlvx\M|\mlxv\M|\mlxvd2x\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mlvx\M|\mlxv\M|\mlxvd2x\M|\mplxv\M} 2 } 
} */
 

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] Link power10 testcases

2020-10-22 Thread Alan Modra via Gcc-patches
-git a/gcc/testsuite/gcc.target/powerpc/vec-ternarylogic-7.c 
b/gcc/testsuite/gcc.target/powerpc/vec-ternarylogic-7.c
index b754195c2dc..45936f36869 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-ternarylogic-7.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-ternarylogic-7.c
@@ -1,5 +1,6 @@
-/* { dg-do run } */
-/* { dg-require-effective-target power10_hw } */
+/* { dg-do run { target { power10_hw } } } */
+/* { dg-do link { target { ! power10_hw } } } */
+/* { dg-require-effective-target power10_ok } */
 /* { dg-options "-mdejagnu-cpu=power10" } */
 
 #include 
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-ternarylogic-9.c 
b/gcc/testsuite/gcc.target/powerpc/vec-ternarylogic-9.c
index 0d9998e6341..71e52ee880b 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-ternarylogic-9.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-ternarylogic-9.c
@@ -1,5 +1,6 @@
-/* { dg-do run } */
-/* { dg-require-effective-target power10_hw } */
+/* { dg-do run { target { power10_hw } } } */
+/* { dg-do link { target { ! power10_hw } } } */
+/* { dg-require-effective-target power10_ok } */
 /* { dg-require-effective-target int128 } */
 /* { dg-options "-mdejagnu-cpu=power10" } */
 
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c
index f1e3860ee43..5862517eae9 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c
@@ -1,6 +1,7 @@
-/* { dg-do run } */
+/* { dg-do run { target { power10_hw } } } */
+/* { dg-do link { target { ! power10_hw } } } */
 /* { dg-options "-mcpu=power10 -O2" } */
-/* { dg-require-effective-target power10_hw } */
+/* { dg-require-effective-target power10_ok } */
 
 /* Check that the expected 128-bit instructions are generated if the processor
supports the 128-bit integer instructions. */
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c
index 0c5695e4807..13b4c8afd4f 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c
@@ -1,6 +1,7 @@
-/* { dg-do run } */
+/* { dg-do run { target { power10_hw } } } */
+/* { dg-do link { target { ! power10_hw } } } */
 /* { dg-options "-mcpu=power10 -O2" } */
-/* { dg-require-effective-target power10_hw } */
+/* { dg-require-effective-target power10_ok } */
 
 /* Check that the expected 128-bit instructions are generated if the processor
supports the 128-bit integer instructions. */
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c
index 93c3c720246..d58a6b0b682 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c
@@ -1,6 +1,7 @@
-/* { dg-do run } */
+/* { dg-do run { target { power10_hw } } } */
+/* { dg-do link { target { ! power10_hw } } } */
 /* { dg-options "-mcpu=power10 -O2" } */
-/* { dg-require-effective-target power10_hw } */
+/* { dg-require-effective-target power10_ok } */
 
 /* Check that the expected 128-bit instructions are generated if the processor
supports the 128-bit integer instructions. */
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c
index 41dee583e59..9147d67c9d1 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c
@@ -1,6 +1,7 @@
-/* { dg-do run } */
+/* { dg-do run { target { power10_hw } } } */
+/* { dg-do link { target { ! power10_hw } } } */
 /* { dg-options "-mcpu=power10 -O2" } */
-/* { dg-require-effective-target power10_hw } */
+/* { dg-require-effective-target power10_ok } */
 
 /* Check that the expected 128-bit instructions are generated if the processor
supports the 128-bit integer instructions. */
diff --git a/gcc/testsuite/gcc.target/powerpc/xxgenpc-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/xxgenpc-runnable.c
index 244c57365d4..d4040ea8b70 100644
--- a/gcc/testsuite/gcc.target/powerpc/xxgenpc-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/xxgenpc-runnable.c
@@ -1,6 +1,7 @@
-/* { dg-do run } */
+/* { dg-do run { target { power10_hw } } } */
+/* { dg-do link { target { ! power10_hw } } } */
 /* { dg-options "-mdejagnu-cpu=power10 -O2" } */
-/* { dg-require-effective-target power10_hw } */
+/* { dg-require-effective-target power10_ok } */
 
 #include 
 

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] dimode_off.c test

2020-10-22 Thread Alan Modra via Gcc-patches
This tests behaviour near the limit of 16-bit signed offsets.  If
power10 prefix instructions are enabled, no such testing occurs.

* gcc.target/powerpc/dimode_off.c: Add -mno-prefixed to options.

Regstrapped powerpc64le-linux power10 and power8.  OK?

diff --git a/gcc/testsuite/gcc.target/powerpc/dimode_off.c 
b/gcc/testsuite/gcc.target/powerpc/dimode_off.c
index 19ca40c508b..12718eafdd3 100644
--- a/gcc/testsuite/gcc.target/powerpc/dimode_off.c
+++ b/gcc/testsuite/gcc.target/powerpc/dimode_off.c
@@ -1,5 +1,5 @@
 /* { dg-do assemble } */
-/* { dg-options "-O2 -fno-align-functions -fno-asynchronous-unwind-tables 
-mtraceback=no -save-temps" } */
+/* { dg-options "-O2 -fno-align-functions -fno-asynchronous-unwind-tables 
-mtraceback=no -mno-prefixed -save-temps" } */
 
 void w1 (void *x, long long y) { *(long long *) (x + 32767) = y; }
 void w2 (void *x, long long y) { *(long long *) (x + 32766) = y; }

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] Non-pcrel tests when power10

2020-10-22 Thread Alan Modra via Gcc-patches
These tests require -mno-pcrel because they are testing features
of the non-pcrel ABI.

* gcc.target/powerpc/cprophard.c: Add -mno-pcrel to options.
* gcc.target/powerpc/float128-hw3.c: Likewise.
* gcc.target/powerpc/pr79439-1.c: Likewise.
* gcc.target/powerpc/pr79439-2.c: Likewise.
* gcc.target/powerpc/r2_shrink-wrap.c: Likewise.

Regstrapped powerpc64le-linux power10 and power8.  OK?

diff --git a/gcc/testsuite/gcc.target/powerpc/cprophard.c 
b/gcc/testsuite/gcc.target/powerpc/cprophard.c
index f93081f0cb5..3c548495192 100644
--- a/gcc/testsuite/gcc.target/powerpc/cprophard.c
+++ b/gcc/testsuite/gcc.target/powerpc/cprophard.c
@@ -1,6 +1,6 @@
 /* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
 /* { dg-skip-if "" { powerpc*-*-darwin* } } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -mno-pcrel" } */
 /* { dg-final { scan-assembler {ld 2,(24|40)\(1\)} } } */
 
 /* From a linux kernel mis-compile of net/core/skbuff.c.  */
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-hw3.c 
b/gcc/testsuite/gcc.target/powerpc/float128-hw3.c
index b3bbeb25678..630c93dfcb7 100644
--- a/gcc/testsuite/gcc.target/powerpc/float128-hw3.c
+++ b/gcc/testsuite/gcc.target/powerpc/float128-hw3.c
@@ -1,7 +1,7 @@
 /* { dg-do compile { target lp64 } } */
 /* { dg-require-effective-target powerpc_p9vector_ok } */
 /* { dg-require-effective-target float128 } */
-/* { dg-options "-mpower9-vector -O2 -ffast-math -std=c11" } */
+/* { dg-options "-mpower9-vector -O2 -ffast-math -std=c11 -mno-pcrel" } */
 
 /* Test to make sure the compiler calls the external function instead of doing
the built-in processing for _Float128 functions that have hardware support
diff --git a/gcc/testsuite/gcc.target/powerpc/pr79439-1.c 
b/gcc/testsuite/gcc.target/powerpc/pr79439-1.c
index 539c96f571e..8eb08a4e762 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr79439-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr79439-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */
-/* { dg-options "-O2 -fpic -fno-reorder-blocks -fno-inline-functions" } */
+/* { dg-options "-O2 -fpic -fno-reorder-blocks -fno-inline-functions 
-mno-pcrel" } */
 
 /* On the Linux 64-bit ABIs, we eliminate NOP in the 'rec' call even if
-fpic is used.  The recursive call should call the local alias.  The
diff --git a/gcc/testsuite/gcc.target/powerpc/pr79439-2.c 
b/gcc/testsuite/gcc.target/powerpc/pr79439-2.c
index b53af445265..9ebcf2579ab 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr79439-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr79439-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */
-/* { dg-options "-O2 -fpic -fno-reorder-blocks" } */
+/* { dg-options "-O2 -fpic -fno-reorder-blocks -mno-pcrel" } */
 
 /* On the Linux 64-bit ABIs, we should not eliminate NOP in the 'rec' call if
-fpic is used because rec can be interposed at link time (since it has an
diff --git a/gcc/testsuite/gcc.target/powerpc/r2_shrink-wrap.c 
b/gcc/testsuite/gcc.target/powerpc/r2_shrink-wrap.c
index b81b9b1c2ee..a74da38740b 100644
--- a/gcc/testsuite/gcc.target/powerpc/r2_shrink-wrap.c
+++ b/gcc/testsuite/gcc.target/powerpc/r2_shrink-wrap.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target lp64 } } */
-/* { dg-options "-O2 -fdump-rtl-pro_and_epilogue" } */
+/* { dg-options "-O2 -fdump-rtl-pro_and_epilogue -mno-pcrel" } */
 
 /* Verify we move the prologue past the TOC reference of 'j' and shrink-wrap
the function. */

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] Power10 vec-splati-runnable multiple test failures

2020-10-22 Thread Alan Modra via Gcc-patches
FAIL: gcc.target/powerpc/vec-splati-runnable.c 1 blank line(s) in output
FAIL: gcc.target/powerpc/vec-splati-runnable.c (test for excess errors)
Excess errors:
rs6000_emit_xxspltidp_v2df called ...

and running the test fails.  As the comment says
  /* Although the instruction says the results are not defined, it does seem
 to work, at least on Mambo.  But no guarentees!  */
So the simulator works but not real hardware.

Regstrapped powerpc64le-linux on power10 and power8.  OK?

gcc/
* config/rs6000/rs6000.c (rs6000_emit_xxspltidp_v2df): Delete
debug printf.  Remove trailing ".\n" from inform message.
Break long line.
gcc/testsuite/
* gcc.target/powerpc/vec-splati-runnable.c: Don't abort on
undefined output.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index c0adc56387f..2f0ca7af030 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -27485,11 +27485,10 @@ rs6000_const_f32_to_i32 (rtx operand)
 void
 rs6000_emit_xxspltidp_v2df (rtx dst, long value)
 {
-  printf("rs6000_emit_xxspltidp_v2df called %ld\n", value);
-  printf("rs6000_emit_xxspltidp_v2df called 0x%lx\n", value);
   if (((value & 0x7F80) == 0) && ((value & 0x7F) != 0))
 inform (input_location,
-   "the result for the xxspltidp instruction is undefined for 
subnormal input values.\n");
+   "the result for the xxspltidp instruction "
+   "is undefined for subnormal input values");
   emit_insn( gen_xxspltidp_v2df_inst (dst, GEN_INT (value)));
 }
 
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
index afb0bfdef3a..e5a4935644f 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c
@@ -5,7 +5,7 @@
 
 #define DEBUG 0
 
-#ifdef DEBUG
+#if DEBUG
 #include 
 #endif
 
@@ -100,7 +100,7 @@ main (int argc, char *argv [])
   printf(" vresult_d[%i] = %e, expected_vresult_d[%i] = %e\n",
 i, vresult_d[i], i, expected_vresult_d[i]);
 #else
-abort();
+;
 #endif
   }
 

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] VSX_MM_SUFFIX

2020-10-22 Thread Alan Modra via Gcc-patches
gcc.target/powerpc/vsx_mask-count-runnable.c and others
Assembler messages:
Error: unrecognized opcode: `vcntmb'

I'm applying this one as obvious.  Ref
https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549757.html

* config/rs6000/vsx.md (vec_cntmb_, vec_extract_),
(vec_expand_): Replace  with .

diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index c023bc0baaa..d96269367bf 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -6035,7 +6035,7 @@
 (match_operand:QI 2 "const_0_to_1_operand" "n")]
 UNSPEC_VCNTMB))]
   "TARGET_POWER10"
-  "vcntmb %0,%1,%2"
+  "vcntmb %0,%1,%2"
   [(set_attr "type" "vecsimple")])
 
 (define_insn "vec_extract_"
@@ -6043,7 +6043,7 @@
(unspec:SI [(match_operand:VSX_MM 1 "altivec_register_operand" "v")]
UNSPEC_VEXTRACT))]
   "TARGET_POWER10"
-  "vextractm %0,%1"
+  "vextractm %0,%1"
   [(set_attr "type" "vecsimple")])
 
 (define_insn "vec_expand_"
@@ -6051,5 +6051,5 @@
 (unspec:VSX_MM [(match_operand:VSX_MM 1 "vsx_register_operand" "v")]
 UNSPEC_VEXPAND))]
   "TARGET_POWER10"
-  "vexpandm %0,%1"
+  "vexpandm %0,%1"
   [(set_attr "type" "vecsimple")])

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 2/8] [RS6000] rs6000_rtx_costs for AND

2020-10-21 Thread Alan Modra via Gcc-patches
On Wed, Oct 21, 2020 at 03:29:11PM -0500, Segher Boessenkool wrote:
> Anyway:
> 
> +  || (outer_code == AND
> +  && rs6000_is_valid_2insn_and (x, mode)))
> {
>   *total = COSTS_N_INSNS (1);
>   return true;
> 
> It should return COSTS_N_INSNS (2) for that?

No, it should not!

  /* (reg) is costed at zero by rtlanal.c:rtx_cost.  That sets a
 baseline for rtx costs:  If a constant is valid in an insn,
 it is free.  */

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 2/8] [RS6000] rs6000_rtx_costs for AND

2020-10-20 Thread Alan Modra via Gcc-patches
On Tue, Oct 20, 2020 at 01:55:56PM -0500, Segher Boessenkool wrote:
> On Thu, Oct 08, 2020 at 09:27:54AM +1030, Alan Modra wrote:
> > The existing "case AND" in this function is not sufficient for
> > optabs.c:avoid_expensive_constant usage, where the AND is passed in
> > outer_code.  We'd like to cost AND of rs6000_is_valid_and_mask
> > or rs6000_is_valid_2insn_and variety there, so that those masks aren't
> > seen as expensive (ie. better to load to a reg then AND).
> > 
> > * config/rs6000/rs6000.c (rs6000_rtx_costs): Combine CONST_INT
> > AND handling with IOR/XOR.  Move costing for AND with
> > rs6000_is_valid_and_mask or rs6000_is_valid_2insn_and to
> > CONST_INT.
> 
> Sorry this took so long to review :-(
> 
> On 64-bit BE this leads to *bigger* code, and closer observation shows
> that some common sequences degrade on all configs.  This seems to mostly
> be about "andc" (and its dot form).  It wasn't costed properly before,
> but after your patch, a single instruction is replaced by three.
> 
> Could you look into this?

~/build/gcc-alan/gcc$ for z in *.o; do if test `objdump -dr $z | grep andc | wc 
-l` != `objdump -dr ../../gcc/gcc/$z | grep andc | wc -l`; then echo $z; fi; 
done
gimplify.o
insn-emit.o
insn-opinit.o
insn-recog.o
rs6000-string.o

All of these are exactly the case I talked about in
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553919.html

"Sometimes correct insn cost leads to unexpected results.  For
example:

extern unsigned bar (void);
unsigned
f1 (unsigned a)
{
  if ((a & 0x01000200) == 0x01000200)
return bar ();
  return 0;
}

emits for a & 0x01000200
 (set (reg) (and (reg) (const_int 0x01000200)))
at expand time (two rlwinm insns) rather than the older
 (set (reg) (const_int 0x01000200))
 (set (reg) (and (reg) (reg)))
which is three insns.  However, since 0x01000200 is needed later the
older code after optimisation is smaller."

Things have changed slightly since I wrote the above, with the two
rlwinm insns being emitted at expand time, so you see
 (set (reg) (and (reg) (const_int 0xff0003ff)))
 (set (reg) (and (reg) (const_int 0x01fffe00)))
but of course that doesn't change anything regarding the cost of
"a & 0x01000200".

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] calls.c:can_implement_as_sibling_call_p REG_PARM_STACK_SPACE check

2020-10-12 Thread Alan Modra via Gcc-patches
Ping?

On Fri, Oct 02, 2020 at 05:03:50PM +0930, Alan Modra wrote:
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555309.html

-- 
Alan Modra
Australia Development Lab, IBM


[PATCH 6/8] [RS6000] rs6000_rtx_costs multi-insn constants

2020-10-07 Thread Alan Modra via Gcc-patches
This small patch to rs6000_rtx_costs considerably improves code
generated for large constants in 64-bit code, teaching gcc that it is
better to load a constant from memory than to generate a sequence of
up to five dependent instructions.  Note that the rs6000 backend does
generate large constants as loads from memory at expand time but
optimisation passes replace them with SETs of the value due to not
having correct costs.

PR 94393
* config/rs6000/rs6000.c (rs6000_rtx_costs): Cost multi-insn
constants.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 15a806fe307..76aedbfae6f 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -21343,7 +21343,9 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int 
outer_code,
 
   switch (code)
 {
-  /* On the RS/6000, if it is valid in the insn, it is free.  */
+  /* (reg) is costed at zero by rtlanal.c:rtx_cost.  That sets a
+baseline for rtx costs:  If a constant is valid in an insn,
+it is free.  */
 case CONST_INT:
   if (((outer_code == SET
|| outer_code == PLUS
@@ -21404,6 +21406,17 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int 
outer_code,
 
 case CONST_DOUBLE:
 case CONST_WIDE_INT:
+  /* Subtract one insn here for consistency with the above code
+that returns one less than the actual number of insns for
+SETs.  Don't subtract one for other than SETs, because other
+operations will require the constant to be loaded to a
+register before performing the operation.  All special cases
+for codes other than SET must be handled before we get
+here.  */
+  *total = COSTS_N_INSNS (num_insns_constant (x, mode)
+ - (outer_code == SET ? 1 : 0));
+  return true;
+
 case CONST:
 case HIGH:
 case SYMBOL_REF:


  1   2   3   4   5   6   7   8   9   10   >