Re: [PATCH 0/2] mmap: Avoid the sanitizer configure check failure
On Tue, Apr 09, 2024 at 07:24:33AM -0700, H.J. Lu wrote: > Define GCC_AC_FUNC_MMAP with export ASAN_OPTIONS=detect_leaks=0 to avoid > the sanitizer configure check failure. OK for binutils. (I just fixed my local copy of autoconf so I wouldn't run into this again.) The proper fix of course is to update autotools to something more recent. -- Alan Modra Australia Development Lab, IBM
Re: CREL relocation format for ELF (was: RELLEB)
On Fri, Mar 22, 2024 at 06:51:41PM -0700, Fangrui Song wrote: > On Thu, Mar 14, 2024 at 5:16 PM Fangrui Song wrote: > > I propose RELLEB, a new format offering significant file size > > reductions: 17.2% (x86-64), 16.5% (aarch64), and even 32.4% (riscv64)! > > > > Your thoughts on RELLEB are welcome! Does anyone really care about relocatable object file size? If they do, wouldn't they be better off using a compressed file system? -- Alan Modra Australia Development Lab, IBM
Re: [RFC,patch] Linker plugin - extend API for offloading corner case (aka: LDPT_REGISTER_CLAIM_FILE_HOOK_V2 linker plugin hook [GCC PR109128])
On Thu, May 04, 2023 at 11:02:25AM +, Richard Biener via Binutils wrote: > So since we expect the linker to use the host side table is there a way > for the plugin to exactly query that (the set of symbols the linker > uses from the object passed to the plugin)? That would be possible and relatively easy to implement, but might be slow. > Because if the linker > uses something from the file but _not_ the host side offload table > (-ffunction-sections -fdata-sections) then things would still go > wrong, right? > Is there a way to connect both in a way that the linker discards > either if the other isn't present? No, or at least I do not want to even think about implementing such a linker "feature". The problem is that after you have modified the global linker symbol table after adding an object's symbols, it is virtually impossible to restore the state of symbols to what they would be without that object. (Yes, we do that sort of thing for as-needed shared libraries, but the restoration happens immediately after adding the symbols. I also regret implementing it the way I did.) The patch posted is OK from the linker side of things. -- Alan Modra Australia Development Lab, IBM
Re: [RFC][top-level] Add configure test-case
On Mon, Nov 07, 2022 at 06:23:45PM +, Joseph Myers wrote: > On Mon, 7 Nov 2022, Alan Modra via Binutils wrote: > > > a) that top-level binutils/gdb patches don't get applied to the gcc > >git repository in a timely manner, or > > If a toplevel patch is approved for either repository, I think you should > treat it as approved for the other one without needing separate review. Thanks Joseph, that's how I see it too. Of course with the understanding that binutils-gdb can't be used as a back door way of sneaking in a gcc-specific change. Can I get agreement among the gcc build maintainers that such a policy is acceptable? -- Alan Modra Australia Development Lab, IBM
Correct spelling of DW_AT_namelist_item
This typo was fixed a little while ago in binutils-gdb with commit e951225303. I noticed the difference today when importing libiberty from gcc. Committed as obvious. include/ * dwarf2.def: Correct spelling of DW_AT_namelist_item. gcc/ * dwarf2out.cc (gen_namelist_decl): Adjust to suit correct spelling of DW_AT_namelist_item. diff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc index fccf59e8ec3..29f32ec6939 100644 --- a/gcc/dwarf2out.cc +++ b/gcc/dwarf2out.cc @@ -27479,7 +27479,7 @@ gen_namelist_decl (tree name, dw_die_ref scope_die, tree item_decls) nml_item_ref_die = force_decl_die (value); nml_item_die = new_die (DW_TAG_namelist_item, nml_die, NULL); - add_AT_die_ref (nml_item_die, DW_AT_namelist_items, nml_item_ref_die); + add_AT_die_ref (nml_item_die, DW_AT_namelist_item, nml_item_ref_die); } return nml_die; } diff --git a/include/dwarf2.def b/include/dwarf2.def index 4214c80907a..530c6f849f9 100644 --- a/include/dwarf2.def +++ b/include/dwarf2.def @@ -289,7 +289,7 @@ DW_AT (DW_AT_frame_base, 0x40) DW_AT (DW_AT_friend, 0x41) DW_AT (DW_AT_identifier_case, 0x42) DW_AT (DW_AT_macro_info, 0x43) -DW_AT (DW_AT_namelist_items, 0x44) +DW_AT (DW_AT_namelist_item, 0x44) DW_AT (DW_AT_priority, 0x45) DW_AT (DW_AT_segment, 0x46) DW_AT (DW_AT_specification, 0x47) -- Alan Modra Australia Development Lab, IBM
Re: [PATCH] libiberty: remove FINAL and OVERRIDE from ansidecl.h
On Mon, May 23, 2022 at 07:42:29PM -0400, David Malcolm via Binutils wrote: > Any objections, or is there a reason to keep these macros that I'm > not aware of? (and did I send this to all the pertinent lists?) No objection from me. These macros are not used anywhere in binutils-gdb. -- Alan Modra Australia Development Lab, IBM
Re: [PATCH] libiberty: stop using PTR macro.
On Tue, May 10, 2022 at 10:56:22AM +0200, Martin Liška wrote: > diff --git a/libiberty/hashtab.c b/libiberty/hashtab.c > @@ -457,15 +457,15 @@ htab_empty (htab_t htab) >else if (htab->free_with_arg_f != NULL) > (*htab->free_with_arg_f) (htab->alloc_arg, htab->entries); >if (htab->alloc_with_arg_f != NULL) > - htab->entries = (PTR *) (*htab->alloc_with_arg_f) (htab->alloc_arg, > nsize, > -sizeof (PTR *)); > + htab->entries = (void **) (*htab->alloc_with_arg_f) (htab->alloc_arg, > nsize, > +sizeof (void **)); Here, and below, the code should really be using "sizeof (void *)". You may as well fix that nit while you're at it. Also, indentation looks wrong. >else > - htab->entries = (PTR *) (*htab->alloc_f) (nsize, sizeof (PTR *)); > + htab->entries = (void **) (*htab->alloc_f) (nsize, sizeof (void **)); > htab->size = nsize; > htab->size_prime_index = nindex; > } >else > -memset (entries, 0, size * sizeof (PTR)); > +memset (entries, 0, size * sizeof (void *)); >htab->n_deleted = 0; >htab->n_elements = 0; > } > @@ -543,10 +543,10 @@ htab_expand (htab_t htab) > } > >if (htab->alloc_with_arg_f != NULL) > -nentries = (PTR *) (*htab->alloc_with_arg_f) (htab->alloc_arg, nsize, > - sizeof (PTR *)); > +nentries = (void **) (*htab->alloc_with_arg_f) (htab->alloc_arg, nsize, > + sizeof (void **)); >else > -nentries = (PTR *) (*htab->alloc_f) (nsize, sizeof (PTR *)); > +nentries = (void **) (*htab->alloc_f) (nsize, sizeof (void **)); >if (nentries == NULL) > return 0; >htab->entries = nentries; Here too. -- Alan Modra Australia Development Lab, IBM
Re: [PATCH] Remove non-ANSI C macros in ansidecl.h.
On Tue, May 10, 2022 at 10:57:47AM +0200, Martin Liška wrote: > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > @Alan: Are you fine with the change from binutils/gdb perspective? I'm fine if this isn't going into the binutils-gdb repo immediately. There are occurrences of PTR in the cgen generated parts of opcodes, sim, and even gdb. I have a few patches I haven't yet committed. -- Alan Modra Australia Development Lab, IBM
Make opcodes configure depend on bfd configure
The idea is for opcodes to be able to see whether bfd is compiled for 64-bit. A lot of --enable-targets=all libopcodes is wasted space if bfd can't load 64-bit target object files. * Makefile.def (configure-opcodes): Depend on configure-bfd. * Makefile.in: Regenerate. Applied to gcc. I'll be importing the gcc versions over to binutils. diff --git a/Makefile.def b/Makefile.def index 0abc42b1a1b..a504192e6d7 100644 --- a/Makefile.def +++ b/Makefile.def @@ -4,7 +4,7 @@ AutoGen definitions Makefile.tpl; // Makefile.in is generated from Makefile.tpl by 'autogen Makefile.def'. // This file was originally written by Nathanael Nerode. // -// Copyright 2002-2019 Free Software Foundation +// Copyright 2002-2021 Free Software Foundation // // This file is free software; you can redistribute it and/or modify // it under the terms of the GNU General Public License as published by @@ -493,6 +493,7 @@ dependencies = { module=install-strip-ld; on=install-strip-bfd; }; dependencies = { module=install-strip-ld; on=install-strip-libctf; }; // libopcodes depends on libbfd +dependencies = { module=configure-opcodes; on=configure-bfd; hard=true; }; dependencies = { module=install-opcodes; on=install-bfd; }; dependencies = { module=install-strip-opcodes; on=install-strip-bfd; }; diff --git a/Makefile.in b/Makefile.in index d13f6c353ee..860cf8f067b 100644 --- a/Makefile.in +++ b/Makefile.in @@ -62717,6 +62717,16 @@ install-ld: maybe-install-libctf install-strip-libctf: maybe-install-strip-bfd install-strip-ld: maybe-install-strip-bfd install-strip-ld: maybe-install-strip-libctf +configure-opcodes: configure-bfd +configure-stage1-opcodes: configure-stage1-bfd +configure-stage2-opcodes: configure-stage2-bfd +configure-stage3-opcodes: configure-stage3-bfd +configure-stage4-opcodes: configure-stage4-bfd +configure-stageprofile-opcodes: configure-stageprofile-bfd +configure-stagetrain-opcodes: configure-stagetrain-bfd +configure-stagefeedback-opcodes: configure-stagefeedback-bfd +configure-stageautoprofile-opcodes: configure-stageautoprofile-bfd +configure-stageautofeedback-opcodes: configure-stageautofeedback-bfd install-opcodes: maybe-install-bfd install-strip-opcodes: maybe-install-strip-bfd configure-gas: maybe-configure-intl -- Alan Modra Australia Development Lab, IBM
obstack.h __PTR_ALIGN vs. ubsan
Current ubsan complains on every use of __PTR_ALIGN (when ptrdiff_t is as large as a pointer), due to making calculations relative to a NULL pointer. This patch avoids the problem by extracting out and simplifying __BPTR_ALIGN for the usual case. I've continued to use ptrdiff_t here, where it might be better to throw away __BPTR_ALIGN entirely and just assume uintptr_t exists. OK to apply for gcc? * obstack.h (__PTR_ALIGN): Expand and simplify __BPTR_ALIGN rather than calculating relative to a NULL pointer. diff --git a/include/obstack.h b/include/obstack.h index a6eb6c95587..0d8746f835b 100644 --- a/include/obstack.h +++ b/include/obstack.h @@ -137,9 +137,9 @@ relative to B. Otherwise, use the faster strategy of computing the alignment relative to 0. */ -#define __PTR_ALIGN(B, P, A) \ - __BPTR_ALIGN (sizeof (ptrdiff_t) < sizeof (void *) ? (B) : (char *) 0, \ -P, A) +#define __PTR_ALIGN(B, P, A) \ + (sizeof (ptrdiff_t) < sizeof (void *) ? __BPTR_ALIGN (B, P, A) \ + : (char *) (((ptrdiff_t) (P) + (A)) & ~(A))) #ifndef __attribute_pure__ # if defined __GNUC_MINOR__ && __GNUC__ * 1000 + __GNUC_MINOR__ >= 2096 -- Alan Modra Australia Development Lab, IBM
Re: [POWER10] __morestack calls from pcrel code
On Wed, Jul 21, 2021 at 08:59:04AM -0400, David Edelsohn wrote: > On Wed, Jul 21, 2021 at 4:29 AM Alan Modra wrote: > > > > On Wed, Jul 14, 2021 at 08:24:16PM -0400, David Edelsohn wrote: > > > > > > * config/rs6000/morestack.S (R2_SAVE): Define. > > > > > > (__morestack): Save and restore r2. Set up r2 for called > > > > > > functions. > > > > > > This patch is okay. > > > > Thanks David, the patch is needed on gcc-11 and gcc-10 too. > > OK for the branches too? > > Backports are fine, but I believe that Richi is planning to cut GCC 11 > RC today, so you really should check with him about a backport at the > last minute. Hi Richard, Is this patch OK at this late stage for the gcc-11 branch? https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573978.html The impacts of the bug are segfaults and other undesirable behaviour with Go (or more generally -fsplit-stack) on power10 when libgcc is not power10 pcrel. A non-pcrel libgcc is very likely how distros will ship gcc. -- Alan Modra Australia Development Lab, IBM
Re: [PATCH 2/4 REVIEW] libtool.m4: fix nm BSD flag detection
On Wed, Jul 07, 2021 at 08:03:45PM +0100, Nick Alcock via Gcc-patches wrote: > On 7 Jul 2021, Nick Clifton told this: > > > Hi Nick, > > > >> Ping? > > > > Oops. > > I sent a bunch of pings out at the same time, to a bunch of different > projects. You are the only person to respond, so thank you! > > >>> PR libctf/27482 > >>> * libtool.m4 (LT_PATH_NM): Try BSDization flags with a user-provided > > > > Changes to libtool need to be posted to the libtool project: > > > > https://www.gnu.org/software/libtool/ > > I considered this, but there is *serious* divergence between the > libtool.m4 in our tree and upstream. Fixing this divergence looks to be > a fairly major project in and of itself :( the last real sync looked > like being all the way back in 2008. Yes, I looked at doing a sync myself a few years ago.. I'll OK the two libtool changes for binutils. -- Alan Modra Australia Development Lab, IBM
Re: [POWER10] __morestack calls from pcrel code
On Wed, Jun 30, 2021 at 05:06:30PM -0300, Tulio Magno Quites Machado Filho wrote: > Alan Modra via Gcc-patches writes: > > > Compiling gcc/testsuite/gcc.dg/split-*.c and others with -mcpu=power10 > > and linking with a non-pcrel libgcc results in crashes due to the > > power10 pcrel code not having r2 set for the generic-morestack.c > > functions called from __morestack. There is also a problem when > > non-pcrel code calls a pcrel libgcc. See the patch comments. > > > > A similar situation theoretically occurs with ELFv1 multi-toc > > executables, when __morestack might be located in a different toc > > group to its caller. This patch makes no attempt to fix that, since > > the gold linker does not support multi-toc (gold is needed for proper > > support of -fsplit-stack code) nor does gcc emit __morestack calls > > that support multi-toc. > > > > Bootstrapped and regression tested power64le-linux with both > > -mcpu=power10 and -mcpu=power9. OK for mainline and backporting to > > gcc-11 and gcc-10? > > > > * config/rs6000/morestack.S (R2_SAVE): Define. > > (__morestack): Save and restore r2. Set up r2 for called > > functions. > > Thanks! This patch solved the issue I was seeing. > > If it gets merged, can this patch be backported to GCC 10 and 11, please? > > -- > Tulio Magno https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573978.html This patch has now been unreviewed for over two weeks. I expected a rubber stamp style approval; This assembly file is all mine, I know the ABI and how .eh_frame driven exception handling works on powerpc. So I'm going to claim the patch is obvious enough to someone with a good understanding of what is going on in morestack.S and commit under the "obvious" rule after allowing a few more days for comment. -- Alan Modra Australia Development Lab, IBM
Re: [RS6000] Adjust testcases for power10 instructions
On Thu, Jul 01, 2021 at 04:47:21PM -0500, Segher Boessenkool wrote: > Hi! > > On Thu, Jul 01, 2021 at 10:59:15PM +0930, Alan Modra wrote: > > * lib/target-supports.exp (check_effective_target_has_arch_pwr10): New. > > Mike added this already, please make sure to not add it twice :-) Yup, rebasing took it out of my patch and a little edit took it out of my changelog. > [...] > > gcc.target/powerpc/pr86731-fwrapv-longlong.c: Match power10 insns. > > (It still allows older as well, so "Also match" maybe?) OK. > Did you make sure all of these are correct and expected? Yes, they still are. I checked that there was a corresponding testsuite regression fix for each change too. > Are the > testcases still strict enough. I think so. , or should you add -mno-pcrel to the > options, instead? Or maybe test both -mpcrel and -mno-pcrel? Etc. I think adding -mno-pcrel would be a bad idea, since it would reduce power10 code coverage, and you'll get both by simply running the testsuite on power10 and say, power9. > > > * gcc.target/powerpc/lvsl-lvsr.c: Avoid file name match. > > You also add a "p?", is that expected? Should be in the changelog > then :-) It was in the changelog.. I mentioned lvsl-lvsr.c twice (which I suppose might fall foul of the changelog commit checking). Changing to * gcc.target/powerpc/lvsl-lvsr.c: Likewise. Avoid file name match. > > -/* { dg-final { scan-assembler-times {\mlxvd2x\M|\mlxv\M} 2 } } */ > > +/* { dg-final { scan-assembler-times {\mlxvd2x\M|\mp?lxv\M} 2 } } */ > > > > @@ -1,12 +1,12 @@ > > /* Test expected code generation for lvsl and lvsr on little endian. > > - Note that lvsl and lvsr are each produced once, but the filename > > - causes them to appear twice in the file. */ > > + Note that \s is used in the lvsl/lvsr matches so we don't match > > + on '.file "lvsl-lvsr.c"'. */ > > Even better is to not put the instruction names in the filename, but > heh, maybe that would be too simple ;-) > > > Segher -- Alan Modra Australia Development Lab, IBM
[RS6000] Adjust testcases for power10 instructions
-final { scan-assembler-times {\maddic\M} 3 { target { has_arch_pwr10 } } } } */ +/* { dg-final { scan-assembler-times {\msetbcr\M} 1 { target { has_arch_pwr10 } } } } */ +/* { dg-final { scan-assembler-times {\maddze\M} 3 } } */ long ne0(long a) { diff --git a/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c b/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c index 1269fe635c6..72b83ef9d08 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c +++ b/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c @@ -30,5 +30,5 @@ vector signed long long splats4(void) /* { dg-final { scan-assembler-times {\mvspltis[bhw]\M} 0 } } */ /* { dg-final { scan-assembler-times {\mvsl[bhwd]\M} 0 } } */ -/* { dg-final { scan-assembler-times {\mlvx\M|\mlxv\M|\mlxvd2x\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mlvx\M|\mp?lxv\M|\mlxvd2x\M} 2 } } */ diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 7f78c5593ac..789723fb287 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -6127,6 +6127,16 @@ proc check_effective_target_has_arch_pwr9 { } { }] } +proc check_effective_target_has_arch_pwr10 { } { + return [check_no_compiler_messages arch_pwr10 assembly { + #ifndef _ARCH_PWR10 + #error does not have power10 support. + #else + /* "has power10 support" */ + #endif + }] +} + # Return 1 if this is a PowerPC target supporting -mcpu=power10. # Limit this to 64-bit linux systems for now until other targets support # power10. -- Alan Modra Australia Development Lab, IBM
[POWER10] __morestack calls from pcrel code
Compiling gcc/testsuite/gcc.dg/split-*.c and others with -mcpu=power10 and linking with a non-pcrel libgcc results in crashes due to the power10 pcrel code not having r2 set for the generic-morestack.c functions called from __morestack. There is also a problem when non-pcrel code calls a pcrel libgcc. See the patch comments. A similar situation theoretically occurs with ELFv1 multi-toc executables, when __morestack might be located in a different toc group to its caller. This patch makes no attempt to fix that, since the gold linker does not support multi-toc (gold is needed for proper support of -fsplit-stack code) nor does gcc emit __morestack calls that support multi-toc. Bootstrapped and regression tested power64le-linux with both -mcpu=power10 and -mcpu=power9. OK for mainline and backporting to gcc-11 and gcc-10? * config/rs6000/morestack.S (R2_SAVE): Define. (__morestack): Save and restore r2. Set up r2 for called functions. diff --git a/libgcc/config/rs6000/morestack.S b/libgcc/config/rs6000/morestack.S index 4a07de927c5..a2e255e5c21 100644 --- a/libgcc/config/rs6000/morestack.S +++ b/libgcc/config/rs6000/morestack.S @@ -31,6 +31,7 @@ #define PARAMS 48 #endif #define MORESTACK_FRAMESIZE(PARAMS+96) +#define R2_SAVE-MORESTACK_FRAMESIZE+PARAMS-8 #define PARAMREG_SAVE -MORESTACK_FRAMESIZE+PARAMS+0 #define STATIC_CHAIN_SAVE -MORESTACK_FRAMESIZE+PARAMS+64 #define R29_SAVE -MORESTACK_FRAMESIZE+PARAMS+72 @@ -143,6 +144,17 @@ ENTRY0(__morestack_non_split) # cr7 must also be preserved. ENTRY0(__morestack) + +#if _CALL_ELF == 2 +# Functions with localentry bits of zero cannot make calls if those +# calls might change r2. This is true generally, and also true for +# __morestack with its special calling convention. When __morestack's +# caller is non-pcrel but libgcc is pcrel, the functions called here +# might modify r2. r2 must be preserved on exit, and also restored +# for the call back to our caller. + std %r2,R2_SAVE(%r1) +#endif + # Save parameter passing registers, our arguments, lr, r29 # and use r29 as a frame pointer. std %r3,PARAMREG_SAVE+0(%r1) @@ -161,10 +173,24 @@ ENTRY0(__morestack) std %r12,LINKREG_SAVE(%r1) std %r3,NEWSTACKSIZE_SAVE(%r1) # new stack size mr %r29,%r1 +#if _CALL_ELF == 2 + .cfi_offset %r2,R2_SAVE +#endif .cfi_offset %r29,R29_SAVE .cfi_def_cfa_register %r29 stdu %r1,-MORESTACK_FRAMESIZE(%r1) +#if _CALL_ELF == 2 && !defined __PCREL__ +# If this isn't a pcrel libgcc then the functions we call here will +# require r2 to be valid. If __morestack is called from pcrel code r2 +# won't be valid. Set it up. + bcl 20,31,1f +1: + mflr %r12 + addis %r2,%r12,.TOC.-1b@ha + addi %r2,%r2,.TOC.-1b@l +#endif + # void __morestack_block_signals (void) bl JUMP_TARGET(__morestack_block_signals) @@ -199,6 +225,9 @@ ENTRY0(__morestack) # instructions after __morestack's return address. # ld %r12,LINKREG_SAVE(%r29) +#if _CALL_ELF == 2 + ld %r2,R2_SAVE(%r29) +#endif ld %r3,PARAMREG_SAVE+0(%r29)# restore arg regs ld %r4,PARAMREG_SAVE+8(%r29) ld %r5,PARAMREG_SAVE+16(%r29) @@ -228,6 +257,15 @@ ENTRY0(__morestack) std %r10,PARAMREG_SAVE+56(%r29) #endif +#if _CALL_ELF == 2 && !defined __PCREL__ +# r2 was restored for calling back into our caller. Set it up again. + bcl 20,31,1f +1: + mflr %r12 + addis %r2,%r12,.TOC.-1b@ha + addi %r2,%r2,.TOC.-1b@l +#endif + bl JUMP_TARGET(__morestack_block_signals) # void *__generic_releasestack (size_t *pavailable) @@ -249,6 +287,9 @@ ENTRY0(__morestack) # Restore return value regs, and return. ld %r0,LINKREG_SAVE(%r29) mtlr %r0 +#if _CALL_ELF == 2 + ld %r2,R2_SAVE(%r29) +#endif ld %r3,PARAMREG_SAVE+0(%r29) ld %r4,PARAMREG_SAVE+8(%r29) ld %r5,PARAMREG_SAVE+16(%r29) -- Alan Modra Australia Development Lab, IBM
Re: PowerPC64 ELFv2 -fpatchable-function-entry
On Tue, May 18, 2021 at 05:48:40AM -0500, Segher Boessenkool wrote: > I wrote a bit later: > > > > Can you make this less hacky please? Changing the generic code is just > > > fine as well, it needs some love. > > In effect making a callback / hook without making that explicit is bad > for maintainability. We are in stage 1, now is a good time for > infrastructure changes. Yes, perhaps I could do that, and possibly even write a patch that is accepted. I'm not going to though, because I made a decision a little while ago that I'm not going to contribute new gcc work. This one was just flushing some patches that I wrote a while ago. -- Alan Modra Australia Development Lab, IBM
Re: PowerPC64 ELFv2 -fpatchable-function-entry
On Mon, May 10, 2021 at 04:39:55PM -0500, Segher Boessenkool wrote: > Hi! > > On Fri, May 07, 2021 at 12:19:52PM +0930, Alan Modra wrote: > > PowerPC64 ELFv2 dual entry point functions have a couple of problems > > with -fpatchable-function-entry. One is that the nops added after the > > global entry land in the global entry code which is constrained to be > > a power of two number of instructions, and zero global entry code has > > special meaning for linkage. The other is that the global entry code > > isn't always used on function entry, and some uses of > > -fpatchable-function-entry might want to affect all entries to the > > function. So this patch arranges to put one batch of nops before the > > global entry, and the other after the local entry point. > > Wow ugly. Not worse than patchable-funtion-enbtry itself I suppose :-) > > > * config/rs6000/rs6000-logue.c: Include targhooks.h. > > Huh, did it not already do that?! Hrm, all the other hooks seem to be > called via rs6000.c currently. But you do the same, so why do you need > to include the header in rs6000-logue.c? For the new call to default_print_patchable_function_entry in rs6000_output_function_prologue. > > > +/* Write NOPs into the asm outfile FILE around a function entry. This > > + routine may be called twice per function to put NOPs before and after > > + the function entry. If RECORD_P is true the location of the NOPs will > > + be recorded by default_print_patchable_function_entry in a special > > + object section called "__patchable_function_entries". Disable output > > + of any NOPs for the second call. Those, if any, are output by > > + rs6000_output_function_prologue. This means that for ELFv2 any NOPs > > + after the function entry are placed after the local entry point, not > > + the global entry point. NOPs after the entry may be found at > > + record_loc + nops_before * 4 + local_entry_offset. This holds true > > + when nops_before is zero. */ > > + > > +static void > > +rs6000_print_patchable_function_entry (FILE *file, > > + unsigned HOST_WIDE_INT patch_area_size > > ATTRIBUTE_UNUSED, > > + bool record_p) > > It is not a predicate. Do not call it _p please. Yes, I know the > generic code calls is this. That needs fixing. > > A better name (from the viewpoint of callers, which is what matters!) > might be first_time or similar? Or something that really says what it > *does*? > > > +{ > > + /* Always call default_print_patchable_function_entry when RECORD_P in > > + order to output the location of the NOPs, but use the size of the > > + area before the entry on both possible calls. If RECORD_P is true > > + on the second call then the area before the entry was zero size and > > + thus no NOPs will be output. */ > > + if (record_p) > > +default_print_patchable_function_entry (file, crtl->patch_area_entry, > > + record_p); > > +} > > That is trickiness that will break later. There is a fine line between trickiness and elegance. Given the current available crtl variables dealing with the patch area and the current patchable_function_entry parameters, any supposedly "less hacky" but correct expression will simplify down to what I wrote here. -- Alan Modra Australia Development Lab, IBM
Re: PowerPC64 ELFv1 -fpatchable-function-entry
On Fri, May 07, 2021 at 07:24:02PM -0500, Segher Boessenkool wrote: > Looks good with those things tweaked. Except that the patch chose the wrong section to emit the label, because the decl is wrong. But of course I was using the same decl as used in existing SHF_LINK_ORDER support, so it was already broken. You can see this on x86_64 gcc/testsuite/g++.dg/pr93195a.C output for bar(). .globl _Z3barv .type _Z3barv, @function _Z3barv: .LFB1: .cfi_startproc .section__patchable_function_entries,"awo",@progbits,_Z3foov .align 8 .quad .LPFE2 .text .LPFE2: nop pushq %rbp Note the reference to _Z3foov! Tracking down why this happens isn't hard. The first get_section ("__patchable_function_entries", ...) saves the decl for foo, which is the one then used on all subsequent calls. Oops. Jakub, commit 134afa38f0be didn't quite do the right thing. This problem can be fixed with a simplification of the patch I posted, relying on the fact that currently in the only place that wants to emit SHF_LINK_ORDER sections we've already switched to the correct function code section. Regression testing in progress. PR target/98125 * varasm.c (default_elf_asm_named_section): Use a function code label rather than the function symbol as the "o" argument. diff --git a/gcc/varasm.c b/gcc/varasm.c index 97c1e6fff25..6f10ac7762e 100644 --- a/gcc/varasm.c +++ b/gcc/varasm.c @@ -6866,6 +6866,15 @@ default_elf_asm_named_section (const char *name, unsigned int flags, *f = '\0'; } + char func_label[256]; + if (flags & SECTION_LINK_ORDER) +{ + static int func_code_labelno; + func_code_labelno++; + ASM_GENERATE_INTERNAL_LABEL (func_label, "LPFC", func_code_labelno); + ASM_OUTPUT_LABEL (asm_out_file, func_label); +} + fprintf (asm_out_file, "\t.section\t%s,\"%s\"", name, flagchars); /* default_section_type_flags (above) knows which flags need special @@ -6893,11 +6902,8 @@ default_elf_asm_named_section (const char *name, unsigned int flags, fprintf (asm_out_file, ",%d", flags & SECTION_ENTSIZE); if (flags & SECTION_LINK_ORDER) { - tree id = DECL_ASSEMBLER_NAME (decl); - ultimate_transparent_alias_target (); - const char *name = IDENTIFIER_POINTER (id); - name = targetm.strip_name_encoding (name); - fprintf (asm_out_file, ",%s", name); + fprintf (asm_out_file, ","); + assemble_name_raw (asm_out_file, func_label); } if (HAVE_COMDAT_GROUP && (flags & SECTION_LINKONCE)) { -- Alan Modra Australia Development Lab, IBM
PowerPC64 ELFv2 -fpatchable-function-entry
PowerPC64 ELFv2 dual entry point functions have a couple of problems with -fpatchable-function-entry. One is that the nops added after the global entry land in the global entry code which is constrained to be a power of two number of instructions, and zero global entry code has special meaning for linkage. The other is that the global entry code isn't always used on function entry, and some uses of -fpatchable-function-entry might want to affect all entries to the function. So this patch arranges to put one batch of nops before the global entry, and the other after the local entry point. PR target/98125 * config/rs6000/rs6000.c (rs6000_print_patchable_function_entry): New function. (TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY): Define. * config/rs6000/rs6000-logue.c: Include targhooks.h. (rs6000_output_function_prologue): Handle nops for -fpatchable-function-entry after local entry point. diff --git a/gcc/config/rs6000/rs6000-logue.c b/gcc/config/rs6000/rs6000-logue.c index b0ac183ceff..ffa3bb3dcf1 100644 --- a/gcc/config/rs6000/rs6000-logue.c +++ b/gcc/config/rs6000/rs6000-logue.c @@ -51,6 +51,7 @@ #include "gstab.h" /* for N_SLINE */ #include "dbxout.h" /* dbxout_ */ #endif +#include "targhooks.h" static int rs6000_ra_ever_killed (void); static void is_altivec_return_reg (rtx, void *); @@ -3991,6 +3992,10 @@ rs6000_output_function_prologue (FILE *file) fputs (",1\n", file); } + int nops_after_entry = crtl->patch_area_size - crtl->patch_area_entry; + if (nops_after_entry > 0) +default_print_patchable_function_entry (file, nops_after_entry, false); + /* Output -mprofile-kernel code. This needs to be done here instead of in output_function_profile since it must go after the ELFv2 ABI local entry point. */ diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index d43c36e7f1a..97f1b3e0674 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -1404,6 +1404,10 @@ static const struct attribute_spec rs6000_attribute_table[] = #undef TARGET_ASM_FUNCTION_EPILOGUE #define TARGET_ASM_FUNCTION_EPILOGUE rs6000_output_function_epilogue +#undef TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY +#define TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY \ + rs6000_print_patchable_function_entry + #undef TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA #define TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA rs6000_output_addr_const_extra @@ -14953,6 +14957,33 @@ rs6000_assemble_visibility (tree decl, int vis) } #endif +/* Write NOPs into the asm outfile FILE around a function entry. This + routine may be called twice per function to put NOPs before and after + the function entry. If RECORD_P is true the location of the NOPs will + be recorded by default_print_patchable_function_entry in a special + object section called "__patchable_function_entries". Disable output + of any NOPs for the second call. Those, if any, are output by + rs6000_output_function_prologue. This means that for ELFv2 any NOPs + after the function entry are placed after the local entry point, not + the global entry point. NOPs after the entry may be found at + record_loc + nops_before * 4 + local_entry_offset. This holds true + when nops_before is zero. */ + +static void +rs6000_print_patchable_function_entry (FILE *file, + unsigned HOST_WIDE_INT patch_area_size ATTRIBUTE_UNUSED, + bool record_p) +{ + /* Always call default_print_patchable_function_entry when RECORD_P in + order to output the location of the NOPs, but use the size of the + area before the entry on both possible calls. If RECORD_P is true + on the second call then the area before the entry was zero size and + thus no NOPs will be output. */ + if (record_p) +default_print_patchable_function_entry (file, crtl->patch_area_entry, + record_p); +} + enum rtx_code rs6000_reverse_condition (machine_mode mode, enum rtx_code code) {
Revert "rs6000: Avoid -fpatchable-function-entry* regressions on powerpc64 be [PR98125]"
This reverts commit b680b9049737198d010e49cf434704c6a6ed2b3f now that the PowerPC64 ELFv1 regression is fixed properly. PR testsuite/98125 * targhooks.h (default_print_patchable_function_entry_1): Delete. * targhooks.c (default_print_patchable_function_entry_1): Delete. (default_print_patchable_function_entry): Expand above. * config/rs6000/rs6000.c (TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY): Don't define. (rs6000_print_patchable_function_entry): Delete. * testsuite/g++.dg/pr93195a.C: Revert 2021-04-03 change. diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 663eed4f055..d43c36e7f1a 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -1362,10 +1362,6 @@ static const struct attribute_spec rs6000_attribute_table[] = #define TARGET_ASM_ASSEMBLE_VISIBILITY rs6000_assemble_visibility #endif -#undef TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY -#define TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY \ - rs6000_print_patchable_function_entry - #undef TARGET_SET_UP_BY_PROLOGUE #define TARGET_SET_UP_BY_PROLOGUE rs6000_set_up_by_prologue @@ -14957,30 +14953,6 @@ rs6000_assemble_visibility (tree decl, int vis) } #endif -/* Write PATCH_AREA_SIZE NOPs into the asm outfile FILE around a function - entry. If RECORD_P is true and the target supports named sections, - the location of the NOPs will be recorded in a special object section - called "__patchable_function_entries". This routine may be called - twice per function to put NOPs before and after the function - entry. */ - -void -rs6000_print_patchable_function_entry (FILE *file, - unsigned HOST_WIDE_INT patch_area_size, - bool record_p) -{ - unsigned int flags = SECTION_WRITE | SECTION_RELRO; - /* When .opd section is emitted, the function symbol - default_print_patchable_function_entry_1 is emitted into the .opd section - while the patchable area is emitted into the function section. - Don't use SECTION_LINK_ORDER in that case. */ - if (!(TARGET_64BIT && DEFAULT_ABI != ABI_ELFv2) - && HAVE_GAS_SECTION_LINK_ORDER) -flags |= SECTION_LINK_ORDER; - default_print_patchable_function_entry_1 (file, patch_area_size, record_p, - flags); -} - enum rtx_code rs6000_reverse_condition (machine_mode mode, enum rtx_code code) { diff --git a/gcc/targhooks.c b/gcc/targhooks.c index 952fad422eb..d69c9a2d819 100644 --- a/gcc/targhooks.c +++ b/gcc/targhooks.c @@ -1832,15 +1832,17 @@ default_compare_by_pieces_branch_ratio (machine_mode) return 1; } -/* Helper for default_print_patchable_function_entry and other - print_patchable_function_entry hook implementations. */ +/* Write PATCH_AREA_SIZE NOPs into the asm outfile FILE around a function + entry. If RECORD_P is true and the target supports named sections, + the location of the NOPs will be recorded in a special object section + called "__patchable_function_entries". This routine may be called + twice per function to put NOPs before and after the function + entry. */ void -default_print_patchable_function_entry_1 (FILE *file, - unsigned HOST_WIDE_INT - patch_area_size, - bool record_p, - unsigned int flags) +default_print_patchable_function_entry (FILE *file, + unsigned HOST_WIDE_INT patch_area_size, + bool record_p) { const char *nop_templ = 0; int code_num; @@ -1862,6 +1864,9 @@ default_print_patchable_function_entry_1 (FILE *file, patch_area_number++; ASM_GENERATE_INTERNAL_LABEL (buf, "LPFE", patch_area_number); + unsigned int flags = SECTION_WRITE | SECTION_RELRO; + if (HAVE_GAS_SECTION_LINK_ORDER) + flags |= SECTION_LINK_ORDER; switch_to_section (get_section ("__patchable_function_entries", flags, current_function_decl)); assemble_align (POINTER_SIZE); @@ -1878,25 +1883,6 @@ default_print_patchable_function_entry_1 (FILE *file, output_asm_insn (nop_templ, NULL); } -/* Write PATCH_AREA_SIZE NOPs into the asm outfile FILE around a function - entry. If RECORD_P is true and the target supports named sections, - the location of the NOPs will be recorded in a special object section - called "__patchable_function_entries". This routine may be called - twice per function to put NOPs before and after the function - entry. */ - -void -default_print_patchable_function_entry (FILE *file, - unsigned HOST_WIDE_INT patch_area_size, - bool record_p) -{ - unsigned int flags = SECTION_WRITE | SECTION_RELRO; - if
PowerPC64 ELFv1 -fpatchable-function-entry
On PowerPC64 ELFv1 function symbols are defined on function descriptors in an .opd section rather than in the function code. .opd is not split up by the PowerPC64 backend for comdat groups or other situations where per-function sections are required. Thus SECTION_LINK_ORDER can't use the function name to reference a suitable section for ordering: The .opd section might contain many other function descriptors and they may be in a different order to the final function code placement. This patch arranges to use a code label instead of the function name symbol. I chose to emit the label inside default_elf_asm_named_section, immediately before the .section directive using the label, and in case someone uses .previous or the like, need to save and restore the current section when switching to the function code section to emit the label. That requires a tweak to switch_to_section in order to get the current section. I checked all the TARGET_ASM_NAMED_SECTION functions and unnamed.callback functions and it appears none will be affected by that tweak. PR target/98125 * varasm.c (default_elf_asm_named_section): Use a function code label rather than the function symbol as the "o" argument. (switch_to_section): Don't set in_section until section directive has been emitted. diff --git a/gcc/varasm.c b/gcc/varasm.c index 97c1e6fff25..5f95f8cfa75 100644 --- a/gcc/varasm.c +++ b/gcc/varasm.c @@ -6866,6 +6866,26 @@ default_elf_asm_named_section (const char *name, unsigned int flags, *f = '\0'; } + char func_label[256]; + if (flags & SECTION_LINK_ORDER) +{ + static int recur; + if (recur) + gcc_unreachable (); + else + { + ++recur; + section *save_section = in_section; + static int func_code_labelno; + switch_to_section (function_section (decl)); + ++func_code_labelno; + ASM_GENERATE_INTERNAL_LABEL (func_label, "LPFC", func_code_labelno); + ASM_OUTPUT_LABEL (asm_out_file, func_label); + switch_to_section (save_section); + --recur; + } +} + fprintf (asm_out_file, "\t.section\t%s,\"%s\"", name, flagchars); /* default_section_type_flags (above) knows which flags need special @@ -6893,11 +6913,8 @@ default_elf_asm_named_section (const char *name, unsigned int flags, fprintf (asm_out_file, ",%d", flags & SECTION_ENTSIZE); if (flags & SECTION_LINK_ORDER) { - tree id = DECL_ASSEMBLER_NAME (decl); - ultimate_transparent_alias_target (); - const char *name = IDENTIFIER_POINTER (id); - name = targetm.strip_name_encoding (name); - fprintf (asm_out_file, ",%s", name); + fputc (',', asm_out_file); + assemble_name_raw (asm_out_file, func_label); } if (HAVE_COMDAT_GROUP && (flags & SECTION_LINKONCE)) { @@ -7821,11 +7838,6 @@ switch_to_section (section *new_section, tree decl) else if (in_section == new_section) return; - if (new_section->common.flags & SECTION_FORGET) -in_section = NULL; - else -in_section = new_section; - switch (SECTION_STYLE (new_section)) { case SECTION_NAMED: @@ -7843,6 +7855,11 @@ switch_to_section (section *new_section, tree decl) break; } + if (new_section->common.flags & SECTION_FORGET) +in_section = NULL; + else +in_section = new_section; + new_section->common.flags |= SECTION_DECLARED; }
PR98125, PowerPC64 -fpatchable-function-entry
This series of patches fixes -fpatchable-function-entry on PowerPC64 ELFv1 so that SECTION_LINK_ORDER (.section 'o' arg) is now supported, and on PowerPC64 ELFv2 to not break the global entry code. Bootstrapped powerpc64le-linux and x86_64-linux all langs. I did see one regression on both targets, libgo runtime/pprof. It's unclear to me what that means. Alan Modra (3): PowerPC64 ELFv1 -fpatchable-function-entry Revert "rs6000: Avoid -fpatchable-function-entry* regressions on powerpc64 be [PR98125]" PowerPC64 ELFv2 -fpatchable-function-entry gcc/config/rs6000/rs6000-logue.c | 5 gcc/config/rs6000/rs6000.c | 47 +--- gcc/targhooks.c | 38 -- gcc/targhooks.h | 3 -- gcc/testsuite/g++.dg/pr93195a.C | 1 - gcc/varasm.c | 37 ++--- 6 files changed, 69 insertions(+), 62 deletions(-)
Re: RFC: Changing AC_PROG_CC to AC_PROG_CC_C99 in top level configure
On Wed, May 05, 2021 at 08:05:29AM +0100, Iain Sandoe wrote: > Alan Modra via Gcc-patches wrote: > > > On 2021-05-04 8:42 a.m., Nick Clifton wrote: > > > Hi Guys, > > > > > > On 4/30/21 7:36 PM, Simon Marchi wrote: > > > > I think this fix is obvious enough, I encourage you to push it, > > > > > > OK - I have pushed the patch to the mainline branches of both > > > the gcc and binutils-gdb repositories. > > > > Thanks Nick! Incidentally, I checked the AC_PROG_CC_C99 change on > > both binutils and gcc mainline using gcc-4.9. > > > > To build gcc on x86_64 I found the following patch necessary to avoid > > lots of > > error: uninitialized const member ‘stringop_algs::stringop_strategy::max’ > > error: uninitialized const member ‘stringop_algs::stringop_strategy::alg’ > > when compiling config/i386/i386-options.c. These can't be cured by > > configuring with --disable-stage1-checking. > > > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h > > index 97d6f3863cb..cc3b1b6d666 100644 > > --- a/gcc/config/i386/i386.h > > +++ b/gcc/config/i386/i386.h > > @@ -73,8 +73,8 @@ struct stringop_algs > > { > > const enum stringop_alg unknown_size; > > const struct stringop_strategy { > > -const int max; > > -const enum stringop_alg alg; > > +int max; > > +enum stringop_alg alg; > > int noalign; > > } size [MAX_STRINGOP_ALGS]; > > }; > > does this relate to / fix PR 100246 (which seems to fire for some GCC > versions as well > as older clang)? Yes, looks like the same issue. I started making a similar fix to the one you attached to the PR, then laziness kicked in after noticing the errors were only given on the const elements. -- Alan Modra Australia Development Lab, IBM
Re: RFC: Changing AC_PROG_CC to AC_PROG_CC_C99 in top level configure
On 2021-05-04 8:42 a.m., Nick Clifton wrote: > Hi Guys, > > On 4/30/21 7:36 PM, Simon Marchi wrote: >> I think this fix is obvious enough, I encourage you to push it, > > OK - I have pushed the patch to the mainline branches of both > the gcc and binutils-gdb repositories. Thanks Nick! Incidentally, I checked the AC_PROG_CC_C99 change on both binutils and gcc mainline using gcc-4.9. To build gcc on x86_64 I found the following patch necessary to avoid lots of error: uninitialized const member ‘stringop_algs::stringop_strategy::max’ error: uninitialized const member ‘stringop_algs::stringop_strategy::alg’ when compiling config/i386/i386-options.c. These can't be cured by configuring with --disable-stage1-checking. diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 97d6f3863cb..cc3b1b6d666 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -73,8 +73,8 @@ struct stringop_algs { const enum stringop_alg unknown_size; const struct stringop_strategy { -const int max; -const enum stringop_alg alg; +int max; +enum stringop_alg alg; int noalign; } size [MAX_STRINGOP_ALGS]; }; -- Alan Modra Australia Development Lab, IBM
Re: RFC: Changing AC_PROG_CC to AC_PROG_CC_C99 in top level configure
On Mon, May 03, 2021 at 10:47:15AM -0400, Simon Marchi wrote: > > Yes, I prefer the configure fix too. If we state we require C99 in > > binutils then we ought to be able to use C99.. > > > > Nick, does the configure.ac change also need to go in all subdirs, to > > support people running make in say ld/ rather than running make in the > > top build dir? > > For GDB, it's not supported to run gdb/configure directly, you need to > use the top-level configure. Is it supported from some of the other > projects in the repo? > > I just tried with ld, it doesn't work since it depends on bfd also being > built. I tried with just bfd, it doesn't work (with the default > configure options at least) because it requires zlib being built. I wasn't talking about running configure, I was talking about running make. For example, you configure and make binutils as usual, then after making a change to ld/ files, run make in the ld build dir. I don't tend to do that myself but I do run "make check" sometimes in a subdir expecting to get the same results in that subdir as if "make check" was run from the top level. But I should have just tried it myself rather than asking. CC, CPP and others are inherited from the top level and appear with -std=gnu99 in the subdir Makefiles. So it seems all the AC_PROG_CC in subdir configure.ac can stay as they are. > > So if all projects need to go through the top-level configure script > anyway, and C99 is a baseline for all projects, then having the check > only in the top-level makes sense to me. Projects that have more > specific requirements can have their own checks. For example, sim/ > requires C11 now. Unless the C99 check at top-level somehow does not > play well with the C11 check in sim/? Like if that would cause CC to be > set to "gcc -std=gnu99 -std=gnu11" or something like that. > > Simon -- Alan Modra Australia Development Lab, IBM
Re: RFC: Changing AC_PROG_CC to AC_PROG_CC_C99 in top level configure
On Fri, Apr 30, 2021 at 03:48:00PM -0600, Jeff Law via Gcc-patches wrote: > > On 4/30/2021 12:36 PM, Simon Marchi via Gcc-patches wrote: > > On 2021-04-26 7:32 a.m., Nick Clifton via Gdb-patches wrote:> Hi Guys, > > >Given that gcc, gdb and now binutils are all now requiring C99 as a > > >minimum version of C, are there any objections to updating > > >configure.ac to reflect this ? > > > > > > Cheers > > >Nick > > > > > > diff --git a/configure.ac b/configure.ac > > > index a721316d07b..59b4194fb24 100644 > > > --- a/configure.ac > > > +++ b/configure.ac > > > @@ -1278,7 +1278,7 @@ else > > > WINDMC_FOR_BUILD="\$(WINDMC)" > > > fi > > > > > > -AC_PROG_CC > > > +AC_PROG_CC_C99 > > > AC_PROG_CXX > > > > > > # We must set the default linker to the linker used by gcc for the > > > correct > > Hi Nick, > > > > I think this fix is obvious enough, I encourage you to push it, that > > will fix the build failure many people get in opcodes/ppc-dis.c. We'll > > just remove the line later when we upgrade to Autoconf 2.71, as simple > > as that. For now we use 2.69. If that matters, you have my OK for the > > GDB side of things. > > That works for me. I'd just sent Alan the trivial patch to make ppc-dis.c > compile again with C89, but if we're going to update configure.ac > appropriately, then it wouldn't be needed. Yes, I prefer the configure fix too. If we state we require C99 in binutils then we ought to be able to use C99.. Nick, does the configure.ac change also need to go in all subdirs, to support people running make in say ld/ rather than running make in the top build dir? -- Alan Modra Australia Development Lab, IBM
Re: [PATCH] Add -fgnu-retain/-fno-gnu-retain
On Wed, Feb 17, 2021 at 11:23:12AM +, Jozef Lawrynowicz wrote: > In previous discussions, it seemed to me that there was general support > for the "used" attribute implying SHF_GNU_RETAIN and saving symbols from > linker garbage collection[1]. Of course, the current implementation > results in undesirable behavior - the thought that all linker scripts > not supporting uniquely named sections would need to be updated is quite > alarming. > > It's a shame that all this extra complication is required, just because > we cannot have a ".retain ", directive. Is that true? Isn't the problem really that retained sections are named as if -ffunction-sections/-fdata-sections were in force? And that is due to wanting separate sections so that garbage collection works, since SHF_GNU_RETAIN is all about linker garbage collection. I don't see how having a ".retain " would help much. > My preferred vision for this functionality was: > - SHF_GNU_RETAIN section flag indicates a section should be saved > from linker garbage collection. > - ".retain " assembler directive applies SHF_GNU_RETAIN > to the section containing . > - GCC "used" attribute emits a ".retain " directive for > the symbol declaration is is applied to. Applying the "used" > attribute to a symbol declaration does not change the structure of > the object file, beyond applying SHF_GNU_RETAIN to the section > containing that symbol. That description seems to say that a ".retain foo" would mean everything in foo's section is kept. If foo's section was the usual .data, you've kept virtually everything from garbage collection. Surely you don't expect ".retain foo" to create a separate .data section for foo? If you do, I'm strongly against that idea. Note that gas indeed supports multiple sections named .data that can serve the same purpose as -fdata-sections. See the gas doc for the optional .section field "unique". That might be the best way to avoid an under-the-hood -ffunction-sections/-fdata-sections. -- Alan Modra Australia Development Lab, IBM
Re: [PATCH 4/8] intl: turn LIBINTL into -L / -l form
On Mon, Feb 08, 2021 at 11:16:31AM +, Nick Alcock via Binutils wrote: > intl/ChangeLog > 2021-02-04 Nick Alcock > > * configure.ac (LIBINTL): Transform into -L/-lintl form. > * configure: Regenerate. OK for binutils. -- Alan Modra Australia Development Lab, IBM
Re: [PATCH 3/8] intl: always picify
On Mon, Feb 08, 2021 at 11:16:30AM +, Nick Alcock via Binutils wrote: > intl/ChangeLog > 2021-02-02 Nick Alcock > > * aclocal.m4: include picflag.m4. > * configure.ac (PICFLAG): Add and substitute. > * Makefile.in (PICFLAG): New. > (COMPILE): Use it. > * configure: Regenerate. OK for binutils. -- Alan Modra Australia Development Lab, IBM
Re: [PATCH, rs6000] Optimization for PowerPC 64bit constant generation [PR94395]
On Fri, Jan 29, 2021 at 11:11:23AM +0800, HAO CHEN GUI via Gcc-patches wrote: > This patch tries to optimize PowerPC 64 bit constant generation when the > constant can be transformed from a 32 bit or 16 bit constant by rotating, > shifting and mask AND. All and more of what you are doing here for rotated 16-bit constants is covered by https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555760.html That patch is still waiting on review. Hmm, I see my local copy of that patch has one extra line in gcc/testsuite/gcc.target/powerpc/rot_cst2.c +/* { dg-additional-options "-mno-prefixed" { target { lp64 } } } */ in order to keep scan-assembler-times counts correct for power10. -- Alan Modra Australia Development Lab, IBM
Re: [PATCH 5/8] [RS6000] rs6000_rtx_costs cost IOR
On Mon, Jan 25, 2021 at 04:51:43PM -0600, Segher Boessenkool wrote: > Hi! > > On Thu, Oct 08, 2020 at 09:27:57AM +1030, Alan Modra wrote: > > * config/rs6000/rs6000.c (rotate_insert_cost): New function. > > (rs6000_rtx_costs): Cost IOR. > > > > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > > index 383d2901c9f..15a806fe307 100644 > > --- a/gcc/config/rs6000/rs6000.c > > +++ b/gcc/config/rs6000/rs6000.c > > @@ -21206,6 +21206,91 @@ rs6000_cannot_copy_insn_p (rtx_insn *insn) > > && get_attr_cannot_copy (insn); > > } > > > > +/* Handle rtx_costs for scalar integer rotate and insert insns. */ > > You need to document here what the return value means, and what the > preconditions for "left" (and "right") are. Done, and I moved the preconditions on "left" into the new function. > > +static bool > > +rotate_insert_cost (rtx left, rtx right, machine_mode mode, bool speed, > > + int *total) > > +{ > > + if (GET_CODE (right) == AND > > ... because you never check the CODE of "left". > > > + && CONST_INT_P (XEXP (right, 1)) > > + && UINTVAL (XEXP (left, 1)) + UINTVAL (XEXP (right, 1)) + 1 == 0) > > HOST_WIDE_INT is always exactly 64 bits now, so you could do "== -1". Yes, but this is exactly the way the expression occurs in rotl*_insert* instruction patterns. I think it's better to keep them the same. > > +{ > > + rtx leftop = XEXP (left, 0); > > + rtx rightop = XEXP (right, 0); > > + > > + /* rotlsi3_insert_5. */ > > + if (REG_P (leftop) > > + && REG_P (rightop) > > + && mode == SImode > > + && UINTVAL (XEXP (left, 1)) != 0 > > + && UINTVAL (XEXP (right, 1)) != 0 > > + && rs6000_is_valid_mask (XEXP (left, 1), NULL, NULL, mode)) > > + return true; > > Empty line after return please. Done, here and elsewhere. > > + /* rotldi3_insert_6. */ > > + if (REG_P (leftop) > > + && REG_P (rightop) > > + && mode == DImode > > + && exact_log2 (-UINTVAL (XEXP (left, 1))) > 0) > > + return true; > > + /* rotldi3_insert_7. */ > > + if (REG_P (leftop) > > + && REG_P (rightop) > > + && mode == DImode > > + && exact_log2 (-UINTVAL (XEXP (right, 1))) > 0) > > + return true; > > Those could just use rs6000_is_valid_mask as well? This is taken straight from rotldi3_insert_7, so it really ought to stay that way. > > Please wait this until stage 1. Sorry. OK, I'll leave all the rs6000_rtx_costs changes until then. -- Alan Modra Australia Development Lab, IBM
Re: [PATCH 3/8] [RS6000] rs6000_rtx_costs tidy AND
On Mon, Jan 25, 2021 at 04:37:21PM -0600, Segher Boessenkool wrote: > I still do not see what this improves, I only see possible obvious > regressions :-( You asked me to break the patch series into small pieces, for ease of review and to separate tidies from functional changes. Well OK, fair enough. This is one of the tidies. The idea being to make rs6000_rtx_costs a little more self-consistent, to not have someone look at this code in the future and wonder why AND was treated differently to other operations. The only part of this patch that I can imagine you see as a possible regression is the "Don't avoid recursion on const_int shift count" part. That is there only because you wanted it that way in new code. I think you said something about premature optimisation when I made the new code special case const_int and reg to stop recursion, like AND. So for consistency I made the change in old code too. -- Alan Modra Australia Development Lab, IBM
Re: [RS6000] Adjust testcases for power10 instructions V3
Ping. On Tue, Jan 12, 2021 at 02:03:18PM +1030, Alan Modra wrote: > Ping > https://gcc.gnu.org/pipermail/gcc-patches/2020-October/557587.html > > On Fri, Oct 30, 2020 at 07:00:14PM +1030, Alan Modra wrote: > > And now waking up to what you meant by the lvsl-lvsr.c \s comment, > > plus a revised ppc-ne0-1.c scan-assembler. > > > > I think this covers all previous review corrections. Regression tested > > powerpc64-linux power7 and powerpc64le-linux power10. OK? > > > > * lib/target-supports.exp (check_effective_target_has_arch_pwr10): New. > > * gcc.dg/pr56727-2.c, > > gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c, > > gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c, > > gcc.target/powerpc/fold-vec-load-builtin_vec_xl-double.c, > > gcc.target/powerpc/fold-vec-load-builtin_vec_xl-float.c, > > gcc.target/powerpc/fold-vec-load-builtin_vec_xl-int.c, > > gcc.target/powerpc/fold-vec-load-builtin_vec_xl-longlong.c, > > gcc.target/powerpc/fold-vec-load-builtin_vec_xl-short.c, > > gcc.target/powerpc/fold-vec-load-vec_vsx_ld-char.c, > > gcc.target/powerpc/fold-vec-load-vec_vsx_ld-double.c, > > gcc.target/powerpc/fold-vec-load-vec_vsx_ld-float.c, > > gcc.target/powerpc/fold-vec-load-vec_vsx_ld-int.c, > > gcc.target/powerpc/fold-vec-load-vec_vsx_ld-longlong.c, > > gcc.target/powerpc/fold-vec-load-vec_vsx_ld-short.c, > > gcc.target/powerpc/fold-vec-load-vec_xl-char.c, > > gcc.target/powerpc/fold-vec-load-vec_xl-double.c, > > gcc.target/powerpc/fold-vec-load-vec_xl-float.c, > > gcc.target/powerpc/fold-vec-load-vec_xl-int.c, > > gcc.target/powerpc/fold-vec-load-vec_xl-longlong.c, > > gcc.target/powerpc/fold-vec-load-vec_xl-short.c, > > gcc.target/powerpc/fold-vec-splat-floatdouble.c, > > gcc.target/powerpc/fold-vec-splat-longlong.c, > > gcc.target/powerpc/fold-vec-store-builtin_vec_xst-char.c, > > gcc.target/powerpc/fold-vec-store-builtin_vec_xst-double.c, > > gcc.target/powerpc/fold-vec-store-builtin_vec_xst-float.c, > > gcc.target/powerpc/fold-vec-store-builtin_vec_xst-int.c, > > gcc.target/powerpc/fold-vec-store-builtin_vec_xst-short.c, > > gcc.target/powerpc/fold-vec-store-vec_vsx_st-char.c, > > gcc.target/powerpc/fold-vec-store-vec_vsx_st-double.c, > > gcc.target/powerpc/fold-vec-store-vec_vsx_st-float.c, > > gcc.target/powerpc/fold-vec-store-vec_vsx_st-int.c, > > gcc.target/powerpc/fold-vec-store-vec_vsx_st-longlong.c, > > gcc.target/powerpc/fold-vec-store-vec_vsx_st-short.c, > > gcc.target/powerpc/fold-vec-store-vec_xst-char.c, > > gcc.target/powerpc/fold-vec-store-vec_xst-double.c, > > gcc.target/powerpc/fold-vec-store-vec_xst-float.c, > > gcc.target/powerpc/fold-vec-store-vec_xst-int.c, > > gcc.target/powerpc/fold-vec-store-vec_xst-longlong.c, > > gcc.target/powerpc/fold-vec-store-vec_xst-short.c, > > gcc.target/powerpc/lvsl-lvsr.c, > > gcc.target/powerpc/ppc-eq0-1.c, > > gcc.target/powerpc/ppc-ne0-1.c, > > gcc.target/powerpc/pr86731-fwrapv-longlong.c: Match power10 insns. > > * gcc.target/powerpc/lvsl-lvsr.c: Avoid file name match. > > > > diff --git a/gcc/testsuite/gcc.dg/pr56727-2.c > > b/gcc/testsuite/gcc.dg/pr56727-2.c > > index c54369ed25e..f055116772a 100644 > > --- a/gcc/testsuite/gcc.dg/pr56727-2.c > > +++ b/gcc/testsuite/gcc.dg/pr56727-2.c > > @@ -18,4 +18,4 @@ void h () > > > > /* { dg-final { scan-assembler "@(PLT|plt)" { target i?86-*-* x86_64-*-* } > > } } */ > > /* { dg-final { scan-assembler "@(PLT|plt)" { target { powerpc*-*-linux* > > && ilp32 } } } } */ > > -/* { dg-final { scan-assembler "bl f\n\\s*nop" { target { > > powerpc*-*-linux* && lp64 } } } } */ > > +/* { dg-final { scan-assembler {bl f(\n\s*nop|@notoc\n)} { target { > > powerpc*-*-linux* && lp64 } } } } */ > > diff --git > > a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c > > b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c > > index 246f38fa6d1..1cff4550f28 100644 > > --- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c > > +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c > > @@ -25,6 +25,6 @@ main1 (void) > > with no word loads (lw, lwu, lwz, lwzu, or their indexed forms) > > or word stores (stw, stwu, stwx, stwux, or their indexed forms). */ > > > > -/* { dg-final { scan-assembler "
Re: [PATCH 8/8] [RS6000] rs6000_rtx_costs for !speed
Ping. On Tue, Jan 12, 2021 at 02:02:36PM +1030, Alan Modra wrote: > Ping > https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555759.html > > On Thu, Oct 08, 2020 at 09:28:00AM +1030, Alan Modra wrote: > > When optimizing for size we shouldn't be using metrics based on speed > > or vice-versa. rtlanal.c:get_full_rtx_cost wants both speed and size > > metric from rs6000_rtx_costs independent of the global optimize_size. > > > > Note that the patch changes param_simultaneous_prefetches, > > param_l1_cache_size, param_l1_cache_line_size and param_l2_cache_size, > > which were previously all set to zero for optimize_size. I think that > > was a bug. Those params are a function of the processor. > > > > * config/rs6000/rs6000.h (rs6000_cost): Don't declare. > > (struct processor_costs): Move to.. > > * config/rs6000/rs6000.c: ..here. > > (rs6000_cost): Make static. > > (rs6000_option_override_internal): Ignore optimize_size when > > setting up rs6000_cost. > > (rs6000_insn_cost): Take into account optimize_size here > > instead. > > (rs6000_emit_parity): Likewise. > > (rs6000_rtx_costs): Don't use rs6000_cost when !speed. > > > > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > > index d455aa52427..14ecbad5df4 100644 > > --- a/gcc/config/rs6000/rs6000.c > > +++ b/gcc/config/rs6000/rs6000.c > > @@ -497,7 +497,26 @@ rs6000_store_data_bypass_p (rtx_insn *out_insn, > > rtx_insn *in_insn) > > > > /* Processor costs (relative to an add) */ > > > > -const struct processor_costs *rs6000_cost; > > +struct processor_costs { > > + const int mulsi; /* cost of SImode multiplication. */ > > + const int mulsi_const; /* cost of SImode multiplication by constant. */ > > + const int mulsi_const9; /* cost of SImode mult by short constant. */ > > + const int muldi; /* cost of DImode multiplication. */ > > + const int divsi; /* cost of SImode division. */ > > + const int divdi; /* cost of DImode division. */ > > + const int fp; /* cost of simple SFmode and DFmode insns. */ > > + const int dmul;/* cost of DFmode multiplication (and fmadd). */ > > + const int sdiv;/* cost of SFmode division (fdivs). */ > > + const int ddiv;/* cost of DFmode division (fdiv). */ > > + const int cache_line_size;/* cache line size in bytes. */ > > + const int l1_cache_size; /* size of l1 cache, in kilobytes. */ > > + const int l2_cache_size; /* size of l2 cache, in kilobytes. */ > > + const int simultaneous_prefetches; /* number of parallel prefetch > > + operations. */ > > + const int sfdf_convert; /* cost of SF->DF conversion. */ > > +}; > > + > > +static const struct processor_costs *rs6000_cost; > > > > /* Instruction size costs on 32bit processors. */ > > static const > > @@ -4618,131 +4637,128 @@ rs6000_option_override_internal (bool > > global_init_p) > > } > > > >/* Initialize rs6000_cost with the appropriate target costs. */ > > - if (optimize_size) > > -rs6000_cost = TARGET_POWERPC64 ? _cost : _cost; > > - else > > -switch (rs6000_tune) > > - { > > - case PROCESSOR_RS64A: > > - rs6000_cost = _cost; > > - break; > > + switch (rs6000_tune) > > +{ > > +case PROCESSOR_RS64A: > > + rs6000_cost = _cost; > > + break; > > > > - case PROCESSOR_MPCCORE: > > - rs6000_cost = _cost; > > - break; > > +case PROCESSOR_MPCCORE: > > + rs6000_cost = _cost; > > + break; > > > > - case PROCESSOR_PPC403: > > - rs6000_cost = _cost; > > - break; > > +case PROCESSOR_PPC403: > > + rs6000_cost = _cost; > > + break; > > > > - case PROCESSOR_PPC405: > > - rs6000_cost = _cost; > > - break; > > +case PROCESSOR_PPC405: > > + rs6000_cost = _cost; > > + break; > > > > - case PROCESSOR_PPC440: > > - rs6000_cost = _cost; > > - break; > > +case PROCESSOR_PPC440: > > + rs6000_cost = _cost; > > + break; > > > > - case PROCESSOR_PPC476: > > - rs6000_cost = _cost; > > - break; > > +case PROCESSOR_PPC476: > > + rs6000_cost = _cost; > > + break; > > > > - case PROCESSOR_PPC601: > > - rs6000_cost = _cost; > > - break; > > +case P
Re: [PATCH 7/8] [RS6000] rs6000_rtx_costs reduce cost for SETs
Ping. On Tue, Jan 12, 2021 at 02:02:27PM +1030, Alan Modra wrote: > Ping > https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555758.html > > On Thu, Oct 08, 2020 at 09:27:59AM +1030, Alan Modra wrote: > > The aim of this patch is to make rtx_costs for SETs closer to > > insn_cost for SETs. One visible effect on powerpc code is increased > > if-conversion. > > > > * config/rs6000/rs6000.c (rs6000_rtx_costs): Reduce cost of SET > > operands. > > > > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > > index 76aedbfae6f..d455aa52427 100644 > > --- a/gcc/config/rs6000/rs6000.c > > +++ b/gcc/config/rs6000/rs6000.c > > @@ -21684,6 +21684,35 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int > > outer_code, > > } > >return false; > > > > +case SET: > > + /* On entry the value in *TOTAL is the number of general purpose > > +regs being set, multiplied by COSTS_N_INSNS (1). Handle > > +costing of set operands specially since in most cases we have > > +an instruction rather than just a piece of RTL and should > > +return a cost comparable to insn_cost. That's a little > > +complicated because in some cases the cost of SET operands is > > +non-zero, see point 5 above and cost of PLUS for example, and > > +in others it is zero, for example for (set (reg) (reg)). > > +But (set (reg) (reg)) has the same insn_cost as > > +(set (reg) (plus (reg) (reg))). Hack around this by > > +subtracting COSTS_N_INSNS (1) from the operand cost in cases > > +were we add at least COSTS_N_INSNS (1) for some operation. > > +However, don't do so for constants. Constants might cost > > +more than zero when they require more than one instruction, > > +and we do want the cost of extra instructions. */ > > + { > > + rtx_code src_code = GET_CODE (SET_SRC (x)); > > + if (src_code == CONST_INT > > + || src_code == CONST_DOUBLE > > + || src_code == CONST_WIDE_INT) > > + return false; > > + int set_cost = (rtx_cost (SET_SRC (x), mode, SET, 1, speed) > > + + rtx_cost (SET_DEST (x), mode, SET, 0, speed)); > > + if (set_cost >= COSTS_N_INSNS (1)) > > + *total += set_cost - COSTS_N_INSNS (1); > > + return true; > > + } > > + > > default: > >return false; > > } -- Alan Modra Australia Development Lab, IBM
Re: [PATCH 5/8] [RS6000] rs6000_rtx_costs cost IOR
Ping. On Tue, Jan 12, 2021 at 02:02:18PM +1030, Alan Modra wrote: > Ping > https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555756.html > > On Thu, Oct 08, 2020 at 09:27:57AM +1030, Alan Modra wrote: > > * config/rs6000/rs6000.c (rotate_insert_cost): New function. > > (rs6000_rtx_costs): Cost IOR. > > > > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > > index 383d2901c9f..15a806fe307 100644 > > --- a/gcc/config/rs6000/rs6000.c > > +++ b/gcc/config/rs6000/rs6000.c > > @@ -21206,6 +21206,91 @@ rs6000_cannot_copy_insn_p (rtx_insn *insn) > > && get_attr_cannot_copy (insn); > > } > > > > +/* Handle rtx_costs for scalar integer rotate and insert insns. */ > > + > > +static bool > > +rotate_insert_cost (rtx left, rtx right, machine_mode mode, bool speed, > > + int *total) > > +{ > > + if (GET_CODE (right) == AND > > + && CONST_INT_P (XEXP (right, 1)) > > + && UINTVAL (XEXP (left, 1)) + UINTVAL (XEXP (right, 1)) + 1 == 0) > > +{ > > + rtx leftop = XEXP (left, 0); > > + rtx rightop = XEXP (right, 0); > > + > > + /* rotlsi3_insert_5. */ > > + if (REG_P (leftop) > > + && REG_P (rightop) > > + && mode == SImode > > + && UINTVAL (XEXP (left, 1)) != 0 > > + && UINTVAL (XEXP (right, 1)) != 0 > > + && rs6000_is_valid_mask (XEXP (left, 1), NULL, NULL, mode)) > > + return true; > > + /* rotldi3_insert_6. */ > > + if (REG_P (leftop) > > + && REG_P (rightop) > > + && mode == DImode > > + && exact_log2 (-UINTVAL (XEXP (left, 1))) > 0) > > + return true; > > + /* rotldi3_insert_7. */ > > + if (REG_P (leftop) > > + && REG_P (rightop) > > + && mode == DImode > > + && exact_log2 (-UINTVAL (XEXP (right, 1))) > 0) > > + return true; > > + > > + rtx mask = 0; > > + rtx shift = leftop; > > + rtx_code shift_code = GET_CODE (shift); > > + /* rotl3_insert. */ > > + if (shift_code == ROTATE > > + || shift_code == ASHIFT > > + || shift_code == LSHIFTRT) > > + mask = right; > > + else > > + { > > + shift = rightop; > > + shift_code = GET_CODE (shift); > > + /* rotl3_insert_2. */ > > + if (shift_code == ROTATE > > + || shift_code == ASHIFT > > + || shift_code == LSHIFTRT) > > + mask = left; > > + } > > + if (mask > > + && CONST_INT_P (XEXP (shift, 1)) > > + && rs6000_is_valid_insert_mask (XEXP (mask, 1), shift, mode)) > > + { > > + *total += rtx_cost (XEXP (shift, 0), mode, shift_code, 0, speed); > > + *total += rtx_cost (XEXP (mask, 0), mode, AND, 0, speed); > > + return true; > > + } > > +} > > + /* rotl3_insert_3. */ > > + if (GET_CODE (right) == ASHIFT > > + && CONST_INT_P (XEXP (right, 1)) > > + && (INTVAL (XEXP (right, 1)) > > + == exact_log2 (UINTVAL (XEXP (left, 1)) + 1))) > > +{ > > + *total += rtx_cost (XEXP (left, 0), mode, AND, 0, speed); > > + *total += rtx_cost (XEXP (right, 0), mode, ASHIFT, 0, speed); > > + return true; > > +} > > + /* rotl3_insert_4. */ > > + if (GET_CODE (right) == LSHIFTRT > > + && CONST_INT_P (XEXP (right, 1)) > > + && mode == SImode > > + && (INTVAL (XEXP (right, 1)) > > + + exact_log2 (-UINTVAL (XEXP (left, 1 == 32) > > +{ > > + *total += rtx_cost (XEXP (left, 0), mode, AND, 0, speed); > > + *total += rtx_cost (XEXP (right, 0), mode, LSHIFTRT, 0, speed); > > + return true; > > +} > > + return false; > > +} > > + > > /* Compute a (partial) cost for rtx X. Return true if the complete > > cost has been computed, and false if subexpressions should be > > scanned. In either case, *TOTAL contains the cost result. > > @@ -21253,7 +21338,7 @@ static bool > > rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code, > > int opno ATTRIBUTE_UNUSED, int *total, bool speed) > > { > > - rtx right; > > + rtx left, right; > >int code = GET_CODE (x); > > > >switch (code) > > @@ -21435,7 +21520,7 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int > > outer_code, > >right = XEXP (x, 1); > >if (CONST_INT_P (right)) > > { > > - rtx left = XEXP (x, 0); > > + left = XEXP (x, 0); > > rtx_code left_code = GET_CODE (left); > > > > /* rotate-and-mask: 1 insn. */ > > @@ -21452,9 +21537,16 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int > > outer_code, > >return false; > > > > case IOR: > > - /* FIXME */ > >*total = COSTS_N_INSNS (1); > > - return true; > > + left = XEXP (x, 0); > > + if (GET_CODE (left) == AND > > + && CONST_INT_P (XEXP (left, 1))) > > + { > > + right = XEXP (x, 1); > > + if (rotate_insert_cost (left, right, mode, speed, total)) > > + return true; > > + } > > + return false; > > > > case CLZ: > > case XOR: -- Alan Modra Australia Development Lab, IBM
Re: [PATCH 4/8] [RS6000] rs6000_rtx_costs tidy break/return
Ping. On Tue, Jan 12, 2021 at 02:02:09PM +1030, Alan Modra wrote: > Ping > https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555755.html > > On Thu, Oct 08, 2020 at 09:27:56AM +1030, Alan Modra wrote: > > Most cases use "return false" rather than breaking out of the switch. > > Do so in all cases. > > > > * config/rs6000/rs6000.c (rs6000_rtx_costs): Tidy break/return. > > > > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > > index bc5e51aa5ce..383d2901c9f 100644 > > --- a/gcc/config/rs6000/rs6000.c > > +++ b/gcc/config/rs6000/rs6000.c > > @@ -21371,7 +21371,7 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int > > outer_code, > > *total = rs6000_cost->fp; > >else > > *total = rs6000_cost->dmul; > > - break; > > + return false; > > > > case DIV: > > case MOD: > > @@ -21539,7 +21539,7 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int > > outer_code, > > *total = rs6000_cost->fp; > > return false; > > } > > - break; > > + return false; > > > > case NE: > > case EQ: > > @@ -21577,13 +21577,11 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int > > outer_code, > > *total = 0; > > return true; > > } > > - break; > > + return false; > > > > default: > > - break; > > + return false; > > } > > - > > - return false; > > } > > > > /* Debug form of r6000_rtx_costs that is selected if -mdebug=cost. */ -- Alan Modra Australia Development Lab, IBM
Re: [PATCH 3/8] [RS6000] rs6000_rtx_costs tidy AND
Ping. On Tue, Jan 12, 2021 at 02:01:57PM +1030, Alan Modra wrote: > Ping > https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555754.html > > On Thu, Oct 08, 2020 at 09:27:55AM +1030, Alan Modra wrote: > > * config/rs6000/rs6000.c (rs6000_rtx_costs): Tidy AND code. > > Don't avoid recursion on const_int shift count. > > > > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > > index e870ba0039a..bc5e51aa5ce 100644 > > --- a/gcc/config/rs6000/rs6000.c > > +++ b/gcc/config/rs6000/rs6000.c > > @@ -21253,6 +21253,7 @@ static bool > > rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code, > > int opno ATTRIBUTE_UNUSED, int *total, bool speed) > > { > > + rtx right; > >int code = GET_CODE (x); > > > >switch (code) > > @@ -21430,7 +21431,9 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int > > outer_code, > >return false; > > > > case AND: > > - if (CONST_INT_P (XEXP (x, 1))) > > + *total = COSTS_N_INSNS (1); > > + right = XEXP (x, 1); > > + if (CONST_INT_P (right)) > > { > > rtx left = XEXP (x, 0); > > rtx_code left_code = GET_CODE (left); > > @@ -21439,17 +21442,13 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int > > outer_code, > > if ((left_code == ROTATE > >|| left_code == ASHIFT > >|| left_code == LSHIFTRT) > > - && rs6000_is_valid_shift_mask (XEXP (x, 1), left, mode)) > > + && rs6000_is_valid_shift_mask (right, left, mode)) > > { > > - *total = rtx_cost (XEXP (left, 0), mode, left_code, 0, speed); > > - if (!CONST_INT_P (XEXP (left, 1))) > > - *total += rtx_cost (XEXP (left, 1), SImode, left_code, 1, > > speed); > > - *total += COSTS_N_INSNS (1); > > + *total += rtx_cost (XEXP (left, 0), mode, left_code, 0, speed); > > + *total += rtx_cost (XEXP (left, 1), SImode, left_code, 1, speed); > > return true; > > } > > } > > - > > - *total = COSTS_N_INSNS (1); > >return false; > > > > case IOR: -- Alan Modra Australia Development Lab, IBM
Re: [PATCH] binuitils: Check if AR is usable for LTO build
On Mon, Jan 11, 2021 at 02:52:43PM -0800, H.J. Lu wrote: > On Mon, Jan 11, 2021 at 1:20 PM Alan Modra wrote: > > > > On Mon, Jan 11, 2021 at 11:53:15AM -0800, H.J. Lu via Binutils wrote: > > > Check if AR is usable for LTO build with --enable-pgo-build=lto: > > > > > > checking for -plugin option... ar: no operation specified > > > Failed: ar --plugin > > > /usr/gcc-11.0.0-x32/libexec/gcc/x86_64-pc-linux-gnu/11.0.0/liblto_plugin.so > > > rc > > > no > > > configure: error: AR with --plugin and rc is required for LTO build > > > > > > instead of build failure later. > > > > > > PR binutils/26766 > > > * configure.ac: > > > * configure: Regenerated. > > > > See pr27173 too. The problem isn't a matter of finding an "ar" that > > supports --plugin, we have versions of GNU ar (2.30 to 2.32?) that > > accept --plugin but then don't parse the "rc" or other command > > options. I don't think this patch will help. > > PR 27173 patches are at > > https://sourceware.org/pipermail/binutils/2021-January/114879.html After that one is committed, this patch is OK too (with any needed modifications). -- Alan Modra Australia Development Lab, IBM
Re: V2 [PATCH 1/2] GCC: Check if AR works with --plugin and rc
On Mon, Jan 11, 2021 at 04:07:22PM -0800, H.J. Lu wrote: > These are not fatal errors. Here is the updated patch to use > AC_MSG_WARN instead. OK for master? OK by me. Please squash the two patches. -- Alan Modra Australia Development Lab, IBM
Re: [PATCH 1/2] GCC: Check if AR works with --plugin and rc
On Mon, Jan 11, 2021 at 08:57:05AM -0800, H.J. Lu via Binutils wrote: > diff --git a/config/gcc-plugin.m4 b/config/gcc-plugin.m4 > index c5b72e9a13d..798a2054edd 100644 > --- a/config/gcc-plugin.m4 > +++ b/config/gcc-plugin.m4 > @@ -145,6 +145,18 @@ for plugin in $plugin_names; do > break >fi > done > +dnl Check if ${AR} $plugin_option rc works. > +AC_CHECK_TOOL(AR, ar) > +if test "${AR}" = "" ; then > + AC_MSG_ERROR([Required archive tool 'ar' not found on PATH.]) > +fi > +touch conftest.c > +${AR} $plugin_option rc conftest.a conftest.c > +if test "$?" != 0; then > + echo "Failed: ${AR} $plugin_option rc" Use AC_MSG_ERROR rather than echo. > + plugin_option= > +fi > +rm -f conftest.* > if test -n "$plugin_option"; then >$1="$plugin_option" >AC_MSG_RESULT($plugin_option) > diff --git a/libtool.m4 b/libtool.m4 > index 3672e9516e2..150971974c1 100644 > --- a/libtool.m4 > +++ b/libtool.m4 > @@ -1340,7 +1340,14 @@ AC_CHECK_TOOL(AR, ar, false) > test -z "$AR" && AR=ar > if test -n "$plugin_option"; then >if $AR --help 2>&1 | grep -q "\--plugin"; then > -AR="$AR $plugin_option" > +touch conftest.c > +$AR $plugin_option rc conftest.a conftest.c > +if test "$?" != 0; then > + echo "Failed: $AR $plugin_option rc" AC_MSG_ERROR again. > +else > + AR="$AR $plugin_option" > +fi > +rm -f conftest.* >fi > fi > test -z "$AR_FLAGS" && AR_FLAGS=cru -- Alan Modra Australia Development Lab, IBM
Re: [PATCH 0/2] Check if AR works with --plugin and rc
On Mon, Jan 11, 2021 at 08:57:04AM -0800, H.J. Lu via Binutils wrote: > Check if AR works with --plugin and rc before passing --plugin to AR and > RANLIB. Thanks for looking at this, but next time please assign the bug to yourself. I fixed the bug too this morning, before seeing your email. -- Alan Modra Australia Development Lab, IBM
Re: [PATCH 0/8] [RS6000] rs6000_rtx_costs V2
On Sat, Dec 05, 2020 at 07:42:07PM +1030, Alan Modra wrote: > Hi Segher, > I've been holding off pinging these knowing you had a lot of other > review work, but maybe that's settling down now? You already OK'd > 1/8, 2/8 and 6/8. Ping. > [PATCH 3/8] [RS6000] rs6000_rtx_costs tidy AND > https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555754.html > > [PATCH 4/8] [RS6000] rs6000_rtx_costs tidy break/return > https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555755.html > > [PATCH 5/8] [RS6000] rs6000_rtx_costs cost IOR > https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555756.html > > [PATCH 7/8] [RS6000] rs6000_rtx_costs reduce cost for SETs > https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555758.html > > [PATCH 8/8] [RS6000] rs6000_rtx_costs for !speed > https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555759.html > > [RS6000] rotate and mask constants > https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555760.html > > [RS6000] Adjust testcases for power10 instructions V3 > https://gcc.gnu.org/pipermail/gcc-patches/2020-October/557587.html -- Alan Modra Australia Development Lab, IBM
Re: [PATCH] binuitils: Check if AR is usable for LTO build
On Mon, Jan 11, 2021 at 11:53:15AM -0800, H.J. Lu via Binutils wrote: > Check if AR is usable for LTO build with --enable-pgo-build=lto: > > checking for -plugin option... ar: no operation specified > Failed: ar --plugin > /usr/gcc-11.0.0-x32/libexec/gcc/x86_64-pc-linux-gnu/11.0.0/liblto_plugin.so rc > no > configure: error: AR with --plugin and rc is required for LTO build > > instead of build failure later. > > PR binutils/26766 > * configure.ac: > * configure: Regenerated. See pr27173 too. The problem isn't a matter of finding an "ar" that supports --plugin, we have versions of GNU ar (2.30 to 2.32?) that accept --plugin but then don't parse the "rc" or other command options. I don't think this patch will help. > --- > configure| 4 > configure.ac | 4 > 2 files changed, 8 insertions(+) > > diff --git a/configure b/configure > index c44184f72ff..84285addafe 100755 > --- a/configure > +++ b/configure > @@ -10240,6 +10240,10 @@ if test -n "$PLUGIN_OPTION"; then >if $RANLIB --help 2>&1 | grep -q "\--plugin"; then > RANLIB_PLUGIN_OPTION="$PLUGIN_OPTION" >fi > +else > + if test "$enable_pgo_build" != "no"; then > +as_fn_error $? "AR with --plugin and rc is required for LTO build" > "$LINENO" 5 > + fi > fi > > > diff --git a/configure.ac b/configure.ac > index 9dd51c36e5a..d39019d7093 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -3452,6 +3452,10 @@ if test -n "$PLUGIN_OPTION"; then >if $RANLIB --help 2>&1 | grep -q "\--plugin"; then > RANLIB_PLUGIN_OPTION="$PLUGIN_OPTION" >fi > +else > + if test "$enable_pgo_build" != "no"; then > +AC_MSG_ERROR([AR with --plugin and rc is required for LTO build]) > + fi > fi > AC_SUBST(AR_PLUGIN_OPTION) > AC_SUBST(RANLIB_PLUGIN_OPTION) > -- > 2.29.2 -- Alan Modra Australia Development Lab, IBM
Re: [PATCH toplevel] libctf: new testsuite
On Tue, Jan 05, 2021 at 03:25:10PM +, Nick Alcock wrote: > This enables 'make libctf-check', used by a new libctf testsuite in > binutils. > > 2021-01-05 Nick Alcock > > * Makefile.def (libctf): No longer no_check. Checking depends on > all-ld. > * Makefile.in: Regenerated. > > --- > > Makefile.def | 4 +- > Makefile.in | 13 + > > This is a stripped-down top-level-only subset of commit > c59e30ed1727135f8efb79890f2c458f73709757 in binutils-gdb.git. (Because > it is identical to what has already landed in binutils, it should apply > without trouble in syncs back to there.) > > I don't have permission to push this: Alan has offered to do so. It doesn't apply due to gcc missing binutils 87279e3cef5b2c5 changes too. I could fix that easily enough but I'm going to ask that you post a combined patch to bring the gcc repo up to date with any libctf changes. -- Alan Modra Australia Development Lab, IBM
Re: [PATCH, powerpc] testsuite update tests for powerpc power10 target codegen.
On Mon, Dec 07, 2020 at 05:49:05PM -0600, will schmidt via Gcc-patches wrote: > [PATCH, powerpc] testsuite update tests for powerpc power10 target codegen. Appears to duplicate work I did earlier, https://gcc.gnu.org/pipermail/gcc-patches/2020-October/557587.html Except I omitted fold-vec-store-builtin_vec_xst-longlong.c, due to -mdejagnu-cpu=power8 in that test meaning we don't see any power10 insns. -- Alan Modra Australia Development Lab, IBM
Re: testsuite: Adjust target requirements for sad-vectorize and signbit
On Thu, Oct 29, 2020 at 10:10:58PM +1030, Alan Modra wrote: > Fixes > FAIL: gcc.target/powerpc/signbit-1.c scan-assembler-not stxvd2x > FAIL: gcc.target/powerpc/signbit-1.c scan-assembler-times mfvsrd 3 > FAIL: gcc.target/powerpc/signbit-1.c scan-assembler-times srdi 3 > FAIL: gcc.target/powerpc/signbit-2.c scan-assembler-times ld 1 > FAIL: gcc.target/powerpc/signbit-2.c scan-assembler-times srdi 1 > on powerpc-linux (or powerpc64-linux biarch -m32). David, This patch is fixing a regression caused by one of your testsuite patches. Please review. https://gcc.gnu.org/pipermail/gcc-patches/2020-October/557443.html > signbit-1.c is quite obviously a 64-bit only testcase given the > scan-assembler directives, and the purpose of the testcase to verify > the 64-bit only UNSPEC_SIGNBIT patterns. It could be made to pass for > -m32 by adding -mpowerpc64, but that option that isn't very effective > when bi-arch testing and results in errors on rs6000-aix. And it is > pointless to match -m32 stores to the stack followed by loads, which > is what we do at the moment. > > signbit-2.c on the other hand has more reasonable 32-bit output. > > Regression tested powerpc64-linux biarch. > > * gcc.target/powerpc/signbit-1.c: Reinstate lp64 condition. > * gcc.target/powerpc/signbit-2.c: Match 32-bit output too. > > diff --git a/gcc/testsuite/gcc.target/powerpc/signbit-1.c > b/gcc/testsuite/gcc.target/powerpc/signbit-1.c > index eb4f53e397d..1642bf46d7a 100644 > --- a/gcc/testsuite/gcc.target/powerpc/signbit-1.c > +++ b/gcc/testsuite/gcc.target/powerpc/signbit-1.c > @@ -1,4 +1,5 @@ > /* { dg-do compile } */ > +/* { dg-require-effective-target lp64 } */ > /* { dg-require-effective-target ppc_float128_sw } */ > /* { dg-require-effective-target powerpc_p8vector_ok } */ > /* { dg-options "-mdejagnu-cpu=power8 -O2 -mfloat128" } */ > diff --git a/gcc/testsuite/gcc.target/powerpc/signbit-2.c > b/gcc/testsuite/gcc.target/powerpc/signbit-2.c > index ff6af963dda..1b792916eba 100644 > --- a/gcc/testsuite/gcc.target/powerpc/signbit-2.c > +++ b/gcc/testsuite/gcc.target/powerpc/signbit-2.c > @@ -13,5 +13,7 @@ int do_signbit_kf (__float128 *a) { return > __builtin_signbit (*a); } > /* { dg-final { scan-assembler-not "lxvw4x" } } */ > /* { dg-final { scan-assembler-not "lxsd" } } */ > /* { dg-final { scan-assembler-not "lxsdx"} } */ > -/* { dg-final { scan-assembler-times "ld" 1 } } */ > -/* { dg-final { scan-assembler-times "srdi" 1 } } */ > +/* { dg-final { scan-assembler-times "ld" 1 { target lp64 } } } */ > +/* { dg-final { scan-assembler-times "srdi" 1 { target lp64 } } } */ > +/* { dg-final { scan-assembler-times "lwz"1 { target ilp32 } } } */ > +/* { dg-final { scan-assembler-times "rlwinm" 1 { target ilp32 } } } */ -- Alan Modra Australia Development Lab, IBM
Re: [PATCH 0/8] [RS6000] rs6000_rtx_costs V2
Hi Segher, I've been holding off pinging these knowing you had a lot of other review work, but maybe that's settling down now? You already OK'd 1/8, 2/8 and 6/8. [PATCH 3/8] [RS6000] rs6000_rtx_costs tidy AND https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555754.html [PATCH 4/8] [RS6000] rs6000_rtx_costs tidy break/return https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555755.html [PATCH 5/8] [RS6000] rs6000_rtx_costs cost IOR https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555756.html [PATCH 7/8] [RS6000] rs6000_rtx_costs reduce cost for SETs https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555758.html [PATCH 8/8] [RS6000] rs6000_rtx_costs for !speed https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555759.html [RS6000] rotate and mask constants https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555760.html [RS6000] Adjust testcases for power10 instructions V3 https://gcc.gnu.org/pipermail/gcc-patches/2020-October/557587.html -- Alan Modra Australia Development Lab, IBM
[RS6000] Use LIB2_SIDITI_CONV_FUNCS in place of ppc64-fp.c
This patch retires ppc64-fp.c in favour of using "LIB2_SIDITI_CONV_FUNCS = yes", which is a lot better solution than having a copy of selected libgcc2.c functions. So for powerpc64-linux we see these changes in libgcc files (plus corresponding _s.o variants). +_fixdfti.o +_fixsfti.o +_fixtfti.o +_fixunsdfti.o +_fixunssfti.o +_fixunstfti.o +_floattidf.o +_floattisf.o +_floattitf.o +_floatuntidf.o +_floatuntisf.o +_floatuntitf.o -ppc64-fp.o with these empty objects also appearing (plus _s.o variants). +_fixunsxfti.o +_fixxfti.o +_floattixf.o +_floatuntixf.o In reality we aren't getting new TI mode conversions as it might seem, because the old *di*.o files corresponding to the above files contained TI mode conversions, whereas now they contain DI mode conversions. Those match the functions provided in ppc64-fp.o, and the set of dynamic libgcc_s.so.1 symbols is identical, apart from values, to before this patch. For ppc32 we get a whole lot more empty objects replacing the empty ppc64-fp.o. Again the set of global symbol in libgcc.a and dynamic symbols in libgcc_s.so.1 are unchanged. Bootstrapped and regression tested powerpc64-linux, powerpc64le-linux, powerpc-linux and powerpc-ibm-aix7.2.4.0. OK? * config/rs6000/t-ppc64-fp (LIB2ADD): Delete. (LIB2_SIDITI_CONV_FUNCS): Define. * config/rs6000/ppc64-fp.c: Delete file. diff --git a/libgcc/config/rs6000/t-ppc64-fp b/libgcc/config/rs6000/t-ppc64-fp index 26d1730bcdb..999679fc3cb 100644 --- a/libgcc/config/rs6000/t-ppc64-fp +++ b/libgcc/config/rs6000/t-ppc64-fp @@ -1,2 +1 @@ -# Can be used unconditionally, wrapped in __powerpc64__ || __64BIT__ __ppc64__. -LIB2ADD += $(srcdir)/config/rs6000/ppc64-fp.c +LIB2_SIDITI_CONV_FUNCS = yes -- Alan Modra Australia Development Lab, IBM
Re: [PATCH 1/3] Refactor copying decl section names
On Tue, Nov 10, 2020 at 09:19:37PM -0700, Jeff Law wrote: > I think the const char * is fine. That should force resolution to the > same routine we were using earlier. Do you want to commit that fix or > shall I? Commited 693a79a355e1. -- Alan Modra Australia Development Lab, IBM
Re: [PATCH 1/3] Refactor copying decl section names
On Tue, Nov 10, 2020 at 11:45:37AM -0700, Jeff Law wrote: > > On 11/12/19 11:28 PM, Strager Neds wrote: > > * gcc/cgraph.h (symtab_node::get_section): Constify. > > (symtab_node::set_section): Declare new overload. > > * gcc/symtab.c (symtab_node::set_section): Define new overload. > > (symtab_node::copy_visibility_from): Use new overload of > > symtab_node::set_section. > > (symtab_node::resolve_alias): Same. > > * gcc/tree.h (set_decl_section_name): Declare new overload. > > * gcc/tree.c (set_decl_section_name): Define new overload. > > * gcc/c/c-decl.c (merge_decls): Use new overload of > > set_decl_section_name. > > * gcc/cp/decl.c (duplicate_decls): Same. > > * gcc/cp/method.c (use_thunk): Same. > > * gcc/cp/optimize.c (maybe_clone_body): Same. > > * gcc/d/decl.cc (finish_thunk): Same. > > * gcc/tree-emutls.c (get_emutls_init_templ_addr): Same. > > * gcc/cgraphclones.c (cgraph_node::create_virtual_clone): Use new > > overload of symtab_node::set_section. > > (cgraph_node::create_version_clone_with_body): Same. > > * gcc/trans-mem.c (ipa_tm_create_version): Same. > > I adjusted the ChangeLog, added an entry for the coroutines.cc addition > I made and pushed this to the trunk. /home/alan/src/gcc/gcc/go/go-gcc.cc:2759:44: error: call of overloaded ‘set_decl_section_name(tree_node*&, std::nullptr_t)’ is ambiguous set_decl_section_name (var_decl, NULL); ^ In file included from /home/alan/src/gcc/gcc/go/go-gcc.cc:27:0: /home/alan/src/gcc/gcc/tree.h:4283:13: note: candidate: void set_decl_section_name(tree, const char*) extern void set_decl_section_name (tree, const char *); ^ /home/alan/src/gcc/gcc/tree.h:4284:13: note: candidate: void set_decl_section_name(tree, const_tree) extern void set_decl_section_name (tree, const_tree); ^ I'm using the obvious to me "(const char *) NULL" in the call to fix this, but you might like a different style C++ fix instead. -- Alan Modra Australia Development Lab, IBM
[PATCH 2/2] [RS6000] REG_PARM_STACK_SPACE check V2
On PowerPC we can tail call if the callee has less or equal REG_PARM_STACK_SPACE than the caller, as demonstrated by the testcase. So we should use /* If reg parm stack space increases, we cannot sibcall. */ if (REG_PARM_STACK_SPACE (decl ? decl : fntype) > INCOMING_REG_PARM_STACK_SPACE (current_function_decl)) and note the change to use INCOMING_REG_PARM_STACK_SPACE. REG_PARM_STACK_SPACE has always been wrong there for PowerPC. See https://gcc.gnu.org/pipermail/gcc-patches/2014-May/389867.html for why if you're curious. Not that it matters, because PowerPC can do without this check entirely, relying on a stack slot test in generic code. a) The generic code checks that arg passing stack in the callee is not greater than that in the caller, and, b) ELFv2 only allocates reg_parm_stack_space when some parameter is passed on the stack. Point (b) means that zero reg_parm_stack_space implies zero stack space, and non-zero reg_parm_stack_space implies non-zero stack space. So the case of 0 reg_parm_stack_space in the caller and 64 in the callee will be caught by (a). Bootstrapped and regression tested powerpc64le-linux and biarch powerpc64-linux. OK? PR middle-end/97267 gcc/ * config/rs6000/rs6000-logue.c (rs6000_function_ok_for_sibcall): Remove code checking REG_PARM_STACK_SPACE. testsuite/ * gcc.target/powerpc/pr97267.c: New test. diff --git a/gcc/config/rs6000/rs6000-logue.c b/gcc/config/rs6000/rs6000-logue.c index 61eb7ce7ade..d90cd5736e1 100644 --- a/gcc/config/rs6000/rs6000-logue.c +++ b/gcc/config/rs6000/rs6000-logue.c @@ -30,7 +30,6 @@ #include "df.h" #include "tm_p.h" #include "ira.h" -#include "calls.h" #include "print-tree.h" #include "varasm.h" #include "explow.h" @@ -1134,19 +1133,6 @@ rs6000_function_ok_for_sibcall (tree decl, tree exp) else fntype = TREE_TYPE (TREE_TYPE (CALL_EXPR_FN (exp))); - /* If outgoing reg parm stack space changes, we cannot do sibcall. */ - if ((OUTGOING_REG_PARM_STACK_SPACE (fntype) - != OUTGOING_REG_PARM_STACK_SPACE (TREE_TYPE (current_function_decl))) - || (REG_PARM_STACK_SPACE (decl ? decl : fntype) - != REG_PARM_STACK_SPACE (current_function_decl))) -{ - maybe_complain_about_tail_call (exp, - "inconsistent size of stack space" - " allocated for arguments which are" - " passed in registers"); - return false; -} - /* We can't do it if the called function has more vector parameters than the current function; there's nowhere to put the VRsave code. */ if (TARGET_ALTIVEC_ABI diff --git a/gcc/testsuite/gcc.target/powerpc/pr97267.c b/gcc/testsuite/gcc.target/powerpc/pr97267.c new file mode 100644 index 000..cab46245fc9 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr97267.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +static int __attribute__ ((__noclone__, __noinline__)) +reg_args (int j1, int j2, int j3, int j4, int j5, int j6, int j7, int j8) +{ + return j1 + j2 + j3 + j4 + j5 + j6 + j7 + j8; +} + +int __attribute__ ((__noclone__, __noinline__)) +stack_args (int j1, int j2, int j3, int j4, int j5, int j6, int j7, int j8, + int j9) +{ + if (j9 == 0) +return 0; + return reg_args (j1, j2, j3, j4, j5, j6, j7, j8); +} + +/* { dg-final { scan-assembler-not {(?n)^\s+bl\s} } } */ -- Alan Modra Australia Development Lab, IBM
[PATCH 1/2] can_implement_as_sibling_call_p REG_PARM_STACK_SPACE check V2
On Sat, Oct 31, 2020 at 12:10:35AM +1030, Alan Modra wrote: > Would it be better if I post the patches again, restructuring them as > 1) completely no functional change just moving the existing condition >to the powerpc and i386 target hooks, and > 2) twiddling the powerpc target hook? The no function change patch. This moves an #ifdef block of code from calls.c to targetm.function_ok_for_sibcall. Only two targets, x86 and rs6000, define REG_PARM_STACK_SPACE or OUTGOING_REG_PARM_STACK_SPACE macros that might vary depending on the called function. Macros like UNITS_PER_WORD don't change over a function boundary, nor does the MIPS ABI, nor does TARGET_64BIT on PA-RISC. Other targets are even more trivially proven to not need the calls.c code. Besides cleaning up a small piece of #ifdef code, the motivation for this patch is to allow tail calls on PowerPC for functions that require less reg_parm_stack_space than their caller. The original code in calls.c only permitted tail calls when exactly equal. That will be done in a followup patch (removing the code added to rs6000-logue.c in this patch). Bootstrapped and regression tested powerpc64le-linux, powerpc64-linux, and x86_64-linux. OK? PR middle-end/97267 * calls.h (maybe_complain_about_tail_call): Declare. * calls.c (maybe_complain_about_tail_call): Make global. (can_implement_as_sibling_call_p): Delete reg_parm_stack_space param. Adjust caller. Move REG_PARM_STACK_SPACE check to.. * config/i386/i386.c (ix86_function_ok_for_sibcall): ..here, and.. * config/rs6000/rs6000-logue.c (rs6000_function_ok_for_sibcall): .. here. diff --git a/gcc/calls.c b/gcc/calls.c index a8f459632f2..1a7632d2d48 100644 --- a/gcc/calls.c +++ b/gcc/calls.c @@ -1922,7 +1922,7 @@ maybe_warn_nonstring_arg (tree fndecl, tree exp) /* Issue an error if CALL_EXPR was flagged as requiring tall-call optimization. */ -static void +void maybe_complain_about_tail_call (tree call_expr, const char *reason) { gcc_assert (TREE_CODE (call_expr) == CALL_EXPR); @@ -3525,7 +3525,6 @@ static bool can_implement_as_sibling_call_p (tree exp, rtx structure_value_addr, tree funtype, -int reg_parm_stack_space ATTRIBUTE_UNUSED, tree fndecl, int flags, tree addr, @@ -3550,20 +3549,6 @@ can_implement_as_sibling_call_p (tree exp, return false; } -#ifdef REG_PARM_STACK_SPACE - /* If outgoing reg parm stack space changes, we cannot do sibcall. */ - if (OUTGOING_REG_PARM_STACK_SPACE (funtype) - != OUTGOING_REG_PARM_STACK_SPACE (TREE_TYPE (current_function_decl)) - || (reg_parm_stack_space != REG_PARM_STACK_SPACE (current_function_decl))) -{ - maybe_complain_about_tail_call (exp, - "inconsistent size of stack space" - " allocated for arguments which are" - " passed in registers"); - return false; -} -#endif - /* Check whether the target is able to optimize the call into a sibcall. */ if (!targetm.function_ok_for_sibcall (fndecl, exp)) @@ -4088,7 +4073,6 @@ expand_call (tree exp, rtx target, int ignore) try_tail_call = can_implement_as_sibling_call_p (exp, structure_value_addr, funtype, -reg_parm_stack_space, fndecl, flags, addr, args_size); diff --git a/gcc/calls.h b/gcc/calls.h index f32b6308b58..b20d24bb888 100644 --- a/gcc/calls.h +++ b/gcc/calls.h @@ -133,6 +133,7 @@ extern bool reference_callee_copied (CUMULATIVE_ARGS *, extern void maybe_warn_alloc_args_overflow (tree, tree, tree[2], int[2]); extern tree get_attr_nonstring_decl (tree, tree * = NULL); extern bool maybe_warn_nonstring_arg (tree, tree); +extern void maybe_complain_about_tail_call (tree, const char *); enum size_range_flags { /* Set to consider zero a valid range. */ diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 502d24057b5..809c145b638 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -939,6 +939,19 @@ ix86_function_ok_for_sibcall (tree decl, tree exp) decl_or_type = type; } + /* If outgoing reg parm stack space changes, we cannot do sibcall. */ + if ((OUTGOING_REG_PARM_STACK_SPACE (type) + != OUTGOING_REG_PARM_STACK_SPACE (TREE_TYPE (current_function_decl))) + || (REG_PARM_STACK_SPACE (decl_or_type) + != REG_PARM_STACK_SPACE (current_function_decl))) +{
Re: [PATCH] PowerPC: PR libgcc/97543, build libgcc with -mno-gnu-attribute
s6000/t-linux | 18 +- > 1 file changed, 17 insertions(+), 1 deletion(-) > > diff --git a/libgcc/config/rs6000/t-linux b/libgcc/config/rs6000/t-linux > index ed821947b66..b2a079c6b54 100644 > --- a/libgcc/config/rs6000/t-linux > +++ b/libgcc/config/rs6000/t-linux > @@ -1,6 +1,22 @@ > SHLIB_MAPFILES += $(srcdir)/config/rs6000/libgcc-glibc.ver > > -HOST_LIBGCC2_CFLAGS += -mlong-double-128 > +# On the modules that deal with IBM 128-bit values, we need to make sure that > +# TFmode uses the IBM extended double format. > +IBM128_OBJS = ibm-ldouble$(objext) _powitf2$(objext) ppc64-fp$(objext) \ > + _divtc3$(object) _multc3$(object) \ > + _fixtfdi$(object) _fixunstfdi$(object) \ > + _floatditf$(objext) _floatunsditf$(objext) > + > +IBM128_S_OBJS= $(patsubst %$(objext),%_s$(objext),$(IBM128_OBJS)) > +IBM128_ALL_OBJS = $(IBM128_OBJS) $(IBM128_S_OBJS) > + > +IBM128_CFLAGS= -mlong-double-128 -Wno-psabi -mabi=ibmlongdouble > + > +$(IBM128_ALL_OBJS) : INTERNAL_CFLAGS += $(IBM128_CFLAGS) > + > +# Turn off gnu attributes for the whole library. This allows us to build > +# libgcc that supports the different long double formats. > +HOST_LIBGCC2_CFLAGS += -mno-gnu-attribute It would be good if you could pass -mno-gnu-attribute just for the libgcc_s.so objects. That way you might save users from making mistakes due to using the wrong libgcc.a somehow. I think you may be able to do that with gcc_s_compile += -mno-gnu-attribute And, yes, I completely agree that libgcc_s.so should not be marked with .gnu_attribute. > # This is a way of selecting -mcmodel=small for ppc64, which gives > # smaller and faster libgcc code. Directly specifying -mcmodel=small > -- > 2.22.0 > > > -- > Michael Meissner, IBM > IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA > email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797 -- Alan Modra Australia Development Lab, IBM
Re: [PATCH] calls.c:can_implement_as_sibling_call_p REG_PARM_STACK_SPACE check
On Fri, Oct 30, 2020 at 09:21:09AM +, Richard Sandiford wrote: > Alan Modra via Gcc-patches writes: > > This moves an #ifdef block of code from calls.c to > > targetm.function_ok_for_sibcall. Only two targets, x86 and rs6000, > > define REG_PARM_STACK_SPACE or OUTGOING_REG_PARM_STACK_SPACE macros > > that might vary depending on the called function. Macros like > > UNITS_PER_WORD don't change over a function boundary, nor does the > > MIPS ABI, nor does TARGET_64BIT on PA-RISC. Other targets are even > > more trivially seen to not need the calls.c code. > > > > Besides cleaning up a small piece of #ifdef code, the motivation for > > this patch is to allow tail calls on PowerPC for functions that > > require less reg_parm_stack_space than their caller. The original > > code in calls.c only permitted tail calls when exactly equal. > > Is there something PowerPC-specific that makes the relaxation safe > for that target while not being safe on x86? It is quite possible that x86 can relax this condition too, I'm just not familiar enough with all the x86 ABIs know with any certainty. By moving the test to the target hook we allow target maintainers to have full say in the matter. > I take your point about x86 and PowerPC being the only two affected > targets. But the interface does still take an fndecl on all targets, > so I think the target-independent assumption should be that the value > might vary depending on function. So I guess an alternative would be > to relax the target-independent condition and make the x86 hook enforce > the stricter condition (if it really is needed). Yes, except that actually the REG_PARM_STACK_SPACE condition for PowerPC can be removed entirely. I agree that doing as you suggest would be OK for PowerPC, it would just mean we continue to do some unnecessary work in the non-trivial rs6000_function_parms_need_stack. Would it be better if I post the patches again, restructuring them as 1) completely no functional change just moving the existing condition to the powerpc and i386 target hooks, and 2) twiddling the powerpc target hook? Thanks for your time spent reviewing, and comments! -- Alan Modra Australia Development Lab, IBM
[RS6000] Adjust testcases for power10 instructions V3
addic\M} 4 { target { ! has_arch_pwr10 } } } } */ +/* { dg-final { scan-assembler-times {\msubfe\M} 1 { target { ! has_arch_pwr10 } } } } */ +/* { dg-final { scan-assembler-times {\maddic\M} 3 { target { has_arch_pwr10 } } } } */ +/* { dg-final { scan-assembler-times {\msetbcr\M} 1 { target { has_arch_pwr10 } } } } */ +/* { dg-final { scan-assembler-times {\maddze\M} 3 } } */ long ne0(long a) { diff --git a/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c b/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c index 1269fe635c6..72b83ef9d08 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c +++ b/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c @@ -30,5 +30,5 @@ vector signed long long splats4(void) /* { dg-final { scan-assembler-times {\mvspltis[bhw]\M} 0 } } */ /* { dg-final { scan-assembler-times {\mvsl[bhwd]\M} 0 } } */ -/* { dg-final { scan-assembler-times {\mlvx\M|\mlxv\M|\mlxvd2x\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mlvx\M|\mp?lxv\M|\mlxvd2x\M} 2 } } */ diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 8439720baea..7e2dd358640 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -5903,6 +5903,16 @@ proc check_effective_target_has_arch_pwr9 { } { }] } +proc check_effective_target_has_arch_pwr10 { } { + return [check_no_compiler_messages arch_pwr10 assembly { + #ifndef _ARCH_PWR10 + #error does not have power10 support. + #else + /* "has power10 support" */ + #endif + }] +} + # Return 1 if this is a PowerPC target supporting -mcpu=power10. # Limit this to 64-bit linux systems for now until other targets support # power10. -- Alan Modra Australia Development Lab, IBM
Re: testsuite: Adjust target requirements for sad-vectorize and signbit
Fixes FAIL: gcc.target/powerpc/signbit-1.c scan-assembler-not stxvd2x FAIL: gcc.target/powerpc/signbit-1.c scan-assembler-times mfvsrd 3 FAIL: gcc.target/powerpc/signbit-1.c scan-assembler-times srdi 3 FAIL: gcc.target/powerpc/signbit-2.c scan-assembler-times ld 1 FAIL: gcc.target/powerpc/signbit-2.c scan-assembler-times srdi 1 on powerpc-linux (or powerpc64-linux biarch -m32). signbit-1.c is quite obviously a 64-bit only testcase given the scan-assembler directives, and the purpose of the testcase to verify the 64-bit only UNSPEC_SIGNBIT patterns. It could be made to pass for -m32 by adding -mpowerpc64, but that option that isn't very effective when bi-arch testing and results in errors on rs6000-aix. And it is pointless to match -m32 stores to the stack followed by loads, which is what we do at the moment. signbit-2.c on the other hand has more reasonable 32-bit output. Regression tested powerpc64-linux biarch. * gcc.target/powerpc/signbit-1.c: Reinstate lp64 condition. * gcc.target/powerpc/signbit-2.c: Match 32-bit output too. diff --git a/gcc/testsuite/gcc.target/powerpc/signbit-1.c b/gcc/testsuite/gcc.target/powerpc/signbit-1.c index eb4f53e397d..1642bf46d7a 100644 --- a/gcc/testsuite/gcc.target/powerpc/signbit-1.c +++ b/gcc/testsuite/gcc.target/powerpc/signbit-1.c @@ -1,4 +1,5 @@ /* { dg-do compile } */ +/* { dg-require-effective-target lp64 } */ /* { dg-require-effective-target ppc_float128_sw } */ /* { dg-require-effective-target powerpc_p8vector_ok } */ /* { dg-options "-mdejagnu-cpu=power8 -O2 -mfloat128" } */ diff --git a/gcc/testsuite/gcc.target/powerpc/signbit-2.c b/gcc/testsuite/gcc.target/powerpc/signbit-2.c index ff6af963dda..1b792916eba 100644 --- a/gcc/testsuite/gcc.target/powerpc/signbit-2.c +++ b/gcc/testsuite/gcc.target/powerpc/signbit-2.c @@ -13,5 +13,7 @@ int do_signbit_kf (__float128 *a) { return __builtin_signbit (*a); } /* { dg-final { scan-assembler-not "lxvw4x" } } */ /* { dg-final { scan-assembler-not "lxsd" } } */ /* { dg-final { scan-assembler-not "lxsdx"} } */ -/* { dg-final { scan-assembler-times "ld" 1 } } */ -/* { dg-final { scan-assembler-times "srdi" 1 } } */ +/* { dg-final { scan-assembler-times "ld" 1 { target lp64 } } } */ +/* { dg-final { scan-assembler-times "srdi" 1 { target lp64 } } } */ +/* { dg-final { scan-assembler-times "lwz"1 { target ilp32 } } } */ +/* { dg-final { scan-assembler-times "rlwinm" 1 { target ilp32 } } } */ -- Alan Modra Australia Development Lab, IBM
Re: [RS6000] float128-type-2.c unsupported
On Wed, Oct 28, 2020 at 11:35:07PM -0400, David Edelsohn wrote: > Alan, > > It is disrespectful for you to ignore the review of a maintainer and > your colleague. You may not pick and choose amongst maintainers. And > Segher should not be so disrespectful as to contradict his colleague > and co-maintainer. I'm sorry you see this as a matter of respect. I didn't see it that way at all. Segher disagreed with your review, and gave sufficient technical reason for me to commit the patch. > I replied no to your patch and requested a different solution -- one > that does not require significant effort. Please fix this testcase > the way that I requested. I am not going to do as you demand. Those tests are clearly commented as "VSX Linux 64-bit" tests. Ignoring that comment and changing them to make them run for AIX and Darwin plainly requires that I test those changes, which often results in significant effort building cross-compilers. If you wish them to run for AIX targets then that is your job as an AIX maintainer. Don't expect me to do the work. Incidentally, this is the result of your recent commit 122f0db2793 on powerpc64-linux biarch. +FAIL: gcc.target/powerpc/signbit-1.c scan-assembler-not stxvd2x +FAIL: gcc.target/powerpc/signbit-1.c scan-assembler-times mfvsrd 3 +FAIL: gcc.target/powerpc/signbit-1.c scan-assembler-times srdi 3 +FAIL: gcc.target/powerpc/signbit-2.c scan-assembler-times ld 1 +FAIL: gcc.target/powerpc/signbit-2.c scan-assembler-times srdi 1 I'm not complaining about that, or demanding that you fix those fails; In fact I'm in the middle of fixing them. But they do quite perfectly illustrate that tweaking the testsuite is never simple. -- Alan Modra Australia Development Lab, IBM
Re: [RS6000] float128-type-2.c unsupported
On Wed, Oct 28, 2020 at 01:44:54PM -0500, Segher Boessenkool wrote: > On Wed, Oct 28, 2020 at 09:18:35PM +1030, Alan Modra wrote: > > >From e7ce33cef478a826a2fe4e110b43b49586ef2438 Mon Sep 17 00:00:00 2001 > > From: Alan Modra > > Date: Wed, 28 Oct 2020 15:57:57 +1030 > > Subject: > > > > I noticed this test is unsupported on power10 when looking through > > test logs. There seems no reason why that should be the case, ie. > > likely the target test was meant to be powerpc64*-*-linux*. And that > > simplifies down further. > > The target name does not tell you if you are doing a -m32 or a -m64 > build; both powerpc-linux and powerpc64-linux can build both 32-bit and > 64-bit just fine (and hopefully identically). Having target powerpc64* > is basically always wrong. Yes. Even le/be selection should really be done with { target le } for example rather than { target powerpc*le-*-* }. One day we might want to test compilers with multi-endian support. > Your patch is fine though, modulo what David said. If there is some > selector you can use (or you can make one) that is much preferred. But > since this patch is strictly an improvement already, it is okay for > trunk (if the 2nd works on powerpc64le-linux of course ;-) ) Thanks! > > (Improving it to test on exactly the right targets would be nice :-) ) Thank you, I committed it "as is". An incremental improvement is better than no improvement. -- Alan Modra Australia Development Lab, IBM
Re: [RS6000] Don't be too clever with dg-do run and dg-do compile
On Wed, Oct 28, 2020 at 12:16:00PM -0500, Segher Boessenkool wrote: > On Wed, Oct 28, 2020 at 09:20:56PM +1030, Alan Modra wrote: > > Otherwise some versions of dejagnu go ahead and run the vsx tests > > below when they should not. To best cope with older dejagnu, put > > "run" before "compile", the idea being that if the second dg-do always > > wins then that won't cause fails. > > If they are mutually exclusive, does the order still matter? (Just FMI.) Yes, it does. Older dejagnu takes the last action regardless of the target selector success. Newer dejagnu uses the last action of a line with a successful target selector. commit 569f8718b534a2cd9511a7d640352eb0126ff492 Author: Dominik Vogt Date: Mon Mar 28 17:31:07 2016 +1100 * dg.exp (dg-do): Do not change the previously selected action if a de-selected dg-do is encountered. Signed-off-by: Ben Elliston -- Alan Modra Australia Development Lab, IBM
[RS6000] Don't be too clever with dg-do run and dg-do compile
a/gcc/testsuite/gcc.target/powerpc/vsx-load-element-extend-short.c b/gcc/testsuite/gcc.target/powerpc/vsx-load-element-extend-short.c index 68fdcdcea37..837ba79c9ab 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx-load-element-extend-short.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx-load-element-extend-short.c @@ -2,8 +2,9 @@ Test of vec_xl_sext and vec_xl_zext (load into rightmost vector element and zero/sign extend). */ -/* { dg-do compile {target power10_ok} } */ -/* { dg-do run {target power10_hw} } */ +/* { dg-do run { target power10_hw } } */ +/* { dg-do compile { target { ! power10_hw } } } */ +/* { dg-require-effective-target power10_ok } */ /* { dg-require-effective-target int128 } */ /* Deliberately set optization to zero for this test to confirm diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-char.c b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-char.c index 45c49547d66..3049b1c2c28 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-char.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-char.c @@ -1,8 +1,9 @@ /* Test of vec_xst_trunc (truncate and store rightmost vector element) */ -/* { dg-do compile {target power10_ok} } */ -/* { dg-do run {target power10_hw} } */ +/* { dg-do run { target power10_hw } } */ +/* { dg-do compile { target { ! power10_hw } } } */ +/* { dg-require-effective-target power10_ok } */ /* { dg-require-effective-target int128 } */ /* Deliberately set optization to zero for this test to confirm the stxvr*x instruction is generated. At higher optimization levels diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-int.c b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-int.c index f263e3d5cc9..7cc7699f8eb 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-int.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-int.c @@ -1,8 +1,9 @@ /* Test of vec_xst_trunc (truncate and store rightmost vector element) */ -/* { dg-do compile {target power10_ok} } */ -/* { dg-do run {target power10_hw} } */ +/* { dg-do run { target power10_hw } } */ +/* { dg-do compile { target { ! power10_hw } } } */ +/* { dg-require-effective-target power10_ok } */ /* { dg-require-effective-target int128 } */ /* Deliberately set optization to zero for this test to confirm the stxvr*x instruction is generated. At higher optimization levels diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c index 0eeef5e6ba9..e1bd0216611 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c @@ -1,8 +1,9 @@ /* Test of vec_xst_trunc (truncate and store rightmost vector element) */ -/* { dg-do compile {target power10_ok} } */ -/* { dg-do run {target power10_hw} } */ +/* { dg-do run { target power10_hw } } */ +/* { dg-do compile { target { ! power10_hw } } } */ +/* { dg-require-effective-target power10_ok } */ /* { dg-require-effective-target int128 } */ /* Deliberately set optization to zero for this test to confirm diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c index 0186ddc552f..b173b36dbda 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c @@ -1,8 +1,9 @@ /* Test of vec_xst_trunc (truncate and store rightmost vector element) */ -/* { dg-do compile {target power10_ok} } */ -/* { dg-do run {target power10_hw} } */ +/* { dg-do run { target power10_hw } } */ +/* { dg-do compile { target { ! power10_hw } } } */ +/* { dg-require-effective-target power10_ok } */ /* { dg-require-effective-target int128 } */ /* Deliberately set optization to zero for this test to confirm -- Alan Modra Australia Development Lab, IBM
[RS6000] float128-type-2.c unsupported
>From e7ce33cef478a826a2fe4e110b43b49586ef2438 Mon Sep 17 00:00:00 2001 From: Alan Modra Date: Wed, 28 Oct 2020 15:57:57 +1030 Subject: I noticed this test is unsupported on power10 when looking through test logs. There seems no reason why that should be the case, ie. likely the target test was meant to be powerpc64*-*-linux*. And that simplifies down further. Regression tested powerpc64le-linux. OK? * gcc.target/powerpc/float128-type-1.c: Simplify target test. * gcc.target/powerpc/float128-type-2.c: Likewise. diff --git a/gcc/testsuite/gcc.target/powerpc/float128-type-1.c b/gcc/testsuite/gcc.target/powerpc/float128-type-1.c index 13152ac7c26..53f9e357535 100644 --- a/gcc/testsuite/gcc.target/powerpc/float128-type-1.c +++ b/gcc/testsuite/gcc.target/powerpc/float128-type-1.c @@ -1,4 +1,4 @@ -/* { dg-do compile { target { powerpc64*-*-linux* && lp64 } } } */ +/* { dg-do compile { target { *-*-linux* && lp64 } } } */ /* { dg-require-effective-target powerpc_p8vector_ok } */ /* { dg-options "-mdejagnu-cpu=power8 -O2 -mno-float128" } */ diff --git a/gcc/testsuite/gcc.target/powerpc/float128-type-2.c b/gcc/testsuite/gcc.target/powerpc/float128-type-2.c index 5644281c3d4..02dbad1fa4f 100644 --- a/gcc/testsuite/gcc.target/powerpc/float128-type-2.c +++ b/gcc/testsuite/gcc.target/powerpc/float128-type-2.c @@ -1,4 +1,4 @@ -/* { dg-do compile { target { powerpc64-*-linux* && lp64 } } } */ +/* { dg-do compile { target { *-*-linux* && lp64 } } } */ /* { dg-require-effective-target powerpc_p9vector_ok } */ /* { dg-options "-mdejagnu-cpu=power9 -O2 -mno-float128" } */ -- Alan Modra Australia Development Lab, IBM
Re: [RS6000] Do not define builtins that overload disabled builtins
commit 25ffd3d34e means we no longer define an overloaded __builtin_byte_in_set for -m32, so the more informative "__builtin_byte_in_set is not supported in this compiler configuration" is not reported. Regression tested powerpc64-linux biarch. OK? PR bootstrap/92661 * gcc.target/powerpc/byte-in-set-2.c: Update expected error. diff --git a/gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c b/gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c index 9a80c27fe26..34ab50e25ba 100644 --- a/gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c +++ b/gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c @@ -11,5 +11,5 @@ int test_byte_in_set (unsigned char b, unsigned long long set_members) { - return __builtin_byte_in_set (b, set_members); /* { dg-error "'__builtin_byte_in_set' is not supported in this compiler configuration" } */ + return __builtin_byte_in_set (b, set_members); /* { dg-warning "implicit declaration of function" } */ } -- Alan Modra Australia Development Lab, IBM
Re: testsuite: Enable and adjust powerpc fold-vec-extract/insert testcases
git a/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-int-p9.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-int-p9.c index 81ac1f1a00a..a851fd6b8dc 100644 --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-int-p9.c +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-int-p9.c @@ -58,8 +58,6 @@ testui2_cst(unsigned int x, vector unsigned int v) /* { dg-final { scan-assembler-times {\mmtvsrwz\M} 4 { target lp64 } } } */ /* { dg-final { scan-assembler-times {\mxxinsertw\M} 4 { target lp64 } } } */ - -/* { dg-final { scan-assembler-times {\maddi\M} 8 { target ilp32 } } } */ /* { dg-final { scan-assembler-times {\mstw\M} 8 { target ilp32 } } } */ /* { dg-final { scan-assembler-times {\mlxv\M} 8 { target ilp32 } } } */ /* { dg-final { scan-assembler-times {\mlvewx\M} 4 { target ilp32 } } } */ -- Alan Modra Australia Development Lab, IBM
Re: [RS6000] dg-do !compile and scan-assembler
On Tue, Oct 27, 2020 at 05:34:45PM -0500, Segher Boessenkool wrote: > Hi! > > On Tue, Oct 27, 2020 at 09:26:10PM +1030, Alan Modra wrote: > > These tests never checked assembly, because .s files were not > > produced. One test was looking for the wrong instructions. > > -/* { dg-final { scan-assembler-times {\msldbi\M} 6 } } */ > > -/* { dg-final { scan-assembler-times {\msrdbi\M} 6 } } */ > > - > > - > > +/* { dg-final { scan-assembler-times {\mvsldbi\M} 8 } } */ > > +/* { dg-final { scan-assembler-times {\mvsrdbi\M} 8 } } */ > > You also changed the instruction count here; did you check 8 is actually > what is expected here, or is that just what you saw come out? Yes, 8 is the correct count. And in this instance I amazingly changed the count before running the test and finding it "wrong". :-) -- Alan Modra Australia Development Lab, IBM
[RS6000] power10 scan-assembler tests
ot;-mdejagnu-cpu=power10 -O0 -save-temps" } */ /* { dg-final { scan-assembler-times {\mstxvrwx\M} 2 } } */ /* { dg-final { scan-assembler-times {\mstwx\M} 0 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c index c137ce2d19f..0eeef5e6ba9 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c @@ -9,7 +9,7 @@ the stxvr*x instruction is generated. At higher optimization levels the instruction we are looking for is sometimes replaced by other store instructions. */ -/* { dg-options "-mdejagnu-cpu=power10 -O0" } */ +/* { dg-options "-mdejagnu-cpu=power10 -O0 -save-temps" } */ /* { dg-final { scan-assembler-times {\mstxvrdx\M} 2 } } */ /* { dg-final { scan-assembler-times {\mstwx\M} 0 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c index 7d856e7c3eb..0186ddc552f 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c @@ -9,7 +9,7 @@ the stxvr*x instruction is generated. At higher optimization levels the instruction we are looking for is sometimes replaced by other store instructions. */ -/* { dg-options "-mdejagnu-cpu=power10 -O0" } */ +/* { dg-options "-mdejagnu-cpu=power10 -O0 -save-temps" } */ /* { dg-final { scan-assembler-times {\mstxvrhx\M} 2 } } */ /* { dg-final { scan-assembler-times {\msthx\M} 0 } } */ -- Alan Modra Australia Development Lab, IBM
[RS6000] dg-do !compile and scan-assembler
>From 6c1817cece47ce2cb36df1f57b533b9d2385f0a5 Mon Sep 17 00:00:00 2001 From: Alan Modra Date: Tue, 27 Oct 2020 17:32:13 +1030 Subject: These tests never checked assembly, because .s files were not produced. One test was looking for the wrong instructions. A typical error log PASS: gcc.target/powerpc/vec-permute-ext-runnable.c (test for excess errors) gcc.target/powerpc/vec-permute-ext-runnable.c output file does not exist UNRESOLVED: gcc.target/powerpc/vec-permute-ext-runnable.c scan-assembler-times \\mpermx\\M 10 Bootstrapped and regression tested powerpc64le-linux power8. powerpc64le-linux power10 and powerpc64-linux power7 biarch regtest still in progress. OK? * gcc.target/powerpc/vec-blend-runnable.c: Add save-temps. * gcc.target/powerpc/vec-insert-word-runnable.c: Likewise. * gcc.target/powerpc/vec-permute-ext-runnable.c: Likewise. * gcc.target/powerpc/vec-replace-word-runnable.c: Likewise. * gcc.target/powerpc/vec-splati-runnable.c: Likewise. * gcc.target/powerpc/vec-ternarylogic-3.c: Likewise. * gcc.target/powerpc/vec-ternarylogic-9.c: Likewise. * gcc.target/powerpc/vsx_mask-count-runnable.c: Likewise. * gcc.target/powerpc/vsx_mask-expand-runnable.c: Likewise. * gcc.target/powerpc/vsx_mask-extract-runnable.c: Likewise. * gcc.target/powerpc/vsx_mask-move-runnable.c: Likewise. * gcc.target/powerpc/vec-shift-double-runnable.c: Likewise, and correct assembly match. diff --git a/gcc/testsuite/gcc.target/powerpc/vec-blend-runnable.c b/gcc/testsuite/gcc.target/powerpc/vec-blend-runnable.c index 774960bbcd3..0f4b2130351 100644 --- a/gcc/testsuite/gcc.target/powerpc/vec-blend-runnable.c +++ b/gcc/testsuite/gcc.target/powerpc/vec-blend-runnable.c @@ -1,7 +1,7 @@ /* { dg-do run { target { power10_hw } } } */ /* { dg-do link { target { ! power10_hw } } } */ /* { dg-require-effective-target power10_ok } */ -/* { dg-options "-mdejagnu-cpu=power10" } */ +/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */ #include #define DEBUG 0 diff --git a/gcc/testsuite/gcc.target/powerpc/vec-insert-word-runnable.c b/gcc/testsuite/gcc.target/powerpc/vec-insert-word-runnable.c index 55ca1c4b35d..be45182a6be 100644 --- a/gcc/testsuite/gcc.target/powerpc/vec-insert-word-runnable.c +++ b/gcc/testsuite/gcc.target/powerpc/vec-insert-word-runnable.c @@ -1,7 +1,7 @@ /* { dg-do run { target { power10_hw } } } */ /* { dg-do link { target { ! power10_hw } } } */ /* { dg-require-effective-target power10_ok } */ -/* { dg-options "-mdejagnu-cpu=power10" } */ +/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */ #include #define DEBUG 0 diff --git a/gcc/testsuite/gcc.target/powerpc/vec-permute-ext-runnable.c b/gcc/testsuite/gcc.target/powerpc/vec-permute-ext-runnable.c index 2626d876d3d..3e3f9a77ecb 100644 --- a/gcc/testsuite/gcc.target/powerpc/vec-permute-ext-runnable.c +++ b/gcc/testsuite/gcc.target/powerpc/vec-permute-ext-runnable.c @@ -1,7 +1,7 @@ /* { dg-do run { target { power10_hw } } } */ /* { dg-do link { target { ! power10_hw } } } */ /* { dg-require-effective-target power10_ok } */ -/* { dg-options "-mdejagnu-cpu=power10" } */ +/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */ #include #define DEBUG 0 diff --git a/gcc/testsuite/gcc.target/powerpc/vec-replace-word-runnable.c b/gcc/testsuite/gcc.target/powerpc/vec-replace-word-runnable.c index 413b9048eca..162968316bc 100644 --- a/gcc/testsuite/gcc.target/powerpc/vec-replace-word-runnable.c +++ b/gcc/testsuite/gcc.target/powerpc/vec-replace-word-runnable.c @@ -1,7 +1,7 @@ /* { dg-do run { target { power10_hw } } } */ /* { dg-do link { target { ! power10_hw } } } */ /* { dg-require-effective-target power10_ok } */ -/* { dg-options "-mdejagnu-cpu=power10" } */ +/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */ #include diff --git a/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable.c b/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable.c index b8478f5c32d..128da2ad42b 100644 --- a/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable.c +++ b/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable.c @@ -1,7 +1,7 @@ /* { dg-do run { target { power10_hw } } } */ /* { dg-do link { target { ! power10_hw } } } */ /* { dg-require-effective-target power10_ok } */ -/* { dg-options "-mdejagnu-cpu=power10" } */ +/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */ #include #define DEBUG 0 @@ -379,7 +379,5 @@ main (int argc, char *argv []) return 0; } -/* { dg-final { scan-assembler-times {\msldbi\M} 6 } } */ -/* { dg-final { scan-assembler-times {\msrdbi\M} 6 } } */ - - +/* { dg-final { scan-assembler-times {\mvsldbi\M} 8 } } */ +/* { dg-final { scan-assembler-times {\mvsrdbi\M} 8 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c b/gcc/testsuite/gcc.ta
[RS6000] Separate dg-require-effective-target options
Subject was "[RS6000] Tests that use int128_t and -m32" I meant to make this change before committing too. Pushed. * gcc.target/powerpc/vsx_mask-count-runnable.c: Separate options passed to dg-require-effective-target. * gcc.target/powerpc/vsx_mask-expand-runnable.c: Likewise. * gcc.target/powerpc/vsx_mask-extract-runnable.c: Likewise. * gcc.target/powerpc/vsx_mask-move-runnable.c: Likewise. diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c b/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c index 6aa165c675c..28aa7da9d1f 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c @@ -1,7 +1,8 @@ /* { dg-do run { target { power10_hw } } } */ /* { dg-do link { target { ! power10_hw } } } */ /* { dg-options "-mdejagnu-cpu=power10 -O2" } */ -/* { dg-require-effective-target { int128 && power10_ok } } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-require-effective-target int128 } */ /* Check that the expected 128-bit instructions are generated if the processor supports the 128-bit integer instructions. */ diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c b/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c index 9fdfa4a8b82..68c1c3f1c9a 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c @@ -1,7 +1,8 @@ /* { dg-do run { target { power10_hw } } } */ /* { dg-do link { target { ! power10_hw } } } */ /* { dg-options "-mdejagnu-cpu=power10 -O2" } */ -/* { dg-require-effective-target { int128 && power10_ok } } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-require-effective-target int128 } */ /* Check that the expected 128-bit instructions are generated if the processor supports the 128-bit integer instructions. */ diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c b/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c index a038e56c9cd..4664807a69e 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c @@ -1,7 +1,8 @@ /* { dg-do run { target { power10_hw } } } */ /* { dg-do link { target { ! power10_hw } } } */ /* { dg-options "-mdejagnu-cpu=power10 -O2" } */ -/* { dg-require-effective-target { int128 && power10_ok } } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-require-effective-target int128 } */ /* Check that the expected 128-bit instructions are generated if the processor supports the 128-bit integer instructions. */ diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c b/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c index 6f87e60ea41..58954dc5fc9 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c @@ -1,7 +1,8 @@ /* { dg-do run { target { power10_hw } } } */ /* { dg-do link { target { ! power10_hw } } } */ /* { dg-options "-mdejagnu-cpu=power10 -O2" } */ -/* { dg-require-effective-target { int128 && power10_ok } } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-require-effective-target int128 } */ /* Check that the expected 128-bit instructions are generated if the processor supports the 128-bit integer instructions. */ -- Alan Modra Australia Development Lab, IBM
[RS6000] Adjust testcases for power10 instructions V2
0644 --- a/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c +++ b/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c @@ -30,5 +30,5 @@ vector signed long long splats4(void) /* { dg-final { scan-assembler-times {\mvspltis[bhw]\M} 0 } } */ /* { dg-final { scan-assembler-times {\mvsl[bhwd]\M} 0 } } */ -/* { dg-final { scan-assembler-times {\mlvx\M|\mlxv\M|\mlxvd2x\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mlvx\M|\mp?lxv\M|\mlxvd2x\M} 2 } } */ -- Alan Modra Australia Development Lab, IBM
Re: [RS6000] Unsupported test options for -m32
On Mon, Oct 26, 2020 at 04:28:20PM +, Iain Sandoe wrote: > David Edelsohn via Gcc-patches wrote: > > > FAIL: gcc.target/powerpc/swaps-p8-22.c (test for excess errors) > > Excess errors: > > cc1: error: '-mcmodel' not supported in this configuration > > > > * gcc.target/powerpc/swaps-p8-22.c: Disable for -m32. > > > > diff --git a/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c > > b/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c > > index 83f6ab3a1c0..bceada41b75 100644 > > --- a/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c > > +++ b/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c > > @@ -1,5 +1,5 @@ > > /* { dg-do compile } */ > > -/* { dg-require-effective-target powerpc_p8vector_ok } */ > > +/* { dg-require-effective-target { lp64 && powerpc_p8vector_ok } } */ > > /* { dg-options "-O2 -mdejagnu-cpu=power8 -maltivec -mcmodel=large" } */ > > > > /* The expansion for vector character multiply introduces a vperm > > operation. > > > > > > Please don't fix the failure this way. This is incorrect. -m32 means > > more than Linux. This reverts my hard work to run more of the powerpc > > testsuite on AIX. AIX also is -m32. Ah, David was the culprit that broke the test on linux and darwin. ;-) > Darwin also is (powerpc-darwin) and has an m32 multilib (powerpc64-darwin) > so not reliable there either. > > > > This probably should be fixed with > > > > { dg-additional-options "-mcmodel=large" { target { lp64 || > > !powerpc*-*-linux* } } } > > > > or whatever the appropriate incantation to omit only ppc32 linux. Or maybe > > > > { dg-do compile { target { lp64 || !powerpc*-*-linux* } } } > > mcmodel will also break for powerpc64 and powerpc / m64 Darwin, so if this is > meant to be Linux-specific, that seems to be the thing to mention. $ grep mcmodel gcc/config/rs6000/*.opt gcc/config/rs6000/aix64.opt:mcmodel= gcc/config/rs6000/aix64.opt:Known code models (for use with the -mcmodel= option): gcc/config/rs6000/linux64.opt:mcmodel= gcc/config/rs6000/linux64.opt:Known code models (for use with the -mcmodel= option): aix64.opt is used for all rs6000-aix targets, linux64.opt for 64-bit and biarch powerpc linux targets. powerpc linux errors for -mcmodel when -m32 (yes, even when biarch Segher). So the proper test is a little more complicated than any suggestion given so far. This looks correct to me, bearing in mind that the test is a duplicate of swaps-p8-21.c aimed specifically at testing -mcmodel=large. * gcc.target/powerpc/swaps-p8-22.c: Enable only for aix and -m64 linux. diff --git a/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c b/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c index 83f6ab3a1c0..847aebccca8 100644 --- a/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c +++ b/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c @@ -1,4 +1,4 @@ -/* { dg-do compile } */ +/* { dg-do compile { target { *-*-aix* || { *-*-linux* && lp64 } } } } */ /* { dg-require-effective-target powerpc_p8vector_ok } */ /* { dg-options "-O2 -mdejagnu-cpu=power8 -maltivec -mcmodel=large" } */ -- Alan Modra Australia Development Lab, IBM
Re: [RS6000] Tests that use int128_t and -m32
On Mon, Oct 26, 2020 at 07:33:49AM -0500, Segher Boessenkool wrote: > On Sun, Oct 25, 2020 at 09:50:01PM +1030, Alan Modra wrote: > > All these tests fail with -m32 due to lack of int128 support, > > Is there any good reason __int128 is not enabled for rs6000 -m32, btw? Lack of addti3 and subti3 perhaps? > > in some > > cases with what I thought was not the best error message. For example > > vsx_mask-move-runnable.c:34:3: error: unknown type name 'vector' > > is misleading. The problem isn't "vector" but "vector __uint128_t". > > Ouch, yes. Do you see a simple way to fix that? I haven't looked. The only reason I commented on the error was in the hope that someone who knows gcc intimately enough to fix this without much effort would do so. :-) -- Alan Modra Australia Development Lab, IBM
Re: [RS6000] Non-pcrel tests when power10
On Mon, Oct 26, 2020 at 03:18:39PM -0500, Segher Boessenkool wrote: > Hi! > > On Thu, Oct 22, 2020 at 05:28:17PM +1030, Alan Modra wrote: > > These tests require -mno-pcrel because they are testing features > > of the non-pcrel ABI. > > > --- a/gcc/testsuite/gcc.target/powerpc/cprophard.c > > +++ b/gcc/testsuite/gcc.target/powerpc/cprophard.c > > @@ -1,6 +1,6 @@ > > /* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ > > Make this { target lp64 } if you want? OK, changed. > > --- a/gcc/testsuite/gcc.target/powerpc/pr79439-1.c > > +++ b/gcc/testsuite/gcc.target/powerpc/pr79439-1.c > > @@ -1,5 +1,5 @@ > > (another) But not this one. > > --- a/gcc/testsuite/gcc.target/powerpc/pr79439-2.c > > +++ b/gcc/testsuite/gcc.target/powerpc/pr79439-2.c > > @@ -1,5 +1,5 @@ > > /* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */ > > (wow there are many) Or this. The tests are linux specific scan-assembler tests. -- Alan Modra Australia Development Lab, IBM
Re: [RS6000] Tests that use int128_t and -m32
On Sun, Oct 25, 2020 at 10:43:12AM -0400, David Edelsohn wrote: > On Sun, Oct 25, 2020 at 7:20 AM Alan Modra wrote: > > > > All these tests fail with -m32 due to lack of int128 support, in some > > cases with what I thought was not the best error message. For example > > vsx_mask-move-runnable.c:34:3: error: unknown type name 'vector' > > is misleading. The problem isn't "vector" but "vector __uint128_t". > > > > * gcc.target/powerpc/vsx-load-element-extend-char.c: Require int128. > > * gcc.target/powerpc/vsx-load-element-extend-int.c: Likewise. > > * gcc.target/powerpc/vsx-load-element-extend-longlong.c: Likewise. > > * gcc.target/powerpc/vsx-load-element-extend-short.c: Likewise. > > * gcc.target/powerpc/vsx-store-element-truncate-char.c: Likewise. > > * gcc.target/powerpc/vsx-store-element-truncate-int.c: Likewise. > > * gcc.target/powerpc/vsx-store-element-truncate-longlong.c: > > Likewise. > > * gcc.target/powerpc/vsx-store-element-truncate-short.c: Likewise. > > * gcc.target/powerpc/vsx_mask-count-runnable.c: Likewise. > > * gcc.target/powerpc/vsx_mask-expand-runnable.c: Likewise. > > * gcc.target/powerpc/vsx_mask-extract-runnable.c: Likewise. > > * gcc.target/powerpc/vsx_mask-move-runnable.c: Likewise. > > Good catch. > > Another problem with all of the vsx_mask test cases is that they use > -mcpu=power10 instead of -mdejagnu-cpu=power10. Can you follow up > with that fix or do you want me to? Sure, I can do that if you're pre-approving the patch. gcc.target/powerpc/pr93122.c too. -- Alan Modra Australia Development Lab, IBM
[RS6000] biarch test fail
I thought this one was worth at least commenting as to why it fails when biarch testing. OK? * gcc.target/powerpc/bswap64-4.c: Comment. diff --git a/gcc/testsuite/gcc.target/powerpc/bswap64-4.c b/gcc/testsuite/gcc.target/powerpc/bswap64-4.c index a3c05539652..11787000409 100644 --- a/gcc/testsuite/gcc.target/powerpc/bswap64-4.c +++ b/gcc/testsuite/gcc.target/powerpc/bswap64-4.c @@ -7,6 +7,12 @@ /* { dg-final { scan-assembler-times "ldbrx" 1 { target has_arch_pwr7 } } } */ /* { dg-final { scan-assembler-times "stdbrx" 1 { target has_arch_pwr7 } } } */ +/* This test will fail when biarch testing with + "RUNTESTFLAGS=--target_board=unix'{-m64,-m32}'" because the -m32 is + added on the command line after the dg-options -mpowerpc64, and + common/config/rs6000/rs6000-common.c:rs6000_handle_option disables + -mpowerpc64 for -m32. */ + long long swap_load (long long *a) { return __builtin_bswap64 (*a); } long long swap_reg (long long a) { return __builtin_bswap64 (a); } void swap_store (long long *a, long long b) { *a = __builtin_bswap64 (b); } -- Alan Modra Australia Development Lab, IBM
[RS6000] Remove -mpcrel from tests
When running with -m32 FAIL: gcc.target/powerpc/pr94740.c (test for excess errors) Excess errors: cc1: error: '-mpcrel' requires '-mcmodel=medium' The others don't run for -m32, but remove the unnecessary -mpcrel anyway. * gcc.target/powerpc/localentry-1.c: Remove -mpcrel from options. * gcc.target/powerpc/notoc-direct-1.c: Likewise. * gcc.target/powerpc/pr94740.c: Likewise. diff --git a/gcc/testsuite/gcc.target/powerpc/localentry-1.c b/gcc/testsuite/gcc.target/powerpc/localentry-1.c index c3c51680cfe..1343df2b8f1 100644 --- a/gcc/testsuite/gcc.target/powerpc/localentry-1.c +++ b/gcc/testsuite/gcc.target/powerpc/localentry-1.c @@ -1,11 +1,10 @@ /* { dg-do compile } */ -/* { dg-options "-mdejagnu-cpu=power10 -O2 -mpcrel" } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ /* { dg-require-effective-target powerpc_elfv2 } */ /* { dg-require-effective-target power10_ok } */ -/* Ensure we generate ".localentry fn,1" for both leaf and non-leaf functions. - At present, -mcpu=power10 does not enable pc-relative mode, so make sure we - enable it to be able to check for .localentry. */ +/* Ensure we generate ".localentry fn,1" for both leaf and non-leaf + functions. */ extern int y (int); diff --git a/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c b/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c index 74187e1d5dc..8fa09b03f4f 100644 --- a/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c +++ b/gcc/testsuite/gcc.target/powerpc/notoc-direct-1.c @@ -1,11 +1,10 @@ /* { dg-do compile } */ -/* { dg-options "-mdejagnu-cpu=power10 -O2 -mpcrel" } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ /* { dg-require-effective-target powerpc_elfv2 } */ /* { dg-require-effective-target power10_ok } */ -/* Test that calls generated from PC-relative code are annotated with @notoc. - At present, -mcpu=power10 does not enable pc-relative mode. Enable it here - explicitly until it is turned on by default. */ +/* Test that calls generated from PC-relative code are annotated with + @notoc. */ extern int yy0 (int); extern void yy1 (int); @@ -40,4 +39,3 @@ int ww (void) /* { dg-final { scan-assembler {yy0@notoc} } } */ /* { dg-final { scan-assembler {zz1@notoc} } } */ /* { dg-final { scan-assembler {zz0@notoc} } } */ - diff --git a/gcc/testsuite/gcc.target/powerpc/pr94740.c b/gcc/testsuite/gcc.target/powerpc/pr94740.c index 9c2b4644701..09decc38c2c 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr94740.c +++ b/gcc/testsuite/gcc.target/powerpc/pr94740.c @@ -1,7 +1,7 @@ /* PR rtl-optimization/94740 */ /* { dg-do compile } */ /* { dg-require-effective-target power10_ok } */ -/* { dg-options "-O2 -mdejagnu-cpu=power10 -mpcrel" } */ +/* { dg-options "-O2 -mdejagnu-cpu=power10" } */ int array[8]; int -- Alan Modra Australia Development Lab, IBM
[RS6000] Unsupported test options for -m32
FAIL: gcc.target/powerpc/swaps-p8-22.c (test for excess errors) Excess errors: cc1: error: '-mcmodel' not supported in this configuration * gcc.target/powerpc/swaps-p8-22.c: Disable for -m32. diff --git a/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c b/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c index 83f6ab3a1c0..bceada41b75 100644 --- a/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c +++ b/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-require-effective-target { lp64 && powerpc_p8vector_ok } } */ /* { dg-options "-O2 -mdejagnu-cpu=power8 -maltivec -mcmodel=large" } */ /* The expansion for vector character multiply introduces a vperm operation. -- Alan Modra Australia Development Lab, IBM
[RS6000] Tests that use int128_t and -m32
-git a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c index 7fce6a44d4f..465fbeaf6ab 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-longlong.c @@ -3,6 +3,7 @@ /* { dg-do compile {target power10_ok} } */ /* { dg-do run {target power10_hw} } */ +/* { dg-require-effective-target { int128 } } */ /* Deliberately set optization to zero for this test to confirm the stxvr*x instruction is generated. At higher optimization levels diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c index 17925c87732..f87256921bf 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx-store-element-truncate-short.c @@ -3,6 +3,7 @@ /* { dg-do compile {target power10_ok} } */ /* { dg-do run {target power10_hw} } */ +/* { dg-require-effective-target { int128 } } */ /* Deliberately set optization to zero for this test to confirm the stxvr*x instruction is generated. At higher optimization levels diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c b/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c index 5862517eae9..6ac4ed2173f 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c @@ -1,7 +1,7 @@ /* { dg-do run { target { power10_hw } } } */ /* { dg-do link { target { ! power10_hw } } } */ /* { dg-options "-mcpu=power10 -O2" } */ -/* { dg-require-effective-target power10_ok } */ +/* { dg-require-effective-target { int128 && power10_ok } } */ /* Check that the expected 128-bit instructions are generated if the processor supports the 128-bit integer instructions. */ diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c b/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c index 13b4c8afd4f..05fedf77eb9 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c @@ -1,7 +1,7 @@ /* { dg-do run { target { power10_hw } } } */ /* { dg-do link { target { ! power10_hw } } } */ /* { dg-options "-mcpu=power10 -O2" } */ -/* { dg-require-effective-target power10_ok } */ +/* { dg-require-effective-target { int128 && power10_ok } } */ /* Check that the expected 128-bit instructions are generated if the processor supports the 128-bit integer instructions. */ diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c b/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c index d58a6b0b682..6e952695905 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c @@ -1,7 +1,7 @@ /* { dg-do run { target { power10_hw } } } */ /* { dg-do link { target { ! power10_hw } } } */ /* { dg-options "-mcpu=power10 -O2" } */ -/* { dg-require-effective-target power10_ok } */ +/* { dg-require-effective-target { int128 && power10_ok } } */ /* Check that the expected 128-bit instructions are generated if the processor supports the 128-bit integer instructions. */ diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c b/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c index 9147d67c9d1..c2eb53d3bb2 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c @@ -1,7 +1,7 @@ /* { dg-do run { target { power10_hw } } } */ /* { dg-do link { target { ! power10_hw } } } */ /* { dg-options "-mcpu=power10 -O2" } */ -/* { dg-require-effective-target power10_ok } */ +/* { dg-require-effective-target { int128 && power10_ok } } */ /* Check that the expected 128-bit instructions are generated if the processor supports the 128-bit integer instructions. */ -- Alan Modra Australia Development Lab, IBM
Re: [RS6000] VSX_MM_SUFFIX
On Sat, Oct 24, 2020 at 12:31:43PM -0500, Segher Boessenkool wrote: > On Sat, Oct 24, 2020 at 02:59:34PM +1030, Alan Modra wrote: > > Those instructions aren't generated, we don't see them anywhere on a > > power10 all-lang bootstrap except in one testcase designed to exercise > > them. > > But that is completely not relevant :-) > > If you use a macro that doesn't exist, the compiler simply does not > build! My empirical evidence to the contrary says your theoretical arguments are invalid. :-) $ gcc/xgcc -Bgcc/ -S ~/src/gcc/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c -O2 -mcpu=power10 $ grep VSX_MM vsx_mask-count-runnable.s vcntmb 9,0,1 vcntmb 9,0,1 vcntmb 9,0,1 vcntmb 9,0,1 -- Alan Modra Australia Development Lab, IBM
Re: [RS6000] VSX_MM_SUFFIX
On Fri, Oct 23, 2020 at 04:06:57PM -0500, Segher Boessenkool wrote: > On Fri, Oct 23, 2020 at 09:41:30AM +1030, Alan Modra wrote: > > On Thu, Oct 22, 2020 at 11:03:14AM -0500, Segher Boessenkool wrote: > > > On Thu, Oct 22, 2020 at 08:58:10AM -0700, Carl Love wrote: > > > > OK, looks like VSX_MM_SUFFIX doesn't exist anymore. Don't know what > > > > happed to it. Thanks. > > > > > > It never existed on trunk. Please always regstrap patches on trunk > > > before committing. It feels unnecessary at times, but now you know > > > why :-) > > > > Regression testing wouldn't have caught the problem, unless Carl had > > done so on power10. > > "regstrap" is "bootstrap + regression check". It won't bootstrap (won't > cross build even afaics?) Those instructions aren't generated, we don't see them anywhere on a power10 all-lang bootstrap except in one testcase designed to exercise them. -- Alan Modra Australia Development Lab, IBM
Re: [RS6000] altivec style lvx/stvx addresses vs power10
Hi Segher, On Fri, Oct 23, 2020 at 11:31:02AM -0500, Segher Boessenkool wrote: > On Fri, Oct 23, 2020 at 05:15:08PM +1030, Alan Modra wrote: > > The problem starts with fwprop creating > > (insn 9 4 0 2 (set (mem:V8HI (and:DI (plus:DI (reg/v/f:DI 121 [ vpp ]) > > (const_int 12 [0xc])) > > (const_int -16 [0xfff0])) [0 MEM > short int> [(void *)_4 & -16B]+0 S16 A128]) > > (reg/v:V8HI 120 [ vp1 ])) "pixel.c":6:10 1237 {vsx_movv8hi_64bit} > > which is finally thrown out as invalid by lra. lra of course does that > > by reloading the entire address. > > While it could/should just reload that 12 into a reg :-( Could you > investigate doing that? Is there any reason we should not? My first thought was to write the necessary reloads too, but I think it's better to not generate the invalid addressing in the first place. Hiding the +12 in an invalid memory address means it isn't exposed to optimization passes. -- Alan Modra Australia Development Lab, IBM
Re: [PATCH 2/8] [RS6000] rs6000_rtx_costs for AND
On Fri, Oct 23, 2020 at 07:18:44PM -0400, Hans-Peter Nilsson wrote: > On Thu, 22 Oct 2020, Alan Modra via Gcc-patches wrote: > > /* (reg) is costed at zero by rtlanal.c:rtx_cost. That sets a > > baseline for rtx costs: If a constant is valid in an insn, > > it is free. */ > > >From where is this quote? My "git grep" fails to find it for me > (on master). It seems like a port-specific commment so I'd > have expected to find it somewhere in config/rs6000. It's in a patch in the rs6000_rtx_costs series that hasn't yet been committed. https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555757.html -- Alan Modra Australia Development Lab, IBM
Re: [RS6000] Adjust testcases for power10 instructions
On Fri, Oct 23, 2020 at 01:22:31PM -0500, Segher Boessenkool wrote: > On Fri, Oct 23, 2020 at 04:45:29PM +1030, Alan Modra wrote: > > Revised patch, removing changes to > > gcc.target/powerpc/fold-vec-st-double.c, > > gcc.target/powerpc/fold-vec-st-longlong.c, > > gcc.target/powerpc/fold-vec-st-pixel.c. Fixing fails on those three > > tests will be the subject of another patch. > > Okido. > > > Most of these changes are fairly obvious. Duplicated setbcr in > > +/* { dg-final { scan-assembler-times {\maddic\M|\msetbcr\M} 4 } } */ > > +/* { dg-final { scan-assembler-times {\msubfe\M|\msetbcr\M} 1 } } */ > > is due to addic;subfe being replaced in one function with setbcr. > > But that won't really work. If there is more than one addic replaced by > setbcr, that second scan fails (because it matches at least two times > then). Sure it works. I did test the patch! If future code gen changes, the test will need updating at that point. scan-assembler tests require maintenance.. > > * gcc.dg/pr56727-2.c, > ... > > * gcc.target/powerpc/ppc-eq0-1.c, > > * gcc.target/powerpc/ppc-ne0-1.c, > > * gcc.target/powerpc/pr86731-fwrapv-longlong.c: Match power10 insns. > > This should all be behind only one "*" (so delete it on all but the > first line here). Huh, ok. > > --- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c > > +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c > > @@ -25,6 +25,6 @@ main1 (void) > > with no word loads (lw, lwu, lwz, lwzu, or their indexed forms) > > or word stores (stw, stwu, stwx, stwux, or their indexed forms). */ > > > > -/* { dg-final { scan-assembler "\t(lvx|lxv|lvsr|stxv)" } } */ > > +/* { dg-final { scan-assembler "\t(lvx|lxv|lvsr|stxv|plxv|pstxv)" } } */ > > /* { dg-final { scan-assembler "\t(lvx|p?lxv|lvsr|p?stxv)" } } */ > might be more readable/maintainable/extensible? OK. > > --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c > > +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c > > @@ -36,4 +36,4 @@ BUILD_VAR_TEST( test10, vector unsigned char, signed long > > long, vector unsigned > > BUILD_VAR_TEST( test11, vector unsigned char, signed int, vector unsigned > > char); > > BUILD_CST_TEST( test12, vector unsigned char, 8, vector unsigned char); > > > > -/* { dg-final { scan-assembler-times > > {\mlxvw4x\M|\mlxvd2x\M|\mlxvx\M|\mlvx\M} 12 } } */ > > +/* { dg-final { scan-assembler-times > > {\mlxvw4x\M|\mlxvd2x\M|\mlxvx\M|\mlvx\M|\mplxv\M} 12 } } */ > > Here, it did not allow lxv before. Should it? > > (in many files) We don't generate lxv due to the insn only supporting DQ offsets. For example, on power9 test3_cst from this test is li 9,2 lxvx 34,3,9 blr On power10 plxv 34,2(3) blr > Have you verified the p10 code generation actually makes sense? Yes, I did. > > --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-splat-longlong.c > > +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-splat-longlong.c > > @@ -25,7 +25,7 @@ vector signed long long test_sll () { const vector signed > > long long y = {34, 45} > > vector unsigned long long test_ull () { const vector unsigned long long y > > = {56, 67}; return vec_splat (y, 0b00010); } > > > > /* Assorted load instructions for the initialization with known constants. > > */ > > -/* { dg-final { scan-assembler-times {\mlvx\M|\mlxvd2x\M|\mlxv\M} 3 } } */ > > +/* { dg-final { scan-assembler-times > > {\mlvx\M|\mlxvd2x\M|\mlxv\M|\mplxv\M|\mplxv\M} 3 } } */ > > You have plxv twice here. Oops. > > --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-store-builtin_vec_xst-char.c > > +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-store-builtin_vec_xst-char.c > > @@ -36,4 +36,4 @@ BUILD_VAR_TEST( test10, vector unsigned char, signed > > long long, vector unsigned > > BUILD_VAR_TEST( test11, vector unsigned char, signed int, vector unsigned > > char ); > > BUILD_CST_TEST( test12, vector unsigned char, 12, vector unsigned char ); > > > > -/* { dg-final { scan-assembler-times > > {\mstxvw4x\M|\mstxvd2x\M|\mstxvx\M|\mstvx\M} 12 } } */ > > +/* { dg-final { scan-assembler-times > > {\mstxvw4x\M|\mstxvd2x\M|\mstxvx\M|\mstvx\M|\mpstxv\M} 12 } } */ > > Similarly, should it have plain stxv as well? No. Same reason as to why lxv isn't generated. > > --- a/gcc/testsuite/gcc.target/powerpc/lvsl-lvsr.c > > +++ b/gcc/testsuite/gcc.target/powerpc/lvsl-lvsr.c > > @@ -1,12 +1,10 @@ >
[RS6000] Power10 ICE running gcc.target/powerpc/ppc-ne0-1.c
rs6000_emit_int_cmove generates isel so the condition below needs fixing for power10. Bootstrapped and regression tested powerpc64le-linux power10 and power8. OK? * config/rs6000/rs6000.md (cstore4): Don't call rs6000_emit_int_cmove for power10 when -mno-isel. diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 2bf5df41a9b..cd41b759f71 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -11538,7 +11538,7 @@ "" { /* Everything is best done with setbc[r] if available. */ - if (TARGET_POWER10) + if (TARGET_POWER10 && TARGET_ISEL) rs6000_emit_int_cmove (operands[0], operands[1], const1_rtx, const0_rtx); /* Expanding EQ and NE directly to some machine instructions does not help -- Alan Modra Australia Development Lab, IBM
[RS6000] altivec style lvx/stvx addresses vs power10
gcc.target/powerpc/fold-vec-st-pixel.c and other testcases fail on power10, generating addi 9,5,12 rldicr 9,9,0,59 stxv 34,0(9) rather than addi 5,5,12 stvx 2,0,5 for an altivec lvx/stvx style address. The problem starts with fwprop creating (insn 9 4 0 2 (set (mem:V8HI (and:DI (plus:DI (reg/v/f:DI 121 [ vpp ]) (const_int 12 [0xc])) (const_int -16 [0xfff0])) [0 MEM [(void *)_4 & -16B]+0 S16 A128]) (reg/v:V8HI 120 [ vp1 ])) "pixel.c":6:10 1237 {vsx_movv8hi_64bit} which is finally thrown out as invalid by lra. lra of course does that by reloading the entire address. fwprop creates the invalid address due to rs6000_legitimate_address_p trimming off the outer AND of altivec style addresses before applying other predicates. address_is_prefixed then allows the inner address. Now at the time the AND stripping was added (git commit 850e8d3d56d), rs6000_legitimate_address looked a lot simpler. This patch allows through just those addresses that were legitimate in those simpler days. (legitimate_small_data_p, legitimate_constant_pool_address_p, legitimate_lo_sum_address_p, and legitimate_offset_address_p did get to look at the inside of an AND address, but I'm fairly certain small_data, constant_pool and lo_sum addresses would never be wrapped in an AND, and offset_address_p of that time excluded altivec modes.) Regstrapped powerpc64le-linux power8 and power10, with tests on powerpc64-linux -m64/-m32 still running. OK assuming they pass? * config/rs6000/rs6000.c (rs6000_legitimate_address_p): Limit AND addressing to just lvx/stvx style addresses. diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 2f0ca7af030..7fd80fb75ff 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -9128,15 +9128,21 @@ rs6000_legitimate_address_p (machine_mode mode, rtx x, bool reg_ok_strict) bool reg_offset_p = reg_offset_addressing_ok_p (mode); bool quad_offset_p = mode_supports_dq_form (mode); - /* If this is an unaligned stvx/ldvx type address, discard the outer AND. */ + if (TARGET_ELF && RS6000_SYMBOL_REF_TLS_P (x)) +return 0; + + /* Handle unaligned altivec lvx/stvx type addresses. */ if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode) && GET_CODE (x) == AND && CONST_INT_P (XEXP (x, 1)) && INTVAL (XEXP (x, 1)) == -16) -x = XEXP (x, 0); +{ + x = XEXP (x, 0); + return (legitimate_indirect_address_p (x, reg_ok_strict) + || legitimate_indexed_address_p (x, reg_ok_strict) + || virtual_stack_registers_memory_p (x)); +} - if (TARGET_ELF && RS6000_SYMBOL_REF_TLS_P (x)) -return 0; if (legitimate_indirect_address_p (x, reg_ok_strict)) return 1; if (TARGET_UPDATE -- Alan Modra Australia Development Lab, IBM
Re: [RS6000] Adjust testcases for power10 instructions
On Thu, Oct 22, 2020 at 11:25:09PM +1030, Alan Modra wrote: > Some of these are wrong, sorry. I need to go over and check them > thoroughly. Please consider the patch withdrawn. Revised patch, removing changes to gcc.target/powerpc/fold-vec-st-double.c, gcc.target/powerpc/fold-vec-st-longlong.c, gcc.target/powerpc/fold-vec-st-pixel.c. Fixing fails on those three tests will be the subject of another patch. Most of these changes are fairly obvious. Duplicated setbcr in +/* { dg-final { scan-assembler-times {\maddic\M|\msetbcr\M} 4 } } */ +/* { dg-final { scan-assembler-times {\msubfe\M|\msetbcr\M} 1 } } */ is due to addic;subfe being replaced in one function with setbcr. * gcc.dg/pr56727-2.c, * gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c, * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c, * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-double.c, * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-float.c, * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-int.c, * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-longlong.c, * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-short.c, * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-char.c, * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-double.c, * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-float.c, * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-int.c, * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-longlong.c, * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-short.c, * gcc.target/powerpc/fold-vec-load-vec_xl-char.c, * gcc.target/powerpc/fold-vec-load-vec_xl-double.c, * gcc.target/powerpc/fold-vec-load-vec_xl-float.c, * gcc.target/powerpc/fold-vec-load-vec_xl-int.c, * gcc.target/powerpc/fold-vec-load-vec_xl-longlong.c, * gcc.target/powerpc/fold-vec-load-vec_xl-short.c, * gcc.target/powerpc/fold-vec-splat-floatdouble.c, * gcc.target/powerpc/fold-vec-splat-longlong.c, * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-char.c, * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-double.c, * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-float.c, * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-int.c, * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-short.c, * gcc.target/powerpc/fold-vec-store-vec_vsx_st-char.c, * gcc.target/powerpc/fold-vec-store-vec_vsx_st-double.c, * gcc.target/powerpc/fold-vec-store-vec_vsx_st-float.c, * gcc.target/powerpc/fold-vec-store-vec_vsx_st-int.c, * gcc.target/powerpc/fold-vec-store-vec_vsx_st-longlong.c, * gcc.target/powerpc/fold-vec-store-vec_vsx_st-short.c, * gcc.target/powerpc/fold-vec-store-vec_xst-char.c, * gcc.target/powerpc/fold-vec-store-vec_xst-double.c, * gcc.target/powerpc/fold-vec-store-vec_xst-float.c, * gcc.target/powerpc/fold-vec-store-vec_xst-int.c, * gcc.target/powerpc/fold-vec-store-vec_xst-longlong.c, * gcc.target/powerpc/fold-vec-store-vec_xst-short.c, * gcc.target/powerpc/lvsl-lvsr.c, * gcc.target/powerpc/ppc-eq0-1.c, * gcc.target/powerpc/ppc-ne0-1.c, * gcc.target/powerpc/pr86731-fwrapv-longlong.c: Match power10 insns. * gcc.target/powerpc/lvsl-lvsr.c: Avoid file name match. diff --git a/gcc/testsuite/gcc.dg/pr56727-2.c b/gcc/testsuite/gcc.dg/pr56727-2.c index c54369ed25e..f055116772a 100644 --- a/gcc/testsuite/gcc.dg/pr56727-2.c +++ b/gcc/testsuite/gcc.dg/pr56727-2.c @@ -18,4 +18,4 @@ void h () /* { dg-final { scan-assembler "@(PLT|plt)" { target i?86-*-* x86_64-*-* } } } */ /* { dg-final { scan-assembler "@(PLT|plt)" { target { powerpc*-*-linux* && ilp32 } } } } */ -/* { dg-final { scan-assembler "bl f\n\\s*nop" { target { powerpc*-*-linux* && lp64 } } } } */ +/* { dg-final { scan-assembler {bl f(\n\s*nop|@notoc\n)} { target { powerpc*-*-linux* && lp64 } } } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c index 246f38fa6d1..d9f173b521e 100644 --- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c @@ -25,6 +25,6 @@ main1 (void) with no word loads (lw, lwu, lwz, lwzu, or their indexed forms) or word stores (stw, stwu, stwx, stwux, or their indexed forms). */ -/* { dg-final { scan-assembler "\t(lvx|lxv|lvsr|stxv)" } } */ +/* { dg-final { scan-assembler "\t(lvx|lxv|lvsr|stxv|plxv|pstxv)" } } */ /* { dg-final { scan-assembler-not "\tlwz?u?x? " { xfail { powerpc-ibm-aix* } } } } */ /* { dg-final { scan-assembler-not "\tstwu?x? " } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c b
Re: [RS6000] Power10 vec-splati-runnable multiple test failures
On Thu, Oct 22, 2020 at 09:08:58AM -0700, Carl Love wrote: > I see the error print statement you changed so that it would not wrap. > I have always been told it is best not to break the print statement > across two lines. The argument is it makes it harder to find it in the > code when using grep. In this case, it should be clear what file the > error statement is in. What is your take in general about breaking or > not breaking the body of an error print statement across lines? Submit to https://gcc.gnu.org/codingconventions.html which says "Formatting Conventions Line Length Lines shall be at most 80 columns." Seriously, it doesn't matter what you or I think about the formatting rules, they are in place for good reason. -- Alan Modra Australia Development Lab, IBM
Re: [RS6000] VSX_MM_SUFFIX
On Thu, Oct 22, 2020 at 11:03:14AM -0500, Segher Boessenkool wrote: > On Thu, Oct 22, 2020 at 08:58:10AM -0700, Carl Love wrote: > > OK, looks like VSX_MM_SUFFIX doesn't exist anymore. Don't know what > > happed to it. Thanks. > > It never existed on trunk. Please always regstrap patches on trunk > before committing. It feels unnecessary at times, but now you know > why :-) Regression testing wouldn't have caught the problem, unless Carl had done so on power10. "dg_do compile" tests don't run the assembler, and "dg_do run" tests with "dg_require_effective_target power10_hw" do nothing at all if not on power10 hardware. The single test that might have caught the problem, vec-splati-runnable.c wasn't even compiled. That's why I submitted https://gcc.gnu.org/pipermail/gcc-patches/2020-October/556757.html -- Alan Modra Australia Development Lab, IBM
Re: [RS6000] Adjust testcases for power10 instructions
On Thu, Oct 22, 2020 at 05:33:46PM +1030, Alan Modra wrote: > * gcc.dg/pr56727-2.c, > * gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c, > * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c, > * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-double.c, > * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-float.c, > * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-int.c, > * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-longlong.c, > * gcc.target/powerpc/fold-vec-load-builtin_vec_xl-short.c, > * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-char.c, > * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-double.c, > * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-float.c, > * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-int.c, > * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-longlong.c, > * gcc.target/powerpc/fold-vec-load-vec_vsx_ld-short.c, > * gcc.target/powerpc/fold-vec-load-vec_xl-char.c, > * gcc.target/powerpc/fold-vec-load-vec_xl-double.c, > * gcc.target/powerpc/fold-vec-load-vec_xl-float.c, > * gcc.target/powerpc/fold-vec-load-vec_xl-int.c, > * gcc.target/powerpc/fold-vec-load-vec_xl-longlong.c, > * gcc.target/powerpc/fold-vec-load-vec_xl-short.c, > * gcc.target/powerpc/fold-vec-splat-floatdouble.c, > * gcc.target/powerpc/fold-vec-splat-longlong.c, > * gcc.target/powerpc/fold-vec-st-double.c, > * gcc.target/powerpc/fold-vec-st-longlong.c, > * gcc.target/powerpc/fold-vec-st-pixel.c, > * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-char.c, > * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-double.c, > * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-float.c, > * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-int.c, > * gcc.target/powerpc/fold-vec-store-builtin_vec_xst-short.c, > * gcc.target/powerpc/fold-vec-store-vec_vsx_st-char.c, > * gcc.target/powerpc/fold-vec-store-vec_vsx_st-double.c, > * gcc.target/powerpc/fold-vec-store-vec_vsx_st-float.c, > * gcc.target/powerpc/fold-vec-store-vec_vsx_st-int.c, > * gcc.target/powerpc/fold-vec-store-vec_vsx_st-longlong.c, > * gcc.target/powerpc/fold-vec-store-vec_vsx_st-short.c, > * gcc.target/powerpc/fold-vec-store-vec_xst-char.c, > * gcc.target/powerpc/fold-vec-store-vec_xst-double.c, > * gcc.target/powerpc/fold-vec-store-vec_xst-float.c, > * gcc.target/powerpc/fold-vec-store-vec_xst-int.c, > * gcc.target/powerpc/fold-vec-store-vec_xst-longlong.c, > * gcc.target/powerpc/fold-vec-store-vec_xst-short.c, > * gcc.target/powerpc/lvsl-lvsr.c, > * gcc.target/powerpc/ppc-eq0-1.c, > * gcc.target/powerpc/ppc-ne0-1.c, > * gcc.target/powerpc/pr86731-fwrapv-longlong.c: Match power10 insns. > * gcc.target/powerpc/lvsl-lvsr.c: Avoid file name match. > > Regstrapped powerpc64le-linux power10 and power8. OK? Some of these are wrong, sorry. I need to go over and check them thoroughly. Please consider the patch withdrawn. > diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-st-pixel.c > b/gcc/testsuite/gcc.target/powerpc/fold-vec-st-pixel.c > index 5b95cc73d8d..0a3aaec6d8d 100644 > --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-st-pixel.c > +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-st-pixel.c > @@ -19,4 +19,4 @@ testst_cst1 (vector pixel vp1, int i1, vector pixel * vpp) > return vec_st(vp1, 12, vpp); > } > > -/* { dg-final { scan-assembler-times {\mstvx\M} 2 } } */ > +/* { dg-final { scan-assembler-times {\mstvx\M|\mpstxv\M} 2 } } */ For example, this one. We don't get two stvx insns here on power10, as we should, but we do need altivec style addressing (ie. & -16). So pstxv should not be a pass. -- Alan Modra Australia Development Lab, IBM
Re: [PATCH] calls.c:can_implement_as_sibling_call_p REG_PARM_STACK_SPACE check
On Mon, Oct 12, 2020 at 10:37:05PM +1030, Alan Modra wrote: > Ping? > > On Fri, Oct 02, 2020 at 05:03:50PM +0930, Alan Modra wrote: > https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555309.html Ping^2 -- Alan Modra Australia Development Lab, IBM
[RS6000] Adjust testcases for power10 instructions
ian. */ /* { dg-do compile { target { powerpc64le-*-* } } } */ /* { dg-options "-O0 -Wno-deprecated" } */ -/* { dg-final { scan-assembler-times "lvsl" 2 } } */ -/* { dg-final { scan-assembler-times "lvsr" 2 } } */ -/* { dg-final { scan-assembler-times {\mlxvd2x\M|\mlxv\M} 2 } } */ +/* { dg-final { scan-assembler-times {\slvsl\s} 1 } } */ +/* { dg-final { scan-assembler-times {\slvsr\s} 1 } } */ +/* { dg-final { scan-assembler-times {\mlxvd2x\M|\mlxv\M|\mplxv\M} 2 } } */ /* { dg-final { scan-assembler-times {\m(?:v|xx)permr?\M} 2 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/ppc-eq0-1.c b/gcc/testsuite/gcc.target/powerpc/ppc-eq0-1.c index 496a6e340c0..63799a6c8a8 100644 --- a/gcc/testsuite/gcc.target/powerpc/ppc-eq0-1.c +++ b/gcc/testsuite/gcc.target/powerpc/ppc-eq0-1.c @@ -7,4 +7,4 @@ int foo(int x) return x == 0; } -/* { dg-final { scan-assembler "cntlzw|isel" } } */ +/* { dg-final { scan-assembler {\mcntlzw\M|\misel\M|\msetbc\M} } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/ppc-ne0-1.c b/gcc/testsuite/gcc.target/powerpc/ppc-ne0-1.c index 63c4b6087df..31627b192ab 100644 --- a/gcc/testsuite/gcc.target/powerpc/ppc-ne0-1.c +++ b/gcc/testsuite/gcc.target/powerpc/ppc-ne0-1.c @@ -2,9 +2,9 @@ /* { dg-do compile } */ /* { dg-options "-O2 -mno-isel" } */ -/* { dg-final { scan-assembler-times "addic" 4 } } */ -/* { dg-final { scan-assembler-times "subfe" 1 } } */ -/* { dg-final { scan-assembler-times "addze" 3 } } */ +/* { dg-final { scan-assembler-times {\maddic\M|\msetbcr\M} 4 } } */ +/* { dg-final { scan-assembler-times {\msubfe\M|\msetbcr\M} 1 } } */ +/* { dg-final { scan-assembler-times {\maddze\M} 3 } } */ long ne0(long a) { diff --git a/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c b/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c index 1269fe635c6..b2cbc420922 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c +++ b/gcc/testsuite/gcc.target/powerpc/pr86731-fwrapv-longlong.c @@ -30,5 +30,5 @@ vector signed long long splats4(void) /* { dg-final { scan-assembler-times {\mvspltis[bhw]\M} 0 } } */ /* { dg-final { scan-assembler-times {\mvsl[bhwd]\M} 0 } } */ -/* { dg-final { scan-assembler-times {\mlvx\M|\mlxv\M|\mlxvd2x\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mlvx\M|\mlxv\M|\mlxvd2x\M|\mplxv\M} 2 } } */ -- Alan Modra Australia Development Lab, IBM
[RS6000] Link power10 testcases
-git a/gcc/testsuite/gcc.target/powerpc/vec-ternarylogic-7.c b/gcc/testsuite/gcc.target/powerpc/vec-ternarylogic-7.c index b754195c2dc..45936f36869 100644 --- a/gcc/testsuite/gcc.target/powerpc/vec-ternarylogic-7.c +++ b/gcc/testsuite/gcc.target/powerpc/vec-ternarylogic-7.c @@ -1,5 +1,6 @@ -/* { dg-do run } */ -/* { dg-require-effective-target power10_hw } */ +/* { dg-do run { target { power10_hw } } } */ +/* { dg-do link { target { ! power10_hw } } } */ +/* { dg-require-effective-target power10_ok } */ /* { dg-options "-mdejagnu-cpu=power10" } */ #include diff --git a/gcc/testsuite/gcc.target/powerpc/vec-ternarylogic-9.c b/gcc/testsuite/gcc.target/powerpc/vec-ternarylogic-9.c index 0d9998e6341..71e52ee880b 100644 --- a/gcc/testsuite/gcc.target/powerpc/vec-ternarylogic-9.c +++ b/gcc/testsuite/gcc.target/powerpc/vec-ternarylogic-9.c @@ -1,5 +1,6 @@ -/* { dg-do run } */ -/* { dg-require-effective-target power10_hw } */ +/* { dg-do run { target { power10_hw } } } */ +/* { dg-do link { target { ! power10_hw } } } */ +/* { dg-require-effective-target power10_ok } */ /* { dg-require-effective-target int128 } */ /* { dg-options "-mdejagnu-cpu=power10" } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c b/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c index f1e3860ee43..5862517eae9 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c @@ -1,6 +1,7 @@ -/* { dg-do run } */ +/* { dg-do run { target { power10_hw } } } */ +/* { dg-do link { target { ! power10_hw } } } */ /* { dg-options "-mcpu=power10 -O2" } */ -/* { dg-require-effective-target power10_hw } */ +/* { dg-require-effective-target power10_ok } */ /* Check that the expected 128-bit instructions are generated if the processor supports the 128-bit integer instructions. */ diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c b/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c index 0c5695e4807..13b4c8afd4f 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c @@ -1,6 +1,7 @@ -/* { dg-do run } */ +/* { dg-do run { target { power10_hw } } } */ +/* { dg-do link { target { ! power10_hw } } } */ /* { dg-options "-mcpu=power10 -O2" } */ -/* { dg-require-effective-target power10_hw } */ +/* { dg-require-effective-target power10_ok } */ /* Check that the expected 128-bit instructions are generated if the processor supports the 128-bit integer instructions. */ diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c b/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c index 93c3c720246..d58a6b0b682 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c @@ -1,6 +1,7 @@ -/* { dg-do run } */ +/* { dg-do run { target { power10_hw } } } */ +/* { dg-do link { target { ! power10_hw } } } */ /* { dg-options "-mcpu=power10 -O2" } */ -/* { dg-require-effective-target power10_hw } */ +/* { dg-require-effective-target power10_ok } */ /* Check that the expected 128-bit instructions are generated if the processor supports the 128-bit integer instructions. */ diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c b/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c index 41dee583e59..9147d67c9d1 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c @@ -1,6 +1,7 @@ -/* { dg-do run } */ +/* { dg-do run { target { power10_hw } } } */ +/* { dg-do link { target { ! power10_hw } } } */ /* { dg-options "-mcpu=power10 -O2" } */ -/* { dg-require-effective-target power10_hw } */ +/* { dg-require-effective-target power10_ok } */ /* Check that the expected 128-bit instructions are generated if the processor supports the 128-bit integer instructions. */ diff --git a/gcc/testsuite/gcc.target/powerpc/xxgenpc-runnable.c b/gcc/testsuite/gcc.target/powerpc/xxgenpc-runnable.c index 244c57365d4..d4040ea8b70 100644 --- a/gcc/testsuite/gcc.target/powerpc/xxgenpc-runnable.c +++ b/gcc/testsuite/gcc.target/powerpc/xxgenpc-runnable.c @@ -1,6 +1,7 @@ -/* { dg-do run } */ +/* { dg-do run { target { power10_hw } } } */ +/* { dg-do link { target { ! power10_hw } } } */ /* { dg-options "-mdejagnu-cpu=power10 -O2" } */ -/* { dg-require-effective-target power10_hw } */ +/* { dg-require-effective-target power10_ok } */ #include -- Alan Modra Australia Development Lab, IBM
[RS6000] dimode_off.c test
This tests behaviour near the limit of 16-bit signed offsets. If power10 prefix instructions are enabled, no such testing occurs. * gcc.target/powerpc/dimode_off.c: Add -mno-prefixed to options. Regstrapped powerpc64le-linux power10 and power8. OK? diff --git a/gcc/testsuite/gcc.target/powerpc/dimode_off.c b/gcc/testsuite/gcc.target/powerpc/dimode_off.c index 19ca40c508b..12718eafdd3 100644 --- a/gcc/testsuite/gcc.target/powerpc/dimode_off.c +++ b/gcc/testsuite/gcc.target/powerpc/dimode_off.c @@ -1,5 +1,5 @@ /* { dg-do assemble } */ -/* { dg-options "-O2 -fno-align-functions -fno-asynchronous-unwind-tables -mtraceback=no -save-temps" } */ +/* { dg-options "-O2 -fno-align-functions -fno-asynchronous-unwind-tables -mtraceback=no -mno-prefixed -save-temps" } */ void w1 (void *x, long long y) { *(long long *) (x + 32767) = y; } void w2 (void *x, long long y) { *(long long *) (x + 32766) = y; } -- Alan Modra Australia Development Lab, IBM
[RS6000] Non-pcrel tests when power10
These tests require -mno-pcrel because they are testing features of the non-pcrel ABI. * gcc.target/powerpc/cprophard.c: Add -mno-pcrel to options. * gcc.target/powerpc/float128-hw3.c: Likewise. * gcc.target/powerpc/pr79439-1.c: Likewise. * gcc.target/powerpc/pr79439-2.c: Likewise. * gcc.target/powerpc/r2_shrink-wrap.c: Likewise. Regstrapped powerpc64le-linux power10 and power8. OK? diff --git a/gcc/testsuite/gcc.target/powerpc/cprophard.c b/gcc/testsuite/gcc.target/powerpc/cprophard.c index f93081f0cb5..3c548495192 100644 --- a/gcc/testsuite/gcc.target/powerpc/cprophard.c +++ b/gcc/testsuite/gcc.target/powerpc/cprophard.c @@ -1,6 +1,6 @@ /* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ /* { dg-skip-if "" { powerpc*-*-darwin* } } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -mno-pcrel" } */ /* { dg-final { scan-assembler {ld 2,(24|40)\(1\)} } } */ /* From a linux kernel mis-compile of net/core/skbuff.c. */ diff --git a/gcc/testsuite/gcc.target/powerpc/float128-hw3.c b/gcc/testsuite/gcc.target/powerpc/float128-hw3.c index b3bbeb25678..630c93dfcb7 100644 --- a/gcc/testsuite/gcc.target/powerpc/float128-hw3.c +++ b/gcc/testsuite/gcc.target/powerpc/float128-hw3.c @@ -1,7 +1,7 @@ /* { dg-do compile { target lp64 } } */ /* { dg-require-effective-target powerpc_p9vector_ok } */ /* { dg-require-effective-target float128 } */ -/* { dg-options "-mpower9-vector -O2 -ffast-math -std=c11" } */ +/* { dg-options "-mpower9-vector -O2 -ffast-math -std=c11 -mno-pcrel" } */ /* Test to make sure the compiler calls the external function instead of doing the built-in processing for _Float128 functions that have hardware support diff --git a/gcc/testsuite/gcc.target/powerpc/pr79439-1.c b/gcc/testsuite/gcc.target/powerpc/pr79439-1.c index 539c96f571e..8eb08a4e762 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr79439-1.c +++ b/gcc/testsuite/gcc.target/powerpc/pr79439-1.c @@ -1,5 +1,5 @@ /* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */ -/* { dg-options "-O2 -fpic -fno-reorder-blocks -fno-inline-functions" } */ +/* { dg-options "-O2 -fpic -fno-reorder-blocks -fno-inline-functions -mno-pcrel" } */ /* On the Linux 64-bit ABIs, we eliminate NOP in the 'rec' call even if -fpic is used. The recursive call should call the local alias. The diff --git a/gcc/testsuite/gcc.target/powerpc/pr79439-2.c b/gcc/testsuite/gcc.target/powerpc/pr79439-2.c index b53af445265..9ebcf2579ab 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr79439-2.c +++ b/gcc/testsuite/gcc.target/powerpc/pr79439-2.c @@ -1,5 +1,5 @@ /* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */ -/* { dg-options "-O2 -fpic -fno-reorder-blocks" } */ +/* { dg-options "-O2 -fpic -fno-reorder-blocks -mno-pcrel" } */ /* On the Linux 64-bit ABIs, we should not eliminate NOP in the 'rec' call if -fpic is used because rec can be interposed at link time (since it has an diff --git a/gcc/testsuite/gcc.target/powerpc/r2_shrink-wrap.c b/gcc/testsuite/gcc.target/powerpc/r2_shrink-wrap.c index b81b9b1c2ee..a74da38740b 100644 --- a/gcc/testsuite/gcc.target/powerpc/r2_shrink-wrap.c +++ b/gcc/testsuite/gcc.target/powerpc/r2_shrink-wrap.c @@ -1,5 +1,5 @@ /* { dg-do compile { target lp64 } } */ -/* { dg-options "-O2 -fdump-rtl-pro_and_epilogue" } */ +/* { dg-options "-O2 -fdump-rtl-pro_and_epilogue -mno-pcrel" } */ /* Verify we move the prologue past the TOC reference of 'j' and shrink-wrap the function. */ -- Alan Modra Australia Development Lab, IBM
[RS6000] Power10 vec-splati-runnable multiple test failures
FAIL: gcc.target/powerpc/vec-splati-runnable.c 1 blank line(s) in output FAIL: gcc.target/powerpc/vec-splati-runnable.c (test for excess errors) Excess errors: rs6000_emit_xxspltidp_v2df called ... and running the test fails. As the comment says /* Although the instruction says the results are not defined, it does seem to work, at least on Mambo. But no guarentees! */ So the simulator works but not real hardware. Regstrapped powerpc64le-linux on power10 and power8. OK? gcc/ * config/rs6000/rs6000.c (rs6000_emit_xxspltidp_v2df): Delete debug printf. Remove trailing ".\n" from inform message. Break long line. gcc/testsuite/ * gcc.target/powerpc/vec-splati-runnable.c: Don't abort on undefined output. diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index c0adc56387f..2f0ca7af030 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -27485,11 +27485,10 @@ rs6000_const_f32_to_i32 (rtx operand) void rs6000_emit_xxspltidp_v2df (rtx dst, long value) { - printf("rs6000_emit_xxspltidp_v2df called %ld\n", value); - printf("rs6000_emit_xxspltidp_v2df called 0x%lx\n", value); if (((value & 0x7F80) == 0) && ((value & 0x7F) != 0)) inform (input_location, - "the result for the xxspltidp instruction is undefined for subnormal input values.\n"); + "the result for the xxspltidp instruction " + "is undefined for subnormal input values"); emit_insn( gen_xxspltidp_v2df_inst (dst, GEN_INT (value))); } diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c index afb0bfdef3a..e5a4935644f 100644 --- a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c +++ b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c @@ -5,7 +5,7 @@ #define DEBUG 0 -#ifdef DEBUG +#if DEBUG #include #endif @@ -100,7 +100,7 @@ main (int argc, char *argv []) printf(" vresult_d[%i] = %e, expected_vresult_d[%i] = %e\n", i, vresult_d[i], i, expected_vresult_d[i]); #else -abort(); +; #endif } -- Alan Modra Australia Development Lab, IBM
[RS6000] VSX_MM_SUFFIX
gcc.target/powerpc/vsx_mask-count-runnable.c and others Assembler messages: Error: unrecognized opcode: `vcntmb' I'm applying this one as obvious. Ref https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549757.html * config/rs6000/vsx.md (vec_cntmb_, vec_extract_), (vec_expand_): Replace with . diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index c023bc0baaa..d96269367bf 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -6035,7 +6035,7 @@ (match_operand:QI 2 "const_0_to_1_operand" "n")] UNSPEC_VCNTMB))] "TARGET_POWER10" - "vcntmb %0,%1,%2" + "vcntmb %0,%1,%2" [(set_attr "type" "vecsimple")]) (define_insn "vec_extract_" @@ -6043,7 +6043,7 @@ (unspec:SI [(match_operand:VSX_MM 1 "altivec_register_operand" "v")] UNSPEC_VEXTRACT))] "TARGET_POWER10" - "vextractm %0,%1" + "vextractm %0,%1" [(set_attr "type" "vecsimple")]) (define_insn "vec_expand_" @@ -6051,5 +6051,5 @@ (unspec:VSX_MM [(match_operand:VSX_MM 1 "vsx_register_operand" "v")] UNSPEC_VEXPAND))] "TARGET_POWER10" - "vexpandm %0,%1" + "vexpandm %0,%1" [(set_attr "type" "vecsimple")]) -- Alan Modra Australia Development Lab, IBM
Re: [PATCH 2/8] [RS6000] rs6000_rtx_costs for AND
On Wed, Oct 21, 2020 at 03:29:11PM -0500, Segher Boessenkool wrote: > Anyway: > > + || (outer_code == AND > + && rs6000_is_valid_2insn_and (x, mode))) > { > *total = COSTS_N_INSNS (1); > return true; > > It should return COSTS_N_INSNS (2) for that? No, it should not! /* (reg) is costed at zero by rtlanal.c:rtx_cost. That sets a baseline for rtx costs: If a constant is valid in an insn, it is free. */ -- Alan Modra Australia Development Lab, IBM
Re: [PATCH 2/8] [RS6000] rs6000_rtx_costs for AND
On Tue, Oct 20, 2020 at 01:55:56PM -0500, Segher Boessenkool wrote: > On Thu, Oct 08, 2020 at 09:27:54AM +1030, Alan Modra wrote: > > The existing "case AND" in this function is not sufficient for > > optabs.c:avoid_expensive_constant usage, where the AND is passed in > > outer_code. We'd like to cost AND of rs6000_is_valid_and_mask > > or rs6000_is_valid_2insn_and variety there, so that those masks aren't > > seen as expensive (ie. better to load to a reg then AND). > > > > * config/rs6000/rs6000.c (rs6000_rtx_costs): Combine CONST_INT > > AND handling with IOR/XOR. Move costing for AND with > > rs6000_is_valid_and_mask or rs6000_is_valid_2insn_and to > > CONST_INT. > > Sorry this took so long to review :-( > > On 64-bit BE this leads to *bigger* code, and closer observation shows > that some common sequences degrade on all configs. This seems to mostly > be about "andc" (and its dot form). It wasn't costed properly before, > but after your patch, a single instruction is replaced by three. > > Could you look into this? ~/build/gcc-alan/gcc$ for z in *.o; do if test `objdump -dr $z | grep andc | wc -l` != `objdump -dr ../../gcc/gcc/$z | grep andc | wc -l`; then echo $z; fi; done gimplify.o insn-emit.o insn-opinit.o insn-recog.o rs6000-string.o All of these are exactly the case I talked about in https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553919.html "Sometimes correct insn cost leads to unexpected results. For example: extern unsigned bar (void); unsigned f1 (unsigned a) { if ((a & 0x01000200) == 0x01000200) return bar (); return 0; } emits for a & 0x01000200 (set (reg) (and (reg) (const_int 0x01000200))) at expand time (two rlwinm insns) rather than the older (set (reg) (const_int 0x01000200)) (set (reg) (and (reg) (reg))) which is three insns. However, since 0x01000200 is needed later the older code after optimisation is smaller." Things have changed slightly since I wrote the above, with the two rlwinm insns being emitted at expand time, so you see (set (reg) (and (reg) (const_int 0xff0003ff))) (set (reg) (and (reg) (const_int 0x01fffe00))) but of course that doesn't change anything regarding the cost of "a & 0x01000200". -- Alan Modra Australia Development Lab, IBM
Re: [PATCH] calls.c:can_implement_as_sibling_call_p REG_PARM_STACK_SPACE check
Ping? On Fri, Oct 02, 2020 at 05:03:50PM +0930, Alan Modra wrote: https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555309.html -- Alan Modra Australia Development Lab, IBM
[PATCH 6/8] [RS6000] rs6000_rtx_costs multi-insn constants
This small patch to rs6000_rtx_costs considerably improves code generated for large constants in 64-bit code, teaching gcc that it is better to load a constant from memory than to generate a sequence of up to five dependent instructions. Note that the rs6000 backend does generate large constants as loads from memory at expand time but optimisation passes replace them with SETs of the value due to not having correct costs. PR 94393 * config/rs6000/rs6000.c (rs6000_rtx_costs): Cost multi-insn constants. diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 15a806fe307..76aedbfae6f 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -21343,7 +21343,9 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code, switch (code) { - /* On the RS/6000, if it is valid in the insn, it is free. */ + /* (reg) is costed at zero by rtlanal.c:rtx_cost. That sets a +baseline for rtx costs: If a constant is valid in an insn, +it is free. */ case CONST_INT: if (((outer_code == SET || outer_code == PLUS @@ -21404,6 +21406,17 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code, case CONST_DOUBLE: case CONST_WIDE_INT: + /* Subtract one insn here for consistency with the above code +that returns one less than the actual number of insns for +SETs. Don't subtract one for other than SETs, because other +operations will require the constant to be loaded to a +register before performing the operation. All special cases +for codes other than SET must be handled before we get +here. */ + *total = COSTS_N_INSNS (num_insns_constant (x, mode) + - (outer_code == SET ? 1 : 0)); + return true; + case CONST: case HIGH: case SYMBOL_REF: