[Bug testsuite/116080] [15 regression] New tests from r15-2233-g8d1af8f904a0c0 fail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116080 --- Comment #19 from andi at firstfloor dot org --- >Is musttail7.c specifically about testing recursive calls? It is mainly about testing the frontend. In the middle end tail recursion is implemented quite differently but the frontend doesn't know. I guess we can drop this case here. > > As a workaround, I suggest to remove {} from foo's definition, such that it is > TREE_PUBLIC, but TREE_ASM_WRITTEN is false. Should be ok.
[Bug testsuite/116080] [15 regression] New tests from r15-2233-g8d1af8f904a0c0 fail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116080 --- Comment #17 from andi at firstfloor dot org --- This patch should fix it. Please confirm. diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index f92f7f1af9c6..f58aed462971 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -12883,6 +12883,7 @@ proc check_effective_target_frame_pointer_for_non_leaf { } { # most trivial type. proc check_effective_target_tail_call { } { return [check_no_messages_and_pattern tail_call ",SIBCALL" rtl-expand { + // C++ __attribute__((__noipa__)) void foo (void) { } __attribute__((__noipa__)) void bar (void) { foo(); } } {-O2 -fdump-rtl-expand-all}] ;# The "SIBCALL" note requires a detailed dump. @@ -12893,6 +12894,7 @@ proc check_effective_target_tail_call { } { # is supported at -O0. proc check_effective_target_musttail { } { return [check_no_messages_and_pattern musttail ",SIBCALL" rtl-expand { + // C++ __attribute__((__noipa__)) void foo (void) { } __attribute__((__noipa__)) void bar (void) { [[gnu::musttail]] return foo(); } } {-fdump-rtl-expand-all}] ;# The "SIBCALL" note requires a detailed dump. @@ -12901,6 +12903,7 @@ proc check_effective_target_musttail { } { # Return 1 if the target can perform musttail for externals proc check_effective_target_external_musttail { } { return [check_no_messages_and_pattern external_musttail ",SIBCALL" rtl-expand { + // C++ extern __attribute__((__noipa__)) void foo (void); __attribute__((__noipa__)) void bar (void) { [[gnu::musttail]] return foo(); } } {-fdump-rtl-expand-all}] ;# The "SIBCALL" note requires a detailed dump.
[Bug testsuite/116080] [15 regression] New tests from r15-2233-g8d1af8f904a0c0 fail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116080 --- Comment #16 from andi at firstfloor dot org --- On Mon, Sep 30, 2024 at 03:05:11PM +, clyon at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116080 > > --- Comment #15 from Christophe Lyon --- > (In reply to andi from comment #14) > > The test relies on the > > > > gcc/testsuite/lib/target-supports.exp:check_effective_target_tail_call > Are you sure? > musttail7.c has: > /* { dg-do compile { target { musttail && { c || c++11 } } } } */ > > (so "musttail" and not "tail_call") Yes musttail right. > My log has: > spawn -ignore SIGHUP > /home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/builds/destdir/x86_64-pc-linux-gnu/bin/arm-eabi-g++ > musttail19057.c -fdiagnostics-plain-output -nostdinc++ > -I/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/builds/x86_64-pc-linux-gnu/arm-eabi/gcc-gcc.git~master-stage2/arm-eabi/libstdc++-v3/include/arm-eabi > -I/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/builds/x86_64-pc-linux-gnu/arm-eabi/gcc-gcc.git~master-stage2/arm-eabi/libstdc++-v3/include > -I/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/libstdc++-v3/libsupc++ > -I/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/libstdc++-v3/include/backward > -I/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/libstdc++-v3/testsuite/util > -fmessage-length=0 -fdump-rtl-expand-all -Wno-complain-wrong-lang > -fdump-rtl-expand -S -o musttail19057.s > (with no error) > > but for the actual testcase: > spawn -ignore SIGHUP > /home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/builds/destdir/x86_64-pc-linux-gnu/bin/arm-eabi-g++ > /home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/gcc/testsuite/c-c++-common/musttail7.c > -fdiagnostics-plain-output -nostdinc++ > -I/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/builds/x86_64-pc-linux-gnu/arm-eabi/gcc-gcc.git~master-stage2/arm-eabi/libstdc++-v3/include/arm-eabi > -I/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/builds/x86_64-pc-linux-gnu/arm-eabi/gcc-gcc.git~master-stage2/arm-eabi/libstdc++-v3/include > -I/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/libstdc++-v3/libsupc++ > -I/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/libstdc++-v3/include/backward > -I/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/libstdc++-v3/testsuite/util > -fmessage-length=0 -std=c++11 -S -o musttail7.s The "std=c++11" is different, so the probe is with C and the test with C++, and we run into PR115606 with the C++ compiler breaking structure tail calls on many architectures (which should really be fixed, but not here) Need to change the probe to use C++.
[Bug testsuite/116080] [15 regression] New tests from r15-2233-g8d1af8f904a0c0 fail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116080 --- Comment #14 from andi at firstfloor dot org --- The test relies on the gcc/testsuite/lib/target-supports.exp:check_effective_target_tail_call probe matching what the test does. Perhaps the way you are passing options doesn't pass them to the TCL based test code? The probe should be in the logs, perhaps you can find that or upload them somewhere.
[Bug testsuite/116500] gcc.dg/vect/vect-switch-ifcvt-1.c FAILs on sparc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116500 --- Comment #7 from andi at firstfloor dot org --- Thanks. Updated patch. This one seems obvious so I'll commit soon. diff --git a/gcc/testsuite/gcc.dg/vect/vect-switch-ifcvt-1.c b/gcc/testsuite/gcc.dg/vect/vect-switch-ifcvt-1.c index f5352ef8ed7a..2e3a9ae3c249 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-switch-ifcvt-1.c +++ b/gcc/testsuite/gcc.dg/vect/vect-switch-ifcvt-1.c @@ -1,4 +1,4 @@ -/* { dg-require-effective-target vect_int } */ +/* { dg-require-effective-target vect_condition } */ #include "tree-vect.h" extern void abort (void);
[Bug target/116497] Need no_caller_saved_registers with SSE support
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116497 --- Comment #20 from andi at firstfloor dot org --- On Tue, Aug 27, 2024 at 05:12:41PM +, hjl.tools at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116497 > > --- Comment #19 from H.J. Lu --- > (In reply to andi from comment #18) > > > > -mgeneral-regs-only works for this case, but breaks SSE. > > > > > > Why is __attribute__((no_caller_saved_registers)) needed on start? > > > > To maintain the standard ABI to its caller. Otherwise the final > > return could clobber caller state. > > GCC should do the right thing without no_caller_saved_registers. If not, > it is a GCC bug. Do you have a testcase to show such GCC bug? The test case is the same, just commenting out SAVE_REGS. You're right. It seems gcc does the right thing based on the callee ABIs. I hadn't realized that. So yes the attribute and the change are not really needed. Good news.
[Bug target/116497] Need no_caller_saved_registers with SSE support
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116497 --- Comment #18 from andi at firstfloor dot org --- > > -mgeneral-regs-only works for this case, but breaks SSE. > > Why is __attribute__((no_caller_saved_registers)) needed on start? To maintain the standard ABI to its caller. Otherwise the final return could clobber caller state. Basically the optimization is to move all the save/returns to only one wrapper function, not every interpreter op function.
[Bug target/116497] Need no_caller_saved_registers with SSE support
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116497 --- Comment #13 from andi at firstfloor dot org --- > --- Comment #11 from H.J. Lu --- > Please provide a small testcase to show the issue. You mean a test case for no_caller_saved_registers failing with SSE? It's just __attribute__((no_caller_saved_registers)) void foo(void) {} (if you compile without any special options, or use the target dance as seen in the PHP github link above) Or a test case for the intended register allocation benefits? That's more complicated and won't be small.
[Bug target/116497] Need no_caller_saved_registers with SSE support
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116497 --- Comment #12 from andi at firstfloor dot org --- > no_call{er,ee}_saved_registers are i386-specific so how do we handle other > ports? Are we going to require implementing them for all ports? It's an optimization, so nothing is required. But yes if the other targets want the more efficient interpreters they would need an equivalent. clang's preserve_none/most currently is supported on aarch64 and x86-64. Right now I'm just trying to get it to work for x86.
[Bug target/116497] Need no_caller_saved_registers with SSE support
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116497 --- Comment #9 from andi at firstfloor dot org --- On Tue, Aug 27, 2024 at 08:02:53AM +, rguenth at gcc dot gnu.org wrote: > Hmm, why would a tail call need to save extra regs over what the callers > caller > already saved? We're returning to that after all. The tail call doesn't need to save anything extra, but the original function entering the tail call loop needs to, otherwise the tail calls clobbering stuff would violate assumptions of its caller. So the transformation is original_caller -> add no_caller_saved_registers tailcall loop functions -> add no_callee_saved_registers
[Bug target/116497] Need no_caller_saved_registers with SSE support
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116497 --- Comment #8 from andi at firstfloor dot org --- On Tue, Aug 27, 2024 at 07:58:30AM +, liuhongt at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116497 > > Hongtao Liu changed: > >What|Removed |Added > > CC||liuhongt at gcc dot gnu.org > > --- Comment #6 from Hongtao Liu --- > (In reply to Andi Kleen from comment #1) > > Disable check for no_caller_saved_registers enforcing non FP. > > > > diff --git a/gcc/config/i386/i386-options.cc > > b/gcc/config/i386/i386-options.cc > > index f79257cc764..cec652cc9e6 100644 > > --- a/gcc/config/i386/i386-options.cc > > +++ b/gcc/config/i386/i386-options.cc > > @@ -3639,8 +3639,8 @@ ix86_set_current_function (tree fndecl) > > reinit_regs (); > > > >if (cfun->machine->func_type != TYPE_NORMAL > > - || (cfun->machine->call_saved_registers > > - == TYPE_NO_CALLER_SAVED_REGISTERS)) > > + /* || (cfun->machine->call_saved_registers > > +== TYPE_NO_CALLER_SAVED_REGISTERS) */) > > { > >/* Don't allow SSE, MMX nor x87 instructions since they > > may change processor state. */ > > I think RA is smart enough to save and restore SSE,MMX or x87 registers, and > we > can remove TYPE_NO_CALLER_SAVED_REGISTERS part from this. > Or are there any other concerns here regarding the comments? @hj While that would work, it would add unnecessary overhead to the interpreter entry case which doesn't need to save these registers. On the other hand maybe it is cheap enough.
[Bug target/116497] static functions ABI should be improved for SSE caller saved registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116497 --- Comment #4 from andi at firstfloor dot org --- The change of the subject is incorrect. The transformation has nothing to do with static function: consider LTO or someone might write an interpreter spread over multiple files. > always need to keep on changing the sources of the application rather than > ever > doing improvements to GCC that would help code that didn't even know about the > attributes. The same is true of this whole musttail attribute. It does nothing > except provide an error message. There are better ways of implementing that > inside GCC really than the attribute that was added. GCC has -fopt-info which > should have been used instead. -fopt-info is not a useful interface for checking for an transformation that is needed for correctness. > Here is another place where the attribute is just a way to hack around instead > of improving GCC for ABI for static functions. So yes the compiler might figure this out on its own for some cases, but it would need this proposed attribute semantic change anyways to do this.
[Bug c/83324] [feature request] Pragma or special syntax for guaranteed tail calls
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83324 --- Comment #32 from andi at firstfloor dot org --- The feature is currently only supported with standard C/C++ attributes ([[clang/gnu::musttail]]), not __attribute__ But given that you have existing code that uses the old syntax and clang supports that too we should probably add that too.
[Bug c/83324] [feature request] Pragma or special syntax for guaranteed tail calls
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83324 --- Comment #29 from andi at firstfloor dot org --- The semantics of -foptimize-sibling-calls do not change. However if your program depends on sbling calls for correctness it should migrate to the new attribute
[Bug tree-optimization/116166] [13/14 Regression] risc-v (last) insn-emit-nn.c build takes hours
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116166 --- Comment #29 from andi at firstfloor dot org --- > It might be interesting to have statistics on function sizes in > insn-recog.cc to see if there's any outliers - if it's just very many > there's nothing to do but split the file up. Or LTO doing it for you.
[Bug middle-end/115091] Support value speculation in frontend
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115091 --- Comment #2 from andi at firstfloor dot org --- On Wed, May 15, 2024 at 06:23:27AM +, rguenth at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115091 > > --- Comment #1 from Richard Biener --- > maybe represent this in a more formal way: Makes sense. > > node = __builtin_speculate (node + 1, node->next); > > and in GIMPLE: > > _1 = node + 1; > _2 = node->next; > node = .SPECULATE (_1, _2); > > and during RTL expansion leave the desired representation to the targets > which could use an UNSPEC to avoid optimizing it away. There might be speculation constructs that don't use equal in more complex scenarios. For example you could speculate on a binary search. Perhaps could extend it to more conditions? > Formally we'd say __builtin_speculate "uses" the first argument if its > value is equal to the second argument value. > > That said, I heard CPUs have prefetchers that recognize this kind of list > walking. I wonder why they wouldn't then also be able to speculate the > load value like you say. These are on the L2 or L3 level, not L1. This is about hiding L1 latencies, which normally doesn't have a prefetcher. > Oh, and doesn't this explicit speculation of using node++ open up spectre > style attacks? It's not different than any other condition in this regard.
[Bug lto/99828] inlining failed in call to ‘always_inline’ ‘memcpy’: --param max-inline-insns-auto limit reached
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99828 --- Comment #15 from andi at firstfloor dot org --- > Provided I cannot reproduce on the current kernel, where exactly does this > come > from? Usually I had to do a longer loop of randconfig builds to find it. It only happens in some specific configs.
[Bug tree-optimization/42587] bswap not recognized for memory
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42587 --- Comment #13 from andi at firstfloor dot org --- > The code in the initial report optimizes to bswap with GCC8.1 and later. > Is that the test case you meant? GCC8.1 was released on May 2, 2018, well > before your Nov comment, so maybe you meant something else. Should be ok to close then. Thanks
[Bug target/93346] gcc does not generate BZHI
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93346 --- Comment #3 from andi at firstfloor dot org --- > The bzhi patterns all match some odd if_then_else only to guard against > inx & 255 == 0: Is that guard needed? At least clang doesn't seem to care about it.
[Bug lto/66229] LTO fails with -fauto-profile on mcf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66229 --- Comment #4 from andi at firstfloor dot org --- Did some testing. Previously pretty much everything I tried failed. I don't have mcf, but git, less, gcc LTO+autofdo bootstrap all appear to work now. So it's likely fixed. Would be good if someone could confirm with the actual mcf.
[Bug lto/83375] partitioner partitions static arrays with label references
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83375 --- Comment #9 from andi at firstfloor dot org --- It's in kernel/bpf/core.c It won't happen every time on a build unless you force 1on1 partitioning.
[Bug testsuite/77684] many tree-prof testsuite failures in parallel make check
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77684 --- Comment #8 from andi at firstfloor dot org --- > The log shows the same errors: > spawn [open ...] > Permission error mapping pages. > Consider increasing /proc/sys/kernel/perf_event_mlock_kb, > or try again with a smaller value of -m/--mmap_pages. > (current value: 4294967295,0) That's strange. it should be smaller with the -m flag. Perhaps missed some case. -Andi > FAIL: gcc.dg/tree-prof/pr52150.c execution,-g
[Bug gcov-profile/71672] inlining indirect calls does not work with autofdo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71672 --- Comment #2 from andi at firstfloor dot org --- On Wed, Apr 12, 2017 at 12:15:26PM +, marxin at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71672 > > Martin Liška changed: > >What|Removed |Added > > CC||marxin at gcc dot gnu.org >Assignee|unassigned at gcc dot gnu.org |marxin at gcc dot > gnu.org > > --- Comment #1 from Martin Liška --- > I will take a look as soon as autofdo issues will be resolved: > https://github.com/google/autofdo/issues/43 event 79 is an event generated by the perf tools. As a workaround you can use a older version of the perf tool (it should work even with newer kernels). The quipper library autofdo relies on is notoriously bad in keeping up with perf changes, it only supports whatever ancient version Chrome is currently shipping. -Andi
[Bug lto/66305] -ffat-lto-objects create unreproducible objects
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66305 --- Comment #4 from andi at firstfloor dot org --- > --- Comment #3 from lunar at debian dot org --- > Richard Biener: > > I think they become deterministic with -frandom-seed=0 for example. > > They are not deterministic to support partial linking of LTO objects as far > > as I know. > > They are indeed reproducible with `-frandom-seed=0`. But I guess there's a > downside to that, right? The downside is that incremential linking (ld -r) does not work. But the random seed is used for other things in gcc too, so you may have other problems too. > > similar, which would be a bit more deterministic, but there are still > > ways this could break (e.g. if someone copies object files around) > > Would using a hash over the section content work? In any cases, in the context > of Debian (this applies for FreeBSD as well), we have a canonical build path > so it would probably be fine to use it as the source of the hash. > > I guess one could already do this without further help by giving the > build path to `-frandom-seed=`. This only would need some Makefile trickery. Yes. It would probably be easier in gcc, e.g. with a new option.
[Bug lto/66305] -ffat-lto-objects create unreproducible objects
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66305 --- Comment #2 from andi at firstfloor dot org --- On Wed, May 27, 2015 at 12:21:04PM +, rguenth at gcc dot gnu.org wrote: > --- Comment #1 from Richard Biener --- > I think they become deterministic with -frandom-seed=0 for example. They are > not deterministic to support partial linking of LTO objects as far as I know. Yes that's right. It prevents the linker from merging sections. In theory it would be possible to use the hash of the full output path name or similar, which would be a bit more deterministic, but there are still ways this could break (e.g. if someone copies object files around) How about your "deterministic build" tools just learn to ignore that suffix? -Andi
[Bug lto/61969] [4.8/4.9/5 Regression] wrong code by LTO on i?86-linux-gnu (affecting trunk, 4.9.x, and 4.8.x)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61969 --- Comment #8 from andi at firstfloor dot org --- > only automatic vars may have a VALUE_EXPR, certainly not 'extern const' stuff. It's an initializer for an automatic var in the source func_52() { struct S0 foo = { ... } ... } > > What does func_52 look like before the NRV pass? It must be sth like > > = l_55; > > ? Yes it looks like that.
[Bug lto/61635] LTO partitioner does not handle &&label in statics
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61635 --- Comment #2 from andi at firstfloor dot org --- Test case git clone https://github.com/andikleen/linux-misc -b lto-linus-3.15 Build with the attached kernel config (copy to .config in the build dir) and 4.9 -Andi On Fri, Jun 27, 2014 at 11:04:00PM +, hubicka at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61635 > > Jan Hubicka changed: > >What|Removed |Added > > Status|UNCONFIRMED |ASSIGNED >Last reconfirmed||2014-06-27 > CC||hubicka at gcc dot gnu.org >Assignee|unassigned at gcc dot gnu.org |hubicka at gcc dot > gnu.org > Ever confirmed|0 |1 > > --- Comment #1 from Jan Hubicka --- > Yep, this is a known problem. We do not represent labels in symbol table and > thus LTO partitioner has no clue about them. I would be interested to see the > example that fails - theoretically the problematic static variable should > always end up in the same partition as the function using it, because that > function is no inline and no clone and should not get duplicated. So kernel & > friends should fail only with -flto-partition=1to1 not with ballanced/none/one > algorithms. > > I plan correct solution - I already implemented labels for symtab some time > ago > but I would like to re-do it on the new API. > The poor-man's class hiearchy in C was not very pretty and a lot of code is > still not updated to nots assume two basic types of symbols (variables and > functions). With Martin Liska we are updating the APIs and once it is done I > will add symbols for non-local labels and const_decls. > > -- > You are receiving this mail because: > You reported the bug. >
[Bug lto/50602] ICE in tree_nrv, at tree-nrv.c:155 during large LTO build
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50602 --- Comment #24 from andi at firstfloor dot org 2012-05-07 13:08:08 UTC --- On Mon, May 07, 2012 at 08:54:10AM +, rguenther at suse dot de wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50602 > > --- Comment #23 from rguenther at suse dot de > 2012-05-07 08:54:10 UTC --- > On Sat, 5 May 2012, hubicka at gcc dot gnu.org wrote: > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50602 > > > > Jan Hubicka changed: > > > >What|Removed |Added > > > > CC||hubicka at gcc dot gnu.org > > > > --- Comment #22 from Jan Hubicka 2012-05-05 > > 10:02:15 UTC --- > > great, backporting to 4.7 should be easy, right? > > I'm not sure we should backport option handling changes. In general > we still expect consistent options at compile and link time (that did > not change in 4.7 - what changed was the implementation). I worked around it in my kernel build, so it's not critial. But there are other problems back now :-( -Andi