[Bug target/89627] Miscompiled constructor call
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89627 Jim Wilson changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2019-03-08 Ever confirmed|0 |1 --- Comment #1 from Jim Wilson --- This is the same problem that Andrew Burgess just reported. He found it while looking at gdb testsuite failures. He reported it against the psABI https://github.com/riscv/riscv-elf-psabi-doc/issues/88 This has a link to a patch, but the patch is trying to fix the ABI which is wrong. We only need the last two hunks of the patch which fix a bug in riscv_pass_fpr_single. https://github.com/T-J-Teru/gcc/commit/ae183138a618a01acf2cf009544d4ce0dda61ab9 I'm waiting for Andrew to submit the patch so I can approve it.
[Bug target/89411] RISC-V backend will generate wrong instruction for longlong type like lw a3,-2048(a5)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89411 --- Comment #4 from Jim Wilson --- Created attachment 45925 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45925=edit work in progress patches for a fix This implements two ways to fix it, a simple way that just fails for BLKmode, and a more complex way that passes in a SYMBOL_REF and tries to extract decl size and alignment from it. There are ifdefs so you can choose which one gets used. I'd like to do a lot more testing to see what the optimization effect is from the two patches, and to see how often they trigger, and to make sure they won't generate a compiler error on anything important (like gcc, glibc, newlib, linux kernel).
[Bug target/90015] riscv: typo "intterupt" in diagnostic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90015 Jim Wilson changed: What|Removed |Added CC||wilson at gcc dot gnu.org --- Comment #2 from Jim Wilson --- I tested Jakub's patch with 32-bit elf and 64-bit linux cross compilers. There were no regressions. The patch is OK.
[Bug debug/90109] gstabs flag generates wrong entry for long on x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90109 --- Comment #2 from Jim Wilson --- long long and long double did not exist when stabs was invented. Also, 64-bit machines and C++ did not exist at the time. Also, unfortunately, stabs wasn't designed to be extensible. So there is no way to describe anything that doesn't exist in K C without using incompatible extensions, such as those enabled by the the gdb stabs extensions. So yes, invalid, though I'd add an extra caution that stabs should only be used if you need compatibility with old obsolete systems. Otherwise, it is a waste of time, as we stopped actively maintaining it about a decade ago, and probably should just drop the support from gcc.
[Bug target/89955] riscv.h improperly defines STARTFILE_PREFIX_SPEC spec
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89955 Jim Wilson changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #7 from Jim Wilson --- Fixed on mainline.
[Bug target/89955] riscv.h improperly defines STARTFILE_PREFIX_SPEC spec
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89955 --- Comment #6 from Jim Wilson --- Author: wilson Date: Thu Jun 6 23:18:48 2019 New Revision: 272021 URL: https://gcc.gnu.org/viewcvs?rev=272021=gcc=rev Log: RISC-V: Move STARTFILE_PREFIX_SPEC into target OS files. gcc/ PR target/89955 * config/riscv/riscv.h (STARTFILE_PREFIX_SPEC): Deleted. * config/riscv/freebsd.h (STARTFILE_PREFIX_SPEC): Added. * config/riscv/linux.h (STARTFILE_PREFIX_SPEC): Added. Modified: trunk/gcc/ChangeLog trunk/gcc/config/riscv/freebsd.h trunk/gcc/config/riscv/linux.h trunk/gcc/config/riscv/riscv.h
[Bug target/89955] riscv.h improperly defines STARTFILE_PREFIX_SPEC spec
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89955 --- Comment #5 from Jim Wilson --- OK, sounds like we need to move STARTFILE_PREFIX_SPEC into various OS header files then. That will require some testing. I caught a virus last week and am behind on everything, so I haven't had a chance to try to do this yet.
[Bug target/90419] RISCV --with-multilib-list support is somewhat incomplete
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90419 --- Comment #4 from Jim Wilson --- GCC worked out of the box before we started upstreaming the toolchain. And it will work out of the box again when we are done with the upstreaming. But meanwhile, we are still in the middle of upstreaming and some things only work if you have the right collection of patches or the right collection of source trees. If I change the FSF gcc tree to try to fix this, then I break the github riscv/riscv-gnu-toolchain tree, which is using the same gcc sources, and has working multilib support. I don't want to have to maintain two different gcc source trees, so I'm leaving the FSF gcc broken temporarily because I don't want to break the riscv/riscv-gnu-toolchain tree. This problem will go away when glibc is finished upstreaming. This is a problem I keep raising inside SiFive and the RISC-V Foundation because a number of things are blocked on this glibc upstreaming problem. It isn't clear when it will get fixed though. I might have to volunteer to do it myself at some point, but I'm still hoping someone else does it.
[Bug target/90419] RISCV --with-multilib-list support is somewhat incomplete
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90419 --- Comment #6 from Jim Wilson --- Comment on attachment 46379 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46379 patch: disable 32bit riscv abis in gcc multilib You are missing the g aliases. And you still have 32-bit dirnames. We have a generator script that is used to generate this file. It gives the command line at the top. You can just rerun this without the 32-bit arches and abis. Try rynning: ./multilib-generator rv64imac-lp64-rv64ima,rv64imaf,rv64imafd,rv64imafc,rv64imafdc- rv64imafdc-lp64d-rv64imafd- that should be on a single line. Then pipe that into a file to use.
[Bug target/90419] RISCV --with-multilib-list support is somewhat incomplete
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90419 --- Comment #2 from Jim Wilson --- I talked to Palmer. Apparently what you want to do is build multilibs for lp64 and lp64d, to test the linux multilib support. That isn't currently supported. I would suggest applying a local patch to change the gcc/config/riscv/t-linux-multilib file with one that just has the lp64 and lp64d multilibs. If you are trying to do this natively, you are likely to run into problems along the way, as no one has tried to do this before as far as I know.
[Bug target/90419] RISCV --with-multilib-list support is somewhat incomplete
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90419 Jim Wilson changed: What|Removed |Added CC||wilson at gcc dot gnu.org --- Comment #1 from Jim Wilson --- For cross compilers, --enable-multilib gives a decent selection of multilib targets. For native compilers, we don't have linux kernel support to run 32-bit code on 64-bit parts. Also, I'm not aware of any hardware that has the hardware support necessary to run 32-bit code on 64-bit parts, which makes the linux kernel support somewhat of a moot problem. So, currently, there is no use for building native multilib compilers. There is however a problem that the RISC-V ABI requires that libraries go in very specific places, e.g. /lib64/lp64d for the default linux ABI. In order to get the library paths correct in a native linux system compiler that will be installed into /usr/bin, one must build gcc multilib, and one must simultaneously disable the multilibs that won't work. This is what --with-multilib-list is for. If you configure --enable-multilib --with-multilib-list=lp64d then you will get the correct result. This is how Fedora configures their RISC-V compilers for instance. Supporting more than one ABI in --with-multilib-list is a major project. Currently this is implemented via sed scripts, but the fact that we have two interrelated options here -march and -mabi means that this gets very complicated very quickly. I looked at this and gave up, and decided that support for only one ABI at a time was practical. If you want more than one ABI, then this is an enhancement request not a bug, and it is unclear when or if this will get fixed. 32-bit glibc support exists but has not been merged into the glibc tree yet. This was pending on the y2038 patches. Currently all 32-bit linux targets use a 32-bit time_t that will overflow in 2038. This requires an unfortunate ABI break to fix. As RISC-V is a new target, we were waiting for the y2038 infrastructure patches to go into first (both linux kernel and glibc), so that we could avoid an ABI break shortly after contributing the 32-bit RISC-V port. I think all of the infrastructure patches are in now, but likely need some testing. I just recently reported a glibc bug due to these changes, which turned out to actually be a linux kernel bug. Anyways, the main problem now is finding a volunteer to pick up the 32-bit RISC-V glibc work and finish it. Meanwhile, the 32-bit glibc patches are available as a patch set that was sent to the libc-alpha mailing list last year, and as patches on the riscv-glibc-2.29 branch in the github.com/riscv/riscv-glibc tree. This branch is used by and known to work with the riscv/meta-riscv open-embedded tree. I agree the current situation is not ideal, but RISC-V is a very new target that isn't fully fleshed out yet, and the situation will get better in a year or two. This is mostly a user education problem at the moment. By the way, there is already a gentoo RISC-V port. A co-worker Palmer Dabbelt did one. I don't know anything about gentoo though, and I haven't tried using it. So you would have to ask him about it if interested.
[Bug other/63509] Misleading error message when building gcc without C++ compiler installed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63509 Jim Wilson changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2019-05-02 CC||wilson at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #7 from Jim Wilson --- Confirmed. Except that current sources now say configure: error: uint64_t or int64_t not found We have a configure check to verify that the C compiler works. We are missing one for the C++ compiler. In configure we have $as_echo_n "checking whether the C compiler works... " >&6; } and there is no equivalent for the C++ compiler.
[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=2 --- Comment #14 from Jim Wilson --- https://wiki.debian.org/Ports/ia64 James Clarke has been active recently on the binutils and/or gcc mailing lists. My IA-64 work has dwindled down to nothing, as RISC-V work has kept me too busy to do anything else. With the architecture already set to end-of-life next year, I'm not planning to do anymore IA-64 work.
[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=2 --- Comment #15 from Jim Wilson --- See also PR 87338 which has a response earlier today from John Paul Adrian Glaubitz.
[Bug other/63509] Misleading error message when building gcc without C++ compiler installed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63509 --- Comment #8 from Jim Wilson --- I filed an autoconf bug http://savannah.gnu.org/support/index.php?109676 The problem here is that autoconf only verifies that the first AC_PROG_X compiler is working. And since we include AC_PROG_CC before AC_PROG_CXX, only the C compiler is tested to see if it works. But since we are using the C++ compiler for all of the configure checks, and for the build, we really should be verifying that the C++ compiler works. We can work around the autoconf problem by putting AC_PROG_CXX before AC_PROG_CC. This will verify that the C++ compiler works, but will stop verifying that the C compiler works, which is probably OK.
[Bug target/61577] [4.9.0] can't compile on hp-ux v3 ia64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61577 Jim Wilson changed: What|Removed |Added CC||wilson at gcc dot gnu.org --- Comment #35 from Jim Wilson --- I never had access to ia64 hpux unfortunately. I did ask them for a copy once, but they couldn't figure out why I wanted it, and so they wouldn't give it to me. Steve Ellcey was working for HP at the time and maintaining the ia64-hpux support. You could try asking him. I think the ia64-hpux support was always poor. Too many things done differently than ia64-linux, requiring inconvenient hacks on our side to make our tools work on their system. I don't know if the binutils support was ever usable. The gcc support was usable at one time, but I don't know the details.
[Bug debug/90109] gstabs flag generates wrong entry for long on x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90109 --- Comment #5 from Jim Wilson --- Stabs requires that we emit info for all of the base types at the start. But if one of the base types does not exist for a 32-bit K C target, then we are struck, as that can't be described. And if we can't describe one or more of the base types, then we can't safely describe derived types. Currently, we just lie about the base types we can't describe, which lets us emit debug info for the derived types that we can describe. It is possible to describe some things as derived types that can't be described as base types. But if we aren't allowed to lie about base types, then we probably have to refuse to emit any debug info at all for the file, which doesn't help. You can get correct debug info for this testcase if you use stabs-plus-gdb-extensions. At least until you try to use some language feature invented in the last decade or two that we never added stabs-plus-gdb-extensions support for, and then you are screwed again. Which takes me back to don't use stabs again.
[Bug target/91041] New: obsolete arm-pe support in arm backend
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91041 Bug ID: 91041 Summary: obsolete arm-pe support in arm backend Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: wilson at gcc dot gnu.org Target Milestone: --- The arm/pe.h file was removed in 2012, but we still have about half a dozen references in the arm.c file to the ARM_PE macro that was defined in that file. The arm-symbian port uses some of the same features, but it doesn't define ARM_PE, the symbol support is handled via TARGET_DLLIMPORT_DECL_ATTRIBUTES. The obsolete code should be cleaned up.
[Bug target/91041] obsolete arm-pe support in arm backend
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91041 --- Comment #1 from Jim Wilson --- FYI I noticed this while trying to answer a question on gcc-help, otherwise I'm not doing any arm work at present.
[Bug target/91420] relocation truncated to fit: R_RISCV_HI20 against `.LC0' with GCC 8.2/8.3 with "-O2" on RISC-V
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91420 Jim Wilson changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2019-08-12 CC||wilson at gcc dot gnu.org Ever confirmed|0 |1
[Bug target/91602] GCC fails to build for riscv in a combined tree due to misconfigured leb128 support
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91602 Jim Wilson changed: What|Removed |Added CC||wilson at gcc dot gnu.org --- Comment #3 from Jim Wilson --- Combined tree builds are obsolete and shouldn't be used anymore. Since this only shows up in a combined tree build, I don't consider it important. If you build the toolchain the correct way, building binutils and gcc separately, the build does work. My preferred solution would be to kill combined tree build support.
[Bug target/91602] GCC fails to build for riscv in a combined tree due to misconfigured leb128 support
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91602 --- Comment #6 from Jim Wilson --- By the way, the underlying problem here is, as Andrew Waterman mentioned, that the RISC-V linker does aggressive link time relaxations to reduce code size, and this makes lib128 with label subtraction unsafe. The RISC-V port is not the only port that has this problem. The RISC-V port is just the only one that gets it right. The others can silently produce incorrect debug info. I have evidence that every binutils target except RISC-V that does aggressive link time relaxation to reduce code size is doing it wrong. There is a bug in binutils bugzilla where I listed linker relaxation code size related issues, but I have since found and fixed other problems in the RISC-V port, so it is incomplete. https://sourceware.org/bugzilla/show_bug.cgi?id=22756#c2
[Bug target/91602] GCC fails to build for riscv in a combined tree due to misconfigured leb128 support
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91602 --- Comment #5 from Jim Wilson --- The wiki is wrong. Combined tree builds should not be used anymore. Combined tree builds date back to when Cygnus was maintainer for everything. We put everything in a single source tree, and wrote configure and makefile support so that you could check them out with a single command, and configure and build them with a single command. But Cygnus is no longer controlling everything, and different packages are now in different source trees, and are meant to be compiled separately. When you build packages separately, each package after the first uses feature tests to configure itself appropriately to use the earlier packages. When you use a combined tree build, feature tests can't be used because earlier packages haven't been installed yet, so configure just makes guesses. We have the most commonly correct results hardwired in as guesses. This was fine when packages were smaller, and we had a smaller number of targets. Nowadays, there is too much risk that the guesses will be wrong. A combined tree build may not give you the exact same result as building separately because of this. Of course, if you just need a throw away toolchain to check how something works for a different architecture than your own, this is fine. But if you want to make a product release that will be used by customers, this can be dangerous. None of the major build scripts use combined tree builds. None of the linux distros (Fedora, Debian, OpenSuse) do this. None of the embedded linux build systems to this (OpenEmbedded/Yocto, Buildroot). crosstool-ng doesn't do builds this way. And github.com/riscv/riscv-gnu-toolchain doesn't do builds this way. There are so many easy ways to build toolchains that we don't need combined tree builds to work. This is of course my personal opinion, and I expect that people will disagree with me. But I rarely if ever see anyone update the combined tree support, so I will continue to tell people that it is dangerous to rely on it.
[Bug target/91635] wrong code at -O2 with __builtin_add_overflow() and shifts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91635 --- Comment #13 from Jim Wilson --- I see 5 broken patterns which matches the list already given. The four testcases are all triggering on the same splitter, which is the first define_split, lshrsi3_zero_extend_3+1. The second define_split, lshrsi3_zero_extend_3+2, requires a mask which has at least 32 zeros in the low bits. I don't think that this can occur with a paradoxical reg, because the only valid input bits are all undefined bits in the upper half of the paradoxical reg. We may as well fix it anyways though to be consistent, but this means writing a testcase appears to be impossible. The other three are post reload splitters, which will be hard to write a testcase for. So it looks like we only have one useful testcase here, but that is better than nothing. Kito and I can test RISC-V patches. Using SUBREG_REG would require generating a new reg rtx with the wider mode. May not be worth the trouble considering how uncommon that case will be. For the second define_split, the existing testcases that trigger it are unsigned long sub1 (unsigned long i) { return (i >> 32) << 32; } unsigned long sub2 (unsigned long i) { return (i >> 63) << 63; } Like I mentioned above, I don't see how that can be rewritten to use a paradoxical reg.
[Bug target/91635] wrong code at -O2 with __builtin_add_overflow() and shifts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91635 --- Comment #22 from Jim Wilson --- > First of all, I think you need to use :DI instead of :GPR for the last > define_split, as it shouldn't be iterated with. Yes, sloppy copy paste. I will fix before committing. > Second, doesn't this mean that the splitters will be matched then only > during combine and never afterwards? The define_splits were specifically written to match patterns created by combine that aren't valid instructions. So I think it is OK if they can only be matched during combine. My empirical results are good looking at libc and libstdc++ code size, I haven't seen any cases where I get worse code size by modifying them to use clobbers. > The can_create_pseudo_p () -> gen_reg_rtx (mode) way of adding intermediate > temporaries is used heavily in many other backends, just look at say i386, > rs6000 or aarch64 backends for examples (I've looked only at those 3 and > found many spots in each), so if missing REG_DEAD notes affect code quality, > perhaps it would be more useful to try to change the combiner to add those > notes (say before splitting remember highest pseudo number, after the > splitter see if we've added some new pseudos and by scanning the resulting > sequence see if we shouldn't add REG_DEAD notes for those somewhere, because > they shouldn't be used anywhere outside of the sequence. Yes, long term, I think it would be a good idea to improve combine to add REG_DEAD notes for temporary regs created by splitters. I don't plan to work on that as I have too much other stuff to do. I'd be willing to create a bug report if you want me to.
[Bug target/91635] wrong code at -O2 with __builtin_add_overflow() and shifts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91635 --- Comment #25 from Jim Wilson --- Author: wilson Date: Thu Sep 5 20:32:55 2019 New Revision: 275444 URL: https://gcc.gnu.org/viewcvs?rev=275444=gcc=rev Log: RISC-V: Fix bad insn splits with paradoxical subregs. Shifting by more than the size of a SUBREG_REG doesn't work, so we either need to disable splits if an input is paradoxical, or else we need to generate a clean temporary for intermediate results. Jakub wrote the first version of this patch, so gets primary credit for it. gcc/ PR target/91635 * config/riscv/riscv.md (zero_extendsidi2, zero_extendhi2, extend2): Don't split if paradoxical_subreg_p (operands[0]). (*lshrsi3_zero_extend_3+1, *lshrsi3_zero_extend_3+2): Add clobber and use as intermediate value. gcc/testsuite/ PR target/91635 * gcc.c-torture/execute/pr91635.c: New test. * gcc.target/riscv/shift-shift-4.c: New test. * gcc.target/riscv/shift-shift-5.c: New test. Added: trunk/gcc/testsuite/gcc.c-torture/execute/pr91635.c trunk/gcc/testsuite/gcc.target/riscv/shift-shift-4.c trunk/gcc/testsuite/gcc.target/riscv/shift-shift-5.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/riscv/riscv.md trunk/gcc/testsuite/ChangeLog
[Bug target/91635] wrong code at -O2 with __builtin_add_overflow() and shifts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91635 --- Comment #18 from Jim Wilson --- I was willing to write a patch, I just needed a day to catch up as I'm hopelessly overloaded with work. But since Jakub wrote one, it looks right to me, and I think we should use that. I'm not sure exactly what testing Kito did, and he has now left on vacation, so I will just do my own testing. I think the reason why RISC-V needs paradoxical_reg and other architectures don't is that most architectures have sense enough to define zero and sign extend instructions. RISC-V will finally get that when the draft B (bit manipulation) extension is formally approved.
[Bug target/91635] wrong code at -O2 with __builtin_add_overflow() and shifts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91635 --- Comment #19 from Jim Wilson --- I did cross toolchain builds and checks for rv32i/newlib and rv64gc/glibc. There were no regressions. Since the splitters exist to reduce code size, I looked at the libc and libstdc++ sizes with and without the patch. For the rv32i build, libstdc++ is the same size, and libc is 4 bytes smaller. I extracted a testcase. int isnanf (float x) { union u { float f; int i; } u; u.f = x; int i = u.i; i &= 0x7fff; return i > 0x7f80; } Without the patch with -O2 -S we get sllia0,a0,1 li a5,2139095040 srlia0,a0,1 sgt a0,a0,a5 ret with the patch we get li a5,-16777216 sllia0,a0,1 sgtua0,a0,a5 ret The issue here is that the extra temporary used for the splitter that generates the slli/srli gives combine a little more freedom, and it manages to find an optimization that was missed before. For the rv64gc build, libc is 38 bytes smaller and libstdc++ is 34 bytes larger. The libc issue appears to be the same as above. The libstdc++ issue is more interesting. I extracted a testcase. unsigned long sub (long l) { union u { struct s { int a : 19; unsigned int b : 13; int x; } s; long l; } u; u.l = l; return u.s.b; } Without the patch, with -O2 -S, we generate srliw a0,a0,19 ret and with the patch we get srlia0,a0,19 sllia0,a0,51 srlia0,a0,51 ret The issue here is that there is no REG_DEAD note for the temporary that we generated. So without the patch combine generates Successfully matched this instruction: (set (reg:DI 78) (zero_extract:DI (reg:DI 82) (const_int 13 [0xd]) (const_int 19 [0x13]))) and with the patch combine generates Failed to match this instruction: (parallel [ (set (reg:DI 78) (zero_extract:DI (reg:DI 82) (const_int 13 [0xd]) (const_int 19 [0x13]))) (set (reg:DI 83) (and:DI (ashift:DI (reg:DI 82) (const_int 32 [0x20])) (const_int -2251799813685248 [0xfff8]))) ]) where reg 83 is the temporary generated by the splitter which has no REG_DEAD note. Since the temporary appears to be generally useful, and lack of a REG_DEAD note causes problems, I think it would be better to modify the two define_split patterns to have a clobber instead of calling gen_reg_rtx. I do get the REG_DEAD note in this case, and the correct single instruction output for the second testcase. The three define_insn_and_split patterns can remain the same as they aren't causing any trouble. Though I suppose there is a question here of whether combine will ever accidentally substitute in a paradoxical reg for a clobber. I would hope that it doesn't. I can check for that. Since I have a new patch, and new testcases, I need to redo the builds and checks from scratch.
[Bug target/91635] wrong code at -O2 with __builtin_add_overflow() and shifts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91635 --- Comment #20 from Jim Wilson --- Created attachment 46830 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46830=edit proposed patch to fix paradoxical reg in splitter problem I get better code size with this alternative patch. I added two more testcases for the issues I found while evaluating this patch. There is no change to the rv32/newlib libc/libstdc++ library code sizes. The rv64/linux libc.so is now 94 bytes smaller. The libstdc++.so is 6 bytes larger but text is 10 bytes smaller and rodata is 16 bytes larger. I'm not sure what happened there, maybe alignment padding, but I can live with it. The text section looks like the same code, but slightly better register allocation, which allows a few more compressed instructions to be used. This has passed rv32/newlib and rv64/linux cross toolchain build and check. The new testcase from Jakub still works, along with my two new testcases. I tried adding instrumentation to the patch to abort if I ended up with a paradoxical reg in the splitters, and it never triggered, so it looks like combine is doing the right thing when allocating a reg for the clobber.
[Bug target/91683] ICE: SIGSEGV at -O when compiling for riscv64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91683 Jim Wilson changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2019-09-06 CC||wilson at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |wilson at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Jim Wilson --- Created attachment 46850 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46850=edit proposed fix This is failing in undo_all in combine, trying to write to a pointer into data that has been garbage collected. The problem starts with a riscv.md splitter which calls riscv_move_integer which checks can_create_pseudo_p and calls force_reg if allowed. This eventually calls gen_reg_rtx which calls ensure_regno_capacity which calls ggc_realloc, which invalidates the data in the undo buffer since all pseudo reg addresses change. I guess this is another reason why calling gen_reg_rtx in a splitter is a bad idea. I can fix it by adding another option to riscv_move_integer to indicate when we are calling it from inside a splitter, but maybe I will think of a better way to fix the problem later. The fact that this requires a ggc_realloc call to happen in a very precise place means that this testcase does not appear to be very useful. There is no guarantee it will trigger the problem if other parts of the compiler change.
[Bug inline-asm/85185] Wider-than-expected load for struct member used as operand of inline-asm with memory clobber at -Og
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85185 --- Comment #10 from Jim Wilson --- It is the same basic problem as before, but more serious. The original testcase uses a short variable, and can be fixed by adding a cast to int. But the new testcase does not use any short variables. There seems to be a disconnect between the C parser and the RTL expander here. the C parser is giving the asm operand a type of short unsigned int, which is the smallest integer type that the constant can fit in. The RTL expander is then generating a HImode reg for it, but loading a constant into an HImode reg will implicitly sign extend it, which is wrong for a short unsigned int asm operand. Before the expand_expr call... Breakpoint 6, expand_asm_stmt (stmt=stmt@entry=0x76274108) at ../../gcc-git/gcc/cfgexpand.c:3247 (gdb) print val $11 = (gdb) pt warning: Expression is not an assignment (and might have no effect) constant 65532> (gdb) after the expand_expr call... gdb) print op $12 = (rtx) 0x76163450 (gdb) pr warning: Expression is not an assignment (and might have no effect) (const_int -4 [0xfffc]) (gdb) then it does op = force_reg (TYPE_MODE (type), op); and we end up with gdb) print op $13 = (rtx) 0x76272828 (gdb) pr warning: Expression is not an assignment (and might have no effect) (reg:HI 72) Maybe what we need here is some equivalent to PROMOTE_MODES to force asm operands to word mode on targets where this makes more sense. But this needs to happen before the constant gets sign extended, which means it has to happen before or inside the expand_expr call.
[Bug target/91720] [10 Regression] wrong code with -Og -fno-forward-propagate -frerun-cse-after-loop -fno-tree-fre
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91720 Jim Wilson changed: What|Removed |Added CC||wilson at gcc dot gnu.org --- Comment #8 from Jim Wilson --- Created attachment 46867 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46867=edit simplest fix: don't define WORD_REGISTER_OPERATIONS I've had problems with combine, LOAD_EXTEND_OP, and WORD_REGISTER_OPERATIONS before. I think combine is making unsafe assumptions in this case, but I'm getting tired of arguing about this. So the simple solution is to just stop defining WORD_REGISTER_OPERATIONS. It makes this testcase work. I get about a 0.1% code size increase when I try it with a 64-bit toolchain, looking at libc and libstdc++ library file sizes. If this affects dhrystone or coremark code size or performance, a lot of people will be unhappy, so I will have to double check that. For a 32-bit toolchain it is more like 0.01% which is not as bad.
[Bug target/91683] ICE: SIGSEGV at -O when compiling for riscv64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91683 --- Comment #16 from Jim Wilson --- (In reply to rguent...@suse.de from comment #15) > I still don't understand. The rtx are not relocated. The only thing is the > address of the slot of the regno to rtx map. I have a debug session in comment 6 that shows that this address is changing. But maybe what you are missing here is how combine changes registers modes. It uses a call like this SUBST_MODE (regno_reg_rtx[REGNO (i2dest)], split_mode); and this macro is defined as #define SUBST_MODE(INTO, NEWVAL) do_SUBST_MODE (&(INTO), (NEWVAL)) so we are saving an address in the regno_reg_rtx array. I don't know exactly why it works this way, and I don't care, because I don't think that fixing this is the right solution for the RISC-V port. I think the RISC-V port should stop calling gen_reg_rtx inside splitters. The patch where this code appeared suggests that it was a rare hard to reproduce bug that required this fix. https://gcc.gnu.org/ml/gcc-patches/2006-01/msg00477.html Maybe SUBST_MODE was just blindly copied from SUBST_INT, and it was missed that it might work without the & in this case. This may also depend on how exactly SUBST_MODE is used, but currently it is only used for regs, which are supposed to be unique, so dropping the & may work. This may not work if SUBST_MODE gets used for otherwise, but currently it isn't.
[Bug c/91713] GP linker relaxation is not performed with -nostdlib
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91713 Jim Wilson changed: What|Removed |Added CC||wilson at gcc dot gnu.org --- Comment #1 from Jim Wilson --- This is an edge condition and an accident of circumstances. When you link with the default libraries, other stuff gets put in sdata before a, and so the variable a is in range of gp because it is at -0x7f0. When you link with -nostdlib, a is the only thing in sdata, and we run into an edge condition where a is -0x800 from gp, which is at the extreme limit, but the linker relaxation has to limit the range to deal with section alignment issues that may changes addresses after relaxation, and so we have to assume that a is out of range. If you change the example to int a = 1; int b = 2; int c = 3; int d = 4; int e = 5; int main() { return a + b + c + d + e; } then we see that all 5 variables use gp address with default libraries, and only the last 3 with -nostdlib, so we are losing the first two variables due to address range limitation at linker relaxation time. There is a somewhat related open binutils bug report https://sourceware.org/bugzilla/show_bug.cgi?id=24678 and another somewhat related binutils bug report I recently fixed https://sourceware.org/ml/binutils/2019-08/msg00244.html if gp was still computed inside the .sdata section we wouldn't have this problem, but that means undoing a change that reduced code size for most applications. Maybe there is a different way to solve the earlier problem that doesn't cause this problem. Anyways, this is a linker issue not a gcc issue.
[Bug target/91683] ICE: SIGSEGV at -O when compiling for riscv64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91683 --- Comment #5 from Jim Wilson --- (In reply to Richard Biener from comment #3) > Hmm, but shouldn't we instead fix combine to record not the reg rtx but the > regno for UNDO_MODE? Because nothing prevents this from happening elsewhere > (even in combine itself?). Combine doesn't call gen_reg_rtx, other than make_more_copies, and splitters aren't supposed to either. As Segher mentioned, if a define_split needs a temporary it is supposed to add a clobber of a match_operand. I tried an experiment, I set a global var at the start of combine_instructions, cleared it at the end, added an abort to gen_reg_rtx if the var is set, and did an x86 bootstrap. It completed successfully. A make check is running, but I don't expect to hit the abort there either. A RISC-V toolchain with my proposed fix added still fails as riscv_split_symbol called from gen_split_13 has the same problem and needs a similar fix.
[Bug target/91683] ICE: SIGSEGV at -O when compiling for riscv64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91683 --- Comment #6 from Jim Wilson --- (In reply to Richard Biener from comment #4) > Btw, the rtx of pseudos doesn't change, what changes is the address of the > entry in regno_reg_rtx[] which is the pseudo-nr -> rtx map. But I don't see > that involved... do_SUBST_MODE saves the address of the rtx being modified, in this case the address of a pseudo reg. Breakpoint 5, do_SUBST_MODE (into=0x76280458, newval=E_DImode) at ../../gcc-svn/gcc/combine.c:809 (gdb) print into $5 = (rtx *) 0x76280458 (gdb) print *into $6 = (rtx) 0x76287630 (gdb) pr warning: Expression is not an assignment (and might have no effect) (reg:SI 139) (gdb) print regno_reg_rtx[139] $7 = (rtx) 0x76287630 (gdb) print _reg_rtx[139] $8 = (rtx *) 0x76280458 (gdb) So into is an address into regno_reg_rtx. When we realloc regno_reg_rtx this changes. Breakpoint 6, emit_status::ensure_regno_capacity (this=0x1a881f0 ) at ../../gcc-svn/gcc/emit-rtl.c:1215 (gdb) next 6 (gdb) print regno_reg_rtx[139] $13 = (rtx) 0x76287630 (gdb) print _reg_rtx[139] $14 = (rtx *) 0x7627b458 (gdb) print $5 $15 = (rtx *) 0x76280458 (gdb) print *$5 $16 = (rtx) 0xa5a5a5a5a5a5a5a5 (gdb)
[Bug target/91683] ICE: SIGSEGV at -O when compiling for riscv64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91683 --- Comment #12 from Jim Wilson --- (In reply to Segher Boessenkool from comment #11) > Oh lol, I finally wake up. It is called from the splitter. That isn't > really a valid thing to do. Right. I want to eliminate that. > We have some limited support for that since a while, but it seems this > cannot ever really work? Maybe with changes to the combine undo buffer, but there is still the lack of REG_DEAD notes that can cause problems, so I think I'm still better off avoiding calling gen_reg_rtx from splitters.
[Bug target/91683] ICE: SIGSEGV at -O when compiling for riscv64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91683 --- Comment #14 from Jim Wilson --- > 3) Do we want to prohibit calling gen_reg_rtx during combine? Why did > we want it, before? Prohibiting gen_reg_rtx calls here would have helped find the bugs I'm fixing now. I've got a hack to do this to find the remaining bugs I need to fix. I don't know why anyone would want it before. I'm not sure if this ever worked right.
[Bug target/91683] ICE: SIGSEGV at -O when compiling for riscv64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91683 --- Comment #7 from Jim Wilson --- > Would --param=ggc-min-expand=0 and/or --param=ggc-min-heapsize=0 help to > reproduce the issue? I don't think so. The problem occurs in ensure_regno_capacity, which isn't affected by frequency of garbage collection. It is only affected by the number of gen_reg_rtx calls. It is already the case that ggc_free overwrites freed memory with 0xa5 when ENABLE_GC_CHECKING is on, that is what exposed the problem.
[Bug target/91683] ICE: SIGSEGV at -O when compiling for riscv64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91683 --- Comment #8 from Jim Wilson --- (In reply to Richard Biener from comment #3) > Hmm, but shouldn't we instead fix combine to record not the reg rtx but the > regno for UNDO_MODE? Because nothing prevents this from happening elsewhere > (even in combine itself?). Changing do_SUBST_MODE to save the regno instead of the address of rhe reg rtx is a possible solution, but can work only if there is no other place in combine that might take the address of a reg rtx. I can't be sure of that unless I do more analysis, which I don't plan to do as I don't think this is the right solution. Also, I think I will get better code if I fix the splitters to use a clobber. As was mentioned in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91635#c19 , calling gen_reg_rtx means no REG_DEAD note which limits what combine can do, whereas using a clobber means getting an existing reg that likely already has a REG_DEAD note which helps combine.
[Bug target/91229] RISC-V ABI problem with zero-length bit-fields and float struct fields
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91229 Jim Wilson changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #4 from Jim Wilson --- Fixed on mainline.
[Bug target/91229] RISC-V ABI problem with zero-length bit-fields and float struct fields
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91229 --- Comment #3 from Jim Wilson --- Author: wilson Date: Thu Aug 8 19:04:56 2019 New Revision: 274215 URL: https://gcc.gnu.org/viewcvs?rev=274215=gcc=rev Log: RISC-V: Fix C ABI for flattened struct with 0-length bitfield. gcc/ PR target/91229 * config/riscv/riscv.c (riscv_flatten_aggregate_field): New arg ignore_zero_width_bit_field_p. Skip zero size bitfields when true. Pass into recursive call. (riscv_flatten_aggregate_argument): New arg. Pass to riscv_flatten_aggregate_field. (riscv_pass_aggregate_in_fpr_pair_p): New local warned. Call riscv_flatten_aggregate_argument twice, with false and true as last arg. Process result twice. Compare results and warn if different. (riscv_pass_aggregate_in_fpr_and_gpr_p): Likewise. gcc/testsuite/ * gcc.target/riscv/flattened-struct-abi-1.c: New test. * gcc.target/riscv/flattened-struct-abi-2.c: New test. Added: trunk/gcc/testsuite/gcc.target/riscv/flattened-struct-abi-1.c trunk/gcc/testsuite/gcc.target/riscv/flattened-struct-abi-2.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/riscv/riscv.c trunk/gcc/testsuite/ChangeLog
[Bug target/91229] RISC-V ABI problem with zero-length bit-fields and float struct fields
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91229 Jim Wilson changed: What|Removed |Added Target||riscv*-*-* Status|UNCONFIRMED |NEW Last reconfirmed||2019-07-22 Assignee|unassigned at gcc dot gnu.org |wilson at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Jim Wilson --- There is a psABI discussion about the problem at https://github.com/riscv/riscv-elf-psabi-doc/issues/99
[Bug target/91229] RISC-V ABI problem with zero-length bit-fields and float struct fields
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91229 --- Comment #2 from Jim Wilson --- Created attachment 46617 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46617=edit proposed patch to change ABI and warn for affected structs
[Bug target/91229] New: RISC-V ABI problem with zero-length bit-fields and float struct fields
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91229 Bug ID: 91229 Summary: RISC-V ABI problem with zero-length bit-fields and float struct fields Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: wilson at gcc dot gnu.org Target Milestone: --- This was noticed by clang development, comparing clang against gcc to verify ABI compliance. There is a problem with the GCC implementation of the ABI, where we are accidentally emitting different code for the same struct when compiled by the C and C++ compilers. There are two cases affected by this. Here is the first example: struct s1 { int : 0; float f; int i; int : 0; }; void dummy(float, int); void f(struct s1 s) { dummy(s.f + 1.0, s.i + 1); } where we have a struct that can be passed in one FP reg and one integer GP, and here is the second example: struct s1 { int : 0; float f; float g; int : 0; }; void dummy(float, float); void f(struct s1 s) { dummy(s.f + 1.0, s.g + 2.0); } where we have a struct that can be passed in two FP regs. In both cases, the C++ compiler passes the float struct fields in FP registers, and the C compiler passes them in integer registers. The general case here is any struct with one or more zero-length bitfields, exactly two non-zero length fields, one of which must have an FP type that can fit in an FP register, and the other can be an FP type that fits in an FP register or an integer type that fits in an integer register or a integer bit-field that is the exact same size as an integer type that can fit in an integer register. Also, the target must have FP register support. The fundamental problem is that the RISC-V backend is not checking for zero-length bit-fields when deciding if a struct field can be passed in a FP register or not. Meanwhile, the C++ front end is stripping zero-length bit-fields after struct layout. So when compiling as C++ we decide that the FP struct fields can be passed in FP regs. But when compiling as C we decide that there are too many struct fields and they all get passed in integer registers. Since having the C and C++ front ends using different ABIs is undesirable, we need an ABI change. Fixing the C++ case would require inconvenient changes to the C++ front end. So fixing the C case with a RISC-V backend patch looks like the best practical solution. The affected structures are a bit obscure and not very useful, so it is hoped no real code will be affected. I've done an open-embedded world build with an instrumented compiler, and I didn't see any case that triggered my code. Not everything built though, since some stuff still doesn't have RISC-V support yet. I did have over 30,000 tasks run, so quite a bit of stuff did build.
[Bug tree-optimization/91191] New: vrp and boolean arguments
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91191 Bug ID: 91191 Summary: vrp and boolean arguments Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: wilson at gcc dot gnu.org Target Milestone: --- It appears that vrp isn't propagating the ranges of incoming boolean arguments. Given this example: unsigned char reg(_Bool b) { union U { unsigned char f0; _Bool f1; }; union U u; u.f1 = b; if (u.f0 > 1) { // This cannot happen // if b is only allowed // to be 0 or 1: return 42; } return 13; } clang optimizes this to unconditionally return 13, but gcc does a compare and conditionally returns either 42 or 13 depending on the result of the compare. This happens with both x86_64 and RISC-V. Looking at the vrp dumps, I see b_3(D): VARYING
[Bug target/91635] wrong code at -O2 with __builtin_add_overflow() and shifts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91635 Jim Wilson changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Assignee|unassigned at gcc dot gnu.org |wilson at gcc dot gnu.org --- Comment #27 from Jim Wilson --- Fixed on mainline and gcc-9 branch.
[Bug target/91683] ICE: SIGSEGV at -O when compiling for riscv64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91683 --- Comment #20 from Jim Wilson --- Author: wilson Date: Thu Sep 19 01:19:25 2019 New Revision: 275925 URL: https://gcc.gnu.org/viewcvs?rev=275925=gcc=rev Log: RISC-V: Fix more splitters accidentally calling gen_reg_rtx. PR target/91683 * config/riscv/riscv-protos.h (riscv_split_symbol): New bool parameter. (riscv_move_integer): Likewise. * config/riscv/riscv.c (riscv_split_integer): Pass FALSE for new riscv_move_integer arg. (riscv_legitimize_move): Likewise. (riscv_force_temporary): New parameter in_splitter. Don't call force_reg if true. (riscv_unspec_offset_high): Pass FALSE for new riscv_force_temporary arg. (riscv_add_offset): Likewise. (riscv_split_symbol): New parameter in_splitter. Pass to riscv_force_temporary. (riscv_legitimize_address): Pass FALSE for new riscv_split_symbol arg. (riscv_move_integer): New parameter in_splitter. New local can_create_psuedo. Don't call riscv_split_integer or force_reg when in_splitter TRUE. (riscv_legitimize_const_move): Pass FALSE for new riscv_move_integer, riscv_split_symbol, and riscv_force_temporary args. * config/riscv/riscv.md (low+1): Pass TRUE for new riscv_move_integer arg. (low+2): Pass TRUE for new riscv_split_symbol arg. Modified: trunk/gcc/ChangeLog trunk/gcc/config/riscv/riscv-protos.h trunk/gcc/config/riscv/riscv.c trunk/gcc/config/riscv/riscv.md
[Bug rtl-optimization/91860] [10 Regression] ICE: in decompose, at rtl.h:2279 with -Og -fipa-cp -g --param=max-combine-insns=3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91860 --- Comment #5 from Jim Wilson --- The new testcase is essentially the same problem, but it is i1src this time not i2src, so just copying i2src earlier doesn't solve the problem, we also need a fix for i1src, or a fix elsewhere.
[Bug rtl-optimization/91860] [10 Regression] ICE: in decompose, at rtl.h:2279 with -Og -fipa-cp -g --param=max-combine-insns=3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91860 Jim Wilson changed: What|Removed |Added CC||wilson at gcc dot gnu.org --- Comment #1 from Jim Wilson --- Combine is doing a 3 instruction split into 2 which appears ok Trying 12, 14 -> 18: 12: r84:SI=0xa000 14: r76:DI=sign_extend(r84:SI+0x466) REG_DEAD r84:SI REG_EQUAL 0xa466 18: r87:DI=~r76:DI REG_DEAD r76:DI Failed to match this instruction: (set (reg:DI 87) (const_int -42087 [0x5b99])) Splitting with gen_split_11 (riscv.md:1280) Successfully matched this instruction: (set (reg:DI 76 [ _7 ]) (const_int -40960 [0x6000])) Successfully matched this instruction: (set (reg:DI 87) (plus:DI (reg:DI 76 [ _7 ]) (const_int -1127 [0xfb99]))) allowing combination of insns 12, 14 and 18 original costs 4 + 8 + 4 = 16 replacement costs 4 + 4 = 8 however a side effect of this is that when i1 is substituted into newpat, i2src becomes invalid RTL gdb) print i2src $74 = (rtx) 0x76262b90 (gdb) pr warning: Expression is not an assignment (and might have no effect) (sign_extend:DI (const_int 42086 [0xa466])) This wouldn't be a problem normally, except that i2src is later used to generate a debug instruction via propagate_for_debug. This calls simplify_replace_fn_rtx which crashes on the invalid rtl. Looks like we need to copy i2src sooner than we currently do.
[Bug rtl-optimization/91860] [10 Regression] ICE: in decompose, at rtl.h:2279 with -Og -fipa-cp -g --param=max-combine-insns=3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91860 --- Comment #2 from Jim Wilson --- Created attachment 46919 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46919=edit untested fix to copy i2src earlier works for testcase but otherwise untested
[Bug middle-end/92263] [10 Regression] ICE in commit_one_edge_insertion, at cfgrtl.c:2087 since r270758
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92263 Jim Wilson changed: What|Removed |Added Status|NEW |ASSIGNED --- Comment #9 from Jim Wilson --- Fixed on mainline. Do you need a backport to the gcc-9 branch?
[Bug middle-end/92263] [10 Regression] ICE in commit_one_edge_insertion, at cfgrtl.c:2087 since r270758
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92263 --- Comment #8 from Jim Wilson --- Author: wilson Date: Tue Nov 5 22:34:40 2019 New Revision: 277861 URL: https://gcc.gnu.org/viewcvs?rev=277861=gcc=rev Log: Allow libcalls for complex memcpy when optimizing for size. The RISC-V backend wants to use a libcall when optimizing for size if more than 6 instructions are needed. Emit_move_complex asks for no libcalls. This case requires 8 insns for rv64 and 16 insns for rv32, so we get fallback code that emits a loop. Commit_one_edge_insertion doesn't allow code inserted for a phi node on an edge to end with a branch, and so this triggers an assertion. This problem goes away if we allow libcalls when optimizing for size, which gives the code the RISC-V backend wants, and avoids triggering the assert. gcc/ PR middle-end/92263 * expr.c (emit_move_complex): Only use BLOCK_OP_NO_LIBCALL when optimize_insn_for_speed_p is true. gcc/testsuite/ PR middle-end/92263 * gcc.dg/pr92263.c: New. Added: trunk/gcc/testsuite/gcc.dg/pr92263.c Modified: trunk/gcc/ChangeLog trunk/gcc/expr.c trunk/gcc/testsuite/ChangeLog
[Bug middle-end/92263] [10 Regression] ICE in commit_one_edge_insertion, at cfgrtl.c:2087 since r270758
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92263 --- Comment #6 from Jim Wilson --- Looking at some other targets. ARM has movcc but not 128-bit long double. Aaarch has movcc and 128-bit long double, but has 128-bit load/store so this is only 4 instructions. mips64, powerpc64, and sparc64 have movcc and 128-bit long double, but emit the memcpy inline as 8 instructions. riscv64 meanwhile wants the libcall with -Os as that is 4 instructions instead of 8. For rv32 this would be 16 instructions. I'm not sure offhand if the other targets support 32-bit code and 128-bit long double. Anyways, I tracked the use of BLOCK_OP_NO_LIBCALL in emit_move_complex back to bugzilla 15289, fixed by a patch from Richard Henderson back in Dec 1 2004. I think it is just an oversight that -Os wasn't considered here. I think the correct fix is to only force BLOCK_OP_NO_LIBCALL when optimizing for speed. With this change, I get the 8 instruction sequence with -O2, and the 4 instruction libcall sequence with -Os, which is what the RISC-V backend wants, and this lets the testcase work.
[Bug middle-end/92263] [10 Regression] ICE in commit_one_edge_insertion, at cfgrtl.c:2087 since r270758
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92263 Jim Wilson changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |wilson at gcc dot gnu.org --- Comment #7 from Jim Wilson --- Created attachment 47139 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47139=edit untested proposed fix
[Bug middle-end/92263] [10 Regression] ICE in commit_one_edge_insertion, at cfgrtl.c:2087 since r270758
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92263 Jim Wilson changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #10 from Jim Wilson --- Fixed on mainline. No backport requested.
[Bug middle-end/92263] [10 Regression] ICE in commit_one_edge_insertion, at cfgrtl.c:2087 since r270758
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92263 --- Comment #3 from Jim Wilson --- I did a cross compiler build and check yesterday using up-to-date sources and did not see this failure. I've been testing regularly. I did my build on an x86_64 Ubuntu 16.04 machine with gcc-5.4 as the system compiler. Maybe this depends on the compiler used for the build? Or the exact configure command?
[Bug middle-end/92263] [10 Regression] ICE in commit_one_edge_insertion, at cfgrtl.c:2087 since r270758
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92263 --- Comment #4 from Jim Wilson --- OK, I get it now. You are using non-standard optimization options with a testsuite testcase. I can reproduce when I use your compiler options. I will take a look.
[Bug middle-end/92263] [10 Regression] ICE in commit_one_edge_insertion, at cfgrtl.c:2087 since r270758
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92263 --- Comment #5 from Jim Wilson --- The patch adds a RISC-V movcc pattern. This causes toplev.c to enable flag_tree_cselim. This optimization pass creates a complex long double conditional move via a phi node. complex long double cstore_31; ... [local count: 27903866]: cstore_30 = MEM [(void *)_8]; [local count: 55807731]: # cstore_31 = PHI <__complex__ (0.0, 0.0)(4), cstore_30(5)> MEM [(void *)_8] = cstore_31; When we try to convert gimple to rtl, eliminate_phi calls insert_value_copy_on_edge for the 32-byte long double 0 value. The constant then gets forced to memory, and we end up calling emit_block_move with BLOCK_OP_NO_LIBCALL, which ends up emitting a loop to do the memory to memory copy. Then later in commit_one_edge_insertion we split the edge, insert the code containing the loop, and then trigger an abort because the last instruction inserted is the loop back branch. I don't see where the RISC-V port did anything wrong. The load hoisting code is checking the movcc optab to see if the target supports the operation, but I don't see anything obvious like that in the cselim pass. The only obvious fix I see in the RISC-V back end is to modify riscv_expand_block_move to emit inline non-loop code for a 32-byte memory to memory copy, even when optimizing for size, which I'd rather not do. Maybe it can be fixed in commit_one_edge_insertion by allowing conditional branches but not unconditional branches, but it isn't clear why this is refusing to allow branches here in the first place. I will have to look at other targets to see why they aren't failing.
[Bug bootstrap/92709] Cross Compilation failed for Latest GCC riscv64-linux-gnu on Linux/WSL2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92709 Jim Wilson changed: What|Removed |Added CC||wilson at gcc dot gnu.org --- Comment #3 from Jim Wilson --- >make[4]: Entering directory >'/home/cqwrteur/gcc-riscv64-build/riscv64-linux-gnu/lib32/ilp32/libgcc' >make[4]: *** No rule to make target 'all'. Stop. Look in the riscv64-linux-gnu/lib32/ilp32/libgcc dir. There should be a Makefile that is a copy of the $srcdir/libgcc/Makefile.in file with some sed substition. If this isn't true, then it is the configure step that failed. You can force the library dirs to reconfigure by deleting them. It gets a bit more complicated with multilibs, I think you have to delete the top level riscv64-linux-gnu/libgcc to force a reconfigure, but just a rm -rf riscv64-linux-gnu works too and is simpler, though more stuff will be rebuilt. You might want to do a -j1 make to get an easier to read build log. I don't have a Windows machine at work, but there is only one WSL problem that I have seen reported, and it is that WSL makes filesystems case-insensitive by default which is contrary to linux practice. It is known that this will break glibc builds which uses .os and .oS for two different kinds of files. I don't think that this breaks gcc builds. But since you are trying to do a cross to riscv64-linux-gnu you will run into this problem if you haven't already.
[Bug target/91683] ICE: SIGSEGV at -O when compiling for riscv64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91683 Jim Wilson changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #22 from Jim Wilson --- Fixed on mainline and gcc-9 branch.
[Bug middle-end/91691] Cross compiling glibc produces a false maybe-uninitialized error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91691 Jim Wilson changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2019-10-10 CC||wilson at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #3 from Jim Wilson --- I can reproduce using gcc-8 and gcc-9 but not mainline. I'm using gcc options: -std=gnu11 -fgnu89-inline -mcmodel=medlow -g -O2 -Wall -Wwrite-strings -Wundef -fmerge-all-constants -frounding-math -fno-stack-protector -Wstrict-prototypes - Wold-style-definition -fmath-errno -fPIC -ftls-model=initial-exec \ tmp.i -fdump-tree-all -fdump-ipa-all Bisecting on mainline shows that the new IPA SRA pass added Sept 20 fixed this. Without that patch, I see that total_deadline is being decomposed, but apparently there is something wrong with the tracking info. If I add -fno-ipa-sra the warning goes away. The new IPA-SRA is too big of a patch to backport, so it looks like this will have to remain unfixed on the gcc-8 and gcc-9 branches.
[Bug middle-end/91691] Cross compiling glibc produces a false maybe-uninitialized error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91691 Jim Wilson changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #5 from Jim Wilson --- Fixed on mainline. Not backporting to gcc-8 or gcc-9 branches.
[Bug middle-end/24639] [meta-bug] bug to track all Wuninitialized issues
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24639 Bug 24639 depends on bug 91691, which changed state. Bug 91691 Summary: Cross compiling glibc produces a false maybe-uninitialized error https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91691 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug rtl-optimization/91860] [10 Regression] ICE: in decompose, at rtl.h:2279 with -Og -fipa-cp -g --param=max-combine-insns=3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91860 Jim Wilson changed: What|Removed |Added Attachment #46919|0 |1 is obsolete|| --- Comment #9 from Jim Wilson --- Created attachment 47008 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47008=edit second untested fix
[Bug rtl-optimization/91860] [10 Regression] ICE: in decompose, at rtl.h:2279 with -Og -fipa-cp -g --param=max-combine-insns=3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91860 Jim Wilson changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2019-10-08 Assignee|unassigned at gcc dot gnu.org |wilson at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #8 from Jim Wilson --- The third and fourth testcases are i2src problems, just like the first testcase. Since fixing this in combine looks complicated, I took a look at the RTL simplifier. Using the fourth testcase as my example, the problem here is that we are trying to substitute (sign_extend:DI (const_int 8160 [0x1fe0])) as the value for (reg:DI 78 [ _9 ]) in the debug insn (debug_insn 29 28 30 2 (var_location:QI d (subreg:QI (reg:DI 78 [ _9 ]) 0)) "tm\ p4.c":11:5 -1 (nil)) The place where this starts to go wrong is in simplify_truncation, where it tries to compare the size of the original mode (VOIDmode) with the subreg mode (QI) and decides that we need to sign extend the constant to convert it from VOIDmode to QImode. We actually need a truncation not a extension here. This then goes even further wrong in simplify_const_unary_operation which does scalar_int_mode imode = (op_mode == VOIDmode ? result_mode : as_a (op_mode)); which treats the CONST_INT as a QImode value, but it isn't a valid QImode CONST_INT valid due to lack of QImode sign-extension, so the following wide_int::from call triggers an consistency check and we abort. I see a number of places in simplify-rtx where it assumes that if an input is VOIDmode then it should use MAX_MODE_INT instead as a CONST_INT should always be already extended in canonical form. This seems to be the correct fix, which I want to put in simplify_truncation so that we aren't passing VOIDmode to GET_MODE_UNIT_PRECISION. I verified that this works for all four testcases.
[Bug rtl-optimization/91860] [10 Regression] ICE: in decompose, at rtl.h:2279 with -Og -fipa-cp -g --param=max-combine-insns=3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91860 --- Comment #10 from Jim Wilson --- Author: wilson Date: Fri Oct 11 18:41:35 2019 New Revision: 276901 URL: https://gcc.gnu.org/viewcvs?rev=276901=gcc=rev Log: Extend subst to simplify CONST_INT inside SIGN_EXTEND. This addresses PR 91860 which has four testcases triggering internal errors. The problem here is that in combine when handling debug insns, we are trying to substitute (sign_extend:DI (const_int 8160 [0x1fe0])) as the value for (reg:DI 78 [ _9 ]) in the debug insn (debug_insn 29 28 30 2 (var_location:QI d (subreg:QI (reg:DI 78 [ _9 ]) 0)) "tmp4.c":11:5 -1 (nil)) This eventually triggers an abort because 8160 is not a sign-extended QImode value. In subst there is already code check for a CONST_INT inside a ZERO_EXTEND and simplify it. This needs to be extended to also handle a SIGN_EXTEND the same way. gcc/ PR rtl-optimization/91860 * combine.c (subst): If new_rtx is a constant, also check for SIGN_EXTEND when deciding whether to call simplify_unary_operation. gcc/testsuite/ PR rtl-optimization/91860 * gcc.dg/pr91860-1.c: New testcase. * gcc.dg/pr91860-2.c: New testcase. * gcc.dg/pr91860-3.c: New testcase. * gcc.dg/pr91860-4.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/pr91860-1.c trunk/gcc/testsuite/gcc.dg/pr91860-2.c trunk/gcc/testsuite/gcc.dg/pr91860-3.c trunk/gcc/testsuite/gcc.dg/pr91860-4.c Modified: trunk/gcc/ChangeLog trunk/gcc/combine.c trunk/gcc/testsuite/ChangeLog
[Bug tree-optimization/90883] Generated code is worse if returned struct is unnamed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90883 --- Comment #32 from Jim Wilson --- The proposed patch looks OK to me. I suggest you submit it to gcc-patches.
[Bug rtl-optimization/92656] The zero_extend insn can't be eliminated in the combine pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92656 --- Comment #5 from Jim Wilson --- A rewrite using dataflow would be better of course. I'm just trying to understand the problem with this testcase better, and maybe find a simple solution, but I don't think that there is one. The workarounds I see just make the code more complicated and add more risk of something else going wrong.
[Bug rtl-optimization/92656] The zero_extend insn can't be eliminated in the combine pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92656 --- Comment #3 from Jim Wilson --- Looking at this, I see that the problem occurs in record_value_for_reg where it does if (!insn || (value && rsp->last_set_table_tick >= label_tick_ebb_start)) rsp->last_set_invalid = 1; last_set_table_tick is 2 and label_tick_ebb_start is 1 because this is the first block of the function. This actually causes a lot of variables set in the first block to be marked invalid if used in a successful combination two or more times, which then prevents the nonzero bits info from being used for any of them. There seems to be a problem with how label_tick is used. In the very first block in the body of the function, label_tick is 2 and label_tick_ebb_start is 1. This is because it is considered to be the second block in the ebb after the entry block. In the second block in the body of the function, label_tick is 3 and label_tick_ebb_start is 3. This means that every variable set in the first block gets treated differently than in every block after the first. If I add a little bit of code before the loop to force it to be the second block, then I get correct output from combine. I just added this before the loop static int j = 0; if (val) j++; This also explains why the problem only occurs with -mtune=sifive-7-series because this enables the conditional move support that turns the loop into a single block, and then the -funroll-loops option fully unrolls the loop, turning the entire function into one block, which prevents combine from handling many of the register sets correctly because everything is in the first block now. This also explains why the problem started when the 2->2 combination support was added, as that causes more successful combinations, and hence more registers getting invalidated in the first block. So the question is why we need label_tick > label_tick_ebb_start for the first block of the function. There is nothing set in the entry block other than hard registers, and those could always be handled specially by just marking them as invalid somehow before processing instructions. Or alternatively, in record_value_for_reg, maybe we can add a check for a pseudo reg only set once and not live in the prologue, and avoid marking it as invalid when we process it a second time. There are already a lot of checks like this scattered around the code.
[Bug target/89627] Miscompiled constructor call
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89627 Jim Wilson changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Assignee|unassigned at gcc dot gnu.org |wilson at gcc dot gnu.org --- Comment #5 from Jim Wilson --- Was fixed in gcc 9 last year.
[Bug target/91602] GCC fails to build for riscv in a combined tree due to misconfigured leb128 support
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91602 --- Comment #10 from Jim Wilson --- The proposed binutils patch has multiple problems and has gone through multiple iterations. Not clear when or if we will be able to accept it. The gcc configure patch to eliminate the call to gcc_GAS_VERSION_GTE_IFELSE for in tree gas builds actually looked like the better solution to me though I haven't tried it yet. If we go this way the patch should perhaps eliminate everything related to gcc_GAS_VERSION_GTE_IFELSE which is a much bigger patch. Since combined tree builds are obsolete, this is low on my priority list.
[Bug target/91602] GCC fails to build for riscv in a combined tree due to misconfigured leb128 support
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91602 --- Comment #11 from Jim Wilson --- Since Marxin pinged this and got me thinking about this again, I realized that there is a simpler fix based on Serge's second suggestion. We can just delete the gas version number from the uleb128 gas check in configure.ac. This will force a gas feature check for uleb128 only, which solves the RISC-V build problem, and is a nice small change. I'm testing a patch for that.
[Bug target/93532] RISCV g++ hangs with optimization >= -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93532 --- Comment #11 from Jim Wilson --- I'm able to reproduce with the gcc-8-branch now. Maybe I made a mistake with my earlier build. Anyways, it looks like it is going wrong here in the reload dump Creating newreg=1856, assigning class NO_REGS to save r1856 434: fa0:SF=call [`sqrtf'] argc:0 REG_UNUSED fa0:SF REG_CALL_DECL `sqrtf' REG_EH_REGION 0 Add reg<-save after: 2446: r114:SF#0=r1856:DF 432: NOTE_INSN_BASIC_BLOCK 24 Add save<-reg after: 2445: r1856:DF=r114:SF#0 then later we appear to end up in a loop generating secondary reloads that need secondary reloads themselves, and so forth. The instruction above looks funny, trying to use a subreg to convert DFmode to SFmode. I don't think we should be generating that. So it looks like a caller save problem. If I add -fno-caller-saves the compile finishes. It appears that we need a definition for HARD_REGNO_CALLER_SAVE_MODE because the default definition can't work here. The comment in sparc.h for HARD_REGNO_CALLER_SAVE_MODE looks relevant. The same definition may work for RISC-V. Looks like the MIPS port does it the same way too.
[Bug target/93532] RISCV g++ hangs with optimization >= -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93532 --- Comment #12 from Jim Wilson --- A bisection on mainline between the gcc-8 and gcc-9 releases shows that this testcase was fixed by a combine patch for PR87600 that stops combining hard regs with pseudos to reduce register pressure. The commentary refers to ira and lra problems. A combine patch won't be as safe as a RISC-V backend patch though. I tried testing the riscv HARD_REGNO_CALLER_SAVE_MODE patch with buildroot but it turns out that it is downloading a pre-built compiler instead of building one. So dropping in the patch doesn't do anything. I will have to figure out what is going on there. Trying the riscv patch with mainline on the testcase, I see that I get better rematerialization without the confusing subregs, and I also get smaller stack frames since we are saving SFmode now to the stack instead of DFmode now. Otherwise, I don't see any significant changes to the code. I tried a make check with the riscv patch on mainline, and got an unexpected g++ testsuite failure, so I will have to look into that.
[Bug target/93532] RISCV g++ hangs with optimization >= -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93532 Jim Wilson changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |wilson at gcc dot gnu.org --- Comment #13 from Jim Wilson --- Created attachment 47794 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47794=edit untested patch to fix the problem
[Bug target/93532] RISCV g++ hangs with optimization >= -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93532 Jim Wilson changed: What|Removed |Added CC||wilson at gcc dot gnu.org --- Comment #9 from Jim Wilson --- I tried the buildroot instructions. It didn't work on an ubuntu 16.04 server machine. There is a 'python3 pip3 -q docwriter' command that hangs. I also discovered that the script isn't restartable. It runs -rf on the build directory and exits with an error. I did get it to work on my ubuntu 18.04 laptop. And it does hang, but it isn't the btPolyhedralContactClipping.cpp file that hangs for me, it is the btBoxBoxDetector.cpp file. I was able to reproduce this with a gcc-8.3.0 build using -O2 -fPIC -fstack-protector-strong options to compile the file. It does not reproduce using the top of the gcc-8-branch svn tree, suggesting that either it is already fixed, or it is maybe a memory corruption problem that is hard to reproduce. Using gdb to attach to the gcc-8.3.0 compiler, I see that it is looping in lra, but I haven't tried to debug that yet. #0 0x00705e7b in bitmap_find_bit (bit=42321, bit@entry=330, head=0x376ae88) at ../../gcc-8.3.0/gcc/bitmap.c:539 #1 bitmap_set_bit (head=0x376ae88, bit=bit@entry=42321) at ../../gcc-8.3.0/gcc/bitmap.c:600 #2 0x0099b95f in mark_regno_dead (regno=42321, mode=, point=) at ../../gcc-8.3.0/gcc/lra-lives.c:362 #3 0x0099c9c4 in process_bb_lives (dead_insn_p=false, curr_point=@0x7ffc9a90: 181876, bb=) at ../../gcc-8.3.0/gcc/lra-lives.c:842 #4 lra_create_live_ranges_1 (all_p=all_p@entry=true, dead_insn_p=dead_insn_p@entry=false) at ../../gcc-8.3.0/gcc/lra-lives.c:1337 #5 0x0099e7c0 in lra_create_live_ranges (all_p=all_p@entry=true, dead_insn_p=dead_insn_p@entry=false) at ../../gcc-8.3.0/gcc/lra-lives.c:1406 #6 0x00982d0c in lra (f=) at ../../gcc-8.3.0/gcc/lra.c:2473 #7 0x0093fa32 in do_reload () at ../../gcc-8.3.0/gcc/ira.c:5465 #8 (anonymous namespace)::pass_reload::execute (this=) at ../../gcc-8.3.0/gcc/ira.c:5649 ...
[Bug target/93532] RISCV g++ hangs with optimization >= -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93532 --- Comment #10 from Jim Wilson --- Created attachment 47774 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47774=edit testcase that reproduces for me compile with -O2 -fPIC -fstack-protector-strong
[Bug tree-optimization/90883] Generated code is worse if returned struct is unnamed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90883 Jim Wilson changed: What|Removed |Added CC||wilson at gcc dot gnu.org --- Comment #29 from Jim Wilson --- The testcase works for riscv64-elf but does not work for riscv32-elf. The difference is in the einline pass before dse1. riscv64-elf has tmp.C:12:17: optimized: Inlining constexpr C::C()/1 into C slow()/3. where as riscv32-elf has tmp.C:12:17: missed: will not early inline: C slow()/3->constexpr C::C()/1, call is cold and code would grow by 1 Since the constructor was not early inlined, dse1 can't eliminate the redundant store. The constructor eventually gets inlined between 085i.materialize-all-clones and 088t.fixup_cfg3 which allows dse2 to eliminate the redundant store. I can make the testcase work for riscv32-elf if I add --param max-inline-insns-size=1 to allow the constructor to be inlined during the einline pass. I didn't check to see if this works for the other failing targets.
[Bug target/93532] RISCV g++ hangs with optimization >= -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93532 Jim Wilson changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #19 from Jim Wilson --- Patch applied to mainline. This is just a minor optimization for gcc-10 as a combiner patch between gcc-8 and gcc-9 reduces register pressure enough to prevent the hang. Hence there is no real need for the patch in gcc-9. The patch might be useful in gcc-8, but the problem is hard to reproduce, buildroot is the only one that ran into the problem, and they can always add the patch to their tree, so not clear if we really need it on the gcc-8 branch.
[Bug target/93532] RISCV g++ hangs with optimization >= -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93532 --- Comment #20 from Jim Wilson --- Thanks for confirming that it solves the buildroot build problem. My gcc mainline g++ test failure turned out to be a thread related issue with qemu cross testing. The testcase works always on hardware, but fails maybe 10-20% of the time when run under qemu. RISC-V qemu is known to still have a few bugs in this area, though they might already be fixed in newer qemu versions than what I have.
[Bug inline-asm/93202] [RISCV] ICE when using inline asm 'h' constraint modifier
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93202 --- Comment #5 from Jim Wilson --- Jakub's patch looks OK to me.
[Bug inline-asm/93202] [RISCV] ICE when using inline asm 'h' constraint modifier
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93202 --- Comment #7 from Jim Wilson --- (In reply to Luís Marques from comment #3) > Jim Wilson: I'm not using it, I was only working on the LLVM implementation. > Could you please clarify if following modifiers are also internal only? > > 'C' Print the integer branch condition for comparison OP. > 'A' Print the atomic operation suffix for memory model OP. > 'F' Print a FENCE if the memory model requires a release. 'C' maps an rtx to a string. It is intended to be used for comparisons to emit the appropriate compare instruction, because the instruction names match the gcc internal rtx names. It can't be used for its intended purpose in an asm, as you can't get a comparison operator as an operand to an asm. Since it works with any rtx, it can be used in an asm, but is very unlikely to be useful. 'A' takes a memory order value from stdatomic.h, and emits a .acq if it is one of the memory orders that requires an acquire operation, e.g. __ATOMIC_ACQUIRE. Gcc calls this a memory model internally, and defines the values in memmodel.h. The primary use is for the atomic builtin functions, to map the memory order argument to the right instruction. This takes an integer argument, so in theory it could be used in an asm, but unlikely to be very useful. 'F" is similar, except it is to atomic releases, and emits a fence instruction. This one is a bit of historical accident. The gcc riscv port was written before we had a formal memory model defined, and so to be conservative, it emits fences in a lot of places where we probably don't need them. Now that we do have a formal memory model defined, the gcc port needs to be fixed to implement it, except there is no one to do the work, so it is unclear when it will happen. Meanwhile, the port still emits a lot of fences we don't need via 'F'. This takes an integer argument as above, so likewise in theory could be used in an asm, but unlikely to be useful. And this one has the additional problem that it needs to change in a future gcc release, though we could preserve the current meaning of the 'F' letter and use a new letter if necessary in the rewrite. The useful print operand letters are the ones for registers, constants, and addresses. These random ones used for internal gcc features aren't really useful in asms.
[Bug target/93304] RISC-V: Function with interrupt attribute use register without save/restore at prologue/epilogue
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93304 Jim Wilson changed: What|Removed |Added CC||wilson at gcc dot gnu.org --- Comment #2 from Jim Wilson --- There is a convention of using all caps for function arguments. See for instance the riscv_build_integer function comment. It would be nice to preserve this convention, but this is a very minor issue. I usually put a blank line between the function comment and the function, but again this is a very minor issue. The patch looks OK to me.
[Bug target/93333] ICE: RTL check: expected code 'const_int', have 'and' in riscv_rtx_costs, at config/riscv/riscv.c:1645
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9 --- Comment #4 from Jim Wilson --- I tried some cross testing with rtl checking enabled, and found another rtl check bug with the -msave-restore support in config/riscv/riscv-sr.c where it uses XINT to read from a CONST_INT which is wrong, as it is actually an XWINT value, and we should be using INTVAL to read the value. I've tested a patch for that, and can commit it tomorrow. -msave-restore is for embedded code size, so this shouldn't be a problem for linux users.
[Bug target/93333] ICE: RTL check: expected code 'const_int', have 'and' in riscv_rtx_costs, at config/riscv/riscv.c:1645
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9 Jim Wilson changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2020-01-21 Ever confirmed|0 |1 --- Comment #2 from Jim Wilson --- I can reproduce. Reproducing requires enabling rtl checking which is not on by default. I suspect that there are other similar problems, as we probably haven't tested a build with rtl checking enabled before. The problem is in riscv_rtx_costs which only needs to return valid values for valid rtl, and it is failing the rtl check for invalid rtl, so this isn't a major problem if rtl checking is off, but it does need to be fixed to be safe.
[Bug target/93333] ICE: RTL check: expected code 'const_int', have 'and' in riscv_rtx_costs, at config/riscv/riscv.c:1645
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9 --- Comment #3 from Jim Wilson --- Jakub's patch looks OK, and works for the testcase.
[Bug target/93333] ICE: RTL check: expected code 'const_int', have 'and' in riscv_rtx_costs, at config/riscv/riscv.c:1645
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9 Jim Wilson changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #7 from Jim Wilson --- Fixed on mainline.
[Bug target/93304] RISC-V: Function with interrupt attribute use register without save/restore at prologue/epilogue
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93304 Jim Wilson changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #4 from Jim Wilson --- Fixed on mainline.
[Bug inline-asm/93202] [RISCV] ICE when using inline asm 'h' constraint modifier
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93202 --- Comment #2 from Jim Wilson --- %h is used for the gcc internal implementation of emitting auipc. I'm skeptical that it is useful for asms. Stripping the HIGH rtx is an internal implementation detail, and does not apply to asms, as you can't get a HIGH there. Is there a reason why you are trying to use it? There may be a better solution for what you need. If we really need %h to work in asms then it probably needs some inconvenient work. I'd rather document that %h shouldn't be used in asms, or leave it undocumented as an internal gcc implementation detail. I'm assuming that you are just working on llvm support, and don't actually need %h to work in asms, you just need llvm and gcc compatibility. riscv_print_operand does use output_operand_lossage as it should. But it calls a function riscv_print_operand_reloc which calls gcc_unreachable in a switch statement. That is an oversight. It can be fixed to use output_operand_lossage too.
[Bug target/93062] Failed to generate indirect branch for long branches on riscv
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93062 Jim Wilson changed: What|Removed |Added CC||wilson at gcc dot gnu.org --- Comment #1 from Jim Wilson --- We need to check length attributes in the branch patterns, and emit different sequences depending on the length. There are multiple examples to compare with, for instance the condjump pattern in the aarch64.md file.
[Bug testsuite/93045] gc bug with test "start_unit-test-1.c"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93045 Jim Wilson changed: What|Removed |Added CC||wilson at gcc dot gnu.org --- Comment #1 from Jim Wilson --- This worked for me running natively on a fedora rawhide system with a 4.15 linux kernel.
[Bug target/93062] Failed to generate indirect branch for long branches on riscv
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93062 Jim Wilson changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2019-12-25 Ever confirmed|0 |1
[Bug middle-end/94083] New: inefficient soft-float x!=Inf code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94083 Bug ID: 94083 Summary: inefficient soft-float x!=Inf code Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: wilson at gcc dot gnu.org Target Milestone: --- Given a testcase like this int foo(void) { volatile float f; intn; f = __builtin_huge_valf(); n += 1 - (f != __builtin_huge_valf()); return n; } and compiling for soft-float, we end up with a call to __unordsf2 followed by a call to __lesf2. This means the floats have to be unpacked twice and checked for nan twice. This gives both poor performance and poor code size. I've confirmed this for x86, arm, and riscv. Folding in the C front end is creating an unordered less then or equal comparison against FLT_MAX. From the 004.original file n = SAVE_EXPR + n; This optimization is coming from a rule in the match.pd file. /* x != +Inf is always equal to !(x > DBL_MAX), but this introduces an exception for x a NaN so use an unordered comparison. */ When we generate rtl, we call do_compare_rtx_and_jump which notices that we don't have an operation for UNLE_EXPR, but decides we can't reverse it because it is unsafe. It tries swapping arguments, but we don't have UNGE_EXPR either. So it emits two libcalls. Converting a NE compare to a UNLE compare looks like an odd optimization. If we want to consider unordered operations as canonical operations, then maybe we should add libgcc support for the unordered operations. Or maybe we should check to see if unordered operations are handled by the target before converting a simple NE into a UNLE. The match.pd rule was changed to use UNLE in the patch for PR 64811 which fixed a problem with handling NaNs. This happened 2018-01-09. The optimization dates back to 2003-05-22 but was originally using LE which is OK for soft-float. It wasn't until the NaN bug was fixed by using UNLE instead of LE that this became an optimization problem. Maybe we just shouldn't perform this optimization when honoring NaNs? That would avoid generating the problematic unordered operation early in the optimizer.
[Bug c++/94044] [10 Regression] internal compiler error: in comptypes, at cp/typeck.c:1490 on riscv64-unknown-linux-gnu and arm-eabi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94044 --- Comment #9 from Jim Wilson --- (In reply to Jakub Jelinek from comment #8) > So perhaps to ease reproduction, tweak the hash function in this case to > always return 0? Yes, that works. I just didn't have a chance to look at the hash function last night. With the hash function hacked I can reproduce for any target and any -std=c++X value. diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 789ccdb..4337928 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -1733,7 +1733,8 @@ hash_tmpl_and_args (tree tmpl, tree args) hashval_t spec_hasher::hash (spec_entry *e) { - return hash_tmpl_and_args (e->tmpl, e->args); + return 0; + // return hash_tmpl_and_args (e->tmpl, e->args); } /* Recursively calculate a hash value for a template argument ARG, for use
[Bug target/94136] GCC doc for built-in function __builtin___clear_cache() not 100% correct
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94136 Jim Wilson changed: What|Removed |Added CC||wilson at gcc dot gnu.org --- Comment #3 from Jim Wilson --- Another possible issue is that __clear_cache is defined in the libgcc docs. But only some platforms are defining CLEAR_INSN_CACHE so only some targets have a usable __clear_cache function.
[Bug bootstrap/92008] Build failure on cygwin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92008 Jim Wilson changed: What|Removed |Added CC||wilson at gcc dot gnu.org --- Comment #21 from Jim Wilson --- This looks the same as a binutils bug. https://sourceware.org/bugzilla/show_bug.cgi?id=22941 The easy solution is to touch intl/plural.c after checkout, so that bison won't be run. The contrib/gcc_update script already does this. So the simplest solution for the original problem is to use contrib/gcc_update to update a gcc tree, or "contrib/gcc_update --touch" if you want to fix a gcc tree without updating it. If your gcc git source tree was already mangled by a bad bison run, you will have to manually reset it to a clean tree, e.g. "git reset --hard" or "git diff > tmp.file; patch -p1 --reverse < tmp.file; rm tmp.file" or whatever, and then run the contrib/gcc_update --touch command. Binutils unfortunately does not have an equivalent to the gcc_update script and hence requires a fix. git unfortunately does not preserve file timestamps across commit and checkout, so when you checkout a file it gets the current time. git also tends to check out files in alphabetical order. If you are on a fast filesystem, i.e. linux, plural.c and plural.y almost always get the same timestamp and bison isn't run. If you are on a slow filesystem, i.e. cygwin, plural.c is often older than plural.y, and bison must be run, and the current bison version fails. This is why it is cygwin folk that most commonly run into this problem.
[Bug c++/94044] [10 Regression] internal compiler error: in comptypes, at cp/typeck.c:1490 on riscv64-unknown-linux-gnu and arm-eabi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94044 Jim Wilson changed: What|Removed |Added CC||wilson at gcc dot gnu.org --- Comment #7 from Jim Wilson --- I made an attempt to reproduce this. I wasn't able to reproduce with an arm-eabi build. I was able to reproduce with a riscv64-linux build. The funny part is that I was able to build two compilers from the same gcc sources, one which reproduces and one which does not, which differ only in exactly how I did the build. For the failing build, I had a complete riscv-gnu-toolchain build available when configuring. For the working build it was just binutils and gcc without glibc/linux header files, and a top-of-tree binutils version unlike the first build. Debugging the two side by side, I see that execution diverges at line 9680 in cp/pt.c entry = type_specializations->find_with_hash (, hash); The working compiler has no hash hit and returns zero. The failing compiler has a hash hit, and then dies inside spec_hasher::equal. In the spec_hasher::equal function I see (gdb) print *e1 $29 = {tmpl = , args = , spec = } (gdb) print *e2 $30 = {tmpl = , args = , spec = } (gdb) pt unit-size align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x75deb7e0 precision:64 min max pointer_to_this > readonly arg:0 elt:1 > type_0 type_6 VOID align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x75ef4540> tmp.C:13:23 start: tmp.C:13:23 finish: tmp.C:13:35>> (gdb) print e2->args $32 = (gdb) pt unit-size align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x75deb7e0 precision:64 min max pointer_to_this > readonly arg:0 elt:1 > type_0 type_6 VOID align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x75efa2a0> tmp.C:13:23 start: tmp.C:13:23 finish: tmp.C:13:35>> (gdb) It then eventually dies inside comptypes because TREE_CODE (t1) is type_pack_expansion. And also TREE_CODE (t2) is type_pack_expansion. This is called from the SIZEOF_EXPR case in cp_tree_equal. If tree addresses are being used for the hash codes, this could just be bad luck whether it fails or not.
[Bug c++/64697] C++11 thread_local: relocation truncated to fit: R_X86_64_PC32 against undefined symbol `TLS init function for N::ptd'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64697 --- Comment #24 from Jim Wilson --- Joel Sherrill offered to create a binutils bug report for this.
[Bug c++/64697] C++11 thread_local: relocation truncated to fit: R_X86_64_PC32 against undefined symbol `TLS init function for N::ptd'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64697 Jim Wilson changed: What|Removed |Added CC||wilson at gcc dot gnu.org --- Comment #22 from Jim Wilson --- This looks like a binutils bug to me. A call to an undefined weak function should never be executed, so it is OK for the linker to convert that call instruction into anything convenient. There is no need for a relocation that can reach an address of zero. We can convert the call instruction to call itself, or the next instruction, or change it to a nop, what ever is convenient, it doesn't really matter. A number of binutils ports already have code to handle related problems. ARM and RISC-V for sure. Probably others. It looks like this support is missing from the x86_64 port. I'd suggest refiling this as a binutils bug. See for instance https://sourceware.org/bugzilla/show_bug.cgi?id=23244 for a RISC-V example of the same problem. But we need a new bug for the x86_64 problem. RISC-V has a register hard wired to zero, so I rewrite the call instruction to use x0 as the base address. The arm port turns the call into a nop.