[Bug target/89627] Miscompiled constructor call

2019-03-07 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89627

Jim Wilson  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-03-08
 Ever confirmed|0   |1

--- Comment #1 from Jim Wilson  ---
This is the same problem that Andrew Burgess just reported.  He found it while
looking at gdb testsuite failures.  He reported it against the psABI
https://github.com/riscv/riscv-elf-psabi-doc/issues/88
This has a link to a patch, but the patch is trying to fix the ABI which is
wrong.  We only need the last two hunks of the patch which fix a bug in
riscv_pass_fpr_single.

https://github.com/T-J-Teru/gcc/commit/ae183138a618a01acf2cf009544d4ce0dda61ab9

I'm waiting for Andrew to submit the patch so I can approve it.

[Bug target/89411] RISC-V backend will generate wrong instruction for longlong type like lw a3,-2048(a5)

2019-03-07 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89411

--- Comment #4 from Jim Wilson  ---
Created attachment 45925
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45925=edit
work in progress patches for a fix

This implements two ways to fix it, a simple way that just fails for BLKmode,
and a more complex way that passes in a SYMBOL_REF and tries to extract decl
size and alignment from it.  There are ifdefs so you can choose which one gets
used.  I'd like to do a lot more testing to see what the optimization effect is
from the two patches, and to see how often they trigger, and to make sure they
won't generate a compiler error on anything important (like gcc, glibc, newlib,
linux kernel).

[Bug target/90015] riscv: typo "intterupt" in diagnostic

2019-04-08 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90015

Jim Wilson  changed:

   What|Removed |Added

 CC||wilson at gcc dot gnu.org

--- Comment #2 from Jim Wilson  ---
I tested Jakub's patch with 32-bit elf and 64-bit linux cross compilers.  There
were no regressions.  The patch is OK.

[Bug debug/90109] gstabs flag generates wrong entry for long on x86_64

2019-04-16 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90109

--- Comment #2 from Jim Wilson  ---
long long and long double did not exist when stabs was invented.  Also, 64-bit
machines and C++ did not exist at the time.  Also, unfortunately, stabs wasn't
designed to be extensible.  So there is no way to describe anything that
doesn't exist in K C without using incompatible extensions, such as those
enabled by the the gdb stabs extensions.

So yes, invalid, though I'd add an extra caution that stabs should only be used
if you need compatibility with old obsolete systems.  Otherwise, it is a waste
of time, as we stopped actively maintaining it about a decade ago, and probably
should just drop the support from gcc.

[Bug target/89955] riscv.h improperly defines STARTFILE_PREFIX_SPEC spec

2019-06-06 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89955

Jim Wilson  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Jim Wilson  ---
Fixed on mainline.

[Bug target/89955] riscv.h improperly defines STARTFILE_PREFIX_SPEC spec

2019-06-06 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89955

--- Comment #6 from Jim Wilson  ---
Author: wilson
Date: Thu Jun  6 23:18:48 2019
New Revision: 272021

URL: https://gcc.gnu.org/viewcvs?rev=272021=gcc=rev
Log:
RISC-V: Move STARTFILE_PREFIX_SPEC into target OS files.

gcc/
PR target/89955
* config/riscv/riscv.h (STARTFILE_PREFIX_SPEC): Deleted.
* config/riscv/freebsd.h (STARTFILE_PREFIX_SPEC): Added.
* config/riscv/linux.h (STARTFILE_PREFIX_SPEC): Added.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/riscv/freebsd.h
trunk/gcc/config/riscv/linux.h
trunk/gcc/config/riscv/riscv.h

[Bug target/89955] riscv.h improperly defines STARTFILE_PREFIX_SPEC spec

2019-05-30 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89955

--- Comment #5 from Jim Wilson  ---
OK, sounds like we need to move STARTFILE_PREFIX_SPEC into various OS header
files then.  That will require some testing.  I caught a virus last week and am
behind on everything, so I haven't had a chance to try to do this yet.

[Bug target/90419] RISCV --with-multilib-list support is somewhat incomplete

2019-05-10 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90419

--- Comment #4 from Jim Wilson  ---
GCC worked out of the box before we started upstreaming the toolchain.  And it
will work out of the box again when we are done with the upstreaming.  But
meanwhile, we are still in the middle of upstreaming and some things only work
if you have the right collection of patches or the right collection of source
trees.  If I change the FSF gcc tree to try to fix this, then I break the
github riscv/riscv-gnu-toolchain tree, which is using the same gcc sources, and
has working multilib support.  I don't want to have to maintain two different
gcc source trees, so I'm leaving the FSF gcc broken temporarily because I don't
want to break the riscv/riscv-gnu-toolchain tree.  This problem will go away
when glibc is finished upstreaming.  This is a problem I keep raising inside
SiFive and the RISC-V Foundation because a number of things are blocked on this
glibc upstreaming problem.  It isn't clear when it will get fixed though.  I
might have to volunteer to do it myself at some point, but I'm still hoping
someone else does it.

[Bug target/90419] RISCV --with-multilib-list support is somewhat incomplete

2019-05-18 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90419

--- Comment #6 from Jim Wilson  ---
Comment on attachment 46379
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46379
patch: disable 32bit riscv abis in gcc multilib

You are missing the g aliases.  And you still have 32-bit dirnames.  We have a
generator script that is used to generate this file.  It gives the command line
at the top.  You can just rerun this without the 32-bit arches and abis.

Try rynning:
./multilib-generator
rv64imac-lp64-rv64ima,rv64imaf,rv64imafd,rv64imafc,rv64imafdc-
rv64imafdc-lp64d-rv64imafd-
that should be on a single line.  Then pipe that into a file to use.

[Bug target/90419] RISCV --with-multilib-list support is somewhat incomplete

2019-05-09 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90419

--- Comment #2 from Jim Wilson  ---
I talked to Palmer.  Apparently what you want to do is build multilibs for lp64
and lp64d, to test the linux multilib support.  That isn't currently supported.
 I would suggest applying a local patch to change the
gcc/config/riscv/t-linux-multilib file with one that just has the lp64 and
lp64d multilibs.  If you are trying to do this natively, you are likely to run
into problems along the way, as no one has tried to do this before as far as I
know.

[Bug target/90419] RISCV --with-multilib-list support is somewhat incomplete

2019-05-09 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90419

Jim Wilson  changed:

   What|Removed |Added

 CC||wilson at gcc dot gnu.org

--- Comment #1 from Jim Wilson  ---
For cross compilers, --enable-multilib gives a decent selection of multilib
targets.

For native compilers, we don't have linux kernel support to run 32-bit code on
64-bit parts.  Also, I'm not aware of any hardware that has the hardware
support necessary to run 32-bit code on 64-bit parts, which makes the linux
kernel support somewhat of a moot problem.  So, currently, there is no use for
building native multilib compilers.

There is however a problem that the RISC-V ABI requires that libraries go in
very specific places, e.g. /lib64/lp64d for the default linux ABI.  In order to
get the library paths correct in a native linux system compiler that will be
installed into /usr/bin, one must build gcc multilib, and one must
simultaneously disable the multilibs that won't work.  This is what
--with-multilib-list is for.  If you configure --enable-multilib
--with-multilib-list=lp64d then you will get the correct result.  This is how
Fedora configures their RISC-V compilers for instance.

Supporting more than one ABI in --with-multilib-list is a major project. 
Currently this is implemented via sed scripts, but the fact that we have two
interrelated options here -march and -mabi means that this gets very
complicated very quickly.  I looked at this and gave up, and decided that
support for only one ABI at a time was practical.  If you want more than one
ABI, then this is an enhancement request not a bug, and it is unclear when or
if this will get fixed.

32-bit glibc support exists but has not been merged into the glibc tree yet. 
This was pending on the y2038 patches.  Currently all 32-bit linux targets use
a 32-bit time_t that will overflow in 2038.  This requires an unfortunate ABI
break to fix.  As RISC-V is a new target, we were waiting for the y2038
infrastructure patches to go into first (both linux kernel and glibc), so that
we could avoid an ABI break shortly after contributing the 32-bit RISC-V port. 
I think all of the infrastructure patches are in now, but likely need some
testing.  I just recently reported a glibc bug due to these changes, which
turned out to actually be a linux kernel bug.  Anyways, the main problem now is
finding a volunteer to pick up the 32-bit RISC-V glibc work and finish it. 
Meanwhile, the 32-bit glibc patches are available as a patch set that was sent
to the libc-alpha mailing list last year, and as patches on the
riscv-glibc-2.29 branch in the github.com/riscv/riscv-glibc tree.  This branch
is used by and known to work with the riscv/meta-riscv open-embedded tree.

I agree the current situation is not ideal, but RISC-V is a very new target
that isn't fully fleshed out yet, and the situation will get better in a year
or two.  This is mostly a user education problem at the moment.

By the way, there is already a gentoo RISC-V port.  A co-worker Palmer Dabbelt
did one.  I don't know anything about gentoo though, and I haven't tried using
it.  So you would have to ask him about it if interested.

[Bug other/63509] Misleading error message when building gcc without C++ compiler installed

2019-05-01 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63509

Jim Wilson  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-05-02
 CC||wilson at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #7 from Jim Wilson  ---
Confirmed.  Except that current sources now say
configure: error: uint64_t or int64_t not found

We have a configure check to verify that the C compiler works.  We are missing
one for the C++ compiler.  In configure we have
$as_echo_n "checking whether the C compiler works... " >&6; }
and there is no equivalent for the C++ compiler.

[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads

2019-04-29 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=2

--- Comment #14 from Jim Wilson  ---
https://wiki.debian.org/Ports/ia64

James Clarke has been active recently on the binutils and/or gcc mailing lists.
 My IA-64 work has dwindled down to nothing, as RISC-V work has kept me too
busy to do anything else.  With the architecture already set to end-of-life
next year, I'm not planning to do anymore IA-64 work.

[Bug middle-end/21111] IA-64 NaT consumption faults due to uninitialized register reads

2019-04-29 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=2

--- Comment #15 from Jim Wilson  ---
See also PR 87338 which has a response earlier today from John Paul Adrian
Glaubitz.

[Bug other/63509] Misleading error message when building gcc without C++ compiler installed

2019-05-04 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63509

--- Comment #8 from Jim Wilson  ---
I filed an autoconf bug
http://savannah.gnu.org/support/index.php?109676

The problem here is that autoconf only verifies that the first AC_PROG_X
compiler is working.  And since we include AC_PROG_CC before AC_PROG_CXX, only
the C compiler is tested to see if it works.  But since we are using the C++
compiler for all of the configure checks, and for the build, we really should
be verifying that the C++ compiler works.

We can work around the autoconf problem by putting AC_PROG_CXX before
AC_PROG_CC.  This will verify that the C++ compiler works, but will stop
verifying that the C compiler works, which is probably OK.

[Bug target/61577] [4.9.0] can't compile on hp-ux v3 ia64

2019-07-04 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61577

Jim Wilson  changed:

   What|Removed |Added

 CC||wilson at gcc dot gnu.org

--- Comment #35 from Jim Wilson  ---
I never had access to ia64 hpux unfortunately.  I did ask them for a copy once,
but they couldn't figure out why I wanted it, and so they wouldn't give it to
me.  Steve Ellcey was working for HP at the time and maintaining the ia64-hpux
support.  You could try asking him.  I think the ia64-hpux support was always
poor.  Too many things done differently than ia64-linux, requiring inconvenient
hacks on our side to make our tools work on their system.  I don't know if the
binutils support was ever usable.  The gcc support was usable at one time, but
I don't know the details.

[Bug debug/90109] gstabs flag generates wrong entry for long on x86_64

2019-04-17 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90109

--- Comment #5 from Jim Wilson  ---
Stabs requires that we emit info for all of the base types at the start.  But
if one of the base types does not exist for a 32-bit K C target, then we are
struck, as that can't be described.  And if we can't describe one or more of
the base types, then we can't safely describe derived types. Currently, we just
lie about the base types we can't describe, which lets us emit debug info for
the derived types that we can describe.  It is possible to describe some things
as derived types that can't be described as base types.  But if we aren't
allowed to lie about base types, then we probably have to refuse to emit any
debug info at all for the file, which doesn't help.

You can get correct debug info for this testcase if you use
stabs-plus-gdb-extensions.  At least until you try to use some language feature
invented in the last decade or two that we never added
stabs-plus-gdb-extensions support for, and then you are screwed again.

Which takes me back to don't use stabs again.

[Bug target/91041] New: obsolete arm-pe support in arm backend

2019-06-30 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91041

Bug ID: 91041
   Summary: obsolete arm-pe support in arm backend
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: wilson at gcc dot gnu.org
  Target Milestone: ---

The arm/pe.h file was removed in 2012, but we still have about half a dozen
references in the arm.c file to the ARM_PE macro that was defined in that file.
 The arm-symbian port uses some of the same features, but it doesn't define
ARM_PE, the symbol support is handled via TARGET_DLLIMPORT_DECL_ATTRIBUTES. 
The obsolete code should be cleaned up.

[Bug target/91041] obsolete arm-pe support in arm backend

2019-07-01 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91041

--- Comment #1 from Jim Wilson  ---
FYI I noticed this while trying to answer a question on gcc-help, otherwise I'm
not doing any arm work at present.

[Bug target/91420] relocation truncated to fit: R_RISCV_HI20 against `.LC0' with GCC 8.2/8.3 with "-O2" on RISC-V

2019-08-12 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91420

Jim Wilson  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-08-12
 CC||wilson at gcc dot gnu.org
 Ever confirmed|0   |1

[Bug target/91602] GCC fails to build for riscv in a combined tree due to misconfigured leb128 support

2019-08-30 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91602

Jim Wilson  changed:

   What|Removed |Added

 CC||wilson at gcc dot gnu.org

--- Comment #3 from Jim Wilson  ---
Combined tree builds are obsolete and shouldn't be used anymore.  Since this
only shows up in a combined tree build, I don't consider it important.  If you
build the toolchain the correct way, building binutils and gcc separately, the
build does work.  My preferred solution would be to kill combined tree build
support.

[Bug target/91602] GCC fails to build for riscv in a combined tree due to misconfigured leb128 support

2019-08-30 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91602

--- Comment #6 from Jim Wilson  ---
By the way, the underlying problem here is, as Andrew Waterman mentioned, that
the RISC-V linker does aggressive link time relaxations to reduce code size,
and this makes lib128 with label subtraction unsafe.  The RISC-V port is not
the only port that has this problem.  The RISC-V port is just the only one that
gets it right.  The others can silently produce incorrect debug info.  I have
evidence that every binutils target except RISC-V that does aggressive link
time relaxation to reduce code size is doing it wrong.  There is a bug in
binutils bugzilla where I listed linker relaxation code size related issues,
but I have since found and fixed other problems in the RISC-V port, so it is
incomplete.
  https://sourceware.org/bugzilla/show_bug.cgi?id=22756#c2

[Bug target/91602] GCC fails to build for riscv in a combined tree due to misconfigured leb128 support

2019-08-30 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91602

--- Comment #5 from Jim Wilson  ---
The wiki is wrong.  Combined tree builds should not be used anymore.

Combined tree builds date back to when Cygnus was maintainer for everything. 
We put everything in a single source tree, and wrote configure and makefile
support so that you could check them out with a single command, and configure
and build them with a single command.  But Cygnus is no longer controlling
everything, and different packages are now in different source trees, and are
meant to be compiled separately.

When you build packages separately, each package after the first uses feature
tests to configure itself appropriately to use the earlier packages.  When you
use a combined tree build, feature tests can't be used because earlier packages
haven't been installed yet, so configure just makes guesses.  We have the most
commonly correct results hardwired in as guesses.  This was fine when packages
were smaller, and we had a smaller number of targets.  Nowadays, there is too
much risk that the guesses will be wrong. A combined tree build may not give
you the exact same result as building separately because of this.  Of course,
if you just need a throw away toolchain to check how something works for a
different architecture than your own, this is fine.  But if you want to make a
product release that will be used by customers, this can be dangerous.

None of the major build scripts use combined tree builds.  None of the linux
distros (Fedora, Debian, OpenSuse) do this.  None of the embedded linux build
systems to this (OpenEmbedded/Yocto, Buildroot).  crosstool-ng doesn't do
builds this way.  And github.com/riscv/riscv-gnu-toolchain doesn't do builds
this way.  There are so many easy ways to build toolchains that we don't need
combined tree builds to work.

This is of course my personal opinion, and I expect that people will disagree
with me.  But I rarely if ever see anyone update the combined tree support, so
I will continue to tell people that it is dangerous to rely on it.

[Bug target/91635] wrong code at -O2 with __builtin_add_overflow() and shifts

2019-09-03 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91635

--- Comment #13 from Jim Wilson  ---
I see 5 broken patterns which matches the list already given.  The four
testcases are all triggering on the same splitter, which is the first
define_split, lshrsi3_zero_extend_3+1.  The second define_split,
lshrsi3_zero_extend_3+2, requires a mask which has at least 32 zeros in the low
bits.  I don't think that this can occur with a paradoxical reg, because the
only valid input bits are all undefined bits in the upper half of the
paradoxical reg.  We may as well fix it anyways though to be consistent, but
this means writing a testcase appears to be impossible.  The other three are
post reload splitters, which will be hard to write a testcase for.  So it looks
like we only have one useful testcase here, but that is better than nothing.

Kito and I can test RISC-V patches.

Using SUBREG_REG would require generating a new reg rtx with the wider mode. 
May not be worth the trouble considering how uncommon that case will be.

For the second define_split, the existing testcases that trigger it are
unsigned long
sub1 (unsigned long i)
{
  return (i >> 32) << 32;
}
unsigned long
sub2 (unsigned long i)
{
  return (i >> 63) << 63;
}
Like I mentioned above, I don't see how that can be rewritten to use a
paradoxical reg.

[Bug target/91635] wrong code at -O2 with __builtin_add_overflow() and shifts

2019-09-05 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91635

--- Comment #22 from Jim Wilson  ---
> First of all, I think you need to use :DI instead of :GPR for the last
> define_split, as it shouldn't be iterated with.

Yes, sloppy copy paste.  I will fix before committing.

> Second, doesn't this mean that the splitters will be matched then only
> during combine and never afterwards?

The define_splits were specifically written to match patterns created by
combine that aren't valid instructions.  So I think it is OK if they can only
be matched during combine.  My empirical results are good looking at libc and
libstdc++ code size, I haven't seen any cases where I get worse code size by
modifying them to use clobbers.

> The can_create_pseudo_p () -> gen_reg_rtx (mode) way of adding intermediate
> temporaries is used heavily in many other backends, just look at say i386,
> rs6000 or aarch64 backends for examples (I've looked only at those 3 and
> found many spots in each), so if missing REG_DEAD notes affect code quality,
> perhaps it would be more useful to try to change the combiner to add those
> notes (say before splitting remember highest pseudo number, after the
> splitter see if we've added some new pseudos and by scanning the resulting
> sequence see if we shouldn't add REG_DEAD notes for those somewhere, because
> they shouldn't be used anywhere outside of the sequence.

Yes, long term, I think it would be a good idea to improve combine to add
REG_DEAD notes for temporary regs created by splitters.  I don't plan to work
on that as I have too much other stuff to do.  I'd be willing to create a bug
report if you want me to.

[Bug target/91635] wrong code at -O2 with __builtin_add_overflow() and shifts

2019-09-05 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91635

--- Comment #25 from Jim Wilson  ---
Author: wilson
Date: Thu Sep  5 20:32:55 2019
New Revision: 275444

URL: https://gcc.gnu.org/viewcvs?rev=275444=gcc=rev
Log:
RISC-V: Fix bad insn splits with paradoxical subregs.

Shifting by more than the size of a SUBREG_REG doesn't work, so we either
need to disable splits if an input is paradoxical, or else we need to
generate a clean temporary for intermediate results.

Jakub wrote the first version of this patch, so gets primary credit for it.

gcc/
PR target/91635
* config/riscv/riscv.md (zero_extendsidi2, zero_extendhi2,
extend2): Don't split if
paradoxical_subreg_p (operands[0]).
(*lshrsi3_zero_extend_3+1, *lshrsi3_zero_extend_3+2): Add clobber and
use as intermediate value.

gcc/testsuite/
PR target/91635
* gcc.c-torture/execute/pr91635.c: New test.
* gcc.target/riscv/shift-shift-4.c: New test.
* gcc.target/riscv/shift-shift-5.c: New test.

Added:
trunk/gcc/testsuite/gcc.c-torture/execute/pr91635.c
trunk/gcc/testsuite/gcc.target/riscv/shift-shift-4.c
trunk/gcc/testsuite/gcc.target/riscv/shift-shift-5.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/riscv/riscv.md
trunk/gcc/testsuite/ChangeLog

[Bug target/91635] wrong code at -O2 with __builtin_add_overflow() and shifts

2019-09-04 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91635

--- Comment #18 from Jim Wilson  ---
I was willing to write a patch, I just needed a day to catch up as I'm
hopelessly overloaded with work.  But since Jakub wrote one, it looks right to
me, and I think we should use that.  I'm not sure exactly what testing Kito
did, and he has now left on vacation, so I will just do my own testing.

I think the reason why RISC-V needs paradoxical_reg and other architectures
don't is that most architectures have sense enough to define zero and sign
extend instructions.  RISC-V will finally get that when the draft B (bit
manipulation) extension is formally approved.

[Bug target/91635] wrong code at -O2 with __builtin_add_overflow() and shifts

2019-09-04 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91635

--- Comment #19 from Jim Wilson  ---
I did cross toolchain builds and checks for rv32i/newlib and rv64gc/glibc. 
There were no regressions.

Since the splitters exist to reduce code size, I looked at the libc and
libstdc++ sizes with and without the patch.

For the rv32i build, libstdc++ is the same size, and libc is 4 bytes smaller. 
I extracted a testcase.
int
isnanf (float x)
{
  union u { float f; int i; } u;
  u.f = x;
  int i = u.i;
  i &= 0x7fff;
  return i > 0x7f80;
}
Without the patch with -O2 -S we get
sllia0,a0,1
li  a5,2139095040
srlia0,a0,1
sgt a0,a0,a5
ret
with the patch we get
li  a5,-16777216
sllia0,a0,1
sgtua0,a0,a5
ret
The issue here is that the extra temporary used for the splitter that generates
the slli/srli gives combine a little more freedom, and it manages to find an
optimization that was missed before.

For the rv64gc build, libc is 38 bytes smaller and libstdc++ is 34 bytes
larger.  The libc issue appears to be the same as above.  The libstdc++ issue
is more interesting.  I extracted a testcase.
unsigned long
sub (long l)
{
  union u {
struct s { int a : 19; unsigned int b : 13; int x; } s;
long l;
  } u;
  u.l = l;
  return u.s.b;
}
Without the patch, with -O2 -S, we generate
srliw   a0,a0,19
ret
and with the patch we get
srlia0,a0,19
sllia0,a0,51
srlia0,a0,51
ret
The issue here is that there is no REG_DEAD note for the temporary that we
generated. So without the patch combine generates
Successfully matched this instruction:
(set (reg:DI 78)
(zero_extract:DI (reg:DI 82)
(const_int 13 [0xd])
(const_int 19 [0x13])))
and with the patch combine generates
Failed to match this instruction:
(parallel [
(set (reg:DI 78)
(zero_extract:DI (reg:DI 82)
(const_int 13 [0xd])
(const_int 19 [0x13])))
(set (reg:DI 83)
(and:DI (ashift:DI (reg:DI 82)
(const_int 32 [0x20]))
(const_int -2251799813685248 [0xfff8])))
])
where reg 83 is the temporary generated by the splitter which has no REG_DEAD
note.

Since the temporary appears to be generally useful, and lack of a REG_DEAD note
causes problems, I think it would be better to modify the two define_split
patterns to have a clobber instead of calling gen_reg_rtx.  I do get the
REG_DEAD note in this case, and the correct single instruction output for the
second testcase.  The three define_insn_and_split patterns can remain the same
as they aren't causing any trouble.

Though I suppose there is a question here of whether combine will ever
accidentally substitute in a paradoxical reg for a clobber.  I would hope that
it doesn't.  I can check for that.

Since I have a new patch, and new testcases, I need to redo the builds and
checks from scratch.

[Bug target/91635] wrong code at -O2 with __builtin_add_overflow() and shifts

2019-09-05 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91635

--- Comment #20 from Jim Wilson  ---
Created attachment 46830
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46830=edit
proposed patch to fix paradoxical reg in splitter problem

I get better code size with this alternative patch.  I added two more testcases
for the issues I found while evaluating this patch.  There is no change to the
rv32/newlib libc/libstdc++ library code sizes.  The rv64/linux libc.so is now
94 bytes smaller.  The libstdc++.so is 6 bytes larger but text is 10 bytes
smaller and rodata is 16 bytes larger.  I'm not sure what happened there, maybe
alignment padding, but I can live with it.  The text section looks like the
same code, but slightly better register allocation, which allows a few more
compressed instructions to be used.  This has passed rv32/newlib and rv64/linux
cross toolchain build and check.  The new testcase from Jakub still works,
along with my two new testcases.

I tried adding instrumentation to the patch to abort if I ended up with a
paradoxical reg in the splitters, and it never triggered, so it looks like
combine is doing the right thing when allocating a reg for the clobber.

[Bug target/91683] ICE: SIGSEGV at -O when compiling for riscv64

2019-09-06 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91683

Jim Wilson  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-09-06
 CC||wilson at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |wilson at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Jim Wilson  ---
Created attachment 46850
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46850=edit
proposed fix

This is failing in undo_all in combine, trying to write to a pointer into data
that has been garbage collected.

The problem starts with a riscv.md splitter which calls riscv_move_integer
which checks can_create_pseudo_p and calls force_reg if allowed.  This
eventually calls gen_reg_rtx which calls ensure_regno_capacity which calls
ggc_realloc, which invalidates the data in the undo buffer since all pseudo reg
addresses change.  I guess this is another reason why calling gen_reg_rtx in a
splitter is a bad idea.

I can fix it by adding another option to riscv_move_integer to indicate when we
are calling it from inside a splitter, but maybe I will think of a better way
to fix the problem later.

The fact that this requires a ggc_realloc call to happen in a very precise
place means that this testcase does not appear to be very useful.  There is no
guarantee it will trigger the problem if other parts of the compiler change.

[Bug inline-asm/85185] Wider-than-expected load for struct member used as operand of inline-asm with memory clobber at -Og

2019-09-11 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85185

--- Comment #10 from Jim Wilson  ---
It is the same basic problem as before, but more serious.  The original
testcase uses a short variable, and can be fixed by adding a cast to int.  But
the new testcase does not use any short variables.

There seems to be a disconnect between the C parser and the RTL expander here. 
the C parser is giving the asm operand a type of short unsigned int, which is
the smallest integer type that the constant can fit in.  The RTL expander is
then generating a HImode reg for it, but loading a constant into an HImode reg
will implicitly sign extend it, which is wrong for a short unsigned int asm
operand.

Before the expand_expr call...

Breakpoint 6, expand_asm_stmt (stmt=stmt@entry=0x76274108)
at ../../gcc-git/gcc/cfgexpand.c:3247
(gdb) print val
$11 = 
(gdb) pt
warning: Expression is not an assignment (and might have no effect)
  constant 65532>
(gdb) 

after the expand_expr call...
gdb) print op
$12 = (rtx) 0x76163450
(gdb) pr
warning: Expression is not an assignment (and might have no effect)
(const_int -4 [0xfffc])
(gdb) 

then it does
op = force_reg (TYPE_MODE (type), op);
and we end up with
gdb) print op
$13 = (rtx) 0x76272828
(gdb) pr
warning: Expression is not an assignment (and might have no effect)
(reg:HI 72)


Maybe what we need here is some equivalent to PROMOTE_MODES to force asm
operands to word mode on targets where this makes more sense.  But this needs
to happen before the constant gets sign extended, which means it has to happen
before or inside the expand_expr call.

[Bug target/91720] [10 Regression] wrong code with -Og -fno-forward-propagate -frerun-cse-after-loop -fno-tree-fre

2019-09-10 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91720

Jim Wilson  changed:

   What|Removed |Added

 CC||wilson at gcc dot gnu.org

--- Comment #8 from Jim Wilson  ---
Created attachment 46867
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46867=edit
simplest fix: don't define WORD_REGISTER_OPERATIONS

I've had problems with combine, LOAD_EXTEND_OP, and WORD_REGISTER_OPERATIONS
before.  I think combine is making unsafe assumptions in this case, but I'm
getting tired of arguing about this.  So the simple solution is to just stop
defining WORD_REGISTER_OPERATIONS.  It makes this testcase work.

I get about a 0.1% code size increase when I try it with a 64-bit toolchain,
looking at libc and libstdc++ library file sizes.  If this affects dhrystone or
coremark code size or performance, a lot of people will be unhappy, so I will
have to double check that.  For a 32-bit toolchain it is more like 0.01% which
is not as bad.

[Bug target/91683] ICE: SIGSEGV at -O when compiling for riscv64

2019-09-10 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91683

--- Comment #16 from Jim Wilson  ---
(In reply to rguent...@suse.de from comment #15)
> I still don't understand. The rtx are not relocated. The only thing is the
> address of the slot of the regno to rtx map.

I have a debug session in comment 6 that shows that this address is changing. 
But maybe what you are missing here is how combine changes registers modes.  It
uses a call like this
  SUBST_MODE (regno_reg_rtx[REGNO (i2dest)], split_mode);
and this macro is defined as
#define SUBST_MODE(INTO, NEWVAL)  do_SUBST_MODE (&(INTO), (NEWVAL))
so we are saving an address in the regno_reg_rtx array.

I don't know exactly why it works this way, and I don't care, because I don't
think that fixing this is the right solution for the RISC-V port.  I think the
RISC-V port should stop calling gen_reg_rtx inside splitters.

The patch where this code appeared suggests that it was a rare hard to
reproduce bug that required this fix.
https://gcc.gnu.org/ml/gcc-patches/2006-01/msg00477.html
Maybe SUBST_MODE was just blindly copied from SUBST_INT, and it was missed that
it might work without the & in this case.  This may also depend on how exactly
SUBST_MODE is used, but currently it is only used for regs, which are supposed
to be unique, so dropping the & may work.  This may not work if SUBST_MODE gets
used for otherwise, but currently it isn't.

[Bug c/91713] GP linker relaxation is not performed with -nostdlib

2019-09-09 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91713

Jim Wilson  changed:

   What|Removed |Added

 CC||wilson at gcc dot gnu.org

--- Comment #1 from Jim Wilson  ---
This is an edge condition and an accident of circumstances.  When you link with
the default libraries, other stuff gets put in sdata before a, and so the
variable a is in range of gp because it is at -0x7f0.  When you link with
-nostdlib, a is the only thing in sdata, and we run into an edge condition
where a is -0x800 from gp, which is at the extreme limit, but the linker
relaxation has to limit the range to deal with section alignment issues that
may changes addresses after relaxation, and so we have to assume that a is out
of range.

If you change the example to
int a = 1;
int b = 2;
int c = 3;
int d = 4;
int e = 5;

int main()
{
return a + b + c + d + e;
}
then we see that all 5 variables use gp address with default libraries, and
only the last 3 with -nostdlib, so we are losing the first two variables due to
address range limitation at linker relaxation time.

There is a somewhat related open binutils bug report
  https://sourceware.org/bugzilla/show_bug.cgi?id=24678
and another somewhat related binutils bug report I recently fixed
  https://sourceware.org/ml/binutils/2019-08/msg00244.html
if gp was still computed inside the .sdata section we wouldn't have this
problem, but that means undoing a change that reduced code size for most
applications.  Maybe there is a different way to solve the earlier problem that
doesn't cause this problem.

Anyways, this is a linker issue not a gcc issue.

[Bug target/91683] ICE: SIGSEGV at -O when compiling for riscv64

2019-09-09 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91683

--- Comment #5 from Jim Wilson  ---
(In reply to Richard Biener from comment #3)
> Hmm, but shouldn't we instead fix combine to record not the reg rtx but the
> regno for UNDO_MODE?  Because nothing prevents this from happening elsewhere
> (even in combine itself?).

Combine doesn't call gen_reg_rtx, other than make_more_copies, and splitters
aren't supposed to either.  As Segher mentioned, if a define_split needs a
temporary it is supposed to add a clobber of a match_operand.

I tried an experiment, I set a global var at the start of combine_instructions,
cleared it at the end, added an abort to gen_reg_rtx if the var is set, and did
an x86 bootstrap.  It completed successfully.  A make check is running, but I
don't expect to hit the abort there either.

A RISC-V toolchain with my proposed fix added still fails as riscv_split_symbol
called from gen_split_13 has the same problem and needs a similar fix.

[Bug target/91683] ICE: SIGSEGV at -O when compiling for riscv64

2019-09-09 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91683

--- Comment #6 from Jim Wilson  ---
(In reply to Richard Biener from comment #4)
> Btw, the rtx of pseudos doesn't change, what changes is the address of the
> entry in regno_reg_rtx[] which is the pseudo-nr -> rtx map.  But I don't see
> that involved...

do_SUBST_MODE saves the address of the rtx being modified, in this case the
address of a pseudo reg.

Breakpoint 5, do_SUBST_MODE (into=0x76280458, newval=E_DImode)
at ../../gcc-svn/gcc/combine.c:809
(gdb) print into
$5 = (rtx *) 0x76280458
(gdb) print *into
$6 = (rtx) 0x76287630
(gdb) pr
warning: Expression is not an assignment (and might have no effect)
(reg:SI 139)
(gdb) print regno_reg_rtx[139]
$7 = (rtx) 0x76287630
(gdb) print _reg_rtx[139]
$8 = (rtx *) 0x76280458
(gdb) 

So into is an address into regno_reg_rtx.  When we realloc regno_reg_rtx this
changes.

Breakpoint 6, emit_status::ensure_regno_capacity (this=0x1a881f0 )
at ../../gcc-svn/gcc/emit-rtl.c:1215
(gdb) next 6
(gdb) print regno_reg_rtx[139]
$13 = (rtx) 0x76287630
(gdb) print _reg_rtx[139]
$14 = (rtx *) 0x7627b458
(gdb) print $5
$15 = (rtx *) 0x76280458
(gdb) print *$5
$16 = (rtx) 0xa5a5a5a5a5a5a5a5
(gdb)

[Bug target/91683] ICE: SIGSEGV at -O when compiling for riscv64

2019-09-09 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91683

--- Comment #12 from Jim Wilson  ---
(In reply to Segher Boessenkool from comment #11)
> Oh lol, I finally wake up.  It is called from the splitter.  That isn't
> really a valid thing to do.

Right.  I want to eliminate that.

> We have some limited support for that since a while, but it seems this
> cannot ever really work?

Maybe with changes to the combine undo buffer, but there is still the lack of
REG_DEAD notes that can cause problems, so I think I'm still better off
avoiding calling gen_reg_rtx from splitters.

[Bug target/91683] ICE: SIGSEGV at -O when compiling for riscv64

2019-09-09 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91683

--- Comment #14 from Jim Wilson  ---
> 3) Do we want to prohibit calling gen_reg_rtx during combine?  Why did
> we want it, before?

Prohibiting gen_reg_rtx calls here would have helped find the bugs I'm fixing
now.  I've got a hack to do this to find the remaining bugs I need to fix.

I don't know why anyone would want it before.  I'm not sure if this ever worked
right.

[Bug target/91683] ICE: SIGSEGV at -O when compiling for riscv64

2019-09-09 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91683

--- Comment #7 from Jim Wilson  ---
> Would --param=ggc-min-expand=0 and/or --param=ggc-min-heapsize=0 help to
> reproduce the issue?

I don't think so.  The problem occurs in ensure_regno_capacity, which isn't
affected by frequency of garbage collection.  It is only affected by the number
of gen_reg_rtx calls.

It is already the case that ggc_free overwrites freed memory with 0xa5 when
ENABLE_GC_CHECKING is on, that is what exposed the problem.

[Bug target/91683] ICE: SIGSEGV at -O when compiling for riscv64

2019-09-09 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91683

--- Comment #8 from Jim Wilson  ---
(In reply to Richard Biener from comment #3)
> Hmm, but shouldn't we instead fix combine to record not the reg rtx but the
> regno for UNDO_MODE?  Because nothing prevents this from happening elsewhere
> (even in combine itself?).

Changing do_SUBST_MODE to save the regno instead of the address of rhe reg rtx
is a possible solution, but can work only if there is no other place in combine
that might take the address of a reg rtx.  I can't be sure of that unless I do
more analysis, which I don't plan to do as I don't think this is the right
solution.

Also, I think I will get better code if I fix the splitters to use a clobber. 
As was mentioned in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91635#c19 ,
calling gen_reg_rtx means no REG_DEAD note which limits what combine can do,
whereas using a clobber means getting an existing reg that likely already has a
REG_DEAD note which helps combine.

[Bug target/91229] RISC-V ABI problem with zero-length bit-fields and float struct fields

2019-08-08 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91229

Jim Wilson  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Jim Wilson  ---
Fixed on mainline.

[Bug target/91229] RISC-V ABI problem with zero-length bit-fields and float struct fields

2019-08-08 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91229

--- Comment #3 from Jim Wilson  ---
Author: wilson
Date: Thu Aug  8 19:04:56 2019
New Revision: 274215

URL: https://gcc.gnu.org/viewcvs?rev=274215=gcc=rev
Log:
RISC-V: Fix C ABI for flattened struct with 0-length bitfield.

gcc/
PR target/91229
* config/riscv/riscv.c (riscv_flatten_aggregate_field): New arg
ignore_zero_width_bit_field_p.  Skip zero size bitfields when true.
Pass into recursive call.
(riscv_flatten_aggregate_argument): New arg.  Pass to
riscv_flatten_aggregate_field.
(riscv_pass_aggregate_in_fpr_pair_p): New local warned.  Call
riscv_flatten_aggregate_argument twice, with false and true as last
arg.  Process result twice.  Compare results and warn if different.
(riscv_pass_aggregate_in_fpr_and_gpr_p): Likewise.

gcc/testsuite/
* gcc.target/riscv/flattened-struct-abi-1.c: New test.
* gcc.target/riscv/flattened-struct-abi-2.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/riscv/flattened-struct-abi-1.c
trunk/gcc/testsuite/gcc.target/riscv/flattened-struct-abi-2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/riscv/riscv.c
trunk/gcc/testsuite/ChangeLog

[Bug target/91229] RISC-V ABI problem with zero-length bit-fields and float struct fields

2019-07-22 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91229

Jim Wilson  changed:

   What|Removed |Added

 Target||riscv*-*-*
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-07-22
   Assignee|unassigned at gcc dot gnu.org  |wilson at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Jim Wilson  ---
There is a psABI discussion about the problem at
https://github.com/riscv/riscv-elf-psabi-doc/issues/99

[Bug target/91229] RISC-V ABI problem with zero-length bit-fields and float struct fields

2019-07-22 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91229

--- Comment #2 from Jim Wilson  ---
Created attachment 46617
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46617=edit
proposed patch to change ABI and warn for affected structs

[Bug target/91229] New: RISC-V ABI problem with zero-length bit-fields and float struct fields

2019-07-22 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91229

Bug ID: 91229
   Summary: RISC-V ABI problem with zero-length bit-fields and
float struct fields
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: wilson at gcc dot gnu.org
  Target Milestone: ---

This was noticed by clang development, comparing clang against gcc to verify
ABI compliance.

There is a problem with the GCC implementation of the ABI, where we are
accidentally emitting different code for the same struct when compiled by the C
and C++ compilers.

There are two cases affected by this.  Here is the first example:
struct s1 { int : 0; float f; int i; int : 0; };

void dummy(float, int);

void f(struct s1 s) {
  dummy(s.f + 1.0, s.i + 1);
}
where we have a struct that can be passed in one FP reg and one integer GP, and
here is the second example:
struct s1 { int : 0; float f; float g; int : 0; };

void dummy(float, float);

void f(struct s1 s) {
  dummy(s.f + 1.0, s.g + 2.0);
}
where we have a struct that can be passed in two FP regs.  In both cases, the
C++ compiler passes the float struct fields in FP registers, and the C compiler
passes them in integer registers.

The general case here is any struct with one or more zero-length bitfields,
exactly two non-zero length fields, one of which must have an FP type that can
fit in an FP register, and the other can be an FP type that fits in an FP
register or an integer type that fits in an integer register or a integer
bit-field that is the exact same size as an integer type that can fit in an
integer register.  Also, the target must have FP register support.

The fundamental problem is that the RISC-V backend is not checking for
zero-length bit-fields when deciding if a struct field can be passed in a FP
register or not.  Meanwhile, the C++ front end is stripping zero-length
bit-fields after struct layout.  So when compiling as C++ we decide that the FP
struct fields can be passed in FP regs.  But when compiling as C we decide that
there are too many struct fields and they all get passed in integer registers.

Since having the C and C++ front ends using different ABIs is undesirable, we
need an ABI change.  Fixing the C++ case would require inconvenient changes to
the C++ front end.  So fixing the C case with a RISC-V backend patch looks like
the best practical solution.

The affected structures are a bit obscure and not very useful, so it is hoped
no real code will be affected.  I've done an open-embedded world build with an
instrumented compiler, and I didn't see any case that triggered my code.  Not
everything built though, since some stuff still doesn't have RISC-V support
yet.  I did have over 30,000 tasks run, so quite a bit of stuff did build.

[Bug tree-optimization/91191] New: vrp and boolean arguments

2019-07-17 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91191

Bug ID: 91191
   Summary: vrp and boolean arguments
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: wilson at gcc dot gnu.org
  Target Milestone: ---

It appears that vrp isn't propagating the ranges of incoming boolean arguments.
 Given this example:

unsigned char reg(_Bool b) {
union U {
unsigned char f0;
_Bool f1;
};
union U u;
u.f1 = b;
if (u.f0 > 1) { 
// This cannot happen
// if b is only allowed
// to be 0 or 1:
return 42;   
}
return 13;
}

clang optimizes this to unconditionally return 13, but gcc does a compare and
conditionally returns either 42 or 13 depending on the result of the compare.
This happens with both x86_64 and RISC-V.

Looking at the vrp dumps, I see
b_3(D): VARYING

[Bug target/91635] wrong code at -O2 with __builtin_add_overflow() and shifts

2019-09-19 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91635

Jim Wilson  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Assignee|unassigned at gcc dot gnu.org  |wilson at gcc dot 
gnu.org

--- Comment #27 from Jim Wilson  ---
Fixed on mainline and gcc-9 branch.

[Bug target/91683] ICE: SIGSEGV at -O when compiling for riscv64

2019-09-18 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91683

--- Comment #20 from Jim Wilson  ---
Author: wilson
Date: Thu Sep 19 01:19:25 2019
New Revision: 275925

URL: https://gcc.gnu.org/viewcvs?rev=275925=gcc=rev
Log:
RISC-V: Fix more splitters accidentally calling gen_reg_rtx.

PR target/91683
* config/riscv/riscv-protos.h (riscv_split_symbol): New bool parameter.
(riscv_move_integer): Likewise.
* config/riscv/riscv.c (riscv_split_integer): Pass FALSE for new
riscv_move_integer arg.
(riscv_legitimize_move): Likewise.
(riscv_force_temporary): New parameter in_splitter.  Don't call
force_reg if true.
(riscv_unspec_offset_high): Pass FALSE for new riscv_force_temporary
arg.
(riscv_add_offset): Likewise.
(riscv_split_symbol): New parameter in_splitter.  Pass to
riscv_force_temporary.
(riscv_legitimize_address): Pass FALSE for new riscv_split_symbol
arg.
(riscv_move_integer): New parameter in_splitter.  New local
can_create_psuedo.  Don't call riscv_split_integer or force_reg when
in_splitter TRUE.
(riscv_legitimize_const_move): Pass FALSE for new riscv_move_integer,
riscv_split_symbol, and riscv_force_temporary args.
* config/riscv/riscv.md (low+1): Pass TRUE for new
riscv_move_integer arg.
(low+2): Pass TRUE for new riscv_split_symbol arg.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/riscv/riscv-protos.h
trunk/gcc/config/riscv/riscv.c
trunk/gcc/config/riscv/riscv.md

[Bug rtl-optimization/91860] [10 Regression] ICE: in decompose, at rtl.h:2279 with -Og -fipa-cp -g --param=max-combine-insns=3

2019-09-26 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91860

--- Comment #5 from Jim Wilson  ---
The new testcase is essentially the same problem, but it is i1src this time not
i2src, so just copying i2src earlier doesn't solve the problem, we also need a
fix for i1src, or a fix elsewhere.

[Bug rtl-optimization/91860] [10 Regression] ICE: in decompose, at rtl.h:2279 with -Og -fipa-cp -g --param=max-combine-insns=3

2019-09-23 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91860

Jim Wilson  changed:

   What|Removed |Added

 CC||wilson at gcc dot gnu.org

--- Comment #1 from Jim Wilson  ---
Combine is doing a 3 instruction split into 2 which appears ok

Trying 12, 14 -> 18:
   12: r84:SI=0xa000
   14: r76:DI=sign_extend(r84:SI+0x466)
  REG_DEAD r84:SI
  REG_EQUAL 0xa466
   18: r87:DI=~r76:DI
  REG_DEAD r76:DI
Failed to match this instruction:
(set (reg:DI 87)
(const_int -42087 [0x5b99]))
Splitting with gen_split_11 (riscv.md:1280)
Successfully matched this instruction:
(set (reg:DI 76 [ _7 ])
(const_int -40960 [0x6000]))
Successfully matched this instruction:
(set (reg:DI 87)
(plus:DI (reg:DI 76 [ _7 ])
(const_int -1127 [0xfb99])))
allowing combination of insns 12, 14 and 18
original costs 4 + 8 + 4 = 16
replacement costs 4 + 4 = 8

however a side effect of this is that when i1 is substituted into newpat, i2src
becomes invalid RTL

gdb) print i2src
$74 = (rtx) 0x76262b90
(gdb) pr
warning: Expression is not an assignment (and might have no effect)
(sign_extend:DI (const_int 42086 [0xa466]))

This wouldn't be a problem normally, except that i2src is later used to
generate a debug instruction via propagate_for_debug.  This calls
simplify_replace_fn_rtx which crashes on the invalid rtl.

Looks like we need to copy i2src sooner than we currently do.

[Bug rtl-optimization/91860] [10 Regression] ICE: in decompose, at rtl.h:2279 with -Og -fipa-cp -g --param=max-combine-insns=3

2019-09-23 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91860

--- Comment #2 from Jim Wilson  ---
Created attachment 46919
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46919=edit
untested fix to copy i2src earlier

works for testcase but otherwise untested

[Bug middle-end/92263] [10 Regression] ICE in commit_one_edge_insertion, at cfgrtl.c:2087 since r270758

2019-11-05 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92263

Jim Wilson  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

--- Comment #9 from Jim Wilson  ---
Fixed on mainline.

Do you need a backport to the gcc-9 branch?

[Bug middle-end/92263] [10 Regression] ICE in commit_one_edge_insertion, at cfgrtl.c:2087 since r270758

2019-11-05 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92263

--- Comment #8 from Jim Wilson  ---
Author: wilson
Date: Tue Nov  5 22:34:40 2019
New Revision: 277861

URL: https://gcc.gnu.org/viewcvs?rev=277861=gcc=rev
Log:
Allow libcalls for complex memcpy when optimizing for size.

The RISC-V backend wants to use a libcall when optimizing for size if
more than 6 instructions are needed.  Emit_move_complex asks for no
libcalls.  This case requires 8 insns for rv64 and 16 insns for rv32,
so we get fallback code that emits a loop.  Commit_one_edge_insertion
doesn't allow code inserted for a phi node on an edge to end with a
branch, and so this triggers an assertion.  This problem goes away if
we allow libcalls when optimizing for size, which gives the code the
RISC-V backend wants, and avoids triggering the assert.

gcc/
PR middle-end/92263
* expr.c (emit_move_complex): Only use BLOCK_OP_NO_LIBCALL when
optimize_insn_for_speed_p is true.

gcc/testsuite/
PR middle-end/92263
* gcc.dg/pr92263.c: New.

Added:
trunk/gcc/testsuite/gcc.dg/pr92263.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/expr.c
trunk/gcc/testsuite/ChangeLog

[Bug middle-end/92263] [10 Regression] ICE in commit_one_edge_insertion, at cfgrtl.c:2087 since r270758

2019-10-30 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92263

--- Comment #6 from Jim Wilson  ---
Looking at some other targets.  ARM has movcc but not 128-bit long double. 
Aaarch has movcc and 128-bit long double, but has 128-bit load/store so this is
only 4 instructions.  mips64, powerpc64, and sparc64 have movcc and 128-bit
long double, but emit the memcpy inline as 8 instructions.  riscv64 meanwhile
wants the libcall with -Os as that is 4 instructions instead of 8.  For rv32
this would be 16 instructions.  I'm not sure offhand if the other targets
support 32-bit code and 128-bit long double.

Anyways, I tracked the use of BLOCK_OP_NO_LIBCALL in emit_move_complex back to
bugzilla 15289, fixed by a patch from Richard Henderson back in Dec 1 2004.  I
think it is just an oversight that -Os wasn't considered here.  I think the
correct fix is to only force BLOCK_OP_NO_LIBCALL when optimizing for speed. 
With this change, I get the 8 instruction sequence with -O2, and the 4
instruction libcall sequence with -Os, which is what the RISC-V backend wants,
and this lets the testcase work.

[Bug middle-end/92263] [10 Regression] ICE in commit_one_edge_insertion, at cfgrtl.c:2087 since r270758

2019-10-30 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92263

Jim Wilson  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |wilson at gcc dot 
gnu.org

--- Comment #7 from Jim Wilson  ---
Created attachment 47139
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47139=edit
untested proposed fix

[Bug middle-end/92263] [10 Regression] ICE in commit_one_edge_insertion, at cfgrtl.c:2087 since r270758

2019-11-14 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92263

Jim Wilson  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Jim Wilson  ---
Fixed on mainline.  No backport requested.

[Bug middle-end/92263] [10 Regression] ICE in commit_one_edge_insertion, at cfgrtl.c:2087 since r270758

2019-10-29 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92263

--- Comment #3 from Jim Wilson  ---
I did a cross compiler build and check yesterday using up-to-date sources and
did not see this failure.  I've been testing regularly.  I did my build on an
x86_64 Ubuntu 16.04 machine with gcc-5.4 as the system compiler.  Maybe this
depends on the compiler used for the build?  Or the exact configure command?

[Bug middle-end/92263] [10 Regression] ICE in commit_one_edge_insertion, at cfgrtl.c:2087 since r270758

2019-10-29 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92263

--- Comment #4 from Jim Wilson  ---
OK, I get it now.  You are using non-standard optimization options with a
testsuite testcase.  I can reproduce when I use your compiler options.  I will
take a look.

[Bug middle-end/92263] [10 Regression] ICE in commit_one_edge_insertion, at cfgrtl.c:2087 since r270758

2019-10-29 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92263

--- Comment #5 from Jim Wilson  ---
The patch adds a RISC-V movcc pattern.  This causes toplev.c to enable
flag_tree_cselim.  This optimization pass creates a complex long double
conditional move via a phi node.
  complex long double cstore_31;
  ...
   [local count: 27903866]:
  cstore_30 = MEM  [(void *)_8];

   [local count: 55807731]:
  # cstore_31 = PHI <__complex__ (0.0, 0.0)(4), cstore_30(5)>
  MEM  [(void *)_8] = cstore_31;
When we try to convert gimple to rtl, eliminate_phi calls
insert_value_copy_on_edge for the 32-byte long double 0 value.  The constant
then gets forced to memory, and we end up calling emit_block_move with
BLOCK_OP_NO_LIBCALL, which ends up emitting a loop to do the memory to memory
copy.  Then later in commit_one_edge_insertion we split the edge, insert the
code containing the loop, and then trigger an abort because the last
instruction inserted is the loop back branch.

I don't see where the RISC-V port did anything wrong.  The load hoisting code
is checking the movcc optab to see if the target supports the operation, but I
don't see anything obvious like that in the cselim pass.  The only obvious fix
I see in the RISC-V back end is to modify riscv_expand_block_move to emit
inline non-loop code for a 32-byte memory to memory copy, even when optimizing
for size, which I'd rather not do.  Maybe it can be fixed in
commit_one_edge_insertion by allowing conditional branches but not
unconditional branches, but it isn't clear why this is refusing to allow
branches here in the first place.  I will have to look at other targets to see
why they aren't failing.

[Bug bootstrap/92709] Cross Compilation failed for Latest GCC riscv64-linux-gnu on Linux/WSL2

2019-12-02 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92709

Jim Wilson  changed:

   What|Removed |Added

 CC||wilson at gcc dot gnu.org

--- Comment #3 from Jim Wilson  ---
>make[4]: Entering directory 
>'/home/cqwrteur/gcc-riscv64-build/riscv64-linux-gnu/lib32/ilp32/libgcc'
>make[4]: *** No rule to make target 'all'.  Stop.

Look in the riscv64-linux-gnu/lib32/ilp32/libgcc dir.  There should be a
Makefile that is a copy of the $srcdir/libgcc/Makefile.in file with some sed
substition.  If this isn't true, then it is the configure step that failed.

You can force the library dirs to reconfigure by deleting them.  It gets a bit
more complicated with multilibs, I think you have to delete the top level
riscv64-linux-gnu/libgcc to force a reconfigure, but just a rm -rf
riscv64-linux-gnu works too and is simpler, though more stuff will be rebuilt.

You might want to do a -j1 make to get an easier to read build log.

I don't have a Windows machine at work, but there is only one WSL problem that
I have seen reported, and it is that WSL makes filesystems case-insensitive by
default which is contrary to linux practice.  It is known that this will break
glibc builds which uses .os and .oS for two different kinds of files.  I don't
think that this breaks gcc builds.  But since you are trying to do a cross to
riscv64-linux-gnu you will run into this problem if you haven't already.

[Bug target/91683] ICE: SIGSEGV at -O when compiling for riscv64

2019-09-20 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91683

Jim Wilson  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #22 from Jim Wilson  ---
Fixed on mainline and gcc-9 branch.

[Bug middle-end/91691] Cross compiling glibc produces a false maybe-uninitialized error

2019-10-09 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91691

Jim Wilson  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-10-10
 CC||wilson at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #3 from Jim Wilson  ---
I can reproduce using gcc-8 and gcc-9 but not mainline.  I'm using gcc options:
-std=gnu11 -fgnu89-inline  -mcmodel=medlow -g -O2 -Wall -Wwrite-strings -Wundef 
-fmerge-all-constants -frounding-math -fno-stack-protector -Wstrict-prototypes
-
Wold-style-definition -fmath-errno   -fPIC   -ftls-model=initial-exec \
tmp.i -fdump-tree-all -fdump-ipa-all 

Bisecting on mainline shows that the new IPA SRA pass added Sept 20 fixed this.
Without that patch, I see that total_deadline is being decomposed, but
apparently there is something wrong with the tracking info. If I add
-fno-ipa-sra the warning goes away.

The new IPA-SRA is too big of a patch to backport, so it looks like this will
have to remain unfixed on the gcc-8 and gcc-9 branches.

[Bug middle-end/91691] Cross compiling glibc produces a false maybe-uninitialized error

2019-10-10 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91691

Jim Wilson  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Jim Wilson  ---
Fixed on mainline.  Not backporting to gcc-8 or gcc-9 branches.

[Bug middle-end/24639] [meta-bug] bug to track all Wuninitialized issues

2019-10-10 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24639
Bug 24639 depends on bug 91691, which changed state.

Bug 91691 Summary: Cross compiling glibc produces a false maybe-uninitialized 
error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91691

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug rtl-optimization/91860] [10 Regression] ICE: in decompose, at rtl.h:2279 with -Og -fipa-cp -g --param=max-combine-insns=3

2019-10-08 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91860

Jim Wilson  changed:

   What|Removed |Added

  Attachment #46919|0   |1
is obsolete||

--- Comment #9 from Jim Wilson  ---
Created attachment 47008
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47008=edit
second untested fix

[Bug rtl-optimization/91860] [10 Regression] ICE: in decompose, at rtl.h:2279 with -Og -fipa-cp -g --param=max-combine-insns=3

2019-10-08 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91860

Jim Wilson  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2019-10-08
   Assignee|unassigned at gcc dot gnu.org  |wilson at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #8 from Jim Wilson  ---
The third and fourth testcases are i2src problems, just like the first
testcase.

Since fixing this in combine looks complicated, I took a look at the RTL
simplifier.  Using the fourth testcase as my example, the problem here is that
we are trying to substitute
(sign_extend:DI (const_int 8160 [0x1fe0]))
as the value for
(reg:DI 78 [ _9 ])
in the debug insn
(debug_insn 29 28 30 2 (var_location:QI d (subreg:QI (reg:DI 78 [ _9 ]) 0))
"tm\
p4.c":11:5 -1
 (nil))

The place where this starts to go wrong is in simplify_truncation, where it
tries to compare the size of the original mode (VOIDmode) with the subreg mode
(QI) and decides that we need to sign extend the constant to convert it from
VOIDmode to QImode.  We actually need a truncation not a extension here.  This
then goes even further wrong in simplify_const_unary_operation which does
  scalar_int_mode imode = (op_mode == VOIDmode
   ? result_mode
   : as_a  (op_mode));
which treats the CONST_INT as a QImode value, but it isn't a valid QImode
CONST_INT valid due to lack of QImode sign-extension, so the following
wide_int::from call triggers an consistency check and we abort.

I see a number of places in simplify-rtx where it assumes that if an input is
VOIDmode then it should use MAX_MODE_INT instead as a CONST_INT should always
be already extended in canonical form.  This seems to be the correct fix, which
I want to put in simplify_truncation so that we aren't passing VOIDmode to
GET_MODE_UNIT_PRECISION.  I verified that this works for all four testcases.

[Bug rtl-optimization/91860] [10 Regression] ICE: in decompose, at rtl.h:2279 with -Og -fipa-cp -g --param=max-combine-insns=3

2019-10-11 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91860

--- Comment #10 from Jim Wilson  ---
Author: wilson
Date: Fri Oct 11 18:41:35 2019
New Revision: 276901

URL: https://gcc.gnu.org/viewcvs?rev=276901=gcc=rev
Log:
Extend subst to simplify CONST_INT inside SIGN_EXTEND.

This addresses PR 91860 which has four testcases triggering internal errors.
The problem here is that in combine when handling debug insns, we are trying
to substitute
(sign_extend:DI (const_int 8160 [0x1fe0]))
as the value for
(reg:DI 78 [ _9 ])
in the debug insn
(debug_insn 29 28 30 2 (var_location:QI d (subreg:QI (reg:DI 78 [ _9 ]) 0))
"tmp4.c":11:5 -1
 (nil))
This eventually triggers an abort because 8160 is not a sign-extended
QImode value.

In subst there is already code check for a CONST_INT inside a ZERO_EXTEND and
simplify it.  This needs to be extended to also handle a SIGN_EXTEND the same
way.

gcc/
PR rtl-optimization/91860
* combine.c (subst): If new_rtx is a constant, also check for
SIGN_EXTEND when deciding whether to call simplify_unary_operation.

gcc/testsuite/
PR rtl-optimization/91860
* gcc.dg/pr91860-1.c: New testcase.
* gcc.dg/pr91860-2.c: New testcase.
* gcc.dg/pr91860-3.c: New testcase.
* gcc.dg/pr91860-4.c: New testcase.

Added:
trunk/gcc/testsuite/gcc.dg/pr91860-1.c
trunk/gcc/testsuite/gcc.dg/pr91860-2.c
trunk/gcc/testsuite/gcc.dg/pr91860-3.c
trunk/gcc/testsuite/gcc.dg/pr91860-4.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/combine.c
trunk/gcc/testsuite/ChangeLog

[Bug tree-optimization/90883] Generated code is worse if returned struct is unnamed

2020-03-02 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90883

--- Comment #32 from Jim Wilson  ---
The proposed patch looks OK to me.  I suggest you submit it to gcc-patches.

[Bug rtl-optimization/92656] The zero_extend insn can't be eliminated in the combine pass

2020-02-27 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92656

--- Comment #5 from Jim Wilson  ---
A rewrite using dataflow would be better of course.  I'm just trying to
understand the problem with this testcase better, and maybe find a simple
solution, but I don't think that there is one.  The workarounds I see just make
the code more complicated and add more risk of something else going wrong.

[Bug rtl-optimization/92656] The zero_extend insn can't be eliminated in the combine pass

2020-02-27 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92656

--- Comment #3 from Jim Wilson  ---
Looking at this, I see that the problem occurs in record_value_for_reg where it
does
  if (!insn
  || (value && rsp->last_set_table_tick >= label_tick_ebb_start))
rsp->last_set_invalid = 1;
last_set_table_tick is 2 and label_tick_ebb_start is 1 because this is the
first block of the function.  This actually causes a lot of variables set in
the first block to be marked invalid if used in a successful combination two or
more times, which then prevents the nonzero bits info from being used for any
of them.

There seems to be a problem with how label_tick is used.  In the very first
block in the body of the function, label_tick is 2 and label_tick_ebb_start is
1.  This is because it is considered to be the second block in the ebb after
the entry block.  In the second block in the body of the function, label_tick
is 3 and label_tick_ebb_start is 3.  This means that every variable set in the
first block gets treated differently than in every block after the first.

If I add a little bit of code before the loop to force it to be the second
block, then I get correct output from combine.  I just added this before the
loop
  static int j = 0;
  if (val)
j++;

This also explains why the problem only occurs with -mtune=sifive-7-series
because this enables the conditional move support that turns the loop into a
single block, and then the -funroll-loops option fully unrolls the loop,
turning the entire function into one block, which prevents combine from
handling many of the register sets correctly because everything is in the first
block now.

This also explains why the problem started when the 2->2 combination support
was added, as that causes more successful combinations, and hence more
registers getting invalidated in the first block.

So the question is why we need label_tick > label_tick_ebb_start for the first
block of the function.  There is nothing set in the entry block other than hard
registers, and those could always be handled specially by just marking them as
invalid somehow before processing instructions.

Or alternatively, in record_value_for_reg, maybe we can add a check for a
pseudo reg only set once and not live in the prologue, and avoid marking it as
invalid when we process it a second time.  There are already a lot of checks
like this scattered around the code.

[Bug target/89627] Miscompiled constructor call

2020-01-27 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89627

Jim Wilson  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Assignee|unassigned at gcc dot gnu.org  |wilson at gcc dot 
gnu.org

--- Comment #5 from Jim Wilson  ---
Was fixed in gcc 9 last year.

[Bug target/91602] GCC fails to build for riscv in a combined tree due to misconfigured leb128 support

2020-01-29 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91602

--- Comment #10 from Jim Wilson  ---
The proposed binutils patch has multiple problems and has gone through multiple
iterations.  Not clear when or if we will be able to accept it.

The gcc configure patch to eliminate the call to gcc_GAS_VERSION_GTE_IFELSE for
in tree gas builds actually looked like the better solution to me though I
haven't tried it yet.  If we go this way the patch should perhaps eliminate
everything related to gcc_GAS_VERSION_GTE_IFELSE which is a much bigger patch.

Since combined tree builds are obsolete, this is low on my priority list.

[Bug target/91602] GCC fails to build for riscv in a combined tree due to misconfigured leb128 support

2020-01-29 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91602

--- Comment #11 from Jim Wilson  ---
Since Marxin pinged this and got me thinking about this again, I realized that
there is a simpler fix based on Serge's second suggestion.  We can just delete
the gas version number from the uleb128 gas check in configure.ac.  This will
force a gas feature check for uleb128 only, which solves the RISC-V build
problem, and is a nice small change.  I'm testing a patch for that.

[Bug target/93532] RISCV g++ hangs with optimization >= -O2

2020-02-04 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93532

--- Comment #11 from Jim Wilson  ---
I'm able to reproduce with the gcc-8-branch now.  Maybe I made a mistake with
my earlier build.  Anyways, it looks like it is going wrong here in the reload
dump

  
  Creating newreg=1856, assigning class NO_REGS to save r1856
  434: fa0:SF=call [`sqrtf'] argc:0
  REG_UNUSED fa0:SF
  REG_CALL_DECL `sqrtf'
  REG_EH_REGION 0
Add reg<-save after:
 2446: r114:SF#0=r1856:DF

  432: NOTE_INSN_BASIC_BLOCK 24
Add save<-reg after:
 2445: r1856:DF=r114:SF#0

  

then later we appear to end up in a loop generating secondary reloads that need
secondary reloads themselves, and so forth.  The instruction above looks funny,
trying to use a subreg to convert DFmode to SFmode.  I don't think we should be
generating that.

So it looks like a caller save problem.  If I add -fno-caller-saves the compile
finishes.  It appears that we need a definition for HARD_REGNO_CALLER_SAVE_MODE
because the default definition can't work here.  The comment in sparc.h for
HARD_REGNO_CALLER_SAVE_MODE looks relevant.  The same definition may work for
RISC-V.  Looks like the MIPS port does it the same way too.

[Bug target/93532] RISCV g++ hangs with optimization >= -O2

2020-02-06 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93532

--- Comment #12 from Jim Wilson  ---
A bisection on mainline between the gcc-8 and gcc-9 releases shows that this
testcase was fixed by a combine patch for PR87600 that stops combining hard
regs with pseudos to reduce register pressure.  The commentary refers to ira
and lra problems.  A combine patch won't be as safe as a RISC-V backend patch
though.

I tried testing the riscv HARD_REGNO_CALLER_SAVE_MODE patch with buildroot but
it turns out that it is downloading a pre-built compiler instead of building
one.  So dropping in the patch doesn't do anything.  I will have to figure out
what is going on there.

Trying the riscv patch with mainline on the testcase, I see that I get better
rematerialization without the confusing subregs, and I also get smaller stack
frames since we are saving SFmode now to the stack instead of DFmode now. 
Otherwise, I don't see any significant changes to the code.

I tried a make check with the riscv patch on mainline, and got an unexpected
g++ testsuite failure, so I will have to look into that.

[Bug target/93532] RISCV g++ hangs with optimization >= -O2

2020-02-06 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93532

Jim Wilson  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |wilson at gcc dot 
gnu.org

--- Comment #13 from Jim Wilson  ---
Created attachment 47794
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47794=edit
untested patch to fix the problem

[Bug target/93532] RISCV g++ hangs with optimization >= -O2

2020-02-03 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93532

Jim Wilson  changed:

   What|Removed |Added

 CC||wilson at gcc dot gnu.org

--- Comment #9 from Jim Wilson  ---
I tried the buildroot instructions.  It didn't work on an ubuntu 16.04 server
machine.  There is a 'python3 pip3 -q docwriter' command that hangs.  I also
discovered that the script isn't restartable.  It runs -rf on the build
directory and exits with an error.  I did get it to work on my ubuntu 18.04
laptop.  And it does hang, but it isn't the btPolyhedralContactClipping.cpp
file that hangs for me, it is the btBoxBoxDetector.cpp file.  I was able to
reproduce this with a gcc-8.3.0 build using -O2 -fPIC -fstack-protector-strong
options to compile the file.  It does not reproduce using the top of the
gcc-8-branch svn tree, suggesting that either it is already fixed, or it is
maybe a memory corruption problem that is hard to reproduce.

Using gdb to attach to the gcc-8.3.0 compiler, I see that it is looping in lra,
but I haven't tried to debug that yet.

#0  0x00705e7b in bitmap_find_bit (bit=42321, bit@entry=330, 
head=0x376ae88) at ../../gcc-8.3.0/gcc/bitmap.c:539
#1  bitmap_set_bit (head=0x376ae88, bit=bit@entry=42321)
at ../../gcc-8.3.0/gcc/bitmap.c:600
#2  0x0099b95f in mark_regno_dead (regno=42321, mode=, 
point=) at ../../gcc-8.3.0/gcc/lra-lives.c:362
#3  0x0099c9c4 in process_bb_lives (dead_insn_p=false, 
curr_point=@0x7ffc9a90: 181876, bb=)
at ../../gcc-8.3.0/gcc/lra-lives.c:842
#4  lra_create_live_ranges_1 (all_p=all_p@entry=true, 
dead_insn_p=dead_insn_p@entry=false)
at ../../gcc-8.3.0/gcc/lra-lives.c:1337
#5  0x0099e7c0 in lra_create_live_ranges (all_p=all_p@entry=true, 
dead_insn_p=dead_insn_p@entry=false)
at ../../gcc-8.3.0/gcc/lra-lives.c:1406
#6  0x00982d0c in lra (f=)
at ../../gcc-8.3.0/gcc/lra.c:2473
#7  0x0093fa32 in do_reload () at ../../gcc-8.3.0/gcc/ira.c:5465
#8  (anonymous namespace)::pass_reload::execute (this=)
at ../../gcc-8.3.0/gcc/ira.c:5649
...

[Bug target/93532] RISCV g++ hangs with optimization >= -O2

2020-02-03 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93532

--- Comment #10 from Jim Wilson  ---
Created attachment 47774
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47774=edit
testcase that reproduces for me

compile with -O2 -fPIC -fstack-protector-strong

[Bug tree-optimization/90883] Generated code is worse if returned struct is unnamed

2020-02-20 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90883

Jim Wilson  changed:

   What|Removed |Added

 CC||wilson at gcc dot gnu.org

--- Comment #29 from Jim Wilson  ---
The testcase works for riscv64-elf but does not work for riscv32-elf.  The
difference is in the einline pass before dse1.  riscv64-elf has
tmp.C:12:17: optimized:  Inlining constexpr C::C()/1 into C slow()/3.
where as riscv32-elf has
tmp.C:12:17: missed:   will not early inline: C slow()/3->constexpr
C::C()/1, call is cold and code would grow by 1

Since the constructor was not early inlined, dse1 can't eliminate the redundant
store.  The constructor eventually gets inlined between
085i.materialize-all-clones and 088t.fixup_cfg3 which allows dse2 to eliminate
the redundant store.

I can make the testcase work for riscv32-elf if I add
--param max-inline-insns-size=1
to allow the constructor to be inlined during the einline pass.  I didn't check
to see if this works for the other failing targets.

[Bug target/93532] RISCV g++ hangs with optimization >= -O2

2020-02-08 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93532

Jim Wilson  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #19 from Jim Wilson  ---
Patch applied to mainline.  This is just a minor optimization for gcc-10 as a
combiner patch between gcc-8 and gcc-9 reduces register pressure enough to
prevent the hang.  Hence there is no real need for the patch in gcc-9.  The
patch might be useful in gcc-8, but the problem is hard to reproduce, buildroot
is the only one that ran into the problem, and they can always add the patch to
their tree, so not clear if we really need it on the gcc-8 branch.

[Bug target/93532] RISCV g++ hangs with optimization >= -O2

2020-02-08 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93532

--- Comment #20 from Jim Wilson  ---
Thanks for confirming that it solves the buildroot build problem.

My gcc mainline g++ test failure turned out to be a thread related issue with
qemu cross testing.  The testcase works always on hardware, but fails maybe
10-20% of the time when run under qemu.  RISC-V qemu is known to still have a
few bugs in this area, though they might already be fixed in newer qemu
versions than what I have.

[Bug inline-asm/93202] [RISCV] ICE when using inline asm 'h' constraint modifier

2020-01-08 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93202

--- Comment #5 from Jim Wilson  ---
Jakub's patch looks OK to me.

[Bug inline-asm/93202] [RISCV] ICE when using inline asm 'h' constraint modifier

2020-01-08 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93202

--- Comment #7 from Jim Wilson  ---
(In reply to Luís Marques from comment #3)
> Jim Wilson: I'm not using it, I was only working on the LLVM implementation.
> Could you please clarify if following modifiers are also internal only?
> 
>   'C'  Print the integer branch condition for comparison OP.
>   'A'  Print the atomic operation suffix for memory model OP.
>   'F'  Print a FENCE if the memory model requires a release.

'C' maps an rtx to a string.  It is intended to be used for comparisons to emit
the appropriate compare instruction, because the instruction names match the
gcc internal rtx names.  It can't be used for its intended purpose in an asm,
as you can't get a comparison operator as an operand to an asm.  Since it works
with any rtx, it can be used in an asm, but is very unlikely to be useful.

'A' takes a memory order value from stdatomic.h, and emits a .acq if it is one
of the memory orders that requires an acquire operation, e.g. __ATOMIC_ACQUIRE.
 Gcc calls this a memory model internally, and defines the values in
memmodel.h.  The primary use is for the atomic builtin functions, to map the
memory order argument to the right instruction.  This takes an integer
argument, so in theory it could be used in an asm, but unlikely to be very
useful.

'F" is similar, except it is to atomic releases, and emits a fence instruction.
 This one is a bit of historical accident.  The gcc riscv port was written
before we had a formal memory model defined, and so to be conservative, it
emits fences in a lot of places where we probably don't need them.  Now that we
do have a formal memory model defined, the gcc port needs to be fixed to
implement it, except there is no one to do the work, so it is unclear when it
will happen.  Meanwhile, the port still emits a lot of fences we don't need via
'F'.  This takes an integer argument as above, so likewise in theory could be
used in an asm, but unlikely to be useful.  And this one has the additional
problem that it needs to change in a future gcc release, though we could
preserve the current meaning of the 'F' letter and use a new letter if
necessary in the rewrite.

The useful print operand letters are the ones for registers, constants, and
addresses.  These random ones used for internal gcc features aren't really
useful in asms.

[Bug target/93304] RISC-V: Function with interrupt attribute use register without save/restore at prologue/epilogue

2020-01-17 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93304

Jim Wilson  changed:

   What|Removed |Added

 CC||wilson at gcc dot gnu.org

--- Comment #2 from Jim Wilson  ---
There is a convention of using all caps for function arguments.  See for
instance the riscv_build_integer function comment.  It would be nice to
preserve this convention, but this is a very minor issue.  I usually put a
blank line between the function comment and the function, but again this is a
very minor issue.

The patch looks OK to me.

[Bug target/93333] ICE: RTL check: expected code 'const_int', have 'and' in riscv_rtx_costs, at config/riscv/riscv.c:1645

2020-01-20 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9

--- Comment #4 from Jim Wilson  ---
I tried some cross testing with rtl checking enabled, and found another rtl
check bug with the -msave-restore support in config/riscv/riscv-sr.c where it
uses XINT to read from a CONST_INT which is wrong, as it is actually an XWINT
value, and we should be using INTVAL to read the value.  I've tested a patch
for that, and can commit it tomorrow.  -msave-restore is for embedded code
size, so this shouldn't be a problem for linux users.

[Bug target/93333] ICE: RTL check: expected code 'const_int', have 'and' in riscv_rtx_costs, at config/riscv/riscv.c:1645

2020-01-20 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9

Jim Wilson  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2020-01-21
 Ever confirmed|0   |1

--- Comment #2 from Jim Wilson  ---
I can reproduce.  Reproducing requires enabling rtl checking which is not on by
default.  I suspect that there are other similar problems, as we probably
haven't tested a build with rtl checking enabled before.

The problem is in riscv_rtx_costs which only needs to return valid values for
valid rtl, and it is failing the rtl check for invalid rtl, so this isn't a
major problem if rtl checking is off, but it does need to be fixed to be safe.

[Bug target/93333] ICE: RTL check: expected code 'const_int', have 'and' in riscv_rtx_costs, at config/riscv/riscv.c:1645

2020-01-20 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9

--- Comment #3 from Jim Wilson  ---
Jakub's patch looks OK, and works for the testcase.

[Bug target/93333] ICE: RTL check: expected code 'const_int', have 'and' in riscv_rtx_costs, at config/riscv/riscv.c:1645

2020-01-21 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9

Jim Wilson  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Jim Wilson  ---
Fixed on mainline.

[Bug target/93304] RISC-V: Function with interrupt attribute use register without save/restore at prologue/epilogue

2020-01-21 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93304

Jim Wilson  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Jim Wilson  ---
Fixed on mainline.

[Bug inline-asm/93202] [RISCV] ICE when using inline asm 'h' constraint modifier

2020-01-08 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93202

--- Comment #2 from Jim Wilson  ---
%h is used for the gcc internal implementation of emitting auipc.  I'm
skeptical that it is useful for asms.  Stripping the HIGH rtx is an internal
implementation detail, and does not apply to asms, as you can't get a HIGH
there.  Is there a reason why you are trying to use it?  There may be a better
solution for what you need.  If we really need %h to work in asms then it
probably needs some inconvenient work.  I'd rather document that %h shouldn't
be used in asms, or leave it undocumented as an internal gcc implementation
detail.  I'm assuming that you are just working on llvm support, and don't
actually need %h to work in asms, you just need llvm and gcc compatibility.

riscv_print_operand does use output_operand_lossage as it should.  But it calls
a function riscv_print_operand_reloc which calls gcc_unreachable in a switch
statement.  That is an oversight.  It can be fixed to use
output_operand_lossage too.

[Bug target/93062] Failed to generate indirect branch for long branches on riscv

2019-12-24 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93062

Jim Wilson  changed:

   What|Removed |Added

 CC||wilson at gcc dot gnu.org

--- Comment #1 from Jim Wilson  ---
We need to check length attributes in the branch patterns, and emit different
sequences depending on the length.  There are multiple examples to compare
with, for instance the condjump pattern in the aarch64.md file.

[Bug testsuite/93045] gc bug with test "start_unit-test-1.c"

2019-12-24 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93045

Jim Wilson  changed:

   What|Removed |Added

 CC||wilson at gcc dot gnu.org

--- Comment #1 from Jim Wilson  ---
This worked for me running natively on a fedora rawhide system with a 4.15
linux kernel.

[Bug target/93062] Failed to generate indirect branch for long branches on riscv

2019-12-24 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93062

Jim Wilson  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-12-25
 Ever confirmed|0   |1

[Bug middle-end/94083] New: inefficient soft-float x!=Inf code

2020-03-06 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94083

Bug ID: 94083
   Summary: inefficient soft-float x!=Inf code
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: wilson at gcc dot gnu.org
  Target Milestone: ---

Given a testcase like this
int foo(void) {
  volatile float f;
  intn;
  f = __builtin_huge_valf();
  n += 1 - (f != __builtin_huge_valf());
  return n;
}
and compiling for soft-float, we end up with a call to __unordsf2 followed by a
call to __lesf2.  This means the floats have to be unpacked twice and checked
for nan twice. This gives both poor performance and poor code size.  I've
confirmed this for x86, arm, and riscv.

Folding in the C front end is creating an unordered less then or equal
comparison against FLT_MAX.  From the 004.original file
n = SAVE_EXPR  + n;

This optimization is coming from a rule in the match.pd file.
  /* x != +Inf is always equal to !(x > DBL_MAX), but this introduces
 an exception for x a NaN so use an unordered comparison.  */

When we generate rtl, we call do_compare_rtx_and_jump which notices that we
don't have an operation for UNLE_EXPR, but decides we can't reverse it because
it is unsafe.  It tries swapping arguments, but we don't have UNGE_EXPR either.
 So it emits two libcalls.

Converting a NE compare to a UNLE compare looks like an odd optimization.  If
we want to consider unordered operations as canonical operations, then maybe we
should add libgcc support for the unordered operations.  Or maybe we should
check to see if unordered operations are handled by the target before
converting a simple NE into a UNLE.

The match.pd rule was changed to use UNLE in the patch for PR 64811 which fixed
a problem with handling NaNs.  This happened 2018-01-09.  The optimization
dates back to 2003-05-22 but was originally using LE which is OK for
soft-float.  It wasn't until the NaN bug was fixed by using UNLE instead of LE
that this became an optimization problem.

Maybe we just shouldn't perform this optimization when honoring NaNs?  That
would avoid generating the problematic unordered operation early in the
optimizer.

[Bug c++/94044] [10 Regression] internal compiler error: in comptypes, at cp/typeck.c:1490 on riscv64-unknown-linux-gnu and arm-eabi

2020-03-14 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94044

--- Comment #9 from Jim Wilson  ---
(In reply to Jakub Jelinek from comment #8)
> So perhaps to ease reproduction, tweak the hash function in this case to
> always return 0?

Yes, that works.  I just didn't have a chance to look at the hash function last
night.  With the hash function hacked I can reproduce for any target and any
-std=c++X value.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 789ccdb..4337928 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -1733,7 +1733,8 @@ hash_tmpl_and_args (tree tmpl, tree args)
 hashval_t
 spec_hasher::hash (spec_entry *e)
 {
-  return hash_tmpl_and_args (e->tmpl, e->args);
+  return 0;
+  //  return hash_tmpl_and_args (e->tmpl, e->args);
 }

 /* Recursively calculate a hash value for a template argument ARG, for use

[Bug target/94136] GCC doc for built-in function __builtin___clear_cache() not 100% correct

2020-03-11 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94136

Jim Wilson  changed:

   What|Removed |Added

 CC||wilson at gcc dot gnu.org

--- Comment #3 from Jim Wilson  ---
Another possible issue is that __clear_cache is defined in the libgcc docs. 
But only some platforms are defining CLEAR_INSN_CACHE so only some targets have
a usable __clear_cache function.

[Bug bootstrap/92008] Build failure on cygwin

2020-04-16 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92008

Jim Wilson  changed:

   What|Removed |Added

 CC||wilson at gcc dot gnu.org

--- Comment #21 from Jim Wilson  ---
This looks the same as a binutils bug.
https://sourceware.org/bugzilla/show_bug.cgi?id=22941

The easy solution is to touch intl/plural.c after checkout, so that bison won't
be run.  The contrib/gcc_update script already does this.  So the simplest
solution for the original problem is to use contrib/gcc_update to update a gcc
tree, or "contrib/gcc_update --touch" if you want to fix a gcc tree without
updating it.  If your gcc git source tree was already mangled by a bad bison
run, you will have to manually reset it to a clean tree, e.g. "git reset
--hard" or "git diff > tmp.file; patch -p1 --reverse < tmp.file; rm tmp.file"
or whatever, and then run the contrib/gcc_update --touch command.  Binutils
unfortunately does not have an equivalent to the gcc_update script and hence
requires a fix.

git unfortunately does not preserve file timestamps across commit and checkout,
so when you checkout a file it gets the current time.  git also tends to check
out files in alphabetical order.  If you are on a fast filesystem, i.e. linux,
plural.c and plural.y almost always get the same timestamp and bison isn't run.
 If you are on a slow filesystem, i.e. cygwin, plural.c is often older than
plural.y, and bison must be run, and the current bison version fails.  This is
why it is cygwin folk that most commonly run into this problem.

[Bug c++/94044] [10 Regression] internal compiler error: in comptypes, at cp/typeck.c:1490 on riscv64-unknown-linux-gnu and arm-eabi

2020-03-13 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94044

Jim Wilson  changed:

   What|Removed |Added

 CC||wilson at gcc dot gnu.org

--- Comment #7 from Jim Wilson  ---
I made an attempt to reproduce this.  I wasn't able to reproduce with an
arm-eabi build.  I was able to reproduce with a riscv64-linux build.  The funny
part is that I was able to build two compilers from the same gcc sources, one
which reproduces and one which does not, which differ only in exactly how I did
the build.  For the failing build, I had a complete riscv-gnu-toolchain build
available when configuring.  For the working build it was just binutils and gcc
without glibc/linux header files, and a top-of-tree binutils version unlike the
first build.

Debugging the two side by side, I see that execution diverges at line 9680 in
cp/pt.c
  entry = type_specializations->find_with_hash (, hash);
The working compiler has no hash hit and returns zero.  The failing compiler
has a hash hit, and then dies inside spec_hasher::equal.

In the spec_hasher::equal function I see

(gdb) print *e1
$29 = {tmpl = ,
  args = , spec = }
(gdb) print *e2
$30 = {tmpl = ,
  args = , spec = }

(gdb) pt
 
unit-size 
align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x75deb7e0 precision:64 min  max 
pointer_to_this >
readonly
arg:0  elt:1 >
type_0 type_6 VOID
align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x75ef4540>
tmp.C:13:23 start: tmp.C:13:23 finish: tmp.C:13:35>>
(gdb) print e2->args
$32 = 
(gdb) pt
 
unit-size 
align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x75deb7e0 precision:64 min  max 
pointer_to_this >
readonly
arg:0  elt:1 >
type_0 type_6 VOID
align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x75efa2a0>
tmp.C:13:23 start: tmp.C:13:23 finish: tmp.C:13:35>>
(gdb) 

It then eventually dies inside comptypes because TREE_CODE (t1) is
type_pack_expansion.  And also TREE_CODE (t2) is type_pack_expansion.  This is
called from the SIZEOF_EXPR case in cp_tree_equal.

If tree addresses are being used for the hash codes, this could just be bad
luck whether it fails or not.

[Bug c++/64697] C++11 thread_local: relocation truncated to fit: R_X86_64_PC32 against undefined symbol `TLS init function for N::ptd'

2020-03-26 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64697

--- Comment #24 from Jim Wilson  ---
Joel Sherrill offered to create a binutils bug report for this.

[Bug c++/64697] C++11 thread_local: relocation truncated to fit: R_X86_64_PC32 against undefined symbol `TLS init function for N::ptd'

2020-03-26 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64697

Jim Wilson  changed:

   What|Removed |Added

 CC||wilson at gcc dot gnu.org

--- Comment #22 from Jim Wilson  ---
This looks like a binutils bug to me.  A call to an undefined weak function
should never be executed, so it is OK for the linker to convert that call
instruction into anything convenient.  There is no need for a relocation that
can reach an address of zero.  We can convert the call instruction to call
itself, or the next instruction, or change it to a nop, what ever is
convenient, it doesn't really matter.

A number of binutils ports already have code to handle related problems.  ARM
and RISC-V for sure.  Probably others.  It looks like this support is missing
from the x86_64 port.  I'd suggest refiling this as a binutils bug.  See for
instance
  https://sourceware.org/bugzilla/show_bug.cgi?id=23244
for a RISC-V example of the same problem.  But we need a new bug for the x86_64
problem.  RISC-V has a register hard wired to zero, so I rewrite the call
instruction to use x0 as the base address.  The arm port turns the call into a
nop.

<    1   2   3   4   5   >