[Bug target/63304] Aarch64 pc-relative load offset out of range

2016-01-25 Thread ard.biesheuvel at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #50 from ard.biesheuvel at linaro dot org ---
(In reply to Ramana Radhakrishnan from comment #49)
> (In reply to ard.biesheuvel from comment #48)
> > (In reply to Richard Earnshaw from comment #47)
> > > (In reply to ard.biesheuvel from comment #46)
> > > > One issue that this causes, which I did not see mentioned anywhere in 
> > > > the
> > > > thread, is that the use of adrp/add and adrp/ldr imposes a 4 KB section
> > > > alignment. In EDK2 Tianocore (UEFI reference implementation), we
> > > > deliberately use -mcmodel=large to get around this requirement, since 
> > > > code
> > > > size is a big deal when executing from NOR flash, and the architecture 
> > > > of
> > > > EDK2 (lots and lots of small separately linked binaries) makes
> > > > the overhead of 4 KB section alignment prohibitive. (It uses 32 byte 
> > > > section
> > > > alignment unless there are objects like a vector table that require 
> > > > more)
> > > 
> > > Huh?  It imposes a 4k *SEGMENT* alignment.  It doesn't impose a section
> > > alignment.
> > 
> > Indeed, apologies for mixing up the lingo.
> > 
> > But my point is that -mcmodel=large did not use to impose such a minimum
> > alignment, and with this change, it now does. Would it perhaps make sense to
> > default enable this feature only for -mcmodel=small (which already uses
> > adrp/add anyway)
> 
> 
> -mcmodel=large allows for images > 1MiB to be compiled, this change enables
> functions > 1MiB in size to exist in images compiled and linked with
> -mcmodel=large.
> 

AIUI the small model allows images up to 4 GB (since that is the range of
adrp+lo12), and the large model allows any size, by emitting cross section
references as literals containing absolute addresses (rather than movz/movk
sequences). This relies on the literals themselves being in range, which is
usually the case, of course, if they are emitted into the same section, modulo
the uses cases that caused this bug in the first place. (BTW my GCC man page
erroneously lists the tiny model as having a 1 GB range but this should
obviously be 1 MB)

So by enabling this feature by default, you are now imposing a 4 GB image limit
on the large model as well, since the literals are moved to a different section
(which, given our choice for the large model, is not guaranteed to be within 4
GB in the final image), and referenced via instructions that only have a 4 GB
range.

> As of now, if you really want to use -mcmodel=large for working around this,
> you already have -mpc-relative-literal-loads to allow this.

I am arguing that enabling this feature essentially breaks the large model, so
it should not be enabled by default, and perhaps be made mutually exclusive.

[Bug target/63304] Aarch64 pc-relative load offset out of range

2016-01-25 Thread ramana at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #49 from Ramana Radhakrishnan  ---
(In reply to ard.biesheuvel from comment #48)
> (In reply to Richard Earnshaw from comment #47)
> > (In reply to ard.biesheuvel from comment #46)
> > > One issue that this causes, which I did not see mentioned anywhere in the
> > > thread, is that the use of adrp/add and adrp/ldr imposes a 4 KB section
> > > alignment. In EDK2 Tianocore (UEFI reference implementation), we
> > > deliberately use -mcmodel=large to get around this requirement, since code
> > > size is a big deal when executing from NOR flash, and the architecture of
> > > EDK2 (lots and lots of small separately linked binaries) makes
> > > the overhead of 4 KB section alignment prohibitive. (It uses 32 byte 
> > > section
> > > alignment unless there are objects like a vector table that require more)
> > 
> > Huh?  It imposes a 4k *SEGMENT* alignment.  It doesn't impose a section
> > alignment.
> 
> Indeed, apologies for mixing up the lingo.
> 
> But my point is that -mcmodel=large did not use to impose such a minimum
> alignment, and with this change, it now does. Would it perhaps make sense to
> default enable this feature only for -mcmodel=small (which already uses
> adrp/add anyway)


-mcmodel=large allows for images > 1MiB to be compiled, this change enables
functions > 1MiB in size to exist in images compiled and linked with
-mcmodel=large.

As of now, if you really want to use -mcmodel=large for working around this,
you already have -mpc-relative-literal-loads to allow this.

[Bug target/63304] Aarch64 pc-relative load offset out of range

2016-01-25 Thread ard.biesheuvel at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #48 from ard.biesheuvel at linaro dot org ---
(In reply to Richard Earnshaw from comment #47)
> (In reply to ard.biesheuvel from comment #46)
> > One issue that this causes, which I did not see mentioned anywhere in the
> > thread, is that the use of adrp/add and adrp/ldr imposes a 4 KB section
> > alignment. In EDK2 Tianocore (UEFI reference implementation), we
> > deliberately use -mcmodel=large to get around this requirement, since code
> > size is a big deal when executing from NOR flash, and the architecture of
> > EDK2 (lots and lots of small separately linked binaries) makes
> > the overhead of 4 KB section alignment prohibitive. (It uses 32 byte section
> > alignment unless there are objects like a vector table that require more)
> 
> Huh?  It imposes a 4k *SEGMENT* alignment.  It doesn't impose a section
> alignment.

Indeed, apologies for mixing up the lingo.

But my point is that -mcmodel=large did not use to impose such a minimum
alignment, and with this change, it now does. Would it perhaps make sense to
default enable this feature only for -mcmodel=small (which already uses
adrp/add anyway)

[Bug target/63304] Aarch64 pc-relative load offset out of range

2016-01-25 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #47 from Richard Earnshaw  ---
(In reply to ard.biesheuvel from comment #46)
> One issue that this causes, which I did not see mentioned anywhere in the
> thread, is that the use of adrp/add and adrp/ldr imposes a 4 KB section
> alignment. In EDK2 Tianocore (UEFI reference implementation), we
> deliberately use -mcmodel=large to get around this requirement, since code
> size is a big deal when executing from NOR flash, and the architecture of
> EDK2 (lots and lots of small separately linked binaries) makes
> the overhead of 4 KB section alignment prohibitive. (It uses 32 byte section
> alignment unless there are objects like a vector table that require more)

Huh?  It imposes a 4k *SEGMENT* alignment.  It doesn't impose a section
alignment.

[Bug target/63304] Aarch64 pc-relative load offset out of range

2016-01-24 Thread ard.biesheuvel at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

ard.biesheuvel at linaro dot org changed:

   What|Removed |Added

 CC||ard.biesheuvel at linaro dot 
org

--- Comment #46 from ard.biesheuvel at linaro dot org ---
One issue that this causes, which I did not see mentioned anywhere in the
thread, is that the use of adrp/add and adrp/ldr imposes a 4 KB section
alignment. In EDK2 Tianocore (UEFI reference implementation), we deliberately
use -mcmodel=large to get around this requirement, since code size is a big
deal when executing from NOR flash, and the architecture of EDK2 (lots and lots
of small separately linked binaries) makes
the overhead of 4 KB section alignment prohibitive. (It uses 32 byte section
alignment unless there are objects like a vector table that require more)

[Bug target/63304] Aarch64 pc-relative load offset out of range

2016-01-21 Thread ramana at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |6.0

--- Comment #45 from Ramana Radhakrishnan  ---
I no longer have plans to fix this in the GCC 5 branch.

[Bug target/63304] Aarch64 pc-relative load offset out of range

2016-01-21 Thread ramana at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #44 from Ramana Radhakrishnan  ---

(In reply to Christophe Lyon from comment #39)
> We have backported r227748, 229160 and 229161 to our linaro-gcc-5 branch,
> and we got a bug report from the kernel team.

Sorry about the breakage. 

> 
> Indeed, when the kernel is configured with CONFIG_ARM64_ERRATUM_843419, the
> support for relocations R_AARCH64_ADR_PREL_PG_HI21 and
> R_AARCH64_ADR_PREL_PG_HI21_NC is removed from the kernel to avoid loading
> objects with possibly offending sequences. In this case the kernel is built
> with
> -mcmodel=large.
> 
> With these backports, these relocations are generated again by default.

The reason for the code generation to be in this form , is to store literal
pools in rodata and not have any data in the code segment at all. Now when
these need to be addressed that far away, addressing them with adrp / load is
reasonable as you are thinking of a text + rodata segment to be > 4GB before
this fails.

In an ideal world this would be implemented by the 4 insn mov / movk sequence
to construct the literal address, however those relocations are not likely to
be supported by the kernel loader (or rather I haven't checked). 


> 
> Adding -mpc-relative-literal-loads to the kernel Makefile in
> arch/arm64/Makefile does fix the build, but since this option is not
> supported by GCC if it does not contain these backports (and the compiler
> aborts with an error), it's not obvious how to modify the Makefile to decide
> when to use this option.

(In reply to Christophe Lyon from comment #43)
> Indeed, that seems safer.
> However, reading http://www.spinics.net/lists/arm-kernel/msg445346.html
> there is a comment saying:
> --
> Note that the kernel itself must be linked with a version of ld
> which fixes potentially affected ADRP instructions through the
> use of veneers
> ---
> 
> But I don't know how this is enforced.


With kernel modules, there's no way the linker fix can be used as kernel
modules are simply put just object files. When I did the work, I had forgotten
about the erratum and the repercussions on kernel modules. Thus
-mpc-relative-literal-loads is quite a reasonable workaround to the problem.


> 
> Or... change GCC's -mfix-cortex-a53-843419 to change the default for
> -mpc-relative-literal-loads and add -mfix-cortex-a53-843419 to the kernel
> Makefile.


That's probably not that bad an idea as a short term workaround for GCC 6.

[Bug target/63304] Aarch64 pc-relative load offset out of range

2016-01-21 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #43 from Christophe Lyon  ---
Indeed, that seems safer.
However, reading http://www.spinics.net/lists/arm-kernel/msg445346.html there
is a comment saying:
--
Note that the kernel itself must be linked with a version of ld
which fixes potentially affected ADRP instructions through the
use of veneers
---

But I don't know how this is enforced.

[Bug target/63304] Aarch64 pc-relative load offset out of range

2016-01-21 Thread cbaylis at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #42 from cbaylis at gcc dot gnu.org ---

While we're suggesting fixes to the kernel, wouldn't it be better if
CONFIG_ARM64_ERRATUM_843419 forced the kernel to be built with the linker
workarounds if the kernel is configured

KBUILD_LDFLAGS += --fix-cortex-a53-843419
or maybe
KBUILD_LDFLAGS += $(call ld-option, --fix-cortex-a53-843419)

[Bug target/63304] Aarch64 pc-relative load offset out of range

2016-01-21 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #41 from Christophe Lyon  ---
Indeed, having:

ifeq ($(CONFIG_ARM64_ERRATUM_843419), y)
KBUILD_CFLAGS_MODULE+= -mcmodel=large
KBUILD_CFLAGS_MODULE+= $(call cc-option, -mpc-relative-literal-loads)
endif

in arch/arm64/Makefile does the trick.

[Bug target/63304] Aarch64 pc-relative load offset out of range

2016-01-21 Thread cbaylis at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

cbaylis at gcc dot gnu.org changed:

   What|Removed |Added

 CC||cbaylis at gcc dot gnu.org

--- Comment #40 from cbaylis at gcc dot gnu.org ---
The kernel does support a mechanism to add a command line option for only
compilers which support it.

Documentation/kbuild/makefiles.txt:
cc-option
cc-option is used to check if $(CC) supports a given option, and if
not supported to use an optional second option.

I think you could therefore add this line to arch/arm64/Makefile
KBUILD_CFLAGS   += $(call cc-option,-mpc-relative-literal-loads)

[Bug target/63304] Aarch64 pc-relative load offset out of range

2016-01-21 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #39 from Christophe Lyon  ---
We have backported r227748, 229160 and 229161 to our linaro-gcc-5 branch, and
we got a bug report from the kernel team.

Indeed, when the kernel is configured with CONFIG_ARM64_ERRATUM_843419, the
support for relocations R_AARCH64_ADR_PREL_PG_HI21 and
R_AARCH64_ADR_PREL_PG_HI21_NC is removed from the kernel to avoid loading
objects with possibly offending sequences. In this case the kernel is built
with
-mcmodel=large.

With these backports, these relocations are generated again by default.

Adding -mpc-relative-literal-loads to the kernel Makefile in
arch/arm64/Makefile does fix the build, but since this option is not supported
by GCC if it does not contain these backports (and the compiler aborts with an
error), it's not obvious how to modify the Makefile to decide when to use this
option.

So, although Ramana said he would backport these to the FSF gcc-5 branch, maybe
it's safer not to.

Or change the default?

Or... change GCC's -mfix-cortex-a53-843419 to change the default for
-mpc-relative-literal-loads and add -mfix-cortex-a53-843419 to the kernel
Makefile.

[Bug target/63304] Aarch64 pc-relative load offset out of range

2016-01-03 Thread tulipawn at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #38 from PeteVine  ---
Hi all,

I'm trying to narrow down a similar issue on armv7 and would welcome any
suggestions where to start:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69082

Thx

[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-11-13 Thread e.menezes at samsung dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #37 from Evandro  ---
Here's what I had in mind:
https://gcc.gnu.org/ml/gcc-patches/2015-11/msg01787.html

Feedback is welcome.

[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-11-06 Thread e.menezes at samsung dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #36 from Evandro  ---
(In reply to Ramana Radhakrishnan from comment #35)
> (In reply to Evandro from comment #32)
> > Because of side effects of the Haiffa scheduler, the loads now pile up, and
> > the ADRPs may affect the load issue rate rather badly if not fused.  At leas
> > on our processor.  
> 
> In straight line code I can imagine this happening - In loopy code I would
> have expected the constants to be hoisted - atleast that's what I remember
> seeing in my analysis. You have seen -mcprelative-literal-loads haven't you
> ? 

The cases that I have in mind involve SL code in functions which are called
form a loop.  Since they are external, only LTO would address such cases.  And,
since we do not control how they are built, we have to handle them as they
come.

As long as there's an opening to investigate the benefits and drawbacks of
reverting to the legacy way considering the function size, I think that it's
interesting to find out the results.

Thank you.

[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-11-06 Thread ramana at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #35 from Ramana Radhakrishnan  ---
(In reply to Evandro from comment #32)
> (In reply to Ramana Radhakrishnan from comment #31)
> > (In reply to Evandro from comment #30)
> > > The performance impact of always referring to constants as if they were 
> > > far
> > > away is significant on targets which do not fuse ADRP and LDR together. 
> > 
> > What happens if you split them up and schedule them appropriately ? I didn't
> > see any significant impact in my benchmarking on implementations that did
> > not implement such fusion. Where people want performance in these cases they
> > can well use -mpc-relative-literal-loads or -mcmodel=tiny - it's in there
> > already.
> 
> Because of side effects of the Haiffa scheduler, the loads now pile up, and
> the ADRPs may affect the load issue rate rather badly if not fused.  At leas
> on our processor.  

In straight line code I can imagine this happening - In loopy code I would have
expected the constants to be hoisted - atleast that's what I remember seeing in
my analysis. You have seen -mcprelative-literal-loads haven't you ? 

> 
> Which brings another point, shouldn't there be just one ADRP per BB or,
> ideally, per function?  Or am I missing something?

That isn't really how this thing works. Well the constants are shared across
the program now as you say down thread. 

> 
> I'll investigate placing the constant after the function, as before, if the
> estimated function size allows for it.  I think that eliminating the ADRPs
> could potentially be more beneficial to code size than merging constants in
> a common literal pool (v. http://bit.ly/1Ptc8nh).

I was actually surprised by the amount of constant sharing that was happening
in what I looked at. 


Thanks,
Ramana

[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-11-06 Thread e.menezes at samsung dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #34 from Evandro  ---
(In reply to Wilco from comment #33)
> (In reply to Evandro from comment #32)
> ADRP latency to load-address should be zero on any OoO core - ADRP is
> basically a move-immediate, so can execute early and hide any latency.

In an ideal world, yes.  In the actual world, they compete for limited
resources that could be used by other insns.

> > Which brings another point, shouldn't there be just one ADRP per BB or,
> > ideally, per function?  Or am I missing something?
> 
> That's not possible in this case as the section is mergeable. An alternative
> implementation using anchors may be feasible, but GCC is extremely bad at
> using anchors efficiently - functions using several global variables also
> end up with a large number of ADRPs when you'd expect a single ADRP.

I see.  

I'll investigate placing the constant after the function, as before, if the
estimated function size allows for it.  I think that eliminating the ADRPs
could potentially be more beneficial to code size than merging constants in a
common literal pool (v. http://bit.ly/1Ptc8nh).

Thank you.

[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-11-06 Thread wdijkstr at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #33 from Wilco  ---
(In reply to Evandro from comment #32)
> (In reply to Ramana Radhakrishnan from comment #31)
> > (In reply to Evandro from comment #30)
> > > The performance impact of always referring to constants as if they were 
> > > far
> > > away is significant on targets which do not fuse ADRP and LDR together. 
> > 
> > What happens if you split them up and schedule them appropriately ? I didn't
> > see any significant impact in my benchmarking on implementations that did
> > not implement such fusion. Where people want performance in these cases they
> > can well use -mpc-relative-literal-loads or -mcmodel=tiny - it's in there
> > already.
> 
> Because of side effects of the Haiffa scheduler, the loads now pile up, and
> the ADRPs may affect the load issue rate rather badly if not fused.  At leas
> on our processor.  

ADRP latency to load-address should be zero on any OoO core - ADRP is basically
a move-immediate, so can execute early and hide any latency.

> Which brings another point, shouldn't there be just one ADRP per BB or,
> ideally, per function?  Or am I missing something?

That's not possible in this case as the section is mergeable. An alternative
implementation using anchors may be feasible, but GCC is extremely bad at using
anchors efficiently - functions using several global variables also end up with
a large number of ADRPs when you'd expect a single ADRP.

[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-11-06 Thread e.menezes at samsung dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #32 from Evandro  ---
(In reply to Ramana Radhakrishnan from comment #31)
> (In reply to Evandro from comment #30)
> > The performance impact of always referring to constants as if they were far
> > away is significant on targets which do not fuse ADRP and LDR together. 
> 
> What happens if you split them up and schedule them appropriately ? I didn't
> see any significant impact in my benchmarking on implementations that did
> not implement such fusion. Where people want performance in these cases they
> can well use -mpc-relative-literal-loads or -mcmodel=tiny - it's in there
> already.

Because of side effects of the Haiffa scheduler, the loads now pile up, and the
ADRPs may affect the load issue rate rather badly if not fused.  At leas on our
processor.  

Which brings another point, shouldn't there be just one ADRP per BB or,
ideally, per function?  Or am I missing something?

> > What's the status of the solution that evaluates the function size? 
> 
> I am not working on that follow-up as I didn't see the real need for it in
> the benchmarking results I was looking at. You are welcome to investigate.

OK

[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-11-06 Thread ramana at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #31 from Ramana Radhakrishnan  ---
(In reply to Evandro from comment #30)
> The performance impact of always referring to constants as if they were far
> away is significant on targets which do not fuse ADRP and LDR together. 

What happens if you split them up and schedule them appropriately ? I didn't
see any significant impact in my benchmarking on implementations that did not
implement such fusion. Where people want performance in these cases they can
well use -mpc-relative-literal-loads or -mcmodel=tiny - it's in there already.

> What's the status of the solution that evaluates the function size? 

I am not working on that follow-up as I didn't see the real need for it in the
benchmarking results I was looking at. You are welcome to investigate.

> Should
> this be optionally enabled only?  

It is enabled by default for -mcmodel=small and -mcmodel=large. 

And no because it has been done after quite a lot of complaints from the
general user community that people are unable to build large software bases
with the compiler.

> Could the assembler be left to address this issue by
> relaxing such loads? :-P  

No...

[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-11-06 Thread e.menezes at samsung dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #30 from Evandro  ---
The performance impact of always referring to constants as if they were far
away is significant on targets which do not fuse ADRP and LDR together.  What's
the status of the solution that evaluates the function size?  Should this be
optionally enabled only?  Would it be the case to come up with a medium code
model? :-P  Could the assembler be left to address this issue by relaxing such
loads? :-P  

Thank you.

[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-10-21 Thread ramana at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #29 from Ramana Radhakrishnan  ---
Author: ramana
Date: Thu Oct 22 04:26:50 2015
New Revision: 229160

URL: https://gcc.gnu.org/viewcvs?rev=229160&root=gcc&view=rev
Log:
[Patch AArch64 63304] Fix issue with global state.


Jiong pointed out privately that there was a thinko
in the way in which the global state was being
set and reset. I don't like adding such
global state but 


2015-10-22  Ramana Radhakrishnan  

PR target/63304
* config/aarch64/aarch64.c (aarch64_nopcrelative_literal_loads): New.
(aarch64_expand_mov_immediate): Use aarch64_nopcrelative_literal_loads.
(aarch64_classify_address): Likewise.
(aarch64_secondary_reload): Likewise.
(aarch64_override_options_after_change_1): Adjust.
* config/aarch64/aarch64.md
(aarch64_reload_movcp):
Use aarch64_nopcrelative_literal_loads.
(aarch64_reload_movcp): Likewise.
* config/aarch64/aarch64-protos.h (aarch64_nopcrelative_literal_loads): 
Declare

2015-10-22  Jiong Wang  
Ramana Radhakrishnan  

PR target/63304
* gcc.target/aarch64/pr63304_1.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/aarch64/pr63304_1.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/aarch64/aarch64-protos.h
trunk/gcc/config/aarch64/aarch64.c
trunk/gcc/config/aarch64/aarch64.md
trunk/gcc/testsuite/ChangeLog


[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-10-09 Thread ramana at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #28 from Ramana Radhakrishnan  ---
Author: ramana
Revision: 228644
Modified property: svn:log

Modified: svn:log at Fri Oct  9 11:08:05 2015
--
--- svn:log (original)
+++ svn:log Fri Oct  9 11:08:05 2015
@@ -1,45 +1,43 @@
-[AArch64] Handle literal pools for functions > 1 MiB in size.
-
+[PATCH PR target/67366 2/2] [gimple-fold.c] Support movmisalign optabs in
gimple-fold.c

-This patch fixes the issue in PR63304 where we have
-functions that are > 1MiB. The idea is to use adrp / ldr or adrp / add
-instructions to address the literal pools under the use of a command line
-option. I would like to turn this on by default on trunk but keep this
-disabled by default for the release branches in order to get some
-serious testing for this feature while it bakes on trunk.
+This patch by Richard allows for movmisalign optabs to be supported
+in gimple-fold.c. This caused a bit of pain in the testsuite with
strlenopt-8.c
+in conjunction with the ARM support for movmisalign_optabs as the test
+was coded up to do different things depending on whether the target
+supported misaligned access or not. However now with unaligned access
+being allowed for different levels of the architecture in the arm backend,
+the concept of the helper function non_strict_align mapping identically
+to the definition of STRICT_ALIGNMENT disappears.

-As a follow-up I would like to try and see if estimate_num_insns or
-something else can give us a heuristic to turn this on for "large" functions.
-After all the number of incidences of this are quite low in real life,
-so may be we should look to restrict this use as much as possible on the
-grounds that this code generation implies an extra integer register for
-addressing for every floating point and vector constant and I don't think
-that's great in code that already may have high register pressure.
+Adjusted thusly for ARM. The testsuite/lib changes were tested with an
+arm-none-eabi multilib that included architecture variants that did not
+support unaligned access and architecture variants that did.

-Tested on aarch64-none-elf with no regressions. A previous
-version was bootstrapped and regression tested.
+The testing matrix for this patch was:

-Applied to trunk.
+1. x86_64 bootstrap and regression test - no regressions.
+2. armhf bootstrap and regression test - no regressions.
+3. arm-none-eabi cross build and regression test for

-regards
-Ramana
+{-marm/-march=armv7-a/-mfpu=vfpv3-d16/-mfloat-abi=softfp}
+{-mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard}
+{-marm/-mcpu=arm7tdmi/-mfloat-abi=soft}
+{-mthumb/-mcpu=arm7tdmi/-mfloat-abi=soft}

-2015-09-14  Ramana Radhakrishnan  
+with no regressions.

-   PR target/63304
-   * config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Handle
-   nopcrelative_literal_loads.
-   (aarch64_classify_address): Likewise.
-   (aarch64_constant_pool_reload_icode): Define.
-   (aarch64_secondary_reload): Handle secondary reloads for
-   literal pools.
-   (aarch64_override_options): Handle nopcrelative_literal_loads.
-   (aarch64_classify_symbol): Handle nopcrelative_literal_loads.
-   * config/aarch64/aarch64.md
(aarch64_reload_movcp):
-   Define.
-   (aarch64_reload_movcp): Likewise.
-   * config/aarch64/aarch64.opt (mpc-relative-literal-loads): New option.
-   * config/aarch64/predicates.md (aarch64_constant_pool_symref): New
-   predicate.
-   * doc/invoke.texi (mpc-relative-literal-loads): Document.
+Ok to apply ?

+2015-10-09  Richard Biener  
+
+   PR target/67366
+   * gimple-fold.c (optabs-query.h): Include
+   (gimple_fold_builtin_memory_op): Allow unaligned stores
+   when movmisalign_optabs are available.
+
+2015-10-09  Ramana Radhakrishnan  
+
+   PR target/67366
+   * lib/target-supports.exp (check_effective_target_non_strict_align):
+   Adjust for arm*-*-*.
+   * gcc.target/arm/pr67366.c: New test.


[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-10-09 Thread ramana at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #27 from Ramana Radhakrishnan  ---
(In reply to Ramana Radhakrishnan from comment #26)
> Author: ramana
> Date: Fri Oct  9 10:58:06 2015
> New Revision: 228644
> 
> URL: https://gcc.gnu.org/viewcvs?rev=228644&root=gcc&view=rev
> Log:
> [AArch64] Handle literal pools for functions > 1 MiB in size.
> 
> 
> This patch fixes the issue in PR63304 where we have
> functions that are > 1MiB. The idea is to use adrp / ldr or adrp / add
> instructions to address the literal pools under the use of a command line
> option. I would like to turn this on by default on trunk but keep this
> disabled by default for the release branches in order to get some
> serious testing for this feature while it bakes on trunk.
> 
> As a follow-up I would like to try and see if estimate_num_insns or
> something else can give us a heuristic to turn this on for "large" functions.
> After all the number of incidences of this are quite low in real life,
> so may be we should look to restrict this use as much as possible on the
> grounds that this code generation implies an extra integer register for
> addressing for every floating point and vector constant and I don't think
> that's great in code that already may have high register pressure.
> 
> Tested on aarch64-none-elf with no regressions. A previous
> version was bootstrapped and regression tested.
> 
> Applied to trunk.
> 
> regards
> Ramana
> 
> 2015-09-14  Ramana Radhakrishnan  
> 
>   PR target/63304
>   * config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Handle
>   nopcrelative_literal_loads.
>   (aarch64_classify_address): Likewise.
>   (aarch64_constant_pool_reload_icode): Define.
>   (aarch64_secondary_reload): Handle secondary reloads for
>   literal pools.
>   (aarch64_override_options): Handle nopcrelative_literal_loads.
>   (aarch64_classify_symbol): Handle nopcrelative_literal_loads.
>   * config/aarch64/aarch64.md (aarch64_reload_movcp):
>   Define.
>   (aarch64_reload_movcp): Likewise.
>   * config/aarch64/aarch64.opt (mpc-relative-literal-loads): New option.
>   * config/aarch64/predicates.md (aarch64_constant_pool_symref): New
>   predicate.
>   * doc/invoke.texi (mpc-relative-literal-loads): Document.
> 
> 
> Added:
> trunk/gcc/testsuite/gcc.target/arm/pr67366.c
> Modified:
> trunk/gcc/ChangeLog
> trunk/gcc/gimple-fold.c
> trunk/gcc/testsuite/ChangeLog
> trunk/gcc/testsuite/lib/target-supports.exp

This commit really was for PR67366.. Copied wrong text into the commit message.

[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-10-09 Thread ramana at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #26 from Ramana Radhakrishnan  ---
Author: ramana
Date: Fri Oct  9 10:58:06 2015
New Revision: 228644

URL: https://gcc.gnu.org/viewcvs?rev=228644&root=gcc&view=rev
Log:
[AArch64] Handle literal pools for functions > 1 MiB in size.


This patch fixes the issue in PR63304 where we have
functions that are > 1MiB. The idea is to use adrp / ldr or adrp / add
instructions to address the literal pools under the use of a command line
option. I would like to turn this on by default on trunk but keep this
disabled by default for the release branches in order to get some
serious testing for this feature while it bakes on trunk.

As a follow-up I would like to try and see if estimate_num_insns or
something else can give us a heuristic to turn this on for "large" functions.
After all the number of incidences of this are quite low in real life,
so may be we should look to restrict this use as much as possible on the
grounds that this code generation implies an extra integer register for
addressing for every floating point and vector constant and I don't think
that's great in code that already may have high register pressure.

Tested on aarch64-none-elf with no regressions. A previous
version was bootstrapped and regression tested.

Applied to trunk.

regards
Ramana

2015-09-14  Ramana Radhakrishnan  

PR target/63304
* config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Handle
nopcrelative_literal_loads.
(aarch64_classify_address): Likewise.
(aarch64_constant_pool_reload_icode): Define.
(aarch64_secondary_reload): Handle secondary reloads for
literal pools.
(aarch64_override_options): Handle nopcrelative_literal_loads.
(aarch64_classify_symbol): Handle nopcrelative_literal_loads.
* config/aarch64/aarch64.md
(aarch64_reload_movcp):
Define.
(aarch64_reload_movcp): Likewise.
* config/aarch64/aarch64.opt (mpc-relative-literal-loads): New option.
* config/aarch64/predicates.md (aarch64_constant_pool_symref): New
predicate.
* doc/invoke.texi (mpc-relative-literal-loads): Document.


Added:
trunk/gcc/testsuite/gcc.target/arm/pr67366.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/gimple-fold.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/lib/target-supports.exp


[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-09-14 Thread ramana at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #25 from Ramana Radhakrishnan  ---
Author: ramana
Date: Mon Sep 14 13:16:59 2015
New Revision: 227748

URL: https://gcc.gnu.org/viewcvs?rev=227748&root=gcc&view=rev
Log:
[AArch64] Handle literal pools for functions > 1 MiB in size.


This patch fixes the issue in PR63304 where we have
functions that are > 1MiB. The idea is to use adrp / ldr or adrp / add
instructions to address the literal pools under the use of a command line
option. I would like to turn this on by default on trunk but keep this
disabled by default for the release branches in order to get some
serious testing for this feature while it bakes on trunk.

As a follow-up I would like to try and see if estimate_num_insns or
something else can give us a heuristic to turn this on for "large" functions.
After all the number of incidences of this are quite low in real life,
so may be we should look to restrict this use as much as possible on the
grounds that this code generation implies an extra integer register for
addressing for every floating point and vector constant and I don't think
that's great in code that already may have high register pressure.

Tested on aarch64-none-elf with no regressions. A previous
version was bootstrapped and regression tested.

Applied to trunk.

regards
Ramana

2015-09-14  Ramana Radhakrishnan  

PR target/63304
* config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Handle
nopcrelative_literal_loads.
(aarch64_classify_address): Likewise.
(aarch64_constant_pool_reload_icode): Define.
(aarch64_secondary_reload): Handle secondary reloads for
literal pools.
(aarch64_override_options): Handle nopcrelative_literal_loads.
(aarch64_classify_symbol): Handle nopcrelative_literal_loads.
* config/aarch64/aarch64.md
(aarch64_reload_movcp):
Define.
(aarch64_reload_movcp): Likewise.
* config/aarch64/aarch64.opt (mpc-relative-literal-loads): New option.
* config/aarch64/predicates.md (aarch64_constant_pool_symref): New
predicate.
* doc/invoke.texi (mpc-relative-literal-loads): Document.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/aarch64/aarch64.c
trunk/gcc/config/aarch64/aarch64.md
trunk/gcc/config/aarch64/aarch64.opt
trunk/gcc/config/aarch64/iterators.md
trunk/gcc/config/aarch64/predicates.md
trunk/gcc/doc/invoke.texi


[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-09-11 Thread ramana at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #24 from Ramana Radhakrishnan  ---
Author: ramana
Date: Fri Sep 11 09:44:26 2015
New Revision: 227679

URL: https://gcc.gnu.org/viewcvs?rev=227679&root=gcc&view=rev
Log:
Remove separate movtf pattern - Use an iterator for all FP modes.

movtf is unnecessary as a separate expander. Move this to be with
the standard scalar floating point expanders.

Achieved by adding a new iterator and then using the same.

Tested cross aarch64-none-elf and no regressions.

Rebased version from https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00767.html


2015-09-10  Ramana Radhakrishnan  

PR target/63304
* config/aarch64/aarch.md (mov:GPF_F16): Use GPF_TF_F16.
(movtf): Delete.
* config/aarch64/iterators.md (GPF_TF_F16): New.
(GPF_F16): Delete.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/aarch64/aarch64.md
trunk/gcc/config/aarch64/iterators.md


[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-08-07 Thread david.abdurachmanov at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #23 from David Abdurachmanov  
---
GCC trunk
  r226676
  or 15af172f2a0ea281969e3105da9f9bb100097c7d from
git://gcc.gnu.org/git/gcc.git
  Date:   Thu Aug 6 14:26:18 2015 +)

Rebased and applied:

  https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02074.html
  https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02258.html

binutils trunk
  f0ce0d3a331129309a46a6a9ac85fce35acae72b from
git://sourceware.org/git/binutils-gdb.git
  Date:   Thu Jul 23 16:01:01 2015 +0100)

Rebases and applied:
  https://sourceware.org/ml/binutils/2015-07/msg00137.html

$ gcc -dumpversion
6.0.0
$ ld --version
GNU ld (GNU Binutils) 2.25.51.20150806

I __successfully__ compiled OpenLoops 1.1.1 and MCFM 6.3 packages.


[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-07-30 Thread ramana at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #22 from Ramana Radhakrishnan  ---
(In reply to David Abdurachmanov from comment #21)
> I am on vacations now, but I already marked this on my TODO list. Once I
> find a free time slot I will give it a spin. I will try to report in a few
> days.
> 
> BTW, I will also show up at GNU Tools Cauldron 2015.

I must also add that this patch is a pre-requisite for the appropriate
iterators to be available.

https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02074.html

and this may need some rebasing after Alan's latest patches to handle fp16.


[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-07-29 Thread david.abdurachmanov at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #21 from David Abdurachmanov  
---
I am on vacations now, but I already marked this on my TODO list. Once I find a
free time slot I will give it a spin. I will try to report in a few days.

BTW, I will also show up at GNU Tools Cauldron 2015.


[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-07-29 Thread ramana at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #20 from Ramana Radhakrishnan  ---
(In reply to Ramana Radhakrishnan from comment #19)
> (In reply to Ramana Radhakrishnan from comment #18)
> > I'm taking a look into this.
> 
> RFC here - https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02258.html

David - can you see if this works for you ?

regards
Ramana


[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-07-27 Thread ramana at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #19 from Ramana Radhakrishnan  ---
(In reply to Ramana Radhakrishnan from comment #18)
> I'm taking a look into this.

RFC here - https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02258.html


[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-07-22 Thread ramana at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

Ramana Radhakrishnan  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||ramana at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |ramana at gcc dot 
gnu.org

--- Comment #18 from Ramana Radhakrishnan  ---
I'm taking a look into this.


[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-07-20 Thread wdijkstr at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

Wilco  changed:

   What|Removed |Added

 CC||wdijkstr at arm dot com

--- Comment #17 from Wilco  ---
Well there seem to be 2 ways to address this:

* If a function is huge, emit literals as const data. This enables the use of
anchors and sharing of literals across all functions in a compilation unit.

* Reserve a register in the adr/ldr literal patterns and add a 2-instruction
sequence using adrp when out of range. Ideally the register should only be
reserved if a function is huge.


[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-07-20 Thread jiwang at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

Jiong Wang  changed:

   What|Removed |Added

 CC||jiwang at gcc dot gnu.org

--- Comment #16 from Jiong Wang  ---
Have done a quick look at this, basic ideas to fix this:

  * generate a special pattern which initialize literal pool start address.
  * implement TARGET_MACHINE_DEPENDENT_REORG to calculate whehter the
pc-relative literal load is within range.
  * output final insruction sequences which initializing literal pool start
address based on the result from reorg pass analysis. Use movk/z, adrp +
add,
single adr for different distance.


[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-07-20 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

Andrew Pinski  changed:

   What|Removed |Added

   Assignee|pinskia at gcc dot gnu.org |unassigned at gcc dot 
gnu.org

--- Comment #15 from Andrew Pinski  ---
Not working on this any time soon.  But someone from ARM really should look
into fixing this as it blocks standard C/C++ code from HPC and distros.


[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-07-19 Thread david.abdurachmanov at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #14 from David Abdurachmanov  
---
I hit another two cases of this.

1. g2root tool, which converts GEANT geometry to ROOT geometry. It create a
single function, which contains lots of descriptions of material, shapes, etc.
all describing some 3D objects and physical properties. These can be very huge.

Also could think of MILL computing as possible example. They use C++ as they
assembler (at least for simulation). Thus you have something like:

void somefunc(void) {
  add(b0, b1);
  add(b2, b3);
  ...
}

IIRC, by compiling it they generate a simulator to run this program on specific
CPU.

2. mcfm 6.3 package, to be precise: mcfm-6.3/src/WW/triangle11new.f

triangle11new.s: Assembler messages:
triangle11new.s:587: Error: pc-relative load offset out of range
triangle11new.s:592: Error: pc-relative load offset out of range

..
   587 ldr x0, .LC2
   588 fmovd0, x2
   589 fmovd1, x0
   590 fdivd0, d0, d1
   591 fmovx2, d0
   592 ldr x0, .LC2
..

346645 .LC2:
346646 .word   0
346647 .word   1073741824
346648 .align  3
..

Distance between ldr instruction and .LC2 is 346058 assembly lines.

Here is the source file:
https://github.com/cms-externals/MCFM/blob/master/src/WW/triangle11new.f (just
1355 sloc).

It's those huge computations, which are killing it. Similar issue as in
OpenLoops package.


[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-01-10 Thread david.abdurachmanov at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #13 from David Abdurachmanov  
---
Created attachment 34416
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34416&action=edit
31-lines, minimal testcase

I am adding 31-lines minimal testcase. Should be good enough for GCC testsuite.

$ gcc -O0 -c pr63304-testcase.c
/tmp/ccKdBqsL.s: Assembler messages:
/tmp/ccKdBqsL.s:58: Error: pc-relative load offset out of range
/tmp/ccKdBqsL.s:62: Error: pc-relative load offset out of range

Yesterday I looked into RTL output and assembly of some failures for OpenLoops.
The function was 400+K lines in assembly. On x86_64 it was something 180+K
lines of assembly and ~1.3MB for function body size.

I can confirm that offset is above 1MB mark and it was trying to load
__float128/long double to q1 from constant pool for passing to `addtf3`.


[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-01-05 Thread david.abdurachmanov at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #12 from David Abdurachmanov  
---
I decided to re-enable -Os for OpenLoops. Then use powerful hardware with
32-physical-cores (x86_64) and 0.5TB of RAM to see if I could get lucky. Fired
up QEMU user mode with Fedora for AArch64 chroot for compiling.

It did resolve some of failures, but not everything. I have seen processes
going as far as 25+GB of RSS and taking hours to finish. This is the reason
package is compiled with -O0.


[Bug target/63304] Aarch64 pc-relative load offset out of range

2014-12-28 Thread david.abdurachmanov at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

David Abdurachmanov  changed:

   What|Removed |Added

 CC||david.abdurachmanov at gmail 
dot c
   ||om

--- Comment #11 from David Abdurachmanov  
---
I believe, I hit this issue with OpenLoops package on AArch64, while it works
fine on x86_64.

It shows up on 3 of Fortran files:
scons: *** [process_obj/pplljjj/loop_pplljjj_eexddxggg_1_qp.os] Error 1
scons: *** [process_obj/pplljjj/loop_pplljjj_eexuuxggg_1_qp.os] Error 1
scons: *** [process_obj/pplljjj/loop_pplljjj_eexbbxggg_1_qp.os] Error 1

I get "Error: pc-relative load offset out of range" 864 times.

Tested with GCC 4.9.1 and binutils 2.24.

I also looked into f9edeb70961d404caac5a849b0783c53228ddf62 / 213078, with it
applied on top it gets worse, i.e. more files are affected. From 864 it
increases to 1392 errors.

This particular library (pplljjj) is 158MB on x86_64. Most of it (153MB)
belongs to .text section.

The code must be compiled with -O0 otherwise machines have hard time handling
it:

gfortran: internal compiler error: Killed (program f951)
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.

@Andrew,
I am happy to test your patches if necessary.


[Bug target/63304] Aarch64 pc-relative load offset out of range

2014-10-11 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||assemble-failure
   Severity|minor   |normal

--- Comment #10 from Andrew Pinski  ---
More point is with LTO you can get this even with optimization turned on.  If
the text section is big enough.


[Bug target/63304] Aarch64 pc-relative load offset out of range

2014-10-11 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #9 from Andrew Pinski  ---
Actually this is much worse than what is mentioned here.  Having the constant
pool be part of the .text section really does not work if the alignment of the
constants are smaller than 4 byte aligned.  The assembler rejects the debugging
info as the end label of the text section is not 4 byte aligned.
I am working on fixing this bug and the bug dealing with the constant pool in
the text section (which is wrong and is not shared between translational units
really).


[Bug target/63304] Aarch64 pc-relative load offset out of range

2014-09-19 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

Andrew Pinski  changed:

   What|Removed |Added

 Status|RESOLVED|NEW
 Resolution|WONTFIX |---

--- Comment #8 from Andrew Pinski  ---
This code is still valid and really should be fixed; maybe not today or for GCC
5 or GCC 6 but it is still a bug.


[Bug target/63304] Aarch64 pc-relative load offset out of range

2014-09-19 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

Venkataramanan  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||doko at ubuntu dot com
 Resolution|--- |WONTFIX

--- Comment #7 from Venkataramanan  ---
Ok I am closing this as wont fix now. Checked with Mitthias who reported the
problem to us and he is fine building it with -O2.


[Bug target/63304] Aarch64 pc-relative load offset out of range

2014-09-19 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

Richard Earnshaw  changed:

   What|Removed |Added

   Priority|P3  |P5
 Status|RESOLVED|NEW
 Resolution|WONTFIX |---
   Assignee|unassigned at gcc dot gnu.org  |venkataramanan.kumar at 
amd dot co
   ||m
   Severity|normal  |minor

--- Comment #6 from Richard Earnshaw  ---
OK.  But I still wouldn't loose much sleep over it.  It wouldn't surprise me
that if you turn the optimizer on it then blows up by using too much memory...


[Bug target/63304] Aarch64 pc-relative load offset out of range

2014-09-19 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #5 from Venkataramanan  ---
We got inspired by this bug.
https://bugs.linaro.org/show_bug.cgi?id=400 
It happens at -O0 now.


[Bug target/63304] Aarch64 pc-relative load offset out of range

2014-09-19 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

Richard Earnshaw  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #4 from Richard Earnshaw  ---
I see this as a theoretical problem that's unlikely to ever manifest itself
with real code (functions generating .5 million instructions would take
insanely long to compile in real life).

Unless you can show some *real* code that exhibits this problem I propose we
don't try to fix this; it's just a limit on the compiler.

We have bigger problems to worry about.


[Bug target/63304] Aarch64 pc-relative load offset out of range

2014-09-19 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

Andrew Pinski  changed:

   What|Removed |Added

 Target||aarch64
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2014-09-19
 Ever confirmed|0   |1

--- Comment #3 from Andrew Pinski  ---
Confirmed.

>From .expand:
(insn 5 4 6 (set (reg:DF 75)
(mem/u/c:DF (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0  S8 A64]))
/home/pinskia/Downloads/x.c:7 -1
 (nil))

Note next time please use macros to create your testcase, it is ok sometimes to
use non preprocessed source


[Bug target/63304] Aarch64 pc-relative load offset out of range

2014-09-18 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #2 from Venkataramanan  ---
Marcus, can you please assign it to me if it is confirmed.


[Bug target/63304] Aarch64 pc-relative load offset out of range

2014-09-18 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

--- Comment #1 from Venkataramanan  ---
Created attachment 33515
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33515&action=edit
Attached test case