Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-24 Thread Xi Ruoyao
On Thu, 2024-01-25 at 08:48 +0800, chenglulu wrote:
> 
> 在 2024/1/24 上午3:36, Xi Ruoyao 写道:
> > On Mon, 2024-01-22 at 15:27 +0800, chenglulu wrote:
> > > > > The failure of this test case was because the compiler believes that 
> > > > > two
> > > > > (UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the
> > > > > same result, but this isn't true because the result depends on PC.  
> > > > > Thus
> > > > > (pc) needed to be included in the RTX, like:
> > > > > 
> > > > >     [(set (match_operand:DI 0 "register_operand" "=r")
> > > > >   (unspec:DI [(match_operand:DI 2 "") (pc)]
> > > > > UNSPEC_LA_PCREL_64_PART1))
> > > > >  (set (match_operand:DI 1 "register_operand" "=r")
> > > > >   (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))]
> > > > > 
> > > > > With this the buggy REG_UNUSED notes were gone.  But it then prevented
> > > > > the CSE when loading the address of __tls_get_addr (i.e. if we address
> > > > > 10 TLE_LD symbols in a function it would emit 10 instances of 
> > > > > "la.global
> > > > > __tls_get_addr") so I added an REG_EQUAL note for it.  For symbols 
> > > > > other
> > > > > than __tls_get_addr such notes are added automatically by optimization
> > > > > passes.
> > > > > 
> > > > > Updated patch attached.
> > > > > 
> > > > I'm eliminating redundant la.global directives in my macro
> > > > implementation.
> > > > 
> > > > I will be testing this patch.
> > > > 
> > > > 
> > > > 
> > > > 
> > > With this patch, spec2006 can pass the test, but spec2017 621 and 654
> > > tests fail.
> > > I haven't debugged the specific cause of the problem yet.
> > Try removing the TARGET_DELEGITIMIZE_ADDRESS hook?  After eating some
> > unhealthy food in the midnight I realized the hook only
> > papers over the same issue caused spec2006 failure.  I tried a bootstrap
> > with BOOT_CFLAGS=-O2 -g -mcmodel=extreme and TARGET_DELEGITIMIZE_ADDRESS
> > commented out, and there is no more spurious "note: non-delegitimized
> > UNSPEC UNSPEC_LA_PCREL_64_PART1 (42) found in variable location" things.
> > I feel that this hook is still written in a buggy way, so maybe removing
> > it will solve the spec2017 issue.
> > 
> I found the problem. Binutils did not consider the four instructions 
> when converting the type from TLS IE to TLS LE, which caused the conversion 
> error.

Oooops.  We better fix this quickly as the Binutils 2.42 release is
imminent.

Maybe we can just disable TLS linker optimization once we see an
R_LARCH_TLS_DESC64* or R_LARCH_TLS_IE64*.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-24 Thread chenglulu



在 2024/1/24 上午3:36, Xi Ruoyao 写道:

On Mon, 2024-01-22 at 15:27 +0800, chenglulu wrote:

The failure of this test case was because the compiler believes that two
(UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the
same result, but this isn't true because the result depends on PC.  Thus
(pc) needed to be included in the RTX, like:

    [(set (match_operand:DI 0 "register_operand" "=r")
  (unspec:DI [(match_operand:DI 2 "") (pc)]
UNSPEC_LA_PCREL_64_PART1))
     (set (match_operand:DI 1 "register_operand" "=r")
  (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))]

With this the buggy REG_UNUSED notes were gone.  But it then prevented
the CSE when loading the address of __tls_get_addr (i.e. if we address
10 TLE_LD symbols in a function it would emit 10 instances of "la.global
__tls_get_addr") so I added an REG_EQUAL note for it.  For symbols other
than __tls_get_addr such notes are added automatically by optimization
passes.

Updated patch attached.


I'm eliminating redundant la.global directives in my macro
implementation.

I will be testing this patch.





With this patch, spec2006 can pass the test, but spec2017 621 and 654
tests fail.
I haven't debugged the specific cause of the problem yet.

Try removing the TARGET_DELEGITIMIZE_ADDRESS hook?  After eating some
unhealthy food in the midnight I realized the hook only
papers over the same issue caused spec2006 failure.  I tried a bootstrap
with BOOT_CFLAGS=-O2 -g -mcmodel=extreme and TARGET_DELEGITIMIZE_ADDRESS
commented out, and there is no more spurious "note: non-delegitimized
UNSPEC UNSPEC_LA_PCREL_64_PART1 (42) found in variable location" things.
I feel that this hook is still written in a buggy way, so maybe removing
it will solve the spec2017 issue.

I found the problem. Binutils did not consider the four instructions 
when converting


the type from TLS IE to TLS LE, which caused the conversion error.




Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-23 Thread Xi Ruoyao
On Mon, 2024-01-22 at 15:27 +0800, chenglulu wrote:
> > > The failure of this test case was because the compiler believes that two
> > > (UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the
> > > same result, but this isn't true because the result depends on PC.  Thus
> > > (pc) needed to be included in the RTX, like:
> > > 
> > >    [(set (match_operand:DI 0 "register_operand" "=r")
> > >  (unspec:DI [(match_operand:DI 2 "") (pc)] 
> > > UNSPEC_LA_PCREL_64_PART1))
> > >     (set (match_operand:DI 1 "register_operand" "=r")
> > >  (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))]
> > > 
> > > With this the buggy REG_UNUSED notes were gone.  But it then prevented
> > > the CSE when loading the address of __tls_get_addr (i.e. if we address
> > > 10 TLE_LD symbols in a function it would emit 10 instances of "la.global
> > > __tls_get_addr") so I added an REG_EQUAL note for it.  For symbols other
> > > than __tls_get_addr such notes are added automatically by optimization
> > > passes.
> > > 
> > > Updated patch attached.
> > > 
> > I'm eliminating redundant la.global directives in my macro 
> > implementation.
> > 
> > I will be testing this patch.
> > 
> > 
> > 
> > 
> With this patch, spec2006 can pass the test, but spec2017 621 and 654 
> tests fail.
> I haven't debugged the specific cause of the problem yet.

Try removing the TARGET_DELEGITIMIZE_ADDRESS hook?  After eating some
unhealthy food in the midnight I realized the hook only
papers over the same issue caused spec2006 failure.  I tried a bootstrap
with BOOT_CFLAGS=-O2 -g -mcmodel=extreme and TARGET_DELEGITIMIZE_ADDRESS
commented out, and there is no more spurious "note: non-delegitimized
UNSPEC UNSPEC_LA_PCREL_64_PART1 (42) found in variable location" things.
I feel that this hook is still written in a buggy way, so maybe removing
it will solve the spec2017 issue.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-21 Thread chenglulu



在 2024/1/19 下午4:51, chenglulu 写道:


在 2024/1/19 下午1:46, Xi Ruoyao 写道:

On Wed, 2024-01-17 at 17:57 +0800, chenglulu wrote:
Virtual register 1479 will be used in insn 2744, but register 1479 
was

assigned the REG_UNUSED attribute in the previous instruction.

The attached file is the wrong file.
The compilation command is as follows:

$ ./gcc/cc1 -fpreprocessed regrename.i -quiet -dp -dumpbase 
regrename.c

-dumpbase-ext .c -mno-relax -mabi=lp64d -march=loongarch64 -mfpu=64
-msimd=lasx -mcmodel=extreme -mtune=loongarch64 -g3 -O2
-Wno-int-conversion -Wno-implicit-int 
-Wno-implicit-function-declaration

-Wno-incompatible-pointer-types -version -o regrename.s
-mexplicit-relocs=always -fdump-rtl-all-all

I've seen some "guality" test failures in GCC test suite as well.
Normally I just ignore the guality failures but this time they look 
very

suspicious.  I'll investigate these issues...

I've also seen this type of failed regression tests and I'll 
continue to

look at this issue as well.

The guality regression is simple: I didn't call
delegitimize_mem_from_attrs (the default TARGET_DELEGITIMIZE_ADDRESS) in
the custom implementation.

The failure of this test case was because the compiler believes that two
(UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the
same result, but this isn't true because the result depends on PC.  Thus
(pc) needed to be included in the RTX, like:

   [(set (match_operand:DI 0 "register_operand" "=r")
 (unspec:DI [(match_operand:DI 2 "") (pc)] 
UNSPEC_LA_PCREL_64_PART1))

    (set (match_operand:DI 1 "register_operand" "=r")
 (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))]

With this the buggy REG_UNUSED notes were gone.  But it then prevented
the CSE when loading the address of __tls_get_addr (i.e. if we address
10 TLE_LD symbols in a function it would emit 10 instances of "la.global
__tls_get_addr") so I added an REG_EQUAL note for it.  For symbols other
than __tls_get_addr such notes are added automatically by optimization
passes.

Updated patch attached.

I'm eliminating redundant la.global directives in my macro 
implementation.


I will be testing this patch.




With this patch, spec2006 can pass the test, but spec2017 621 and 654 
tests fail.

I haven't debugged the specific cause of the problem yet.



Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-19 Thread chenglulu



在 2024/1/19 下午1:46, Xi Ruoyao 写道:

On Wed, 2024-01-17 at 17:57 +0800, chenglulu wrote:

Virtual register 1479 will be used in insn 2744, but register 1479 was
assigned the REG_UNUSED attribute in the previous instruction.

The attached file is the wrong file.
The compilation command is as follows:

$ ./gcc/cc1 -fpreprocessed regrename.i -quiet -dp -dumpbase regrename.c
-dumpbase-ext .c -mno-relax -mabi=lp64d -march=loongarch64 -mfpu=64
-msimd=lasx -mcmodel=extreme -mtune=loongarch64 -g3 -O2
-Wno-int-conversion -Wno-implicit-int -Wno-implicit-function-declaration
-Wno-incompatible-pointer-types -version -o regrename.s
-mexplicit-relocs=always -fdump-rtl-all-all

I've seen some "guality" test failures in GCC test suite as well.
Normally I just ignore the guality failures but this time they look very
suspicious.  I'll investigate these issues...


I've also seen this type of failed regression tests and I'll continue to
look at this issue as well.

The guality regression is simple: I didn't call
delegitimize_mem_from_attrs (the default TARGET_DELEGITIMIZE_ADDRESS) in
the custom implementation.

The failure of this test case was because the compiler believes that two
(UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the
same result, but this isn't true because the result depends on PC.  Thus
(pc) needed to be included in the RTX, like:

   [(set (match_operand:DI 0 "register_operand" "=r")
 (unspec:DI [(match_operand:DI 2 "") (pc)] UNSPEC_LA_PCREL_64_PART1))
(set (match_operand:DI 1 "register_operand" "=r")
 (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))]

With this the buggy REG_UNUSED notes were gone.  But it then prevented
the CSE when loading the address of __tls_get_addr (i.e. if we address
10 TLE_LD symbols in a function it would emit 10 instances of "la.global
__tls_get_addr") so I added an REG_EQUAL note for it.  For symbols other
than __tls_get_addr such notes are added automatically by optimization
passes.

Updated patch attached.


I'm eliminating redundant la.global directives in my macro implementation.

I will be testing this patch.







Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-18 Thread Xi Ruoyao
On Wed, 2024-01-17 at 17:57 +0800, chenglulu wrote:
> > > Virtual register 1479 will be used in insn 2744, but register 1479 was
> > > assigned the REG_UNUSED attribute in the previous instruction.
> > > 
> > > The attached file is the wrong file.
> > > The compilation command is as follows:
> > > 
> > > $ ./gcc/cc1 -fpreprocessed regrename.i -quiet -dp -dumpbase regrename.c
> > > -dumpbase-ext .c -mno-relax -mabi=lp64d -march=loongarch64 -mfpu=64
> > > -msimd=lasx -mcmodel=extreme -mtune=loongarch64 -g3 -O2
> > > -Wno-int-conversion -Wno-implicit-int -Wno-implicit-function-declaration
> > > -Wno-incompatible-pointer-types -version -o regrename.s
> > > -mexplicit-relocs=always -fdump-rtl-all-all
> > I've seen some "guality" test failures in GCC test suite as well.
> > Normally I just ignore the guality failures but this time they look very
> > suspicious.  I'll investigate these issues...
> > 
> I've also seen this type of failed regression tests and I'll continue to 
> look at this issue as well.

The guality regression is simple: I didn't call
delegitimize_mem_from_attrs (the default TARGET_DELEGITIMIZE_ADDRESS) in
the custom implementation.

The failure of this test case was because the compiler believes that two
(UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the
same result, but this isn't true because the result depends on PC.  Thus
(pc) needed to be included in the RTX, like:

  [(set (match_operand:DI 0 "register_operand" "=r")
(unspec:DI [(match_operand:DI 2 "") (pc)] UNSPEC_LA_PCREL_64_PART1))
   (set (match_operand:DI 1 "register_operand" "=r")
(unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))]

With this the buggy REG_UNUSED notes were gone.  But it then prevented
the CSE when loading the address of __tls_get_addr (i.e. if we address
10 TLE_LD symbols in a function it would emit 10 instances of "la.global
__tls_get_addr") so I added an REG_EQUAL note for it.  For symbols other
than __tls_get_addr such notes are added automatically by optimization
passes.

Updated patch attached.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University
From e9d789f8dcb52984b0f894fdecc402a49c5ad6d7 Mon Sep 17 00:00:00 2001
From: Xi Ruoyao 
Date: Fri, 5 Jan 2024 18:40:06 +0800
Subject: [PATCH v2] LoongArch: Don't split the instructions containing relocs
 for extreme code model

The ABI mandates the pcalau12i/addi.d/lu32i.d/lu52i.d instructions for
addressing a symbol to be adjacent.  So model them as "one large
instruction", i.e. define_insn, with two output registers.  The real
address is the sum of these two registers.

The advantage of this approach is the RTL passes can still use ldx/stx
instructions to skip an addi.d instruction.

gcc/ChangeLog:

	* config/loongarch/loongarch.md (unspec): Add
	UNSPEC_LA_PCREL_64_PART1 and UNSPEC_LA_PCREL_64_PART2.
	(la_pcrel64_two_parts): New define_insn.
	* config/loongarch/loongarch.cc (loongarch_tls_symbol): Fix a
	typo in the comment.
	(loongarch_call_tls_get_addr): If TARGET_CMODEL_EXTREME, use
	la_pcrel64_two_parts for addressing the TLS symbol and
	__tls_get_addr.  Emit an REG_EQUAL note to allow CSE addressing
	__tls_get_addr.
	(loongarch_legitimize_tls_address): If TARGET_CMODEL_EXTREME,
	address TLS IE symbols with la_pcrel64_two_parts.
	(loongarch_split_symbol): If TARGET_CMODEL_EXTREME, address
	symbols with la_pcrel64_two_parts.
	(TARGET_DELEGITIMIZE_ADDRESS): Define.
	(loongarch_delegitimize_address): Implement the target hook.

gcc/testsuite/ChangeLog:

	* gcc.target/loongarch/func-call-extreme-1.c (dg-options):
	Use -O2 instead of -O0 to ensure the pcalau12i/addi/lu32i/lu52i
	instruction sequences are not reordered by the compiler.
	(NOIPA): Disallow interprocedural optimizations.
	* gcc.target/loongarch/func-call-extreme-2.c: Remove the content
	duplicated from func-call-extreme-1.c, include it instead.
	(dg-options): Likewise.
	* gcc.target/loongarch/func-call-extreme-3.c (dg-options):
	Likewise.
	* gcc.target/loongarch/func-call-extreme-4.c (dg-options):
	Likewise.
	* gcc.target/loongarch/cmodel-extreme-1.c: New test.
	* gcc.target/loongarch/cmodel-extreme-2.c: New test.
---
 gcc/config/loongarch/loongarch.cc | 135 +++---
 gcc/config/loongarch/loongarch.md |  21 +++
 .../gcc.target/loongarch/cmodel-extreme-1.c   |  18 +++
 .../gcc.target/loongarch/cmodel-extreme-2.c   |   7 +
 .../loongarch/func-call-extreme-1.c   |  14 +-
 .../loongarch/func-call-extreme-2.c   |  29 +---
 .../loongarch/func-call-extreme-3.c   |   2 +-
 .../loongarch/func-call-extreme-4.c   |   2 +-
 8 files changed, 144 insertions(+), 84 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/cmodel-extreme-1.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/cmodel-extreme-2.c

diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
index 82467474288..358d2f8f3f5 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/con

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-17 Thread chenglulu



在 2024/1/17 下午5:50, Xi Ruoyao 写道:

On Wed, 2024-01-17 at 17:38 +0800, chenglulu wrote:

在 2024/1/13 下午9:05, Xi Ruoyao 写道:

在 2024-01-13星期六的 15:01 +0800,chenglulu写道:

在 2024/1/12 下午7:42, Xi Ruoyao 写道:

在 2024-01-12星期五的 09:46 +0800,chenglulu写道:


I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
we need a target hook to tell the generic code
UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
see millions lines of messages like

../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
UNSPEC_LA_PCREL_64_PART1 (42) found in variable location

I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't reproduced the 
problem you mentioned.

   $ ../configure --host=loongarch64-linux-gnu 
--target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \
   --with-arch=loongarch64 --with-abi=lp64d --enable-tls 
--enable-languages=c,c++,fortran,lto --enable-plugin \
   --disable-multilib --disable-host-shared --enable-bootstrap 
--enable-checking=release
   $ make BOOT_FLAGS="-mcmodel=extreme"

What did I do wrong?:-(

BOOT_CFLAGS, not BOOT_FLAGS :).


This is so strange. My compilation here stopped due to syntax problems,

and I still haven't reproduced the information you mentioned about
UNSPEC_LA_PCREL_64_PART1.

I used:

../gcc/configure --with-system-zlib --disable-fixincludes \
   --enable-default-ssp --enable-default-pie \
   --disable-werror --disable-multilib \
   --prefix=/home/xry111/gcc-dev

and then

make STAGE1_{C,CXX}FLAGS="-O2 -g" -j8 \
   BOOT_{C,CXX}FLAGS="-O2 -g -mcmodel=extreme" &| tee gcc-build.log

I guess "-g" is needed to reproduce the issue as well as the messages
were produced in dwarf generation.


I have reproduced this problem, and it can be solved by adding a hook.

But unfortunately, when using '-mcmodel=extreme -mexplicit-relocs=always'

to test spec2006 403.gcc, an error will occur. Others have not been
tested yet.

I roughly debugged it, and the problem should be this:

The problem is that the address of the instruction ‘ldx.d $r12, $r25,
$r6’ is wrong.

Wrong assembly:

     5826 pcalau12i   $r13,%got_pc_hi20(recog_data)
   5827 addi.d  $r12,$r0,%got_pc_lo12(recog_data)
   5828 lu32i.d $r12,%got64_pc_lo20(recog_data)
   5829 lu52i.d $r12,$r12,%got64_pc_hi12(recog_data)
   5830 ldx.d   $r12,$r13,$r12
   5831 ld.b    $r8,$r12,997
   5832 .loc 1 829 18 discriminator 1 view .LVU1527
   5833 ble $r8,$r0,.L476
   5834 ld.d    $r6,$r3,16
   5835 ld.d    $r9,$r3,88
   5836 .LBB189 = .
   5837 .loc 1 839 24 view .LVU1528
   5838 alsl.d  $r7,$r19,$r19,2
   5839 ldx.d   $r12,$r25,$r6
   5840 addi.d  $r17,$r3,120
   5841 .LBE189 = .
   5842 .loc 1 829 18 discriminator 1 view .LVU1529
   5843 or  $r13,$r0,$r0
   5844 addi.d  $r4,$r12,992

Assembly that works fine using macros:

3040 la.global   $r12,$r13,recog_data
3041 ld.b    $r9,$r12,997
3042 ble $r9,$r0,.L475
3043 alsl.d  $r5,$r16,$r16,2
3044 la.global   $r15,$r17,recog_data
3045 addi.d  $r4,$r12,992
3046 addi.d  $r18,$r3,48
3047 or  $r13,$r0,$r0

Comparing the assembly, we can see that lines 5844 and 3045 have the
same function,

but there is a problem with the base address register optimization at
line 5844.

regrename.c.283r.loop2_init:

(insn 6 497 2741 34 (set (reg:DI 180 [ ivtmp.713D.15724 ])
  (const_int 0 [0])) "regrename.c":829:18 discrim 1 156
{*movdi_64bit}
(nil))
(insn 2741 6 2744 34 (parallel [
  (set (reg:DI 1502)
  (unspec:DI [
  (symbol_ref:DI ("recog_data") [flags 0xc0]
)
  ] UNSPEC_LA_PCREL_64_PART1))
  (set (reg/f:DI 1479)
  (unspec:DI [
  (symbol_ref:DI ("recog_data") [flags 0xc0]
)
  ] UNSPEC_LA_PCREL_64_PART2))
  ]) -1
   (expr_list:REG_UNUSED (reg/f:DI 1479)
(nil)))
(insn 2744 2741 2745 34 (set (reg/f:DI 1503)
  (mem:DI (plus:DI (reg/f:DI 1479)
  (reg:DI 1502)) [0  S8 A8])) 156 {*movdi_64bit}
   (expr_list:REG_EQUAL (symbol_ref:DI ("recog_data") [flags 0xc0]
)
(nil)))


Virtual register 1479 will be used in insn 2744, but register 1479 was
assigned the REG_UNUSED attribute in the previous instruction.

The attached file is the wrong file.
The compilation command is as follows:

$ ./gcc/cc1 -fpreprocessed regrename.i -quiet -dp -dumpbase regrename.c
-dumpbase-ext .c -mno-relax -mabi=lp64d -march=loongarch64 -mfpu=64
-msimd=lasx -mcmodel=extreme -mtune=loongarch64 -g3 -O2
-Wno-int-conversion -Wno-implicit-int -Wno-implicit-function-declaration
-Wno-incompatible-pointer-types -version -o regrename.s
-mexplicit-relocs=always -fdump-rtl-all-all

I've seen some "guality" test failure

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-17 Thread Xi Ruoyao
On Wed, 2024-01-17 at 17:38 +0800, chenglulu wrote:
> 
> 在 2024/1/13 下午9:05, Xi Ruoyao 写道:
> > 在 2024-01-13星期六的 15:01 +0800,chenglulu写道:
> > > 在 2024/1/12 下午7:42, Xi Ruoyao 写道:
> > > > 在 2024-01-12星期五的 09:46 +0800,chenglulu写道:
> > > > 
> > > > > > I found an issue bootstrapping GCC with -mcmodel=extreme in 
> > > > > > BOOT_CFLAGS:
> > > > > > we need a target hook to tell the generic code
> > > > > > UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or 
> > > > > > we'll
> > > > > > see millions lines of messages like
> > > > > > 
> > > > > > ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
> > > > > > UNSPEC_LA_PCREL_64_PART1 (42) found in variable location
> > > > > I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't 
> > > > > reproduced the problem you mentioned.
> > > > > 
> > > > >   $ ../configure --host=loongarch64-linux-gnu 
> > > > > --target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \
> > > > >   --with-arch=loongarch64 --with-abi=lp64d --enable-tls 
> > > > > --enable-languages=c,c++,fortran,lto --enable-plugin \
> > > > >   --disable-multilib --disable-host-shared --enable-bootstrap 
> > > > > --enable-checking=release
> > > > >   $ make BOOT_FLAGS="-mcmodel=extreme"
> > > > > 
> > > > > What did I do wrong?:-(
> > > > BOOT_CFLAGS, not BOOT_FLAGS :).
> > > > 
> > > This is so strange. My compilation here stopped due to syntax problems,
> > > 
> > > and I still haven't reproduced the information you mentioned about
> > > UNSPEC_LA_PCREL_64_PART1.
> > I used:
> > 
> > ../gcc/configure --with-system-zlib --disable-fixincludes \
> >   --enable-default-ssp --enable-default-pie \
> >   --disable-werror --disable-multilib \
> >   --prefix=/home/xry111/gcc-dev
> > 
> > and then
> > 
> > make STAGE1_{C,CXX}FLAGS="-O2 -g" -j8 \
> >   BOOT_{C,CXX}FLAGS="-O2 -g -mcmodel=extreme" &| tee gcc-build.log
> > 
> > I guess "-g" is needed to reproduce the issue as well as the messages
> > were produced in dwarf generation.
> > 
> I have reproduced this problem, and it can be solved by adding a hook.
> 
> But unfortunately, when using '-mcmodel=extreme -mexplicit-relocs=always'
> 
> to test spec2006 403.gcc, an error will occur. Others have not been 
> tested yet.
> 
> I roughly debugged it, and the problem should be this:
> 
> The problem is that the address of the instruction ‘ldx.d $r12, $r25, 
> $r6’ is wrong.
> 
> Wrong assembly:
> 
>     5826 pcalau12i   $r13,%got_pc_hi20(recog_data)
>   5827 addi.d  $r12,$r0,%got_pc_lo12(recog_data)
>   5828 lu32i.d $r12,%got64_pc_lo20(recog_data)
>   5829 lu52i.d $r12,$r12,%got64_pc_hi12(recog_data)
>   5830 ldx.d   $r12,$r13,$r12
>   5831 ld.b    $r8,$r12,997
>   5832 .loc 1 829 18 discriminator 1 view .LVU1527
>   5833 ble $r8,$r0,.L476
>   5834 ld.d    $r6,$r3,16
>   5835 ld.d    $r9,$r3,88
>   5836 .LBB189 = .
>   5837 .loc 1 839 24 view .LVU1528
>   5838 alsl.d  $r7,$r19,$r19,2
>   5839 ldx.d   $r12,$r25,$r6
>   5840 addi.d  $r17,$r3,120
>   5841 .LBE189 = .
>   5842 .loc 1 829 18 discriminator 1 view .LVU1529
>   5843 or  $r13,$r0,$r0
>   5844 addi.d  $r4,$r12,992
> 
> Assembly that works fine using macros:
> 
> 3040 la.global   $r12,$r13,recog_data
> 3041 ld.b    $r9,$r12,997
> 3042 ble $r9,$r0,.L475
> 3043 alsl.d  $r5,$r16,$r16,2
> 3044 la.global   $r15,$r17,recog_data
> 3045 addi.d  $r4,$r12,992
> 3046 addi.d  $r18,$r3,48
> 3047 or  $r13,$r0,$r0
> 
> Comparing the assembly, we can see that lines 5844 and 3045 have the 
> same function,
> 
> but there is a problem with the base address register optimization at 
> line 5844.
> 
> regrename.c.283r.loop2_init:
> 
> (insn 6 497 2741 34 (set (reg:DI 180 [ ivtmp.713D.15724 ])
>  (const_int 0 [0])) "regrename.c":829:18 discrim 1 156 
> {*movdi_64bit}
> (nil))
> (insn 2741 6 2744 34 (parallel [
>  (set (reg:DI 1502)
>  (unspec:DI [
>  (symbol_ref:DI ("recog_data") [flags 0xc0]  
> )
>  ] UNSPEC_LA_PCREL_64_PART1))
>  (set (reg/f:DI 1479)
>  (unspec:DI [
>  (symbol_ref:DI ("recog_data") [flags 0xc0]  
> )
>  ] UNSPEC_LA_PCREL_64_PART2))
>  ]) -1
>   (expr_list:REG_UNUSED (reg/f:DI 1479)
> (nil)))
> (insn 2744 2741 2745 34 (set (reg/f:DI 1503)
>  (mem:DI (plus:DI (reg/f:DI 1479)
>  (reg:DI 1502)) [0  S8 A8])) 156 {*movdi_64bit}
>   (expr_list:REG_EQUAL (symbol_ref:DI ("recog_data") [flags 0xc0] 
> )
> (nil)))
> 
> 
> Virtual register 1479 will be used in insn 2744, but register 1479 was
> assigned the REG_UNUSED attribute in the previous instruction.
> 
> The attached file is the wrong file.

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-13 Thread chenglulu



在 2024/1/13 下午9:05, Xi Ruoyao 写道:

在 2024-01-13星期六的 15:01 +0800,chenglulu写道:

在 2024/1/12 下午7:42, Xi Ruoyao 写道:

在 2024-01-12星期五的 09:46 +0800,chenglulu写道:


I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
we need a target hook to tell the generic code
UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
see millions lines of messages like

../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
UNSPEC_LA_PCREL_64_PART1 (42) found in variable location

I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't reproduced the 
problem you mentioned.

  $ ../configure --host=loongarch64-linux-gnu 
--target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \
  --with-arch=loongarch64 --with-abi=lp64d --enable-tls 
--enable-languages=c,c++,fortran,lto --enable-plugin \
  --disable-multilib --disable-host-shared --enable-bootstrap 
--enable-checking=release
  $ make BOOT_FLAGS="-mcmodel=extreme"

What did I do wrong?:-(

BOOT_CFLAGS, not BOOT_FLAGS :).


This is so strange. My compilation here stopped due to syntax problems,

and I still haven't reproduced the information you mentioned about
UNSPEC_LA_PCREL_64_PART1.

I used:

../gcc/configure --with-system-zlib --disable-fixincludes \
  --enable-default-ssp --enable-default-pie \
  --disable-werror --disable-multilib \
  --prefix=/home/xry111/gcc-dev

and then

make STAGE1_{C,CXX}FLAGS="-O2 -g" -j8 \
  BOOT_{C,CXX}FLAGS="-O2 -g -mcmodel=extreme" &| tee gcc-build.log

I guess "-g" is needed to reproduce the issue as well as the messages
were produced in dwarf generation.


Oh, okay, I'll try this method!:-)







Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-13 Thread Xi Ruoyao
在 2024-01-13星期六的 15:01 +0800,chenglulu写道:
> 
> 在 2024/1/12 下午7:42, Xi Ruoyao 写道:
> > 在 2024-01-12星期五的 09:46 +0800,chenglulu写道:
> > 
> > > > I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
> > > > we need a target hook to tell the generic code
> > > > UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
> > > > see millions lines of messages like
> > > > 
> > > > ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
> > > > UNSPEC_LA_PCREL_64_PART1 (42) found in variable location
> > > I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't 
> > > reproduced the problem you mentioned.
> > > 
> > >  $ ../configure --host=loongarch64-linux-gnu 
> > > --target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \
> > >  --with-arch=loongarch64 --with-abi=lp64d --enable-tls 
> > > --enable-languages=c,c++,fortran,lto --enable-plugin \
> > >  --disable-multilib --disable-host-shared --enable-bootstrap 
> > > --enable-checking=release
> > >  $ make BOOT_FLAGS="-mcmodel=extreme"
> > > 
> > > What did I do wrong?:-(
> > BOOT_CFLAGS, not BOOT_FLAGS :).
> > 
> This is so strange. My compilation here stopped due to syntax problems,
> 
> and I still haven't reproduced the information you mentioned about 
> UNSPEC_LA_PCREL_64_PART1.

I used:

../gcc/configure --with-system-zlib --disable-fixincludes \
 --enable-default-ssp --enable-default-pie \
 --disable-werror --disable-multilib \
 --prefix=/home/xry111/gcc-dev

and then

make STAGE1_{C,CXX}FLAGS="-O2 -g" -j8 \
 BOOT_{C,CXX}FLAGS="-O2 -g -mcmodel=extreme" &| tee gcc-build.log

I guess "-g" is needed to reproduce the issue as well as the messages
were produced in dwarf generation.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-12 Thread chenglulu



在 2024/1/12 下午7:42, Xi Ruoyao 写道:

在 2024-01-12星期五的 09:46 +0800,chenglulu写道:


I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
we need a target hook to tell the generic code
UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
see millions lines of messages like

../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
UNSPEC_LA_PCREL_64_PART1 (42) found in variable location

I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't reproduced the 
problem you mentioned.

     $ ../configure --host=loongarch64-linux-gnu --target=loongarch64-linux-gnu 
--build=loongarch64-linux-gnu \
     --with-arch=loongarch64 --with-abi=lp64d --enable-tls 
--enable-languages=c,c++,fortran,lto --enable-plugin \
     --disable-multilib --disable-host-shared --enable-bootstrap 
--enable-checking=release
     $ make BOOT_FLAGS="-mcmodel=extreme"

What did I do wrong?:-(

BOOT_CFLAGS, not BOOT_FLAGS :).


This is so strange. My compilation here stopped due to syntax problems,

and I still haven't reproduced the information you mentioned about 
UNSPEC_LA_PCREL_64_PART1.





Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-12 Thread Xi Ruoyao
在 2024-01-12星期五的 09:46 +0800,chenglulu写道:

> > I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
> > we need a target hook to tell the generic code
> > UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
> > see millions lines of messages like
> > 
> > ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
> > UNSPEC_LA_PCREL_64_PART1 (42) found in variable location
> 
> I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't reproduced 
> the problem you mentioned.
> 
>     $ ../configure --host=loongarch64-linux-gnu 
> --target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \
>     --with-arch=loongarch64 --with-abi=lp64d --enable-tls 
> --enable-languages=c,c++,fortran,lto --enable-plugin \
>     --disable-multilib --disable-host-shared --enable-bootstrap 
> --enable-checking=release
>     $ make BOOT_FLAGS="-mcmodel=extreme"
> 
> What did I do wrong?:-(

BOOT_CFLAGS, not BOOT_FLAGS :).

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-11 Thread chenglulu



I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
we need a target hook to tell the generic code
UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
see millions lines of messages like

../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
UNSPEC_LA_PCREL_64_PART1 (42) found in variable location

I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't 
reproduced the problem you mentioned.


$../configure --host=loongarch64-linux-gnu 
--target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \


    --with-arch=loongarch64 --with-abi=lp64d --enable-tls 
--enable-languages=c,c++,fortran,lto --enable-plugin \


    --disable-multilib --disable-host-shared --enable-bootstrap 
--enable-checking=release


    $ make BOOT_FLAGS="-mcmodel=extreme"

What did I do wrong?:-(




Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-05 Thread Xi Ruoyao
On Fri, 2024-01-05 at 20:45 +0800, chenglulu wrote:
> 
> 在 2024/1/5 下午7:55, Xi Ruoyao 写道:
> > On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote:
> > > On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote:
> > > > 在 2024/1/5 下午4:37, Xi Ruoyao 写道:
> > > > > On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
> > > > > >    bool
> > > > > >    loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
> > > > > >    {
> > > > > > +  /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be 
> > > > > > adjancent
> > > > > > + so that the linker can infer the PC of pcalau12i to apply 
> > > > > > relocations
> > > > > > + to lu32i.d and lu52i.d.  Otherwise, the results would be 
> > > > > > incorrect if
> > > > > > + these four instructions are not in the same 4KiB page.
> > > > > > + Therefore, macro instructions are used when cmodel=extreme.  
> > > > > > */
> > > > > > +  if (loongarch_symbol_extreme_p (type))
> > > > > > +    return false;
> > > > > I think this is a bit of strange.  With 
> > > > > -mexplicit-relocs={auto,always}
> > > > > we should still use explicit relocs, but coding all 4 instructions
> > > > > altogether as
> > > > > 
> > > > > "pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)"
> > > > > 
> > > > > Give me several hours trying to implement this...
> > > > > 
> > > > I think there is no difference between macros and these instructions put
> > > > together. If implement it in a split form, I think I can try it through
> > > > TARGET_SCHED_MACRO_FUSION_PAIR_P
> > We don't need to split the insn.  We can just add a "large insn"
> > containing the assembly output we want.
> > 
> > See the attached patch.  Note that TLS LE/LD/GD needs a fix too because
> > they are basically an variation of GOT addressing.
> > 
> > I've ran some small tests and now trying to bootstrap GCC with -
> > mcmodel=extreme in BOOT_CFLAGS...
> > 
> > > There is a difference:
> > > 
> > > int x;
> > > int t() { return x; }
> > > 
> > > pcalau12i.d t0, %pc_hi20(x)
> > > addi.d t1, r0, %pc_lo12(x)
> > > lu32i.d t1, %pc64_lo20(x)
> > > lu52i.d t1, t1, %pc64_hi12(x)
> > > ldx.w a0, t0, t1
> > > 
> > > is slightly better than
> > > 
> > > pcalau12i.d t0, %pc_hi20(x)
> > > addi.d t1, r0, %pc_lo12(x)
> > > lu32i.d t1, %pc64_lo20(x)
> > > lu52i.d t1, t1, %pc64_hi12(x)
> > > addi.d t0, t0, t1
> > > ld.w a0, t0, 0
> > > 
> > > And generating macros when -mexplicit-relocs=always can puzzle people
> > > (it says "always" :-\ ).
> > > 
> Thumbs up! This method is much better than my method, I learned 
> something! grateful!
> But I still have to test the accuracy.

I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
we need a target hook to tell the generic code
UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
see millions lines of messages like

../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
UNSPEC_LA_PCREL_64_PART1 (42) found in variable location

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 4f89c4af323..410e1b5e693 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -10868,6 +10868,24 @@ loongarch_asm_code_end (void)
 #undef DUMP_FEATURE
 }
 
+static rtx loongarch_delegitimize_address (rtx op)
+{
+  if (GET_CODE (op) == UNSPEC)
+  {
+int unspec = XINT (op, 1);
+switch (unspec)
+  {
+  case UNSPEC_LA_PCREL_64_PART1:
+  case UNSPEC_LA_PCREL_64_PART2:
+   return XVECEXP (op, 0, 0);
+  default:
+   return op;
+  }
+  }
+
+  return op;
+}
+
 /* Initialize the GCC target structure.  */
 #undef TARGET_ASM_ALIGNED_HI_OP
 #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
@@ -11129,6 +11147,10 @@ loongarch_asm_code_end (void)
 #define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \
   loongarch_builtin_support_vector_misalignment
 
+#undef TARGET_DELEGITIMIZE_ADDRESS
+#define TARGET_DELEGITIMIZE_ADDRESS \
+  loongarch_delegitimize_address
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-loongarch.h"

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-05 Thread chenglulu



在 2024/1/5 下午7:55, Xi Ruoyao 写道:

On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote:

On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote:

在 2024/1/5 下午4:37, Xi Ruoyao 写道:

On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:

   bool
   loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
   {
+  /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent
+ so that the linker can infer the PC of pcalau12i to apply relocations
+ to lu32i.d and lu52i.d.  Otherwise, the results would be incorrect if
+ these four instructions are not in the same 4KiB page.
+ Therefore, macro instructions are used when cmodel=extreme.  */
+  if (loongarch_symbol_extreme_p (type))
+    return false;

I think this is a bit of strange.  With -mexplicit-relocs={auto,always}
we should still use explicit relocs, but coding all 4 instructions
altogether as

"pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)"

Give me several hours trying to implement this...


I think there is no difference between macros and these instructions put
together. If implement it in a split form, I think I can try it through
TARGET_SCHED_MACRO_FUSION_PAIR_P

We don't need to split the insn.  We can just add a "large insn"
containing the assembly output we want.

See the attached patch.  Note that TLS LE/LD/GD needs a fix too because
they are basically an variation of GOT addressing.

I've ran some small tests and now trying to bootstrap GCC with -
mcmodel=extreme in BOOT_CFLAGS...


There is a difference:

int x;
int t() { return x; }

pcalau12i.d t0, %pc_hi20(x)
addi.d t1, r0, %pc_lo12(x)
lu32i.d t1, %pc64_lo20(x)
lu52i.d t1, t1, %pc64_hi12(x)
ldx.w a0, t0, t1

is slightly better than

pcalau12i.d t0, %pc_hi20(x)
addi.d t1, r0, %pc_lo12(x)
lu32i.d t1, %pc64_lo20(x)
lu52i.d t1, t1, %pc64_hi12(x)
addi.d t0, t0, t1
ld.w a0, t0, 0

And generating macros when -mexplicit-relocs=always can puzzle people
(it says "always" :-\ ).

Thumbs up! This method is much better than my method, I learned 
something! grateful!

But I still have to test the accuracy.



Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-05 Thread Xi Ruoyao
On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote:
> On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote:
> > 
> > 在 2024/1/5 下午4:37, Xi Ruoyao 写道:
> > > On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
> > > >   bool
> > > >   loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
> > > >   {
> > > > +  /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be 
> > > > adjancent
> > > > + so that the linker can infer the PC of pcalau12i to apply 
> > > > relocations
> > > > + to lu32i.d and lu52i.d.  Otherwise, the results would be 
> > > > incorrect if
> > > > + these four instructions are not in the same 4KiB page.
> > > > + Therefore, macro instructions are used when cmodel=extreme.  */
> > > > +  if (loongarch_symbol_extreme_p (type))
> > > > +    return false;
> > > I think this is a bit of strange.  With -mexplicit-relocs={auto,always}
> > > we should still use explicit relocs, but coding all 4 instructions
> > > altogether as
> > > 
> > > "pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)"
> > > 
> > > Give me several hours trying to implement this...
> > > 
> > I think there is no difference between macros and these instructions put 
> > together. If implement it in a split form, I think I can try it through 
> > TARGET_SCHED_MACRO_FUSION_PAIR_P

We don't need to split the insn.  We can just add a "large insn"
containing the assembly output we want.

See the attached patch.  Note that TLS LE/LD/GD needs a fix too because
they are basically an variation of GOT addressing.

I've ran some small tests and now trying to bootstrap GCC with -
mcmodel=extreme in BOOT_CFLAGS...

> 
> There is a difference:
> 
> int x;
> int t() { return x; }
> 
> pcalau12i.d t0, %pc_hi20(x)
> addi.d t1, r0, %pc_lo12(x)
> lu32i.d t1, %pc64_lo20(x)
> lu52i.d t1, t1, %pc64_hi12(x)
> ldx.w a0, t0, t1
> 
> is slightly better than
> 
> pcalau12i.d t0, %pc_hi20(x)
> addi.d t1, r0, %pc_lo12(x)
> lu32i.d t1, %pc64_lo20(x)
> lu52i.d t1, t1, %pc64_hi12(x)
> addi.d t0, t0, t1
> ld.w a0, t0, 0
> 
> And generating macros when -mexplicit-relocs=always can puzzle people
> (it says "always" :-\ ).
> 

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University
From f6f75b1fd2dbd30255f127f59d16a2683fa22d58 Mon Sep 17 00:00:00 2001
From: Xi Ruoyao 
Date: Fri, 5 Jan 2024 18:40:06 +0800
Subject: [PATCH] LoongArch: Don't split the instructions containing relocs for
 extreme code model

The ABI mandates the pcalau12i/addi.d/lu32i.d/lu52i.d instructions for
addressing a symbol to be adjacent.  So model them as "one large
instruction", i.e. define_insn, with two output registers.  The real
address is the sum of these two registers.

The advantage of this approach is the RTL passes can still use ldx/stx
instructions to skip an addi.d instruction.

gcc/ChangeLog:

	* config/loongarch/loongarch.md (unspec): Add
	UNSPEC_LA_PCREL_64_PART1 and UNSPEC_LA_PCREL_64_PART2.
	(la_pcrel64_two_parts): New define_insn.
	* config/loongarch/loongarch.cc (loongarch_tls_symbol): Fix a
	typo in the comment.
	(loongarch_call_tls_get_addr): If TARGET_CMODEL_EXTREME, use
	la_pcrel64_two_parts for addressing the TLS symbol and
	__tls_get_addr.
	(loongarch_legitimize_tls_address): If TARGET_CMODEL_EXTREME,
	address TLS IE symbols with la_pcrel64_two_parts.
	(loongarch_split_symbol): If TARGET_CMODEL_EXTREME, address
	symbols with la_pcrel64_two_parts.

gcc/testsuite/ChangeLog:

	* gcc.target/loongarch/func-call-extreme-1.c (dg-options):
	Use -O2 instead of -O0 to ensure the pcalau12i/addi/lu32i/lu52i
	instruction sequences are not reordered by the compiler.
	(NOIPA): Disallow interprocedural optimizations.
	* gcc.target/loongarch/func-call-extreme-2.c: Remove the content
	duplicated from func-call-extreme-1.c, include it instead.
	(dg-options): Likewise.
	* gcc.target/loongarch/func-call-extreme-3.c (dg-options):
	Likewise.
	* gcc.target/loongarch/func-call-extreme-4.c (dg-options):
	Likewise.
	* gcc.target/loongarch/cmodel-extreme-1.c: New test.
	* gcc.target/loongarch/cmodel-extreme-2.c: New test.
---
 gcc/config/loongarch/loongarch.cc | 100 +-
 gcc/config/loongarch/loongarch.md |  21 
 .../gcc.target/loongarch/cmodel-extreme-1.c   |  18 
 .../gcc.target/loongarch/cmodel-extreme-2.c   |   7 ++
 .../loongarch/func-call-extreme-1.c   |  14 +--
 .../loongarch/func-call-extreme-2.c   |  29 +
 .../loongarch/func-call-extreme-3.c   |   2 +-
 .../loongarch/func-call-extreme-4.c   |   2 +-
 8 files changed, 109 insertions(+), 84 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/cmodel-extreme-1.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/cmodel-extreme-2.c

diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
index db83232884f..7c01169b422 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/lo

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-05 Thread Xi Ruoyao
On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote:
> 
> 在 2024/1/5 下午4:37, Xi Ruoyao 写道:
> > On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
> > >   bool
> > >   loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
> > >   {
> > > +  /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be 
> > > adjancent
> > > + so that the linker can infer the PC of pcalau12i to apply 
> > > relocations
> > > + to lu32i.d and lu52i.d.  Otherwise, the results would be incorrect 
> > > if
> > > + these four instructions are not in the same 4KiB page.
> > > + Therefore, macro instructions are used when cmodel=extreme.  */
> > > +  if (loongarch_symbol_extreme_p (type))
> > > +    return false;
> > I think this is a bit of strange.  With -mexplicit-relocs={auto,always}
> > we should still use explicit relocs, but coding all 4 instructions
> > altogether as
> > 
> > "pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)"
> > 
> > Give me several hours trying to implement this...
> > 
> I think there is no difference between macros and these instructions put 
> together. If implement it in a split form, I think I can try it through 
> TARGET_SCHED_MACRO_FUSION_PAIR_P

There is a difference:

int x;
int t() { return x; }

pcalau12i.d t0, %pc_hi20(x)
addi.d t1, r0, %pc_lo12(x)
lu32i.d t1, %pc64_lo20(x)
lu52i.d t1, t1, %pc64_hi12(x)
ldx.w a0, t0, t1

is slightly better than

pcalau12i.d t0, %pc_hi20(x)
addi.d t1, r0, %pc_lo12(x)
lu32i.d t1, %pc64_lo20(x)
lu52i.d t1, t1, %pc64_hi12(x)
addi.d t0, t0, t1
ld.w a0, t0, 0

And generating macros when -mexplicit-relocs=always can puzzle people
(it says "always" :-\ ).

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-05 Thread chenglulu



在 2024/1/5 下午4:37, Xi Ruoyao 写道:

On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:

  bool
  loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
  {
+  /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent
+ so that the linker can infer the PC of pcalau12i to apply relocations
+ to lu32i.d and lu52i.d.  Otherwise, the results would be incorrect if
+ these four instructions are not in the same 4KiB page.
+ Therefore, macro instructions are used when cmodel=extreme.  */
+  if (loongarch_symbol_extreme_p (type))
+    return false;

I think this is a bit of strange.  With -mexplicit-relocs={auto,always}
we should still use explicit relocs, but coding all 4 instructions
altogether as

"pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)"

Give me several hours trying to implement this...

I think there is no difference between macros and these instructions put 
together. If implement it in a split form, I think I can try it through 
TARGET_SCHED_MACRO_FUSION_PAIR_P




Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-05 Thread chenglulu



在 2024/1/5 下午4:37, Xi Ruoyao 写道:

On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:

  bool
  loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
  {
+  /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent
+ so that the linker can infer the PC of pcalau12i to apply relocations
+ to lu32i.d and lu52i.d.  Otherwise, the results would be incorrect if
+ these four instructions are not in the same 4KiB page.
+ Therefore, macro instructions are used when cmodel=extreme.  */
+  if (loongarch_symbol_extreme_p (type))
+    return false;

I think this is a bit of strange.  With -mexplicit-relocs={auto,always}
we should still use explicit relocs, but coding all 4 instructions
altogether as

"pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)"

Give me several hours trying to implement this...


You mean to take the last add directive out separately?



Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-05 Thread Xi Ruoyao
On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
>  bool
>  loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
>  {
> +  /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent
> + so that the linker can infer the PC of pcalau12i to apply relocations
> + to lu32i.d and lu52i.d.  Otherwise, the results would be incorrect if
> + these four instructions are not in the same 4KiB page.
> + Therefore, macro instructions are used when cmodel=extreme.  */
> +  if (loongarch_symbol_extreme_p (type))
> +    return false;

I think this is a bit of strange.  With -mexplicit-relocs={auto,always}
we should still use explicit relocs, but coding all 4 instructions
altogether as

"pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)"

Give me several hours trying to implement this...

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


[PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-04 Thread Lulu Cheng
Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent so that 
the
linker can infer the PC of pcalau12i to apply relocations to lu32i.d and 
lu52i.d.
Otherwise, the results would be incorrect if these four instructions are not in
the same 4KiB page.

See the link for details:
https://github.com/loongson/la-abi-specs/blob/release/laelf.adoc#extreme-code-model.

gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_symbol_extreme_p): Add
function declaration.
(loongarch_explicit_relocs_p): Use the macro instruction to get
the symbol address when loongarch_symbol_extreme_p returns true.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/attr-model-1.c: Modify the content of the search
string in the test case.
* gcc.target/loongarch/attr-model-2.c: Likewise.
* gcc.target/loongarch/attr-model-3.c: Likewise.
* gcc.target/loongarch/attr-model-4.c: Likewise.
* gcc.target/loongarch/func-call-extreme-1.c: Likewise.
* gcc.target/loongarch/func-call-extreme-2.c: Likewise.
* gcc.target/loongarch/func-call-extreme-3.c: Likewise.
* gcc.target/loongarch/func-call-extreme-4.c: Likewise.
---
 gcc/config/loongarch/loongarch.cc | 11 +++
 gcc/testsuite/gcc.target/loongarch/attr-model-1.c |  2 +-
 gcc/testsuite/gcc.target/loongarch/attr-model-2.c |  2 +-
 gcc/testsuite/gcc.target/loongarch/attr-model-3.c |  2 +-
 gcc/testsuite/gcc.target/loongarch/attr-model-4.c |  2 +-
 .../gcc.target/loongarch/func-call-extreme-1.c|  6 +++---
 .../gcc.target/loongarch/func-call-extreme-2.c|  6 +++---
 .../gcc.target/loongarch/func-call-extreme-3.c|  6 +++---
 .../gcc.target/loongarch/func-call-extreme-4.c|  6 +++---
 9 files changed, 27 insertions(+), 16 deletions(-)

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 6a3321327ea..3b4b28f3bcc 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -264,6 +264,9 @@ const char *const
 loongarch_fp_conditions[16]= {LARCH_FP_CONDITIONS (STRINGIFY)};
 #undef STRINGIFY
 
+static bool
+loongarch_symbol_extreme_p (enum loongarch_symbol_type type);
+
 /* Size of guard page.  */
 #define STACK_CLASH_PROTECTION_GUARD_SIZE \
   (1 << param_stack_clash_protection_guard_size)
@@ -1963,6 +1966,14 @@ loongarch_symbolic_constant_p (rtx x, enum 
loongarch_symbol_type *symbol_type)
 bool
 loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
 {
+  /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent
+ so that the linker can infer the PC of pcalau12i to apply relocations
+ to lu32i.d and lu52i.d.  Otherwise, the results would be incorrect if
+ these four instructions are not in the same 4KiB page.
+ Therefore, macro instructions are used when cmodel=extreme.  */
+  if (loongarch_symbol_extreme_p (type))
+return false;
+
   if (la_opt_explicit_relocs != EXPLICIT_RELOCS_AUTO)
 return la_opt_explicit_relocs == EXPLICIT_RELOCS_ALWAYS;
 
diff --git a/gcc/testsuite/gcc.target/loongarch/attr-model-1.c 
b/gcc/testsuite/gcc.target/loongarch/attr-model-1.c
index 916d715b98b..65acb29162c 100644
--- a/gcc/testsuite/gcc.target/loongarch/attr-model-1.c
+++ b/gcc/testsuite/gcc.target/loongarch/attr-model-1.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-mexplicit-relocs -mcmodel=normal -O2" } */
-/* { dg-final { scan-assembler-times "%pc64_hi12" 2 } } */
+/* { dg-final { scan-assembler-times "la\.local\t\\\$r\[0-9\]+,\\\$r15," 2 } } 
*/
 
 #define ATTR_MODEL_TEST
 #include "attr-model-test.c"
diff --git a/gcc/testsuite/gcc.target/loongarch/attr-model-2.c 
b/gcc/testsuite/gcc.target/loongarch/attr-model-2.c
index a74c795ac3e..cf0f079e39a 100644
--- a/gcc/testsuite/gcc.target/loongarch/attr-model-2.c
+++ b/gcc/testsuite/gcc.target/loongarch/attr-model-2.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-mexplicit-relocs -mcmodel=extreme -O2" } */
-/* { dg-final { scan-assembler-times "%pc64_hi12" 3 } } */
+/* { dg-final { scan-assembler-times "la\.local\t\\\$r\[0-9\]+,\\\$r15," 3 } } 
*/
 
 #define ATTR_MODEL_TEST
 #include "attr-model-test.c"
diff --git a/gcc/testsuite/gcc.target/loongarch/attr-model-3.c 
b/gcc/testsuite/gcc.target/loongarch/attr-model-3.c
index 5622d508678..7c270d462f7 100644
--- a/gcc/testsuite/gcc.target/loongarch/attr-model-3.c
+++ b/gcc/testsuite/gcc.target/loongarch/attr-model-3.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-mexplicit-relocs=auto -mcmodel=normal -O2" } */
-/* { dg-final { scan-assembler-times "%pc64_hi12" 2 } } */
+/* { dg-final { scan-assembler-times "la\.local\t\\\$r\[0-9\]+,\\\$r15," 2 } } 
*/
 
 #define ATTR_MODEL_TEST
 #include "attr-model-test.c"
diff --git a/gcc/testsuite/gcc.target/loongarch/attr-model-4.c 
b/gcc/testsuite/gcc.target/loongarch/attr-model-4.c
index 482724bb974..627d630c36d 100644
--- a/gcc/testsuite/gcc.tar