Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.
On Thu, 2024-01-25 at 08:48 +0800, chenglulu wrote: > > 在 2024/1/24 上午3:36, Xi Ruoyao 写道: > > On Mon, 2024-01-22 at 15:27 +0800, chenglulu wrote: > > > > > The failure of this test case was because the compiler believes that > > > > > two > > > > > (UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the > > > > > same result, but this isn't true because the result depends on PC. > > > > > Thus > > > > > (pc) needed to be included in the RTX, like: > > > > > > > > > > [(set (match_operand:DI 0 "register_operand" "=r") > > > > > (unspec:DI [(match_operand:DI 2 "") (pc)] > > > > > UNSPEC_LA_PCREL_64_PART1)) > > > > > (set (match_operand:DI 1 "register_operand" "=r") > > > > > (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))] > > > > > > > > > > With this the buggy REG_UNUSED notes were gone. But it then prevented > > > > > the CSE when loading the address of __tls_get_addr (i.e. if we address > > > > > 10 TLE_LD symbols in a function it would emit 10 instances of > > > > > "la.global > > > > > __tls_get_addr") so I added an REG_EQUAL note for it. For symbols > > > > > other > > > > > than __tls_get_addr such notes are added automatically by optimization > > > > > passes. > > > > > > > > > > Updated patch attached. > > > > > > > > > I'm eliminating redundant la.global directives in my macro > > > > implementation. > > > > > > > > I will be testing this patch. > > > > > > > > > > > > > > > > > > > With this patch, spec2006 can pass the test, but spec2017 621 and 654 > > > tests fail. > > > I haven't debugged the specific cause of the problem yet. > > Try removing the TARGET_DELEGITIMIZE_ADDRESS hook? After eating some > > unhealthy food in the midnight I realized the hook only > > papers over the same issue caused spec2006 failure. I tried a bootstrap > > with BOOT_CFLAGS=-O2 -g -mcmodel=extreme and TARGET_DELEGITIMIZE_ADDRESS > > commented out, and there is no more spurious "note: non-delegitimized > > UNSPEC UNSPEC_LA_PCREL_64_PART1 (42) found in variable location" things. > > I feel that this hook is still written in a buggy way, so maybe removing > > it will solve the spec2017 issue. > > > I found the problem. Binutils did not consider the four instructions > when converting the type from TLS IE to TLS LE, which caused the conversion > error. Oooops. We better fix this quickly as the Binutils 2.42 release is imminent. Maybe we can just disable TLS linker optimization once we see an R_LARCH_TLS_DESC64* or R_LARCH_TLS_IE64*. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.
在 2024/1/24 上午3:36, Xi Ruoyao 写道: On Mon, 2024-01-22 at 15:27 +0800, chenglulu wrote: The failure of this test case was because the compiler believes that two (UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the same result, but this isn't true because the result depends on PC. Thus (pc) needed to be included in the RTX, like: [(set (match_operand:DI 0 "register_operand" "=r") (unspec:DI [(match_operand:DI 2 "") (pc)] UNSPEC_LA_PCREL_64_PART1)) (set (match_operand:DI 1 "register_operand" "=r") (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))] With this the buggy REG_UNUSED notes were gone. But it then prevented the CSE when loading the address of __tls_get_addr (i.e. if we address 10 TLE_LD symbols in a function it would emit 10 instances of "la.global __tls_get_addr") so I added an REG_EQUAL note for it. For symbols other than __tls_get_addr such notes are added automatically by optimization passes. Updated patch attached. I'm eliminating redundant la.global directives in my macro implementation. I will be testing this patch. With this patch, spec2006 can pass the test, but spec2017 621 and 654 tests fail. I haven't debugged the specific cause of the problem yet. Try removing the TARGET_DELEGITIMIZE_ADDRESS hook? After eating some unhealthy food in the midnight I realized the hook only papers over the same issue caused spec2006 failure. I tried a bootstrap with BOOT_CFLAGS=-O2 -g -mcmodel=extreme and TARGET_DELEGITIMIZE_ADDRESS commented out, and there is no more spurious "note: non-delegitimized UNSPEC UNSPEC_LA_PCREL_64_PART1 (42) found in variable location" things. I feel that this hook is still written in a buggy way, so maybe removing it will solve the spec2017 issue. I found the problem. Binutils did not consider the four instructions when converting the type from TLS IE to TLS LE, which caused the conversion error.
Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.
On Mon, 2024-01-22 at 15:27 +0800, chenglulu wrote: > > > The failure of this test case was because the compiler believes that two > > > (UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the > > > same result, but this isn't true because the result depends on PC. Thus > > > (pc) needed to be included in the RTX, like: > > > > > > [(set (match_operand:DI 0 "register_operand" "=r") > > > (unspec:DI [(match_operand:DI 2 "") (pc)] > > > UNSPEC_LA_PCREL_64_PART1)) > > > (set (match_operand:DI 1 "register_operand" "=r") > > > (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))] > > > > > > With this the buggy REG_UNUSED notes were gone. But it then prevented > > > the CSE when loading the address of __tls_get_addr (i.e. if we address > > > 10 TLE_LD symbols in a function it would emit 10 instances of "la.global > > > __tls_get_addr") so I added an REG_EQUAL note for it. For symbols other > > > than __tls_get_addr such notes are added automatically by optimization > > > passes. > > > > > > Updated patch attached. > > > > > I'm eliminating redundant la.global directives in my macro > > implementation. > > > > I will be testing this patch. > > > > > > > > > With this patch, spec2006 can pass the test, but spec2017 621 and 654 > tests fail. > I haven't debugged the specific cause of the problem yet. Try removing the TARGET_DELEGITIMIZE_ADDRESS hook? After eating some unhealthy food in the midnight I realized the hook only papers over the same issue caused spec2006 failure. I tried a bootstrap with BOOT_CFLAGS=-O2 -g -mcmodel=extreme and TARGET_DELEGITIMIZE_ADDRESS commented out, and there is no more spurious "note: non-delegitimized UNSPEC UNSPEC_LA_PCREL_64_PART1 (42) found in variable location" things. I feel that this hook is still written in a buggy way, so maybe removing it will solve the spec2017 issue. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.
在 2024/1/19 下午4:51, chenglulu 写道: 在 2024/1/19 下午1:46, Xi Ruoyao 写道: On Wed, 2024-01-17 at 17:57 +0800, chenglulu wrote: Virtual register 1479 will be used in insn 2744, but register 1479 was assigned the REG_UNUSED attribute in the previous instruction. The attached file is the wrong file. The compilation command is as follows: $ ./gcc/cc1 -fpreprocessed regrename.i -quiet -dp -dumpbase regrename.c -dumpbase-ext .c -mno-relax -mabi=lp64d -march=loongarch64 -mfpu=64 -msimd=lasx -mcmodel=extreme -mtune=loongarch64 -g3 -O2 -Wno-int-conversion -Wno-implicit-int -Wno-implicit-function-declaration -Wno-incompatible-pointer-types -version -o regrename.s -mexplicit-relocs=always -fdump-rtl-all-all I've seen some "guality" test failures in GCC test suite as well. Normally I just ignore the guality failures but this time they look very suspicious. I'll investigate these issues... I've also seen this type of failed regression tests and I'll continue to look at this issue as well. The guality regression is simple: I didn't call delegitimize_mem_from_attrs (the default TARGET_DELEGITIMIZE_ADDRESS) in the custom implementation. The failure of this test case was because the compiler believes that two (UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the same result, but this isn't true because the result depends on PC. Thus (pc) needed to be included in the RTX, like: [(set (match_operand:DI 0 "register_operand" "=r") (unspec:DI [(match_operand:DI 2 "") (pc)] UNSPEC_LA_PCREL_64_PART1)) (set (match_operand:DI 1 "register_operand" "=r") (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))] With this the buggy REG_UNUSED notes were gone. But it then prevented the CSE when loading the address of __tls_get_addr (i.e. if we address 10 TLE_LD symbols in a function it would emit 10 instances of "la.global __tls_get_addr") so I added an REG_EQUAL note for it. For symbols other than __tls_get_addr such notes are added automatically by optimization passes. Updated patch attached. I'm eliminating redundant la.global directives in my macro implementation. I will be testing this patch. With this patch, spec2006 can pass the test, but spec2017 621 and 654 tests fail. I haven't debugged the specific cause of the problem yet.
Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.
在 2024/1/19 下午1:46, Xi Ruoyao 写道: On Wed, 2024-01-17 at 17:57 +0800, chenglulu wrote: Virtual register 1479 will be used in insn 2744, but register 1479 was assigned the REG_UNUSED attribute in the previous instruction. The attached file is the wrong file. The compilation command is as follows: $ ./gcc/cc1 -fpreprocessed regrename.i -quiet -dp -dumpbase regrename.c -dumpbase-ext .c -mno-relax -mabi=lp64d -march=loongarch64 -mfpu=64 -msimd=lasx -mcmodel=extreme -mtune=loongarch64 -g3 -O2 -Wno-int-conversion -Wno-implicit-int -Wno-implicit-function-declaration -Wno-incompatible-pointer-types -version -o regrename.s -mexplicit-relocs=always -fdump-rtl-all-all I've seen some "guality" test failures in GCC test suite as well. Normally I just ignore the guality failures but this time they look very suspicious. I'll investigate these issues... I've also seen this type of failed regression tests and I'll continue to look at this issue as well. The guality regression is simple: I didn't call delegitimize_mem_from_attrs (the default TARGET_DELEGITIMIZE_ADDRESS) in the custom implementation. The failure of this test case was because the compiler believes that two (UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the same result, but this isn't true because the result depends on PC. Thus (pc) needed to be included in the RTX, like: [(set (match_operand:DI 0 "register_operand" "=r") (unspec:DI [(match_operand:DI 2 "") (pc)] UNSPEC_LA_PCREL_64_PART1)) (set (match_operand:DI 1 "register_operand" "=r") (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))] With this the buggy REG_UNUSED notes were gone. But it then prevented the CSE when loading the address of __tls_get_addr (i.e. if we address 10 TLE_LD symbols in a function it would emit 10 instances of "la.global __tls_get_addr") so I added an REG_EQUAL note for it. For symbols other than __tls_get_addr such notes are added automatically by optimization passes. Updated patch attached. I'm eliminating redundant la.global directives in my macro implementation. I will be testing this patch.
Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.
On Wed, 2024-01-17 at 17:57 +0800, chenglulu wrote: > > > Virtual register 1479 will be used in insn 2744, but register 1479 was > > > assigned the REG_UNUSED attribute in the previous instruction. > > > > > > The attached file is the wrong file. > > > The compilation command is as follows: > > > > > > $ ./gcc/cc1 -fpreprocessed regrename.i -quiet -dp -dumpbase regrename.c > > > -dumpbase-ext .c -mno-relax -mabi=lp64d -march=loongarch64 -mfpu=64 > > > -msimd=lasx -mcmodel=extreme -mtune=loongarch64 -g3 -O2 > > > -Wno-int-conversion -Wno-implicit-int -Wno-implicit-function-declaration > > > -Wno-incompatible-pointer-types -version -o regrename.s > > > -mexplicit-relocs=always -fdump-rtl-all-all > > I've seen some "guality" test failures in GCC test suite as well. > > Normally I just ignore the guality failures but this time they look very > > suspicious. I'll investigate these issues... > > > I've also seen this type of failed regression tests and I'll continue to > look at this issue as well. The guality regression is simple: I didn't call delegitimize_mem_from_attrs (the default TARGET_DELEGITIMIZE_ADDRESS) in the custom implementation. The failure of this test case was because the compiler believes that two (UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the same result, but this isn't true because the result depends on PC. Thus (pc) needed to be included in the RTX, like: [(set (match_operand:DI 0 "register_operand" "=r") (unspec:DI [(match_operand:DI 2 "") (pc)] UNSPEC_LA_PCREL_64_PART1)) (set (match_operand:DI 1 "register_operand" "=r") (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))] With this the buggy REG_UNUSED notes were gone. But it then prevented the CSE when loading the address of __tls_get_addr (i.e. if we address 10 TLE_LD symbols in a function it would emit 10 instances of "la.global __tls_get_addr") so I added an REG_EQUAL note for it. For symbols other than __tls_get_addr such notes are added automatically by optimization passes. Updated patch attached. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University From e9d789f8dcb52984b0f894fdecc402a49c5ad6d7 Mon Sep 17 00:00:00 2001 From: Xi Ruoyao Date: Fri, 5 Jan 2024 18:40:06 +0800 Subject: [PATCH v2] LoongArch: Don't split the instructions containing relocs for extreme code model The ABI mandates the pcalau12i/addi.d/lu32i.d/lu52i.d instructions for addressing a symbol to be adjacent. So model them as "one large instruction", i.e. define_insn, with two output registers. The real address is the sum of these two registers. The advantage of this approach is the RTL passes can still use ldx/stx instructions to skip an addi.d instruction. gcc/ChangeLog: * config/loongarch/loongarch.md (unspec): Add UNSPEC_LA_PCREL_64_PART1 and UNSPEC_LA_PCREL_64_PART2. (la_pcrel64_two_parts): New define_insn. * config/loongarch/loongarch.cc (loongarch_tls_symbol): Fix a typo in the comment. (loongarch_call_tls_get_addr): If TARGET_CMODEL_EXTREME, use la_pcrel64_two_parts for addressing the TLS symbol and __tls_get_addr. Emit an REG_EQUAL note to allow CSE addressing __tls_get_addr. (loongarch_legitimize_tls_address): If TARGET_CMODEL_EXTREME, address TLS IE symbols with la_pcrel64_two_parts. (loongarch_split_symbol): If TARGET_CMODEL_EXTREME, address symbols with la_pcrel64_two_parts. (TARGET_DELEGITIMIZE_ADDRESS): Define. (loongarch_delegitimize_address): Implement the target hook. gcc/testsuite/ChangeLog: * gcc.target/loongarch/func-call-extreme-1.c (dg-options): Use -O2 instead of -O0 to ensure the pcalau12i/addi/lu32i/lu52i instruction sequences are not reordered by the compiler. (NOIPA): Disallow interprocedural optimizations. * gcc.target/loongarch/func-call-extreme-2.c: Remove the content duplicated from func-call-extreme-1.c, include it instead. (dg-options): Likewise. * gcc.target/loongarch/func-call-extreme-3.c (dg-options): Likewise. * gcc.target/loongarch/func-call-extreme-4.c (dg-options): Likewise. * gcc.target/loongarch/cmodel-extreme-1.c: New test. * gcc.target/loongarch/cmodel-extreme-2.c: New test. --- gcc/config/loongarch/loongarch.cc | 135 +++--- gcc/config/loongarch/loongarch.md | 21 +++ .../gcc.target/loongarch/cmodel-extreme-1.c | 18 +++ .../gcc.target/loongarch/cmodel-extreme-2.c | 7 + .../loongarch/func-call-extreme-1.c | 14 +- .../loongarch/func-call-extreme-2.c | 29 +--- .../loongarch/func-call-extreme-3.c | 2 +- .../loongarch/func-call-extreme-4.c | 2 +- 8 files changed, 144 insertions(+), 84 deletions(-) create mode 100644 gcc/testsuite/gcc.target/loongarch/cmodel-extreme-1.c create mode 100644 gcc/testsuite/gcc.target/loongarch/cmodel-extreme-2.c diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index 82467474288..358d2f8f3f5 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/con
Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.
在 2024/1/17 下午5:50, Xi Ruoyao 写道: On Wed, 2024-01-17 at 17:38 +0800, chenglulu wrote: 在 2024/1/13 下午9:05, Xi Ruoyao 写道: 在 2024-01-13星期六的 15:01 +0800,chenglulu写道: 在 2024/1/12 下午7:42, Xi Ruoyao 写道: 在 2024-01-12星期五的 09:46 +0800,chenglulu写道: I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS: we need a target hook to tell the generic code UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll see millions lines of messages like ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC UNSPEC_LA_PCREL_64_PART1 (42) found in variable location I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't reproduced the problem you mentioned. $ ../configure --host=loongarch64-linux-gnu --target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \ --with-arch=loongarch64 --with-abi=lp64d --enable-tls --enable-languages=c,c++,fortran,lto --enable-plugin \ --disable-multilib --disable-host-shared --enable-bootstrap --enable-checking=release $ make BOOT_FLAGS="-mcmodel=extreme" What did I do wrong?:-( BOOT_CFLAGS, not BOOT_FLAGS :). This is so strange. My compilation here stopped due to syntax problems, and I still haven't reproduced the information you mentioned about UNSPEC_LA_PCREL_64_PART1. I used: ../gcc/configure --with-system-zlib --disable-fixincludes \ --enable-default-ssp --enable-default-pie \ --disable-werror --disable-multilib \ --prefix=/home/xry111/gcc-dev and then make STAGE1_{C,CXX}FLAGS="-O2 -g" -j8 \ BOOT_{C,CXX}FLAGS="-O2 -g -mcmodel=extreme" &| tee gcc-build.log I guess "-g" is needed to reproduce the issue as well as the messages were produced in dwarf generation. I have reproduced this problem, and it can be solved by adding a hook. But unfortunately, when using '-mcmodel=extreme -mexplicit-relocs=always' to test spec2006 403.gcc, an error will occur. Others have not been tested yet. I roughly debugged it, and the problem should be this: The problem is that the address of the instruction ‘ldx.d $r12, $r25, $r6’ is wrong. Wrong assembly: 5826 pcalau12i $r13,%got_pc_hi20(recog_data) 5827 addi.d $r12,$r0,%got_pc_lo12(recog_data) 5828 lu32i.d $r12,%got64_pc_lo20(recog_data) 5829 lu52i.d $r12,$r12,%got64_pc_hi12(recog_data) 5830 ldx.d $r12,$r13,$r12 5831 ld.b $r8,$r12,997 5832 .loc 1 829 18 discriminator 1 view .LVU1527 5833 ble $r8,$r0,.L476 5834 ld.d $r6,$r3,16 5835 ld.d $r9,$r3,88 5836 .LBB189 = . 5837 .loc 1 839 24 view .LVU1528 5838 alsl.d $r7,$r19,$r19,2 5839 ldx.d $r12,$r25,$r6 5840 addi.d $r17,$r3,120 5841 .LBE189 = . 5842 .loc 1 829 18 discriminator 1 view .LVU1529 5843 or $r13,$r0,$r0 5844 addi.d $r4,$r12,992 Assembly that works fine using macros: 3040 la.global $r12,$r13,recog_data 3041 ld.b $r9,$r12,997 3042 ble $r9,$r0,.L475 3043 alsl.d $r5,$r16,$r16,2 3044 la.global $r15,$r17,recog_data 3045 addi.d $r4,$r12,992 3046 addi.d $r18,$r3,48 3047 or $r13,$r0,$r0 Comparing the assembly, we can see that lines 5844 and 3045 have the same function, but there is a problem with the base address register optimization at line 5844. regrename.c.283r.loop2_init: (insn 6 497 2741 34 (set (reg:DI 180 [ ivtmp.713D.15724 ]) (const_int 0 [0])) "regrename.c":829:18 discrim 1 156 {*movdi_64bit} (nil)) (insn 2741 6 2744 34 (parallel [ (set (reg:DI 1502) (unspec:DI [ (symbol_ref:DI ("recog_data") [flags 0xc0] ) ] UNSPEC_LA_PCREL_64_PART1)) (set (reg/f:DI 1479) (unspec:DI [ (symbol_ref:DI ("recog_data") [flags 0xc0] ) ] UNSPEC_LA_PCREL_64_PART2)) ]) -1 (expr_list:REG_UNUSED (reg/f:DI 1479) (nil))) (insn 2744 2741 2745 34 (set (reg/f:DI 1503) (mem:DI (plus:DI (reg/f:DI 1479) (reg:DI 1502)) [0 S8 A8])) 156 {*movdi_64bit} (expr_list:REG_EQUAL (symbol_ref:DI ("recog_data") [flags 0xc0] ) (nil))) Virtual register 1479 will be used in insn 2744, but register 1479 was assigned the REG_UNUSED attribute in the previous instruction. The attached file is the wrong file. The compilation command is as follows: $ ./gcc/cc1 -fpreprocessed regrename.i -quiet -dp -dumpbase regrename.c -dumpbase-ext .c -mno-relax -mabi=lp64d -march=loongarch64 -mfpu=64 -msimd=lasx -mcmodel=extreme -mtune=loongarch64 -g3 -O2 -Wno-int-conversion -Wno-implicit-int -Wno-implicit-function-declaration -Wno-incompatible-pointer-types -version -o regrename.s -mexplicit-relocs=always -fdump-rtl-all-all I've seen some "guality" test failure
Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.
On Wed, 2024-01-17 at 17:38 +0800, chenglulu wrote: > > 在 2024/1/13 下午9:05, Xi Ruoyao 写道: > > 在 2024-01-13星期六的 15:01 +0800,chenglulu写道: > > > 在 2024/1/12 下午7:42, Xi Ruoyao 写道: > > > > 在 2024-01-12星期五的 09:46 +0800,chenglulu写道: > > > > > > > > > > I found an issue bootstrapping GCC with -mcmodel=extreme in > > > > > > BOOT_CFLAGS: > > > > > > we need a target hook to tell the generic code > > > > > > UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or > > > > > > we'll > > > > > > see millions lines of messages like > > > > > > > > > > > > ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC > > > > > > UNSPEC_LA_PCREL_64_PART1 (42) found in variable location > > > > > I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't > > > > > reproduced the problem you mentioned. > > > > > > > > > > $ ../configure --host=loongarch64-linux-gnu > > > > > --target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \ > > > > > --with-arch=loongarch64 --with-abi=lp64d --enable-tls > > > > > --enable-languages=c,c++,fortran,lto --enable-plugin \ > > > > > --disable-multilib --disable-host-shared --enable-bootstrap > > > > > --enable-checking=release > > > > > $ make BOOT_FLAGS="-mcmodel=extreme" > > > > > > > > > > What did I do wrong?:-( > > > > BOOT_CFLAGS, not BOOT_FLAGS :). > > > > > > > This is so strange. My compilation here stopped due to syntax problems, > > > > > > and I still haven't reproduced the information you mentioned about > > > UNSPEC_LA_PCREL_64_PART1. > > I used: > > > > ../gcc/configure --with-system-zlib --disable-fixincludes \ > > --enable-default-ssp --enable-default-pie \ > > --disable-werror --disable-multilib \ > > --prefix=/home/xry111/gcc-dev > > > > and then > > > > make STAGE1_{C,CXX}FLAGS="-O2 -g" -j8 \ > > BOOT_{C,CXX}FLAGS="-O2 -g -mcmodel=extreme" &| tee gcc-build.log > > > > I guess "-g" is needed to reproduce the issue as well as the messages > > were produced in dwarf generation. > > > I have reproduced this problem, and it can be solved by adding a hook. > > But unfortunately, when using '-mcmodel=extreme -mexplicit-relocs=always' > > to test spec2006 403.gcc, an error will occur. Others have not been > tested yet. > > I roughly debugged it, and the problem should be this: > > The problem is that the address of the instruction ‘ldx.d $r12, $r25, > $r6’ is wrong. > > Wrong assembly: > > 5826 pcalau12i $r13,%got_pc_hi20(recog_data) > 5827 addi.d $r12,$r0,%got_pc_lo12(recog_data) > 5828 lu32i.d $r12,%got64_pc_lo20(recog_data) > 5829 lu52i.d $r12,$r12,%got64_pc_hi12(recog_data) > 5830 ldx.d $r12,$r13,$r12 > 5831 ld.b $r8,$r12,997 > 5832 .loc 1 829 18 discriminator 1 view .LVU1527 > 5833 ble $r8,$r0,.L476 > 5834 ld.d $r6,$r3,16 > 5835 ld.d $r9,$r3,88 > 5836 .LBB189 = . > 5837 .loc 1 839 24 view .LVU1528 > 5838 alsl.d $r7,$r19,$r19,2 > 5839 ldx.d $r12,$r25,$r6 > 5840 addi.d $r17,$r3,120 > 5841 .LBE189 = . > 5842 .loc 1 829 18 discriminator 1 view .LVU1529 > 5843 or $r13,$r0,$r0 > 5844 addi.d $r4,$r12,992 > > Assembly that works fine using macros: > > 3040 la.global $r12,$r13,recog_data > 3041 ld.b $r9,$r12,997 > 3042 ble $r9,$r0,.L475 > 3043 alsl.d $r5,$r16,$r16,2 > 3044 la.global $r15,$r17,recog_data > 3045 addi.d $r4,$r12,992 > 3046 addi.d $r18,$r3,48 > 3047 or $r13,$r0,$r0 > > Comparing the assembly, we can see that lines 5844 and 3045 have the > same function, > > but there is a problem with the base address register optimization at > line 5844. > > regrename.c.283r.loop2_init: > > (insn 6 497 2741 34 (set (reg:DI 180 [ ivtmp.713D.15724 ]) > (const_int 0 [0])) "regrename.c":829:18 discrim 1 156 > {*movdi_64bit} > (nil)) > (insn 2741 6 2744 34 (parallel [ > (set (reg:DI 1502) > (unspec:DI [ > (symbol_ref:DI ("recog_data") [flags 0xc0] > ) > ] UNSPEC_LA_PCREL_64_PART1)) > (set (reg/f:DI 1479) > (unspec:DI [ > (symbol_ref:DI ("recog_data") [flags 0xc0] > ) > ] UNSPEC_LA_PCREL_64_PART2)) > ]) -1 > (expr_list:REG_UNUSED (reg/f:DI 1479) > (nil))) > (insn 2744 2741 2745 34 (set (reg/f:DI 1503) > (mem:DI (plus:DI (reg/f:DI 1479) > (reg:DI 1502)) [0 S8 A8])) 156 {*movdi_64bit} > (expr_list:REG_EQUAL (symbol_ref:DI ("recog_data") [flags 0xc0] > ) > (nil))) > > > Virtual register 1479 will be used in insn 2744, but register 1479 was > assigned the REG_UNUSED attribute in the previous instruction. > > The attached file is the wrong file.
Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.
在 2024/1/13 下午9:05, Xi Ruoyao 写道: 在 2024-01-13星期六的 15:01 +0800,chenglulu写道: 在 2024/1/12 下午7:42, Xi Ruoyao 写道: 在 2024-01-12星期五的 09:46 +0800,chenglulu写道: I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS: we need a target hook to tell the generic code UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll see millions lines of messages like ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC UNSPEC_LA_PCREL_64_PART1 (42) found in variable location I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't reproduced the problem you mentioned. $ ../configure --host=loongarch64-linux-gnu --target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \ --with-arch=loongarch64 --with-abi=lp64d --enable-tls --enable-languages=c,c++,fortran,lto --enable-plugin \ --disable-multilib --disable-host-shared --enable-bootstrap --enable-checking=release $ make BOOT_FLAGS="-mcmodel=extreme" What did I do wrong?:-( BOOT_CFLAGS, not BOOT_FLAGS :). This is so strange. My compilation here stopped due to syntax problems, and I still haven't reproduced the information you mentioned about UNSPEC_LA_PCREL_64_PART1. I used: ../gcc/configure --with-system-zlib --disable-fixincludes \ --enable-default-ssp --enable-default-pie \ --disable-werror --disable-multilib \ --prefix=/home/xry111/gcc-dev and then make STAGE1_{C,CXX}FLAGS="-O2 -g" -j8 \ BOOT_{C,CXX}FLAGS="-O2 -g -mcmodel=extreme" &| tee gcc-build.log I guess "-g" is needed to reproduce the issue as well as the messages were produced in dwarf generation. Oh, okay, I'll try this method!:-)
Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.
在 2024-01-13星期六的 15:01 +0800,chenglulu写道: > > 在 2024/1/12 下午7:42, Xi Ruoyao 写道: > > 在 2024-01-12星期五的 09:46 +0800,chenglulu写道: > > > > > > I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS: > > > > we need a target hook to tell the generic code > > > > UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll > > > > see millions lines of messages like > > > > > > > > ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC > > > > UNSPEC_LA_PCREL_64_PART1 (42) found in variable location > > > I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't > > > reproduced the problem you mentioned. > > > > > > $ ../configure --host=loongarch64-linux-gnu > > > --target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \ > > > --with-arch=loongarch64 --with-abi=lp64d --enable-tls > > > --enable-languages=c,c++,fortran,lto --enable-plugin \ > > > --disable-multilib --disable-host-shared --enable-bootstrap > > > --enable-checking=release > > > $ make BOOT_FLAGS="-mcmodel=extreme" > > > > > > What did I do wrong?:-( > > BOOT_CFLAGS, not BOOT_FLAGS :). > > > This is so strange. My compilation here stopped due to syntax problems, > > and I still haven't reproduced the information you mentioned about > UNSPEC_LA_PCREL_64_PART1. I used: ../gcc/configure --with-system-zlib --disable-fixincludes \ --enable-default-ssp --enable-default-pie \ --disable-werror --disable-multilib \ --prefix=/home/xry111/gcc-dev and then make STAGE1_{C,CXX}FLAGS="-O2 -g" -j8 \ BOOT_{C,CXX}FLAGS="-O2 -g -mcmodel=extreme" &| tee gcc-build.log I guess "-g" is needed to reproduce the issue as well as the messages were produced in dwarf generation. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.
在 2024/1/12 下午7:42, Xi Ruoyao 写道: 在 2024-01-12星期五的 09:46 +0800,chenglulu写道: I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS: we need a target hook to tell the generic code UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll see millions lines of messages like ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC UNSPEC_LA_PCREL_64_PART1 (42) found in variable location I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't reproduced the problem you mentioned. $ ../configure --host=loongarch64-linux-gnu --target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \ --with-arch=loongarch64 --with-abi=lp64d --enable-tls --enable-languages=c,c++,fortran,lto --enable-plugin \ --disable-multilib --disable-host-shared --enable-bootstrap --enable-checking=release $ make BOOT_FLAGS="-mcmodel=extreme" What did I do wrong?:-( BOOT_CFLAGS, not BOOT_FLAGS :). This is so strange. My compilation here stopped due to syntax problems, and I still haven't reproduced the information you mentioned about UNSPEC_LA_PCREL_64_PART1.
Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.
在 2024-01-12星期五的 09:46 +0800,chenglulu写道: > > I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS: > > we need a target hook to tell the generic code > > UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll > > see millions lines of messages like > > > > ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC > > UNSPEC_LA_PCREL_64_PART1 (42) found in variable location > > I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't reproduced > the problem you mentioned. > > $ ../configure --host=loongarch64-linux-gnu > --target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \ > --with-arch=loongarch64 --with-abi=lp64d --enable-tls > --enable-languages=c,c++,fortran,lto --enable-plugin \ > --disable-multilib --disable-host-shared --enable-bootstrap > --enable-checking=release > $ make BOOT_FLAGS="-mcmodel=extreme" > > What did I do wrong?:-( BOOT_CFLAGS, not BOOT_FLAGS :). -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.
I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS: we need a target hook to tell the generic code UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll see millions lines of messages like ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC UNSPEC_LA_PCREL_64_PART1 (42) found in variable location I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't reproduced the problem you mentioned. $../configure --host=loongarch64-linux-gnu --target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \ --with-arch=loongarch64 --with-abi=lp64d --enable-tls --enable-languages=c,c++,fortran,lto --enable-plugin \ --disable-multilib --disable-host-shared --enable-bootstrap --enable-checking=release $ make BOOT_FLAGS="-mcmodel=extreme" What did I do wrong?:-(
Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.
On Fri, 2024-01-05 at 20:45 +0800, chenglulu wrote: > > 在 2024/1/5 下午7:55, Xi Ruoyao 写道: > > On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote: > > > On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote: > > > > 在 2024/1/5 下午4:37, Xi Ruoyao 写道: > > > > > On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote: > > > > > > bool > > > > > > loongarch_explicit_relocs_p (enum loongarch_symbol_type type) > > > > > > { > > > > > > + /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be > > > > > > adjancent > > > > > > + so that the linker can infer the PC of pcalau12i to apply > > > > > > relocations > > > > > > + to lu32i.d and lu52i.d. Otherwise, the results would be > > > > > > incorrect if > > > > > > + these four instructions are not in the same 4KiB page. > > > > > > + Therefore, macro instructions are used when cmodel=extreme. > > > > > > */ > > > > > > + if (loongarch_symbol_extreme_p (type)) > > > > > > + return false; > > > > > I think this is a bit of strange. With > > > > > -mexplicit-relocs={auto,always} > > > > > we should still use explicit relocs, but coding all 4 instructions > > > > > altogether as > > > > > > > > > > "pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)" > > > > > > > > > > Give me several hours trying to implement this... > > > > > > > > > I think there is no difference between macros and these instructions put > > > > together. If implement it in a split form, I think I can try it through > > > > TARGET_SCHED_MACRO_FUSION_PAIR_P > > We don't need to split the insn. We can just add a "large insn" > > containing the assembly output we want. > > > > See the attached patch. Note that TLS LE/LD/GD needs a fix too because > > they are basically an variation of GOT addressing. > > > > I've ran some small tests and now trying to bootstrap GCC with - > > mcmodel=extreme in BOOT_CFLAGS... > > > > > There is a difference: > > > > > > int x; > > > int t() { return x; } > > > > > > pcalau12i.d t0, %pc_hi20(x) > > > addi.d t1, r0, %pc_lo12(x) > > > lu32i.d t1, %pc64_lo20(x) > > > lu52i.d t1, t1, %pc64_hi12(x) > > > ldx.w a0, t0, t1 > > > > > > is slightly better than > > > > > > pcalau12i.d t0, %pc_hi20(x) > > > addi.d t1, r0, %pc_lo12(x) > > > lu32i.d t1, %pc64_lo20(x) > > > lu52i.d t1, t1, %pc64_hi12(x) > > > addi.d t0, t0, t1 > > > ld.w a0, t0, 0 > > > > > > And generating macros when -mexplicit-relocs=always can puzzle people > > > (it says "always" :-\ ). > > > > Thumbs up! This method is much better than my method, I learned > something! grateful! > But I still have to test the accuracy. I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS: we need a target hook to tell the generic code UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll see millions lines of messages like ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC UNSPEC_LA_PCREL_64_PART1 (42) found in variable location diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index 4f89c4af323..410e1b5e693 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/config/loongarch/loongarch.cc @@ -10868,6 +10868,24 @@ loongarch_asm_code_end (void) #undef DUMP_FEATURE } +static rtx loongarch_delegitimize_address (rtx op) +{ + if (GET_CODE (op) == UNSPEC) + { +int unspec = XINT (op, 1); +switch (unspec) + { + case UNSPEC_LA_PCREL_64_PART1: + case UNSPEC_LA_PCREL_64_PART2: + return XVECEXP (op, 0, 0); + default: + return op; + } + } + + return op; +} + /* Initialize the GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t" @@ -11129,6 +11147,10 @@ loongarch_asm_code_end (void) #define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \ loongarch_builtin_support_vector_misalignment +#undef TARGET_DELEGITIMIZE_ADDRESS +#define TARGET_DELEGITIMIZE_ADDRESS \ + loongarch_delegitimize_address + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-loongarch.h" -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.
在 2024/1/5 下午7:55, Xi Ruoyao 写道: On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote: On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote: 在 2024/1/5 下午4:37, Xi Ruoyao 写道: On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote: bool loongarch_explicit_relocs_p (enum loongarch_symbol_type type) { + /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent + so that the linker can infer the PC of pcalau12i to apply relocations + to lu32i.d and lu52i.d. Otherwise, the results would be incorrect if + these four instructions are not in the same 4KiB page. + Therefore, macro instructions are used when cmodel=extreme. */ + if (loongarch_symbol_extreme_p (type)) + return false; I think this is a bit of strange. With -mexplicit-relocs={auto,always} we should still use explicit relocs, but coding all 4 instructions altogether as "pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)" Give me several hours trying to implement this... I think there is no difference between macros and these instructions put together. If implement it in a split form, I think I can try it through TARGET_SCHED_MACRO_FUSION_PAIR_P We don't need to split the insn. We can just add a "large insn" containing the assembly output we want. See the attached patch. Note that TLS LE/LD/GD needs a fix too because they are basically an variation of GOT addressing. I've ran some small tests and now trying to bootstrap GCC with - mcmodel=extreme in BOOT_CFLAGS... There is a difference: int x; int t() { return x; } pcalau12i.d t0, %pc_hi20(x) addi.d t1, r0, %pc_lo12(x) lu32i.d t1, %pc64_lo20(x) lu52i.d t1, t1, %pc64_hi12(x) ldx.w a0, t0, t1 is slightly better than pcalau12i.d t0, %pc_hi20(x) addi.d t1, r0, %pc_lo12(x) lu32i.d t1, %pc64_lo20(x) lu52i.d t1, t1, %pc64_hi12(x) addi.d t0, t0, t1 ld.w a0, t0, 0 And generating macros when -mexplicit-relocs=always can puzzle people (it says "always" :-\ ). Thumbs up! This method is much better than my method, I learned something! grateful! But I still have to test the accuracy.
Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.
On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote: > On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote: > > > > 在 2024/1/5 下午4:37, Xi Ruoyao 写道: > > > On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote: > > > > bool > > > > loongarch_explicit_relocs_p (enum loongarch_symbol_type type) > > > > { > > > > + /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be > > > > adjancent > > > > + so that the linker can infer the PC of pcalau12i to apply > > > > relocations > > > > + to lu32i.d and lu52i.d. Otherwise, the results would be > > > > incorrect if > > > > + these four instructions are not in the same 4KiB page. > > > > + Therefore, macro instructions are used when cmodel=extreme. */ > > > > + if (loongarch_symbol_extreme_p (type)) > > > > + return false; > > > I think this is a bit of strange. With -mexplicit-relocs={auto,always} > > > we should still use explicit relocs, but coding all 4 instructions > > > altogether as > > > > > > "pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)" > > > > > > Give me several hours trying to implement this... > > > > > I think there is no difference between macros and these instructions put > > together. If implement it in a split form, I think I can try it through > > TARGET_SCHED_MACRO_FUSION_PAIR_P We don't need to split the insn. We can just add a "large insn" containing the assembly output we want. See the attached patch. Note that TLS LE/LD/GD needs a fix too because they are basically an variation of GOT addressing. I've ran some small tests and now trying to bootstrap GCC with - mcmodel=extreme in BOOT_CFLAGS... > > There is a difference: > > int x; > int t() { return x; } > > pcalau12i.d t0, %pc_hi20(x) > addi.d t1, r0, %pc_lo12(x) > lu32i.d t1, %pc64_lo20(x) > lu52i.d t1, t1, %pc64_hi12(x) > ldx.w a0, t0, t1 > > is slightly better than > > pcalau12i.d t0, %pc_hi20(x) > addi.d t1, r0, %pc_lo12(x) > lu32i.d t1, %pc64_lo20(x) > lu52i.d t1, t1, %pc64_hi12(x) > addi.d t0, t0, t1 > ld.w a0, t0, 0 > > And generating macros when -mexplicit-relocs=always can puzzle people > (it says "always" :-\ ). > -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University From f6f75b1fd2dbd30255f127f59d16a2683fa22d58 Mon Sep 17 00:00:00 2001 From: Xi Ruoyao Date: Fri, 5 Jan 2024 18:40:06 +0800 Subject: [PATCH] LoongArch: Don't split the instructions containing relocs for extreme code model The ABI mandates the pcalau12i/addi.d/lu32i.d/lu52i.d instructions for addressing a symbol to be adjacent. So model them as "one large instruction", i.e. define_insn, with two output registers. The real address is the sum of these two registers. The advantage of this approach is the RTL passes can still use ldx/stx instructions to skip an addi.d instruction. gcc/ChangeLog: * config/loongarch/loongarch.md (unspec): Add UNSPEC_LA_PCREL_64_PART1 and UNSPEC_LA_PCREL_64_PART2. (la_pcrel64_two_parts): New define_insn. * config/loongarch/loongarch.cc (loongarch_tls_symbol): Fix a typo in the comment. (loongarch_call_tls_get_addr): If TARGET_CMODEL_EXTREME, use la_pcrel64_two_parts for addressing the TLS symbol and __tls_get_addr. (loongarch_legitimize_tls_address): If TARGET_CMODEL_EXTREME, address TLS IE symbols with la_pcrel64_two_parts. (loongarch_split_symbol): If TARGET_CMODEL_EXTREME, address symbols with la_pcrel64_two_parts. gcc/testsuite/ChangeLog: * gcc.target/loongarch/func-call-extreme-1.c (dg-options): Use -O2 instead of -O0 to ensure the pcalau12i/addi/lu32i/lu52i instruction sequences are not reordered by the compiler. (NOIPA): Disallow interprocedural optimizations. * gcc.target/loongarch/func-call-extreme-2.c: Remove the content duplicated from func-call-extreme-1.c, include it instead. (dg-options): Likewise. * gcc.target/loongarch/func-call-extreme-3.c (dg-options): Likewise. * gcc.target/loongarch/func-call-extreme-4.c (dg-options): Likewise. * gcc.target/loongarch/cmodel-extreme-1.c: New test. * gcc.target/loongarch/cmodel-extreme-2.c: New test. --- gcc/config/loongarch/loongarch.cc | 100 +- gcc/config/loongarch/loongarch.md | 21 .../gcc.target/loongarch/cmodel-extreme-1.c | 18 .../gcc.target/loongarch/cmodel-extreme-2.c | 7 ++ .../loongarch/func-call-extreme-1.c | 14 +-- .../loongarch/func-call-extreme-2.c | 29 + .../loongarch/func-call-extreme-3.c | 2 +- .../loongarch/func-call-extreme-4.c | 2 +- 8 files changed, 109 insertions(+), 84 deletions(-) create mode 100644 gcc/testsuite/gcc.target/loongarch/cmodel-extreme-1.c create mode 100644 gcc/testsuite/gcc.target/loongarch/cmodel-extreme-2.c diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index db83232884f..7c01169b422 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/config/loongarch/lo
Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.
On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote: > > 在 2024/1/5 下午4:37, Xi Ruoyao 写道: > > On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote: > > > bool > > > loongarch_explicit_relocs_p (enum loongarch_symbol_type type) > > > { > > > + /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be > > > adjancent > > > + so that the linker can infer the PC of pcalau12i to apply > > > relocations > > > + to lu32i.d and lu52i.d. Otherwise, the results would be incorrect > > > if > > > + these four instructions are not in the same 4KiB page. > > > + Therefore, macro instructions are used when cmodel=extreme. */ > > > + if (loongarch_symbol_extreme_p (type)) > > > + return false; > > I think this is a bit of strange. With -mexplicit-relocs={auto,always} > > we should still use explicit relocs, but coding all 4 instructions > > altogether as > > > > "pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)" > > > > Give me several hours trying to implement this... > > > I think there is no difference between macros and these instructions put > together. If implement it in a split form, I think I can try it through > TARGET_SCHED_MACRO_FUSION_PAIR_P There is a difference: int x; int t() { return x; } pcalau12i.d t0, %pc_hi20(x) addi.d t1, r0, %pc_lo12(x) lu32i.d t1, %pc64_lo20(x) lu52i.d t1, t1, %pc64_hi12(x) ldx.w a0, t0, t1 is slightly better than pcalau12i.d t0, %pc_hi20(x) addi.d t1, r0, %pc_lo12(x) lu32i.d t1, %pc64_lo20(x) lu52i.d t1, t1, %pc64_hi12(x) addi.d t0, t0, t1 ld.w a0, t0, 0 And generating macros when -mexplicit-relocs=always can puzzle people (it says "always" :-\ ). -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.
在 2024/1/5 下午4:37, Xi Ruoyao 写道: On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote: bool loongarch_explicit_relocs_p (enum loongarch_symbol_type type) { + /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent + so that the linker can infer the PC of pcalau12i to apply relocations + to lu32i.d and lu52i.d. Otherwise, the results would be incorrect if + these four instructions are not in the same 4KiB page. + Therefore, macro instructions are used when cmodel=extreme. */ + if (loongarch_symbol_extreme_p (type)) + return false; I think this is a bit of strange. With -mexplicit-relocs={auto,always} we should still use explicit relocs, but coding all 4 instructions altogether as "pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)" Give me several hours trying to implement this... I think there is no difference between macros and these instructions put together. If implement it in a split form, I think I can try it through TARGET_SCHED_MACRO_FUSION_PAIR_P
Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.
在 2024/1/5 下午4:37, Xi Ruoyao 写道: On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote: bool loongarch_explicit_relocs_p (enum loongarch_symbol_type type) { + /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent + so that the linker can infer the PC of pcalau12i to apply relocations + to lu32i.d and lu52i.d. Otherwise, the results would be incorrect if + these four instructions are not in the same 4KiB page. + Therefore, macro instructions are used when cmodel=extreme. */ + if (loongarch_symbol_extreme_p (type)) + return false; I think this is a bit of strange. With -mexplicit-relocs={auto,always} we should still use explicit relocs, but coding all 4 instructions altogether as "pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)" Give me several hours trying to implement this... You mean to take the last add directive out separately?
Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.
On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote: > bool > loongarch_explicit_relocs_p (enum loongarch_symbol_type type) > { > + /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent > + so that the linker can infer the PC of pcalau12i to apply relocations > + to lu32i.d and lu52i.d. Otherwise, the results would be incorrect if > + these four instructions are not in the same 4KiB page. > + Therefore, macro instructions are used when cmodel=extreme. */ > + if (loongarch_symbol_extreme_p (type)) > + return false; I think this is a bit of strange. With -mexplicit-relocs={auto,always} we should still use explicit relocs, but coding all 4 instructions altogether as "pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)" Give me several hours trying to implement this... -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
[PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.
Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent so that the linker can infer the PC of pcalau12i to apply relocations to lu32i.d and lu52i.d. Otherwise, the results would be incorrect if these four instructions are not in the same 4KiB page. See the link for details: https://github.com/loongson/la-abi-specs/blob/release/laelf.adoc#extreme-code-model. gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_symbol_extreme_p): Add function declaration. (loongarch_explicit_relocs_p): Use the macro instruction to get the symbol address when loongarch_symbol_extreme_p returns true. gcc/testsuite/ChangeLog: * gcc.target/loongarch/attr-model-1.c: Modify the content of the search string in the test case. * gcc.target/loongarch/attr-model-2.c: Likewise. * gcc.target/loongarch/attr-model-3.c: Likewise. * gcc.target/loongarch/attr-model-4.c: Likewise. * gcc.target/loongarch/func-call-extreme-1.c: Likewise. * gcc.target/loongarch/func-call-extreme-2.c: Likewise. * gcc.target/loongarch/func-call-extreme-3.c: Likewise. * gcc.target/loongarch/func-call-extreme-4.c: Likewise. --- gcc/config/loongarch/loongarch.cc | 11 +++ gcc/testsuite/gcc.target/loongarch/attr-model-1.c | 2 +- gcc/testsuite/gcc.target/loongarch/attr-model-2.c | 2 +- gcc/testsuite/gcc.target/loongarch/attr-model-3.c | 2 +- gcc/testsuite/gcc.target/loongarch/attr-model-4.c | 2 +- .../gcc.target/loongarch/func-call-extreme-1.c| 6 +++--- .../gcc.target/loongarch/func-call-extreme-2.c| 6 +++--- .../gcc.target/loongarch/func-call-extreme-3.c| 6 +++--- .../gcc.target/loongarch/func-call-extreme-4.c| 6 +++--- 9 files changed, 27 insertions(+), 16 deletions(-) diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index 6a3321327ea..3b4b28f3bcc 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/config/loongarch/loongarch.cc @@ -264,6 +264,9 @@ const char *const loongarch_fp_conditions[16]= {LARCH_FP_CONDITIONS (STRINGIFY)}; #undef STRINGIFY +static bool +loongarch_symbol_extreme_p (enum loongarch_symbol_type type); + /* Size of guard page. */ #define STACK_CLASH_PROTECTION_GUARD_SIZE \ (1 << param_stack_clash_protection_guard_size) @@ -1963,6 +1966,14 @@ loongarch_symbolic_constant_p (rtx x, enum loongarch_symbol_type *symbol_type) bool loongarch_explicit_relocs_p (enum loongarch_symbol_type type) { + /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent + so that the linker can infer the PC of pcalau12i to apply relocations + to lu32i.d and lu52i.d. Otherwise, the results would be incorrect if + these four instructions are not in the same 4KiB page. + Therefore, macro instructions are used when cmodel=extreme. */ + if (loongarch_symbol_extreme_p (type)) +return false; + if (la_opt_explicit_relocs != EXPLICIT_RELOCS_AUTO) return la_opt_explicit_relocs == EXPLICIT_RELOCS_ALWAYS; diff --git a/gcc/testsuite/gcc.target/loongarch/attr-model-1.c b/gcc/testsuite/gcc.target/loongarch/attr-model-1.c index 916d715b98b..65acb29162c 100644 --- a/gcc/testsuite/gcc.target/loongarch/attr-model-1.c +++ b/gcc/testsuite/gcc.target/loongarch/attr-model-1.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-mexplicit-relocs -mcmodel=normal -O2" } */ -/* { dg-final { scan-assembler-times "%pc64_hi12" 2 } } */ +/* { dg-final { scan-assembler-times "la\.local\t\\\$r\[0-9\]+,\\\$r15," 2 } } */ #define ATTR_MODEL_TEST #include "attr-model-test.c" diff --git a/gcc/testsuite/gcc.target/loongarch/attr-model-2.c b/gcc/testsuite/gcc.target/loongarch/attr-model-2.c index a74c795ac3e..cf0f079e39a 100644 --- a/gcc/testsuite/gcc.target/loongarch/attr-model-2.c +++ b/gcc/testsuite/gcc.target/loongarch/attr-model-2.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-mexplicit-relocs -mcmodel=extreme -O2" } */ -/* { dg-final { scan-assembler-times "%pc64_hi12" 3 } } */ +/* { dg-final { scan-assembler-times "la\.local\t\\\$r\[0-9\]+,\\\$r15," 3 } } */ #define ATTR_MODEL_TEST #include "attr-model-test.c" diff --git a/gcc/testsuite/gcc.target/loongarch/attr-model-3.c b/gcc/testsuite/gcc.target/loongarch/attr-model-3.c index 5622d508678..7c270d462f7 100644 --- a/gcc/testsuite/gcc.target/loongarch/attr-model-3.c +++ b/gcc/testsuite/gcc.target/loongarch/attr-model-3.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-mexplicit-relocs=auto -mcmodel=normal -O2" } */ -/* { dg-final { scan-assembler-times "%pc64_hi12" 2 } } */ +/* { dg-final { scan-assembler-times "la\.local\t\\\$r\[0-9\]+,\\\$r15," 2 } } */ #define ATTR_MODEL_TEST #include "attr-model-test.c" diff --git a/gcc/testsuite/gcc.target/loongarch/attr-model-4.c b/gcc/testsuite/gcc.target/loongarch/attr-model-4.c index 482724bb974..627d630c36d 100644 --- a/gcc/testsuite/gcc.tar