在 2024/1/5 下午7:55, Xi Ruoyao 写道:
On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote:
On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote:
在 2024/1/5 下午4:37, Xi Ruoyao 写道:
On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
bool
loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
{
+ /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent
+ so that the linker can infer the PC of pcalau12i to apply relocations
+ to lu32i.d and lu52i.d. Otherwise, the results would be incorrect if
+ these four instructions are not in the same 4KiB page.
+ Therefore, macro instructions are used when cmodel=extreme. */
+ if (loongarch_symbol_extreme_p (type))
+ return false;
I think this is a bit of strange. With -mexplicit-relocs={auto,always}
we should still use explicit relocs, but coding all 4 instructions
altogether as
"pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)"
Give me several hours trying to implement this...
I think there is no difference between macros and these instructions put
together. If implement it in a split form, I think I can try it through
TARGET_SCHED_MACRO_FUSION_PAIR_P
We don't need to split the insn. We can just add a "large insn"
containing the assembly output we want.
See the attached patch. Note that TLS LE/LD/GD needs a fix too because
they are basically an variation of GOT addressing.
I've ran some small tests and now trying to bootstrap GCC with -
mcmodel=extreme in BOOT_CFLAGS...
There is a difference:
int x;
int t() { return x; }
pcalau12i.d t0, %pc_hi20(x)
addi.d t1, r0, %pc_lo12(x)
lu32i.d t1, %pc64_lo20(x)
lu52i.d t1, t1, %pc64_hi12(x)
ldx.w a0, t0, t1
is slightly better than
pcalau12i.d t0, %pc_hi20(x)
addi.d t1, r0, %pc_lo12(x)
lu32i.d t1, %pc64_lo20(x)
lu52i.d t1, t1, %pc64_hi12(x)
addi.d t0, t0, t1
ld.w a0, t0, 0
And generating macros when -mexplicit-relocs=always can puzzle people
(it says "always" :-\ ).
Thumbs up! This method is much better than my method, I learned
something! grateful!
But I still have to test the accuracy.