ameisen wrote:
> > This PR doesn't fix (if it was intended to) sibling call optimizations.
> > It _does_ change the code generation for them, however - but it maintains
> > all of the stack adjustments as well as the `fp`/`sp` moves.
> > ```c++
> > __attribute__((noinline))
> > int foo0()
> > {
> > volatile int a;
> > return a += 1;
> > }
> >
> > int bar0()
> > {
> > return foo0();
> > }
> > ```
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > with tail calls disabled (`-std=gnu++23 -O3 -march=mips32r6 -fno-PIC -mllvm
> > -mips-tail-calls=0`):
> > ```assembly
> > addiu $sp, $sp, -24
> > sw $ra, 20($sp) # 4-byte Folded Spill
> > sw $fp, 16($sp) # 4-byte Folded Spill
> > move $fp, $sp
> > jal _Z4foo0v
> > nop
> > move $sp, $fp
> > lw $fp, 16($sp) # 4-byte Folded Reload
> > lw $ra, 20($sp) # 4-byte Folded Reload
> > addiu $sp, $sp, 24
> > jrc $ra
> > ```
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > with tail calls enabled (`-std=gnu++23 -O3 -march=mips32r6 -fno-PIC -mllvm
> > -mips-tail-calls=1`):
> > ```assembly
> > addiu $sp, $sp, -8
> > sw $ra, 4($sp) # 4-byte Folded Spill
> > sw $fp, 0($sp) # 4-byte Folded Spill
> > move $fp, $sp
> > move $sp, $fp
> > lw $fp, 0($sp) # 4-byte Folded Reload
> > lw $ra, 4($sp) # 4-byte Folded Reload
> > j _Z4foo0v
> > addiu $sp, $sp, 8
> > ```
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > What is expected:
> > ```assembly
> > j _Z4foo0v
> > nop
> > ```
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > or
> > ```assembly
> > bc _Z4foo0v
> > ```
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > So, it's moving the `j` to the end, but it's not cleaning up the stack
> > changes that were _around_ the jump. Everything in here but the `j` could
> > be removed (with a `nop`, but the `j` should be replaced with a `bc`
> > anyways), but this effectively is generating a rather messed up function.
> > A significant amount of logic for handling this issue is present in
> > `LowerCall` for PPC and for AArch64 (and the rest), but is missing in the
> > MIPS version.
>
> Hey @ameisen, thanks a lot. Yes, I agree, it is missing for MIPS case in the
> ISel Lowering, but I think this should be a separate github issue, and
> separate PR should be created for this. Can you please create an issue with
> the content from this comment? Thanks in advance.
I'm trying to figure out how to word the issue, as without your PR clang
doesn't even turn it into a tail call at all. Since sibling optimizations are
based upon tail call optimizations (they're a subset of them) it complicates an
issue as I'm not sure how to write an issue dependent upon a PR.
https://github.com/llvm/llvm-project/pull/161860
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits