On 25/02/26 7:06 am, [email protected] wrote:
From: Abhishek Dubey <[email protected]>
Move the long branch address space to the bottom of the long
branch stub. This allows uninterrupted disassembly until the
last 8 bytes. Exclude these last bytes from the overall
program length to prevent failure in assembly generation.
Also, align dummy_tramp_addr field with 8-byte boundary.
Following is disassembler output for test program with moved down
dummy_tramp_addr field:
.....
.....
pc:68 left:44 a6 03 08 7c : mtlr 0
pc:72 left:40 bc ff ff 4b : b .-68
pc:76 left:36 a6 02 68 7d : mflr 11
pc:80 left:32 05 00 9f 42 : bcl 20, 31, .+4
pc:84 left:28 a6 02 88 7d : mflr 12
pc:88 left:24 14 00 8c e9 : ld 12, 20(12)
pc:92 left:20 a6 03 89 7d : mtctr 12
pc:96 left:16 a6 03 68 7d : mtlr 11
pc:100 left:12 20 04 80 4e : bctr
pc:104 left:8 c0 34 1d 00 :
Failure log:
Can't disasm instruction at offset 104: c0 34 1d 00 00 00 00 c0
Disassembly logic can truncate at 104, ignoring last 8 bytes.
Update the dummy_tramp_addr field offset calculation from the end
of the program to reflect its new location, for bpf_arch_text_poke()
to update the actual trampoline's address in this field.
All BPF trampoline selftests continue to pass with this patch applied.
Signed-off-by: Abhishek Dubey <[email protected]>
---
arch/powerpc/net/bpf_jit_comp.c | 45 +++++++++++++++++++++++++--------
1 file changed, 34 insertions(+), 11 deletions(-)
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 7a78e03d482f..f8f6305b0d9f 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -51,7 +51,9 @@ asm (
void bpf_jit_build_fentry_stubs(u32 *image, struct codegen_context *ctx)
{
- int ool_stub_idx, long_branch_stub_idx;
+ int ool_stub_idx, long_branch_stub_idx, tramp_load_offset;
+ bool tramp_needs_align;
+ u32 tramp_idx;
/*
* Out-of-line stub:
@@ -70,27 +72,45 @@ void bpf_jit_build_fentry_stubs(u32 *image, struct
codegen_context *ctx)
/*
* Long branch stub:
- * .long <dummy_tramp_addr>
* mflr r11
* bcl 20,31,$+4
- * mflr r12
- * ld r12, -8-SZL(r12)
+ * mflr r12 // lr/r12 stores current pc
+ * ld r12, 20(r12) // offset(dummy_tramp_addr) from prev
inst. is 20
* mtctr r12
- * mtlr r11 // needed to retain ftrace ABI
+ * mtlr r11 // needed to retain ftrace ABI
* bctr
+ * nop // Optional, for mem alignment of
dummy_tramp_addr
+ * .long <dummy_tramp_addr>
*/
- if (image)
- *((unsigned long *)&image[ctx->idx]) = (unsigned
long)dummy_tramp;
- ctx->idx += SZL / 4;
long_branch_stub_idx = ctx->idx;
EMIT(PPC_RAW_MFLR(_R11));
EMIT(PPC_RAW_BCL4());
EMIT(PPC_RAW_MFLR(_R12));
- EMIT(PPC_RAW_LL(_R12, _R12, -8-SZL));
+
+ /* Relative offset of dummy_tramp_addr wrt start of long branch stub */
+ tramp_idx = long_branch_stub_idx + 7;
+ /*
+ * Image layout need not be considered 8-byte aligned.
+ * Lower 3 bits must be clear for 8-bytes alignment.
+ * Adjust offset for padding NOP before dummy_tramp_addr
+ */
+ tramp_needs_align = (((unsigned long)&image[tramp_idx]) & 7) != 0;
I would rather check:
is_8byte_aligned = (((unsigned long)&image[tramp_idx]) & 0x7) == 0x4;
and handle alignment when !is_8byte_aligned for better readability.
This alignment handling needs to go under CONFIG_PPC64.
Also, this alignment handling fix has nothing to with the moving around
of dummy_tramp_addr. Have the alignment handled in a separate patch with
a fixes tag for stable releases to pick it...
- Hari