On 17/01/26 4:11 pm, adubey wrote:
On 2026-01-17 15:41, Hari Bathini wrote:
On 14/01/26 5:14 pm, [email protected] wrote:
From: Abhishek Dubey <[email protected]>

In the conventional stack frame, the position of tail_call_cnt
is after the NVR save area (BPF_PPC_STACK_SAVE). Whereas, the
offset of tail_call_cnt in the trampoline frame is after the
stack alignment padding. BPF JIT logic could become complex
when dealing with frame-sensitive offset calculation of
tail_call_cnt. Having the same offset in both frames is the
desired objective.

The trampoline frame does not have a BPF_PPC_STACK_SAVE area.
Introducing it leads to under-utilization of extra memory meant
only for the offset alignment of tail_call_cnt.
Another challenge is the variable alignment padding sitting at
the bottom of the trampoline frame, which requires additional
handling to compute tail_call_cnt offset.

This patch addresses the above issues by moving tail_call_cnt
to the bottom of the stack frame at offset 0 for both types
of frames. This saves additional bytes required by BPF_PPC_STACK_SAVE
in trampoline frame, and a common offset computation for
tail_call_cnt serves both frames.

The changes in this patch are required by the third patch in the
series, where the 'reference to tail_call_info' of the main frame
is copied into the trampoline frame from the previous frame.

Signed-off-by: Abhishek Dubey <[email protected]>
---
  arch/powerpc/net/bpf_jit.h        |  4 ++++
  arch/powerpc/net/bpf_jit_comp64.c | 31 ++++++++++++++++++++-----------
  2 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 8334cd667bba..45d419c0ee73 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -72,6 +72,10 @@
      } } while (0)
    #ifdef CONFIG_PPC64
+
+/* for tailcall counter */
+#define BPF_PPC_TAILCALL        8
+
  /* If dummy pass (!image), account for maximum possible instructions */
  #define PPC_LI64(d, i)        do {                          \
      if (!image)                                  \
diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/ bpf_jit_comp64.c
index 1fe37128c876..39061cd742c1 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -20,13 +20,15 @@
  #include "bpf_jit.h"
    /*
- * Stack layout:
+ * Stack layout 1:
+ * Layout when setting up our own stack frame.
+ * Note: r1 at bottom, component offsets positive wrt r1.
   * Ensure the top half (upto local_tmp_var) stays consistent
   * with our redzone usage.
   *
   *        [    prev sp        ] <-------------
- *        [   nv gpr save area    ] 6*8        |
   *        [    tail_call_cnt    ] 8        |
+ *        [   nv gpr save area    ] 6*8        |
   *        [    local_tmp_var    ] 24        |
   * fp (r31) -->    [   ebpf stack space    ] upto 512    |
   *        [     frame header    ] 32/112    |
@@ -36,10 +38,12 @@
  /* for gpr non volatile registers BPG_REG_6 to 10 */
  #define BPF_PPC_STACK_SAVE    (6*8)
  /* for bpf JIT code internal usage */
-#define BPF_PPC_STACK_LOCALS    32
+#define BPF_PPC_STACK_LOCALS    24
  /* stack frame excluding BPF stack, ensure this is quadword aligned */
  #define BPF_PPC_STACKFRAME    (STACK_FRAME_MIN_SIZE + \
-                 BPF_PPC_STACK_LOCALS + BPF_PPC_STACK_SAVE)
+                 BPF_PPC_STACK_LOCALS + \
+                 BPF_PPC_STACK_SAVE   + \
+                 BPF_PPC_TAILCALL)
    /* BPF register usage */
  #define TMP_REG_1    (MAX_BPF_JIT_REG + 0)
@@ -87,27 +91,32 @@ static inline bool bpf_has_stack_frame(struct codegen_context *ctx)
  }


  /*
+ * Stack layout 2:
   * When not setting up our own stackframe, the redzone (288 bytes) usage is:
+ * Note: r1 from prev frame. Component offset negative wrt r1.
   *
   *        [    prev sp        ] <-------------
   *        [      ...           ]         |
   * sp (r1) --->    [    stack pointer    ] --------------
- *        [   nv gpr save area    ] 6*8
   *        [    tail_call_cnt    ] 8
+ *        [   nv gpr save area    ] 6*8
   *        [    local_tmp_var    ] 24
   *        [   unused red zone    ] 224
   */

Calling it stack layout 1 & 2 is inappropriate. The stack layout
is essentially the same. It just goes to show things with reference
to r1 when stack is setup explicitly vs when redzone is being used...
Agree. I am using it as labels to refer in comment. Any better suggestions?
I think the comments could refer to has stack frame vs Redzone case..

- Hari


Reply via email to