On Tue, Sep 04, 2018 at 03:19:52PM +0100, Edward Cree wrote:
> In check_mem_access(), for the PTR_TO_CTX case, after check_ctx_access()
>  has supplied a reg_type, the other members of the register state are set
>  appropriately.  Previously reg.range was set to 0, but as it is in a
>  union with reg.map_ptr, which is larger, upper bytes of the latter were
>  left in place.  This then caused the memcmp() in regsafe() to fail,
>  preventing some branches from being pruned (and occasionally causing the
>  same program to take a varying number of processed insns on repeated
>  verifier runs).
> 
> Signed-off-by: Edward Cree <ec...@solarflare.com>
> ---
> Possibly something might need adding to __mark_reg_unknown() as well to
>  clear map_ptr/range, I'm not sure (though doing so did not affect the
>  processed insn count on the cilium programs).
> 
>  kernel/bpf/verifier.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index f4ff0c569e54..49e4ea66fdd3 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -1640,9 +1640,9 @@ static int check_mem_access(struct bpf_verifier_env 
> *env, int insn_idx, u32 regn
>                       else
>                               mark_reg_known_zero(env, regs,
>                                                   value_regno);
> -                     regs[value_regno].id = 0;
> -                     regs[value_regno].off = 0;
> -                     regs[value_regno].range = 0;
> +                     /* Clear id, off, and union(map_ptr, range) */
> +                     memset(regs + value_regno, 0,
> +                            offsetof(struct bpf_reg_state, var_off));
>                       regs[value_regno].type = reg_type;
>               }

Awesome! Thanks a bunch for tracking it down.
I vaguely remember thinking about overlapping map_ptr with other fields
and not clearing map_ptr explicitly, because it was unnecessary...
Doing a bit of git-archaeology...
Looks like commit f1174f77b50c ("bpf/verifier: rework value tracking")
removed 'imm' from that union, so that __mark_reg_unknown_value()
that was clearing both 'imm' and 'map_ptr' before was no longer happening.
So old sequence:
  mark_reg_unknown_value_and_range(); // which called __mark_reg_unknown_value()
                                      //  which cleared 'imm' (and id|off|range)
  state->regs[value_regno].type = reg_type;
got replaced with
  mark_reg_known_zero();
  state->regs[value_regno].id = 0;
  state->regs[value_regno].off = 0;
  state->regs[value_regno].range = 0;
  state->regs[value_regno].type = reg_type;
which made map_ptr contain junk in upper bits.
I bet the comment "note that reg.[id|off|range] == 0" few lines before
that was deleted by that commit probably caused that bug :)
That comment I added as part of commit 969bf05eb3ce ("bpf: direct packet 
access")
What I was trying to express in that comment that
"mark_reg_unknown_value() that is called right before that comment
also clears id|off|range that are included as part of bigger 'imm' field
that mark_reg_unknown_value() clears, so these three fields don't need
to be cleared separately"
Sorry for confusion that that comment caused and painful debugging.

So would you agree it's fair to add
Fixes: f1174f77b50c ("bpf/verifier: rework value tracking")
?

Also I think it's better to do this memset() in both
__mark_reg_unknown() and in __mark_reg_known()
instead of open coding it here in check_mem_access().

While at it we would also need to adjust this:
static void __mark_reg_const_zero(struct bpf_reg_state *reg)
{
        __mark_reg_known(reg, 0);
        reg->off = 0;
        reg->type = SCALAR_VALUE;
}
since line reg->off = 0; wouldn't make sense after memset() is added
and few other places.

btw the 4 byte hole:
        enum bpf_reg_type          type;                 /*     0     4 */
        /* XXX 4 bytes hole, try to pack */
        union {
                u16                range;                /*           2 */
                struct bpf_map *   map_ptr;              /*           8 */
        };                                               /*     8     8 */
doesn't cause instability issues, since we kzalloc verifier reg state.

How about patch like the following:
------------
>From 422fd975ed78645ab67d2eb50ff6e1ff6fb3de32 Mon Sep 17 00:00:00 2001
From: Alexei Starovoitov <a...@kernel.org>
Date: Tue, 4 Sep 2018 19:13:44 -0700
Subject: [PATCH] bpf/verifier: fix verifier instability

Fixes: f1174f77b50c ("bpf/verifier: rework value tracking")
Debugged-by: Edward Cree <ec...@solarflare.com> 
Signed-off-by: Alexei Starovoitov <a...@kernel.org>
---
 kernel/bpf/verifier.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index f4ff0c569e54..6ff1bac1795d 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -570,7 +570,9 @@ static void __mark_reg_not_init(struct bpf_reg_state *reg);
  */
 static void __mark_reg_known(struct bpf_reg_state *reg, u64 imm)
 {
-       reg->id = 0;
+       /* Clear id, off, and union(map_ptr, range) */
+       memset(((u8 *)reg) + sizeof(reg->type), 0,
+              offsetof(struct bpf_reg_state, var_off) - sizeof(reg->type));
        reg->var_off = tnum_const(imm);
        reg->smin_value = (s64)imm;
        reg->smax_value = (s64)imm;
@@ -589,7 +591,6 @@ static void __mark_reg_known_zero(struct bpf_reg_state *reg)
 static void __mark_reg_const_zero(struct bpf_reg_state *reg)
 {
        __mark_reg_known(reg, 0);
-       reg->off = 0;
        reg->type = SCALAR_VALUE;
 }
 
@@ -700,9 +701,12 @@ static void __mark_reg_unbounded(struct bpf_reg_state *reg)
 /* Mark a register as having a completely unknown (scalar) value. */
 static void __mark_reg_unknown(struct bpf_reg_state *reg)
 {
+       /*
+        * Clear type, id, off, and union(map_ptr, range) and
+        * padding between 'type' and union
+        */
+       memset(reg, 0, offsetof(struct bpf_reg_state, var_off));
        reg->type = SCALAR_VALUE;
-       reg->id = 0;
-       reg->off = 0;
        reg->var_off = tnum_unknown;
        reg->frameno = 0;
        __mark_reg_unbounded(reg);
@@ -1640,9 +1644,6 @@ static int check_mem_access(struct bpf_verifier_env *env, 
int insn_idx, u32 regn
                        else
                                mark_reg_known_zero(env, regs,
                                                    value_regno);
-                       regs[value_regno].id = 0;
-                       regs[value_regno].off = 0;
-                       regs[value_regno].range = 0;
                        regs[value_regno].type = reg_type;
                }
 
@@ -2495,7 +2496,6 @@ static int check_helper_call(struct bpf_verifier_env 
*env, int func_id, int insn
                        regs[BPF_REG_0].type = PTR_TO_MAP_VALUE_OR_NULL;
                /* There is no offset yet applied, variable or fixed */
                mark_reg_known_zero(env, regs, BPF_REG_0);
-               regs[BPF_REG_0].off = 0;
                /* remember map_ptr, so that check_map_access()
                 * can check 'value_size' boundary of memory access
                 * to map element returned from bpf_map_lookup_elem()
-- 
2.17.1


Reply via email to