sk_psock_strp_parse() runs the BPF_PROG_TYPE_SK_SKB stream-parser program
to find the length of the next message. strparser assembles a message out
of several received skbs by chaining them onto the head's frag_list and
recording where to append the next one in strp->skb_nextp:
*strp->skb_nextp = skb;
strp->skb_nextp = &skb->next;
and then calls the parser on the head:
len = (*strp->cb.parse_msg)(strp, head);
The parser is only meant to inspect the skb, but the program may call
bpf_skb_change_tail() -- or the sibling bpf_skb_pull_data(),
bpf_skb_change_head(), bpf_skb_adjust_room(), all allowed for SK_SKB.
Once the head carries a frag_list these go
... -> skb_ensure_writable -> pskb_may_pull -> __pskb_pull_tail
and __pskb_pull_tail() frees the frag_list skbs that strparser still
tracks through skb_nextp:
while ((list = skb_shinfo(skb)->frag_list) != insp) {
skb_shinfo(skb)->frag_list = list->next;
consume_skb(list);
}
strp->skb_nextp now points into a freed sk_buff. The next segment of
the same message arrives in __strp_recv(), which links it with
*strp->skb_nextp = skb, an 8-byte write into the freed skb. The free
and the write happen in different __strp_recv() calls, so the message
has to span at least three segments before it triggers.
BUG: KASAN: slab-use-after-free in __strp_recv+0x447/0xda0
Write of size 8 at addr ffff88810db86140 by task repro/349
Call Trace:
<IRQ>
__strp_recv+0x447/0xda0
__tcp_read_sock+0x13d/0x590
tcp_bpf_strp_read_sock+0x195/0x320
strp_data_ready+0x267/0x340
sk_psock_strp_data_ready+0x1ce/0x350
tcp_data_queue+0x1364/0x2fd0
tcp_rcv_established+0xe07/0x1640
[...]
Allocated by task 349:
skb_clone+0x17b/0x210
__strp_recv+0x2c3/0xda0
__tcp_read_sock+0x13d/0x590
[...]
Freed by task 349:
kmem_cache_free+0x150/0x570
__pskb_pull_tail+0x57b/0xc20
skb_ensure_writable+0x236/0x260
__bpf_skb_change_tail+0x1d4/0x590
sk_skb_change_tail+0x2a/0x40
bpf_prog_1b285dcd6c41373e+0x27/0x30
bpf_prog_run_pin_on_cpu+0xf3/0x260
sk_psock_strp_parse+0x118/0x1e0
__strp_recv+0x4f6/0xda0
[...]
The same resize also leaves the head's length inconsistent with its
frags, so a later __pskb_pull_tail() can instead hit the
BUG_ON(skb_copy_bits(...)) in net/core/skbuff.c.
A stream parser is only meant to measure the next message, not to modify
the packet. Reject a parser whose program can change packet data
(prog->aux->changes_pkt_data) at attach time. The check is shared by
sock_map_prog_update() and sock_map_link_update_prog(), which between them
cover prog attach, link create and link update. Verdict programs are
unaffected and may still modify the skb.
Signed-off-by: Sechang Lim <[email protected]>
---
net/core/sock_map.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index 99e3789492a0..c60ba6d292f9 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -1515,6 +1515,17 @@ static int sock_map_prog_link_lookup(struct bpf_map
*map, struct bpf_prog ***ppr
return 0;
}
+static int sock_map_prog_attach_check(enum bpf_attach_type attach_type,
+ struct bpf_prog *prog)
+{
+ /* A stream parser must not modify the skb, only measure it. */
+ if (prog && attach_type == BPF_SK_SKB_STREAM_PARSER &&
+ prog->aux->changes_pkt_data)
+ return -EINVAL;
+
+ return 0;
+}
+
/* Handle the following four cases:
* prog_attach: prog != NULL, old == NULL, link == NULL
* prog_detach: prog == NULL, old != NULL, link == NULL
@@ -1533,6 +1544,10 @@ static int sock_map_prog_update(struct bpf_map *map,
struct bpf_prog *prog,
if (ret)
return ret;
+ ret = sock_map_prog_attach_check(which, prog);
+ if (ret)
+ return ret;
+
/* for prog_attach/prog_detach/link_attach, return error if a bpf_link
* exists for that prog.
*/
@@ -1776,6 +1791,11 @@ static int sock_map_link_update_prog(struct bpf_link
*link,
ret = -EINVAL;
goto out;
}
+
+ ret = sock_map_prog_attach_check(link->attach_type, prog);
+ if (ret)
+ goto out;
+
if (!sockmap_link->map) {
ret = -ENOLINK;
goto out;
--
2.43.0