> diff --git a/net/core/lwt_bpf.c b/net/core/lwt_bpf.c
> index f71ef82a5f3d..bf588f508b79 100644
> --- a/net/core/lwt_bpf.c
> +++ b/net/core/lwt_bpf.c
> @@ -612,10 +613,16 @@ int bpf_lwt_push_ip_encap(struct sk_buff *skb, void 
> *hdr, u32 len, bool ingress)
>               ipv4 = true;
>               if (unlikely(len < iph->ihl * 4))
>                       return -EINVAL;

Can iph->ihl == 0 slip past this check?  Since len is a u32, iph->ihl * 4
is 0 when iph->ihl is 0, and len < 0 is never true, so the buffer is
accepted without returning -EINVAL.

ip_fast_csum() is later called with this iph->ihl, and on some
architectures (PowerPC, MIPS) that implementation assumes ihl >= 5 and
decrements the loop counter, which can underflow when ihl is 0.

Would an explicit iph->ihl >= 5 guard be needed here, for example:

        if (unlikely(iph->ihl < 5 || len < iph->ihl * 4))
                return -EINVAL;

This was raised in review of v1 and an earlier revision added the
iph->ihl < 5 guard, but the current revision appears to have dropped it
again.

>               is_udp_tunnel = iph->protocol == IPPROTO_UDP;
>               if (unlikely(is_udp_tunnel && len < iph->ihl * 4 + 
> sizeof(struct udphdr)))
>                       return -EINVAL;
>       } else if (iph->version == 6) {
>               ipv4 = false;
>               if (unlikely(len < sizeof(struct ipv6hdr)))
>                       return -EINVAL;
>               is_udp_tunnel = ((struct ipv6hdr *)iph)->nexthdr == NEXTHDR_UDP;
>               if (unlikely(is_udp_tunnel && len < sizeof(struct ipv6hdr) + 
> sizeof(struct udphdr)))
>                       return -EINVAL;
>       } else {
>               return -EINVAL;
>       }
> @@ -637,6 +644,11 @@ int bpf_lwt_push_ip_encap(struct sk_buff *skb, void 
> *hdr, u32 len, bool ingress)
>       if (ingress)
>               skb_postpush_rcsum(skb, iph, len);
>       skb_reset_network_header(skb);
> +     if (is_udp_tunnel) {
> +             size_t iph_sz = ipv4 ? iph->ihl * 4 : sizeof(struct ipv6hdr);
> +
> +             skb_set_transport_header(skb, skb_network_offset(skb) + iph_sz);
> +     }

Can iph->ihl change between the initial length validation and this read?

The hdr pointer, and therefore iph, refers to a buffer provided by the BPF
program, which can be concurrently modified by userspace through a shared
BPF map.  iph->ihl is re-read here to compute iph_sz, and this read happens
before the memcpy() of the header into the skb:

        if (is_udp_tunnel) {
                size_t iph_sz = ipv4 ? iph->ihl * 4 : sizeof(struct ipv6hdr);

                skb_set_transport_header(skb, skb_network_offset(skb) + iph_sz);
        }
        memcpy(skb_network_header(skb), hdr, len);

If iph->ihl is concurrently changed to a larger value such as 15 after the
initial validation, can the transport header offset be set past skb->tail,
leading to out-of-bounds access when drivers later dereference the
transport header for checksum updates?

The suggestion in the v1 discussion was to move skb_set_transport_header()
after the memcpy() so it reads from the copied-in header, but the current
revision still reads iph->ihl before the memcpy().

>       memcpy(skb_network_header(skb), hdr, len);
>       bpf_compute_data_pointers(skb);
>       skb_clear_hash(skb);

---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/26830439032

Reply via email to