On Mon, 24 Sep 2018 at 05:12, Daniel Borkmann <dan...@iogearbox.net> wrote: > > Hi Joe, > > couple of comments inline:
Thanks for the review, I'll fix up those bits. > On 09/21/2018 07:10 PM, Joe Stringer wrote: > > This patch adds new BPF helper functions, bpf_sk_lookup_tcp() and > > bpf_sk_lookup_udp() which allows BPF programs to find out if there is a > > socket listening on this host, and returns a socket pointer which the > > BPF program can then access to determine, for instance, whether to > > forward or drop traffic. bpf_sk_lookup_xxx() may take a reference on the > > socket, so when a BPF program makes use of this function, it must > > subsequently pass the returned pointer into the newly added sk_release() > > to return the reference. > > > > By way of example, the following pseudocode would filter inbound > > connections at XDP if there is no corresponding service listening for > > the traffic: > > > > struct bpf_sock_tuple tuple; > > struct bpf_sock_ops *sk; > > > > populate_tuple(ctx, &tuple); // Extract the 5tuple from the packet > > sk = bpf_sk_lookup_tcp(ctx, &tuple, sizeof tuple, netns, 0); > > if (!sk) { > > // Couldn't find a socket listening for this traffic. Drop. > > return TC_ACT_SHOT; > > } > > bpf_sk_release(sk, 0); > > return TC_ACT_OK; > > > > Signed-off-by: Joe Stringer <j...@wand.net.nz> > > > > --- > > > > v2: Rework 'struct bpf_sock_tuple' to allow passing a packet pointer > > Limit netns_id field to 32 bits > > Fix compile error with CONFIG_IPV6 enabled > > Allow direct packet access from helper > > --- > > include/uapi/linux/bpf.h | 57 ++++++++- > > kernel/bpf/verifier.c | 8 +- > > net/core/filter.c | 149 ++++++++++++++++++++++ > > tools/include/uapi/linux/bpf.h | 57 ++++++++- > > tools/testing/selftests/bpf/bpf_helpers.h | 12 ++ > > 5 files changed, 280 insertions(+), 3 deletions(-) > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > index aa5ccd2385ed..620adbb09a94 100644 > > --- a/include/uapi/linux/bpf.h > > +++ b/include/uapi/linux/bpf.h > > ... > > +/* bpf_sk_lookup performs the core lookup for different types of sockets, > > + * taking a reference on the socket if it doesn't have the flag > > SOCK_RCU_FREE. > > + * Returns the socket as an 'unsigned long' to simplify the casting in the > > + * callers to satisfy BPF_CALL declarations. > > + */ > > +static unsigned long > > +bpf_sk_lookup(struct sk_buff *skb, struct bpf_sock_tuple *tuple, u32 len, > > + u8 proto, u64 netns_id, u64 flags) > > +{ > > + struct net *caller_net = dev_net(skb->dev); > > For sk_skb programs, are we *always* guaranteed to have a skb->dev assigned? > > This definitely holds true for tc programs, but afaik not for sk_skb ones > where > you enable the helpers below, so this would result in a NULL ptr dereference. I'll update this to take the netns from the skb->sk in those hooks. > > +BPF_CALL_2(bpf_sk_release, struct sock *, sk, u64, flags) > > +{ > > + if (!sock_flag(sk, SOCK_RCU_FREE)) > > + sock_gen_put(sk); > > + > > + if (unlikely(flags)) > > + return -EINVAL; > > I guess it's probably okay to leave here, though I'm wondering whether it's > worth to have a flags part in general in this helper. We need to release the > reference in any case beforehands anyway as we otherwise leak it. Personally, > I'd probably that arg here. I agree, I can't think of a good reason to have optional behaviour on socket release. I'll drop it. > > + return 0; > > +} > > + > > +static const struct bpf_func_proto bpf_sk_release_proto = { > > + .func = bpf_sk_release, > > + .gpl_only = false, > > + .ret_type = RET_INTEGER, > > + .arg1_type = ARG_PTR_TO_SOCKET, > > + .arg2_type = ARG_ANYTHING, > > +}; > > + > > bool bpf_helper_changes_pkt_data(void *func) > > { > > if (func == bpf_skb_vlan_push || > > @@ -5018,6 +5155,12 @@ tc_cls_act_func_proto(enum bpf_func_id func_id, > > const struct bpf_prog *prog) > > case BPF_FUNC_skb_ancestor_cgroup_id: > > return &bpf_skb_ancestor_cgroup_id_proto; > > #endif > > + case BPF_FUNC_sk_lookup_tcp: > > + return &bpf_sk_lookup_tcp_proto; > > + case BPF_FUNC_sk_lookup_udp: > > + return &bpf_sk_lookup_udp_proto; > > + case BPF_FUNC_sk_release: > > + return &bpf_sk_release_proto; > > default: > > return bpf_base_func_proto(func_id); > > } > > @@ -5118,6 +5261,12 @@ sk_skb_func_proto(enum bpf_func_id func_id, const > > struct bpf_prog *prog) > > return &bpf_sk_redirect_hash_proto; > > case BPF_FUNC_get_local_storage: > > return &bpf_get_local_storage_proto; > > + case BPF_FUNC_sk_lookup_tcp: > > + return &bpf_sk_lookup_tcp_proto; > > + case BPF_FUNC_sk_lookup_udp: > > + return &bpf_sk_lookup_udp_proto; > > + case BPF_FUNC_sk_release: > > + return &bpf_sk_release_proto; > > (See comment above.)