Fernando Fernandez Mancera <[email protected]> wrote:
> Since commit d265929930e2 ("netfilter: nf_conncount: reduce unnecessary
> GC") if nftables/iptables connlimit is used without a check for new
> connections there can be duplicated entries tracked.
> 
> Pass the nf_conn struct directly to the nf_conncount API and check
> whether the connection is confirmed or not inside nf_conncount_add(). If
> the connection is confirmed, skip it.

I think there is a bit too much noise here, can this be
split in several chunks?

> -unsigned int nf_conncount_count(struct net *net,
> +unsigned int nf_conncount_count(struct net *net, const struct nf_conn *ct,
>                               struct nf_conncount_data *data,
> -                             const u32 *key,
> -                             const struct nf_conntrack_tuple *tuple,
> -                             const struct nf_conntrack_zone *zone);
> +                             const u32 *key);

nf_conn *ct has pitfalls that I did not realize earlier.
Current code is also buggy in that regard.
Maybe we need additional function to ease this for callers.

See below.
> +static int __nf_conncount_add(const struct nf_conn *ct,
> +                           struct nf_conncount_list *list)
>  {
>       const struct nf_conntrack_tuple_hash *found;
>       struct nf_conncount_tuple *conn, *conn_n;
> +     const struct nf_conntrack_tuple *tuple;
> +     const struct nf_conntrack_zone *zone;
>       struct nf_conn *found_ct;
>       unsigned int collect = 0;
>  
> +     if (!ct || nf_ct_is_confirmed(ct))
> +             return -EINVAL;
> +

When is the caller expected to pass a NULL ct?
Maybe this just misses a comment.

> index fc35a11cdca2..e815c0235b62 100644
> --- a/net/netfilter/nft_connlimit.c
> +++ b/net/netfilter/nft_connlimit.c
> @@ -24,28 +24,35 @@ static inline void nft_connlimit_do_eval(struct 
> nft_connlimit *priv,
>                                        const struct nft_pktinfo *pkt,
>                                        const struct nft_set_ext *ext)
>  {
> -     const struct nf_conntrack_zone *zone = &nf_ct_zone_dflt;
> -     const struct nf_conntrack_tuple *tuple_ptr;
> +     struct nf_conntrack_tuple_hash *h;
>       struct nf_conntrack_tuple tuple;
>       enum ip_conntrack_info ctinfo;
>       const struct nf_conn *ct;
>       unsigned int count;
> -
> -     tuple_ptr = &tuple;
> +     int err;
>  
>       ct = nf_ct_get(pkt->skb, &ctinfo);
> -     if (ct != NULL) {
> -             tuple_ptr = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple;
> -             zone = nf_ct_zone(ct);
> -     } else if (!nf_ct_get_tuplepr(pkt->skb, skb_network_offset(pkt->skb),
> -                                   nft_pf(pkt), nft_net(pkt), &tuple)) {
> -             regs->verdict.code = NF_DROP;
> -             return;
> +     if (!ct) {
> +             if (!nf_ct_get_tuplepr(pkt->skb, skb_network_offset(pkt->skb),
> +                                    nft_pf(pkt), nft_net(pkt), &tuple)) {
> +                     regs->verdict.code = NF_DROP;
> +                     return;
> +             }
> +
> +             h = nf_conntrack_find_get(nft_net(pkt), &nf_ct_zone_dflt, 
> &tuple);
> +             if (!h) {
> +                     regs->verdict.code = NF_DROP;
> +                     return;
> +             }
> +             ct = nf_ct_tuplehash_to_ctrack(h);

This ct had its refcount incremented.

I also see that this shared copypaste with xtables.
Would it be possible to pass 'const struct sk_buff *'
and let the nf_conncount core handle this internally?

nf_conncount_add(net, pf, skb, priv->list);

which does:
        ct = nf_ct_get(skb, &ctinfo);
        if (ct && !nf_ct_is_template(ct))
                return __nf_conncount_add(ct, list);

        if (!nf_ct_get_tuplepr(skb, skb_network_offset(skb), pf, net,
                                &tuple))
                return -ERR;

        if (ct) /* its a template, so do lookup in right zone */
                zone = nf_ct_zone(ct);
        else
                zone = &nf_ct_zone_dflt;

        h = nf_conntrack_find_get(nft_net(pkt), zone, &tuple);
        if (!h)
                return -ERR;

        ct = nf_ct_tuplehash_to_ctrack(h);

        err = __nf_conncount_add(ct, list);

        nf_ct_put(ct):

        return err;

I.e., the existing nf_conncount_add() becomes a helper that takes
a ct, as you have already proposed, but its renamed and turned into
an internal helper so frontends don't need to duplicate tuple lookup.

Alternatively, no need to rename it and instead add a new API call
that takes the ct argument, e.g. 'nf_conncount_add_ct' or whatever,
and then make nf_conncount_add() internal in a followup patch.

> +             h = nf_conntrack_find_get(net, &nf_ct_zone_dflt, &tuple);

We should not restrict the lookup to the default zone,
but we should follow what the template (if any) says.

I under-estimated the work to get this right, I think this is too much
code to copypaste between nft+xt frontends.  Sorry for making that
suggestion originally.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to