On Wed, Apr 21, 2021 at 10:19:10PM +0200, Alexandr Nedvedicky wrote:
> would it be OK to commit it once bluhm's diff [1] will be in?

Diff can be commited independently when we know that it is correct.

> --------8<---------------8<---------------8<------------------8<--------
> diff --git a/sys/net/pf.c b/sys/net/pf.c
> index 28aa6157552..d4212c21d7b 100644
> --- a/sys/net/pf.c
> +++ b/sys/net/pf.c
> @@ -7366,11 +7366,20 @@ pf_inp_unlink(struct inpcb *inp)
>  void
>  pf_state_key_link_reverse(struct pf_state_key *sk, struct pf_state_key 
> *skrev)
>  {
> -     /* Note that sk and skrev may be equal, then we refcount twice. */
> -     KASSERT(sk->reverse == NULL);
> -     KASSERT(skrev->reverse == NULL);
> -     sk->reverse = pf_state_key_ref(skrev);
> -     skrev->reverse = pf_state_key_ref(sk);
> +     struct pf_state_key *old_reverse;
> +
> +     old_reverse = atomic_cas_ptr(&sk->reverse, NULL, skrev);
> +     if (old_reverse != NULL)
> +             KASSERT(old_reverse == skrev);
> +     else
> +             pf_state_key_ref(skrev);
> +
Can you remove one of the double empty lines?
> +
> +     old_reverse = atomic_cas_ptr(&skrev->reverse, NULL, sk);
> +     if (old_reverse != NULL)
> +             KASSERT(old_reverse == sk);
> +     else
> +             pf_state_key_ref(sk);
>  }
>  
>  #if NPFLOG > 0

In genua code Markus and Haebaert have a more complex solution.
I don't know if we need this and I did not test it.  Just showing
so you can consider it.  A bunch of other functions are also
affected.  If we want it, I can convert it into a working diff.

If we think sashan@'s diff is sufficient, I am fine with that.
Otherwise I would recommend to port the working solution from genua.

void
pf_state_key_link_reverse(struct pf_state_key *sk, struct pf_state_key *skrev)
{
        struct pf_state_key *rev;

        /*
         * 'skrev' points to the statekey from the packet.
         *
         * We try to set 'skrev->reverse' to 'sk', so we can find 'sk' w/o
         * a lookup in the tree in the future.
         * However, both 'skrev->reverse' and 'sk->reverse' might still point
         * to state keys that are about to be removed, so we need to do these
         * steps:
         *
         * 1. If 'skrev->reverse' exists, we do nothing because the linked
         *    state key might be removed concurrently, i.e. the 'removed'
         *    attribute is set and pf_state_key_unlink_reverse() is running.
         *    When the remove is complete skrev->reverse will be set to NULL.
         * 2. Otherwise ('skrev->reverse' is NULL) we try to set
         *    sk->reverse=skrev'. This will fail if sk->reverse also points
         *    to a state key with 'removed' set.
         * 3. Only if step (2) succeeds we can update 'skrev->reverse=sk'.
         *
         * This way both skrev and sk point to each other through the
         * 'reverse' attribute.
         */
        if ((rev = skrev->reverse) != NULL) {
                /* points either to sk, or removed entry */
                KASSERT(rev == sk || rev->removed);
                return;
        }
        /*
         * Grab a reference to sk/skrev, if we succeed, try to set our reverse
         * pointers, if we fail, undo do reference.
         */
        if (!pf_state_key_ref_try(sk))
                return;
        if (!pf_state_key_ref_try(skrev)) {
                pf_state_key_unref(sk);
                return;
        }
        if (atomic_cas_ptr(&sk->reverse, NULL, skrev) != NULL) {
                /* points either to skrev, NULL, or removed entry */
                rev = sk->reverse;
                KASSERT(rev == skrev || rev == NULL || rev->removed);
                pf_state_key_unref(skrev);
                pf_state_key_unref(sk);
                return;
        }
        if (atomic_cas_ptr(&skrev->reverse, NULL, sk) != NULL) {
                rev = skrev->reverse;
                KASSERT(rev == sk || rev == NULL);
                return;
        }
        /* done */
}

int
pf_state_key_ref_try0(struct pf_state_key *sk)
{
        if (!pf_state_key_isvalid(sk))
                return (0);

        return (refcnt_take_try(sk->refcnt));
}

int
refcnt_take_try(struct refcnt *r)
{
        volatile unsigned int refcnt;

        while (1) {
                refcnt = r->refs;

                /* Failed */
                if (refcnt == 0)
                        return (0);

                if (atomic_cas_uint(&r->refs, refcnt, refcnt + 1) == refcnt)
                        return (1);
        }

        return (1);             /* NOTREACHED */
}

Reply via email to