On 07/15/16 05:45, Matthew Macy wrote:
glebius last commit needs some further re-work.

Hi,

Glebius commit needs to be backed out, at least the API change that changes the return value when calling callout_stop() when the callout is scheduled and being serviced. Simply because there is code out there, like Mattew and others have discovered that is "refcounting" on the callout_reset() and expecting that a subsequent callout_stop() will return 1 to "unref".

If you consider this impossible, maybe a fourth return value is needed for CANCELLED and DRAINING .

Further, getting the callouts straight in the TCP stack is a matter of doing the locking correctly, which some has called "my magic bullet" and not the return values. I've proposed in the following revision https://svnweb.freebsd.org/changeset/base/302768 to add a new callout API that accepts a locking function so that the callout code can run its cancelled checks at the right place for situations where more than one lock is needed.

Consider this case:

void
tcp_timer_2msl(void *xtp)
{
        struct tcpcb *tp = xtp;
        struct inpcb *inp;
        CURVNET_SET(tp->t_vnet);
#ifdef TCPDEBUG
        int ostate;

        ostate = tp->t_state;
#endif
        INP_INFO_RLOCK(&V_tcbinfo);
        inp = tp->t_inpcb;
        KASSERT(inp != NULL, ("%s: tp %p tp->t_inpcb == NULL", __func__, tp));
        INP_WLOCK(inp);
        tcp_free_sackholes(tp);
        if (callout_pending(&tp->t_timers->tt_2msl) ||
            !callout_active(&tp->t_timers->tt_2msl)) {

Here we have custom in-house race check that doesn't affect the return value of callout_reset() nor callout_stop().

                INP_WUNLOCK(tp->t_inpcb);
                INP_INFO_RUNLOCK(&V_tcbinfo);
                CURVNET_RESTORE();
                return;


I propose the following solution:


static void
tcp_timer_2msl_lock(void *xtp, int do_lock)
{
        struct tcpcb *tp = xtp;
        struct inpcb *inp;

        inp = tp->t_inpcb;

        if (do_lock) {
                CURVNET_SET(tp->t_vnet);
                INP_INFO_RLOCK(&V_tcbinfo);
                INP_WLOCK(inp);
        } else {
                INP_WUNLOCK(inp);
                INP_INFO_RUNLOCK(&V_tcbinfo);
                CURVNET_RESTORE();
        }
}


callout_init_lock_function(&callout, &tcp_timer_2msl_lock, CALLOUT_RETURNUNLOCKED);

Then in softclock_call_cc() it will look like this:


        CC_UNLOCK(cc);
        if (c_lock != NULL) {
                if (have locking function)
                        tcp_timer_2msl_lock(c_arg, 1);
                else
                        class->lc_lock(c_lock, lock_status);
                /*
                 * The callout may have been cancelled
                 * while we switched locks.
                 */

Actually "CC_LOCK(cc)" should be in-front of cc_exec_cancel() to avoid races testing, setting and clearing this variable, like done in hps_head.

                if (cc_exec_cancel(cc, direct)) {
                        if (have locking function)
                                tcp_timer_2msl_lock(c_arg, 0);
                        else
                                class->lc_unlock(c_lock);
                        goto skip;
               }
>                cc_exec_cancel(cc, direct) = true;

....

skip:
        if ((c_iflags & CALLOUT_RETURNUNLOCKED) == 0) {
                if (have locking function)
                        ...
                else
                        class->lc_unlock(c_lock);
        }

The whole point about this is to make the the cancelled check atomic.

1) Lock TCP
2) Lock CC_LOCK()
3) change callout state

--HPS
_______________________________________________
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Reply via email to