Re: hfsc_deferred race

2017-08-17 Thread Mike Belopuhov
On Tue, Aug 15, 2017 at 17:14 +0200, Mike Belopuhov wrote:
> Hi,
> 
> I've just triggered an assert in hfsc_deferred (a callout) on an
> MP kernel running on an SP virtual machine:
> 
>   panic: kernel diagnostic assertion "HFSC_ENABLED(ifq)" failed: file 
> "/home/mike/src/openbsd/sys/net/hfsc.c", line 950
>   Stopped at  db_enter+0x9:   leave
>   TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
>   *247463  28420  0 0x3  00  pfctl
>   db_enter() at db_enter+0x9
>   
> panic(817f78f0,4,81a3ffc0,8110c140,800c2060,fff
>   f81598b1c) at panic+0x102
>   __assert(81769d93,817d7350,3b6,817d72bd) at 
> __assert+0x
>   35
>   hfsc_deferred(800c2060) at hfsc_deferred+0x9e
>   timeout_run(8004adc8) at timeout_run+0x4c
>   softclock(0) at softclock+0x146
>   softintr_dispatch(0) at softintr_dispatch+0x9f
>   Xsoftclock() at Xsoftclock+0x1f
>   --- interrupt ---
>   end of kernel
>   end trace frame: 0x728d481974c08548, count: 7
>   0x2cfe9c031c9:
>   https://www.openbsd.org/ddb.html describes the minimum info required in bug
>   reports.  Insufficient info makes it difficult to find and fix bugs.
>   ddb{0}> ps
>  PID TID   PPIDUID  S   FLAGS  WAIT  COMMAND
>   *28420  247463   5000  0  7 0x3pfctl
> 
> 
> pfctl runs in the loop reloading the ruleset.  So at some point we
> disable HFSC on the interface but lose a race with hfsc_deferred
> before re-enabling it.
>

While talking to visa@, I've added this piece that explains where
there race is happening:

When we *disable* hfsc, we put back the prio ifq ops:

203 ifq_attach(struct ifqueue *ifq, const struct ifq_ops *newops, void *opsarg)
204 {
205 struct mbuf_list ml = MBUF_LIST_INITIALIZER();
206 struct mbuf_list free_ml = MBUF_LIST_INITIALIZER();
207 struct mbuf *m;
208 const struct ifq_ops *oldops;
209 void *newq, *oldq;
210
211 newq = newops->ifqop_alloc(ifq->ifq_idx, opsarg);
212
213 mtx_enter(&ifq->ifq_mtx);
214 ifq->ifq_ops->ifqop_purge(ifq, &ml);
215 ifq->ifq_len = 0;
216
217 oldops = ifq->ifq_ops;
218 oldq = ifq->ifq_q;
219
220 ifq->ifq_ops = newops;
221 ifq->ifq_q = newq;
222
223 while ((m = ml_dequeue(&ml)) != NULL) {
224 m = ifq->ifq_ops->ifqop_enq(ifq, m);
225 if (m != NULL) {
226 ifq->ifq_qdrops++;
227 ml_enqueue(&free_ml, m);
228 } else
229 ifq->ifq_len++;
230 }
231 mtx_leave(&ifq->ifq_mtx);
232
233 oldops->ifqop_free(ifq->ifq_idx, oldq);
234
235 ml_purge(&free_ml);
236 }

Line 214 calls hfsc_purge, on line 231 we release the IPL_NET mutex
protecting us from interrupts and then finally on line 233 we call
hfsc_free that does timeout_del. This opens a window of opportunity
after we release the mutex for a networking interrupt to fire and
call the softclock softintr before returning control to ifq_attach.

And visa@ has pointed out another potential race after ifqop_alloc
(aka hfsc_alloc that does timeout_add) and line 213 where we grab
the IPL_NET mutex to set ifq_ops on line 220. Since NET_LOCK doesn't
have an interrupt protection from softclock softintrs anymore an
IPL_NET interrupt can fire and if lucky trigger the hfsc_deferred
before we set ifq_ops. To avoid this race I'm proposing to move the
timeout_add from hfsc_alloc to hfsc_enq_begin, i.e. add the timeout
when there are packets to deal with.

> IFQ has a mechanism to lock the underlying object and I believe this
> is the right tool for this job.  Any other ideas?
> 
> I don't think it's a good idea to hold the mutex (ifq_q_enter and
> ifq_q_leave effectively lock and unlock it) during the ifq_start,
> so we have to make a concession and run the ifq_start before knowing
> whether or not HFSC is attached.  IMO, it's a small price to pay to
> avoide clutter.  Kernel lock assertion is pointless at this point.
> 
> OK?
> 

A new diff combining all three modifications:

diff --git sys/net/hfsc.c sys/net/hfsc.c
index 410bea733c6..b81afd43531 100644
--- sys/net/hfsc.c
+++ sys/net/hfsc.c
@@ -584,14 +584,13 @@ hfsc_idx(unsigned int nqueues, const struct mbuf *m)
 
 void *
 hfsc_alloc(unsigned int idx, void *q)
 {
struct hfsc_if *hif = q;
+
KASSERT(idx == 0); /* when hfsc is enabled we only use the first ifq */
KASSERT(hif != NULL);
-
-   timeout_add(&hif->hif_defer, 1);
return (hif);
 }
 
 void
 hfsc_free(unsigned int idx, void *q)
@@ -825,12 +824,15 @@ hfsc_enq(struct ifqueue *ifq, struct mbuf *m)
}
 
dm = hfsc_class_enqueue(cl, m);
 
/* successfully queued. */
-   if (dm != m && hfsc_class_qlength(cl) == 1)
+   if (dm != m && hfsc_class_qlength(cl) == 1) {
hfsc_set_active(hif, cl, m->m_pkthdr.

Re: hfsc_deferred race

2017-08-16 Thread Mike Belopuhov
On Tue, Aug 15, 2017 at 17:14 +0200, Mike Belopuhov wrote:
> Hi,
> 
> I've just triggered an assert in hfsc_deferred (a callout) on an
> MP kernel running on an SP virtual machine:
> 
>   panic: kernel diagnostic assertion "HFSC_ENABLED(ifq)" failed: file 
> "/home/mike/src/openbsd/sys/net/hfsc.c", line 950
>   Stopped at  db_enter+0x9:   leave
>   TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
>   *247463  28420  0 0x3  00  pfctl
>   db_enter() at db_enter+0x9
>   
> panic(817f78f0,4,81a3ffc0,8110c140,800c2060,fff
>   f81598b1c) at panic+0x102
>   __assert(81769d93,817d7350,3b6,817d72bd) at 
> __assert+0x
>   35
>   hfsc_deferred(800c2060) at hfsc_deferred+0x9e
>   timeout_run(8004adc8) at timeout_run+0x4c
>   softclock(0) at softclock+0x146
>   softintr_dispatch(0) at softintr_dispatch+0x9f
>   Xsoftclock() at Xsoftclock+0x1f
>   --- interrupt ---
>   end of kernel
>   end trace frame: 0x728d481974c08548, count: 7
>   0x2cfe9c031c9:
>   https://www.openbsd.org/ddb.html describes the minimum info required in bug
>   reports.  Insufficient info makes it difficult to find and fix bugs.
>   ddb{0}> ps
>  PID TID   PPIDUID  S   FLAGS  WAIT  COMMAND
>   *28420  247463   5000  0  7 0x3pfctl
> 
> 
> pfctl runs in the loop reloading the ruleset.  So at some point we
> disable HFSC on the interface but lose a race with hfsc_deferred
> before re-enabling it.
> 
> IFQ has a mechanism to lock the underlying object and I believe this
> is the right tool for this job.  Any other ideas?
> 
> I don't think it's a good idea to hold the mutex (ifq_q_enter and
> ifq_q_leave effectively lock and unlock it) during the ifq_start,
> so we have to make a concession and run the ifq_start before knowing
> whether or not HFSC is attached.  IMO, it's a small price to pay to
> avoide clutter.  Kernel lock assertion is pointless at this point.
> 
> OK?
>

I've been running with this while debugging the issue with the active
class list ("panic: kernel diagnostic assertion" from Aug 12 on bugs@)
and I'm quite confident that this works and I don't observe the race
anymore.

In addition, I've figured we can keep the HFSC_ENABLED check as there
is no issue with bailing early here:

diff --git sys/net/hfsc.c sys/net/hfsc.c
index 12504267dc5..c51f1406a0b 100644
--- sys/net/hfsc.c
+++ sys/net/hfsc.c
@@ -950,10 +950,13 @@ hfsc_deferred(void *arg)
 {
struct ifnet *ifp = arg;
struct ifqueue *ifq = &ifp->if_snd;
struct hfsc_if *hif;
 
+   if (!HFSC_ENABLED(ifq))
+   return;
+
if (!ifq_empty(ifq))
ifq_start(ifq);
 
hif = ifq_q_enter(&ifp->if_snd, ifq_hfsc_ops);
if (hif == NULL)


> diff --git sys/net/hfsc.c sys/net/hfsc.c
> index 410bea733c6..3c5b6f6ef78 100644
> --- sys/net/hfsc.c
> +++ sys/net/hfsc.c
> @@ -944,20 +944,19 @@ hfsc_deferred(void *arg)
>  {
>   struct ifnet *ifp = arg;
>   struct ifqueue *ifq = &ifp->if_snd;
>   struct hfsc_if *hif;
>  
> - KERNEL_ASSERT_LOCKED();
> - KASSERT(HFSC_ENABLED(ifq));
> -
>   if (!ifq_empty(ifq))
>   ifq_start(ifq);
>  
> - hif = ifq->ifq_q;
> -
> + hif = ifq_q_enter(&ifp->if_snd, ifq_hfsc_ops);
> + if (hif == NULL)
> + return;
>   /* XXX HRTIMER nearest virtual/fit time is likely less than 1/HZ. */
>   timeout_add(&hif->hif_defer, 1);
> + ifq_q_leave(&ifp->if_snd, hif);
>  }
>  
>  void
>  hfsc_cl_purge(struct hfsc_if *hif, struct hfsc_class *cl, struct mbuf_list 
> *ml)
>  {



hfsc_deferred race

2017-08-15 Thread Mike Belopuhov
Hi,

I've just triggered an assert in hfsc_deferred (a callout) on an
MP kernel running on an SP virtual machine:

  panic: kernel diagnostic assertion "HFSC_ENABLED(ifq)" failed: file 
"/home/mike/src/openbsd/sys/net/hfsc.c", line 950
  Stopped at  db_enter+0x9:   leave
  TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
  *247463  28420  0 0x3  00  pfctl
  db_enter() at db_enter+0x9
  
panic(817f78f0,4,81a3ffc0,8110c140,800c2060,fff
  f81598b1c) at panic+0x102
  __assert(81769d93,817d7350,3b6,817d72bd) at 
__assert+0x
  35
  hfsc_deferred(800c2060) at hfsc_deferred+0x9e
  timeout_run(8004adc8) at timeout_run+0x4c
  softclock(0) at softclock+0x146
  softintr_dispatch(0) at softintr_dispatch+0x9f
  Xsoftclock() at Xsoftclock+0x1f
  --- interrupt ---
  end of kernel
  end trace frame: 0x728d481974c08548, count: 7
  0x2cfe9c031c9:
  https://www.openbsd.org/ddb.html describes the minimum info required in bug
  reports.  Insufficient info makes it difficult to find and fix bugs.
  ddb{0}> ps
 PID TID   PPIDUID  S   FLAGS  WAIT  COMMAND
  *28420  247463   5000  0  7 0x3pfctl


pfctl runs in the loop reloading the ruleset.  So at some point we
disable HFSC on the interface but lose a race with hfsc_deferred
before re-enabling it.

IFQ has a mechanism to lock the underlying object and I believe this
is the right tool for this job.  Any other ideas?

I don't think it's a good idea to hold the mutex (ifq_q_enter and
ifq_q_leave effectively lock and unlock it) during the ifq_start,
so we have to make a concession and run the ifq_start before knowing
whether or not HFSC is attached.  IMO, it's a small price to pay to
avoide clutter.  Kernel lock assertion is pointless at this point.

OK?

diff --git sys/net/hfsc.c sys/net/hfsc.c
index 410bea733c6..3c5b6f6ef78 100644
--- sys/net/hfsc.c
+++ sys/net/hfsc.c
@@ -944,20 +944,19 @@ hfsc_deferred(void *arg)
 {
struct ifnet *ifp = arg;
struct ifqueue *ifq = &ifp->if_snd;
struct hfsc_if *hif;
 
-   KERNEL_ASSERT_LOCKED();
-   KASSERT(HFSC_ENABLED(ifq));
-
if (!ifq_empty(ifq))
ifq_start(ifq);
 
-   hif = ifq->ifq_q;
-
+   hif = ifq_q_enter(&ifp->if_snd, ifq_hfsc_ops);
+   if (hif == NULL)
+   return;
/* XXX HRTIMER nearest virtual/fit time is likely less than 1/HZ. */
timeout_add(&hif->hif_defer, 1);
+   ifq_q_leave(&ifp->if_snd, hif);
 }
 
 void
 hfsc_cl_purge(struct hfsc_if *hif, struct hfsc_class *cl, struct mbuf_list *ml)
 {