Re: svn commit: r301198 - head/sys/dev/xen/netfront

2017-05-10 Thread John Baldwin
On Tuesday, May 09, 2017 08:36:13 PM Colin Percival wrote:
> On 05/09/17 03:09, Roger Pau Monn� wrote:
> > On Wed, May 03, 2017 at 05:13:40AM +, Colin Percival wrote:
> >> On 06/02/16 04:16, Roger Pau Monn� wrote:
> >>> Author: royger
> >>> Date: Thu Jun  2 11:16:35 2016
> >>> New Revision: 301198
> >>> URL: https://svnweb.freebsd.org/changeset/base/301198
> >>
> >> I think this commit is responsible for panics I'm seeing in EC2 on T2 
> >> family
> >> instances.  [...]
> >> but under high traffic volumes I think a separate thread can already be
> >> running in xn_rxeof, having dropped the RX lock while it passes a packet
> >> up the stack.  This would result in two different threads trying to process
> >> the same set of responses from the ring, with (unsurprisingly) bad results.
> > 
> > Hm, right, xn_rxeof drops the lock while pushing the packet up the stack.
> > There's a "XXX" comment on top of that, could you try to remove the lock
> > dripping and see what happens?
> > 
> > I'm not sure there's any reason to drop the lock here, I very much doubt
> > if_input is going to sleep.
> 
> Judging by
> $ grep -R -B 1 -A 1 if_input /usr/src/sys/dev
> I'm pretty sure that we do indeed need to drop the lock.  If it's possible
> to enter if_input while holding locks, there are a *lot* of network interface
> drivers which are dropping locks unnecessarily...

It depends on how the locks are used.  If a NIC driver uses a single lock for
both TX and RX, then it needs to drop the lock as on the TX side,
if_transmit/if_start will be invoked with various network stack locks held.
However, if a driver uses separate locks for RX and TX and doesn't acquire
the RX lock while holding a TX lock, then it should be safe to hold the lock
across if_input.

A lot of the older 100Mbps PCI NIC drivers only use a single lock and thus
need to drop the lock.  Many multiqueue NIC drivers use separate locks
however and probably don't need to drop them around if_input.

-- 
John Baldwin
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r301198 - head/sys/dev/xen/netfront

2017-05-10 Thread Roger Pau Monné
On Tue, May 09, 2017 at 08:36:13PM +, Colin Percival wrote:
> On 05/09/17 03:09, Roger Pau Monn� wrote:
> > On Wed, May 03, 2017 at 05:13:40AM +, Colin Percival wrote:
> >> On 06/02/16 04:16, Roger Pau Monn� wrote:
> >>> Author: royger
> >>> Date: Thu Jun  2 11:16:35 2016
> >>> New Revision: 301198
> >>> URL: https://svnweb.freebsd.org/changeset/base/301198
> >>
> >> I think this commit is responsible for panics I'm seeing in EC2 on T2 
> >> family
> >> instances.  [...]
> >> but under high traffic volumes I think a separate thread can already be
> >> running in xn_rxeof, having dropped the RX lock while it passes a packet
> >> up the stack.  This would result in two different threads trying to process
> >> the same set of responses from the ring, with (unsurprisingly) bad results.
> > 
> > Hm, right, xn_rxeof drops the lock while pushing the packet up the stack.
> > There's a "XXX" comment on top of that, could you try to remove the lock
> > dripping and see what happens?
> > 
> > I'm not sure there's any reason to drop the lock here, I very much doubt
> > if_input is going to sleep.
> 
> Judging by
> $ grep -R -B 1 -A 1 if_input /usr/src/sys/dev
> I'm pretty sure that we do indeed need to drop the lock.  If it's possible
> to enter if_input while holding locks, there are a *lot* of network interface
> drivers which are dropping locks unnecessarily...
> 
> >> 3. Why xn_ifinit_locked is consuming ring responses.
> > 
> > There might be pending RX packets on the ring, so netfront consumes them and
> > signals netback. In the unlikely event that the RX ring was full when
> > xn_ifinit_locked is called, not doing this would mean the RX queue would get
> > stuck forever, since there's no guarantee netfront will receive event 
> > channel
> > notifications.
> 
> In that case, I'm guessing it would be safe to skip this if another thread is
> already running xn_rxeof and chewing through the packets on the ring?  It
> would be easy to set a flag in xn_rxeof before we drop locks.

Hm, I think it would be better to fix xn_rxeof so that netfront doesn't drop
the lock while poking at the ring, what about the following patch?

It should also prevent needlessly dropping and acquiring the lock for each
packet netfront pushes up the stack.

NB: I've only compile-tested this.

---8<---
diff --git a/sys/dev/xen/netfront/netfront.c b/sys/dev/xen/netfront/netfront.c
index 99ccdaaf5000..482b9e948fde 100644
--- a/sys/dev/xen/netfront/netfront.c
+++ b/sys/dev/xen/netfront/netfront.c
@@ -1161,17 +1161,18 @@ xn_rxeof(struct netfront_rxq *rxq)
struct mbufq mbufq_rxq, mbufq_errq;
int err, work_to_do;
 
-   do {
-   XN_RX_LOCK_ASSERT(rxq);
-   if (!netfront_carrier_ok(np))
-   return;
+   XN_RX_LOCK_ASSERT(rxq);
+
+   if (!netfront_carrier_ok(np))
+   return;
 
-   /* XXX: there should be some sane limit. */
-   mbufq_init(_errq, INT_MAX);
-   mbufq_init(_rxq, INT_MAX);
+   /* XXX: there should be some sane limit. */
+   mbufq_init(_errq, INT_MAX);
+   mbufq_init(_rxq, INT_MAX);
 
-   ifp = np->xn_ifp;
+   ifp = np->xn_ifp;
 
+   do {
rp = rxq->ring.sring->rsp_prod;
rmb();  /* Ensure we see queued responses up to 'rp'. */
 
@@ -1191,7 +1192,7 @@ xn_rxeof(struct netfront_rxq *rxq)
}
 
m->m_pkthdr.rcvif = ifp;
-   if ( rx->flags & NETRXF_data_validated ) {
+   if (rx->flags & NETRXF_data_validated) {
/*
 * According to mbuf(9) the correct way to tell
 * the stack that the checksum of an inbound
@@ -1214,50 +1215,45 @@ xn_rxeof(struct netfront_rxq *rxq)
}
 
(void )mbufq_enqueue(_rxq, m);
-   rxq->ring.rsp_cons = i;
}
 
-   mbufq_drain(_errq);
+   rxq->ring.rsp_cons = i;
 
-   /*
-* Process all the mbufs after the remapping is complete.
-* Break the mbuf chain first though.
-*/
-   while ((m = mbufq_dequeue(_rxq)) != NULL) {
-   if_inc_counter(ifp, IFCOUNTER_IPACKETS, 1);
+   xn_alloc_rx_buffers(rxq);
 
-   /* XXX: Do we really need to drop the rx lock? */
-   XN_RX_UNLOCK(rxq);
+   RING_FINAL_CHECK_FOR_RESPONSES(>ring, work_to_do);
+   } while (work_to_do);
+
+   XN_RX_UNLOCK(rxq);
+   mbufq_drain(_errq);
+   /*
+* Process all the mbufs after the remapping is complete.
+* Break the mbuf chain first though.
+*/
+   while ((m = mbufq_dequeue(_rxq)) != NULL) {
+   if_inc_counter(ifp, IFCOUNTER_IPACKETS, 1);
 #if (defined(INET) || defined(INET6))
-  

Re: svn commit: r301198 - head/sys/dev/xen/netfront

2017-05-09 Thread Colin Percival
On 05/09/17 03:09, Roger Pau Monn� wrote:
> On Wed, May 03, 2017 at 05:13:40AM +, Colin Percival wrote:
>> On 06/02/16 04:16, Roger Pau Monn� wrote:
>>> Author: royger
>>> Date: Thu Jun  2 11:16:35 2016
>>> New Revision: 301198
>>> URL: https://svnweb.freebsd.org/changeset/base/301198
>>
>> I think this commit is responsible for panics I'm seeing in EC2 on T2 family
>> instances.  [...]
>> but under high traffic volumes I think a separate thread can already be
>> running in xn_rxeof, having dropped the RX lock while it passes a packet
>> up the stack.  This would result in two different threads trying to process
>> the same set of responses from the ring, with (unsurprisingly) bad results.
> 
> Hm, right, xn_rxeof drops the lock while pushing the packet up the stack.
> There's a "XXX" comment on top of that, could you try to remove the lock
> dripping and see what happens?
> 
> I'm not sure there's any reason to drop the lock here, I very much doubt
> if_input is going to sleep.

Judging by
$ grep -R -B 1 -A 1 if_input /usr/src/sys/dev
I'm pretty sure that we do indeed need to drop the lock.  If it's possible
to enter if_input while holding locks, there are a *lot* of network interface
drivers which are dropping locks unnecessarily...

>> 3. Why xn_ifinit_locked is consuming ring responses.
> 
> There might be pending RX packets on the ring, so netfront consumes them and
> signals netback. In the unlikely event that the RX ring was full when
> xn_ifinit_locked is called, not doing this would mean the RX queue would get
> stuck forever, since there's no guarantee netfront will receive event channel
> notifications.

In that case, I'm guessing it would be safe to skip this if another thread is
already running xn_rxeof and chewing through the packets on the ring?  It
would be easy to set a flag in xn_rxeof before we drop locks.

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r301198 - head/sys/dev/xen/netfront

2017-05-09 Thread Roger Pau Monné
On Wed, May 03, 2017 at 05:13:40AM +, Colin Percival wrote:
> On 06/02/16 04:16, Roger Pau Monné wrote:
> > Author: royger
> > Date: Thu Jun  2 11:16:35 2016
> > New Revision: 301198
> > URL: https://svnweb.freebsd.org/changeset/base/301198
> 
> I think this commit is responsible for panics I'm seeing in EC2 on T2 family
> instances.  Every time a DHCP request is made, we call into xn_ifinit_locked
> (not sure why -- something to do with making the interface promiscuous?) and
> hit this code
> 
> > @@ -1760,7 +1715,7 @@ xn_ifinit_locked(struct netfront_info *n
> > xn_alloc_rx_buffers(rxq);
> > rxq->ring.sring->rsp_event = rxq->ring.rsp_cons + 1;
> > if (RING_HAS_UNCONSUMED_RESPONSES(>ring))
> > -   taskqueue_enqueue(rxq->tq, >intrtask);
> > +   xn_rxeof(rxq);
> > XN_RX_UNLOCK(rxq);
> > }
> 
> but under high traffic volumes I think a separate thread can already be
> running in xn_rxeof, having dropped the RX lock while it passes a packet
> up the stack.  This would result in two different threads trying to process
> the same set of responses from the ring, with (unsurprisingly) bad results.

Hm, right, xn_rxeof drops the lock while pushing the packet up the stack.
There's a "XXX" comment on top of that, could you try to remove the lock
dripping and see what happens?

I'm not sure there's any reason to drop the lock here, I very much doubt
if_input is going to sleep.

> I'm not 100% sure that this is what's causing the panic, but it's definitely
> happening under high traffic conditions immediately after xn_ifinit_locked is
> called, so I think my speculation is well-founded.
> 
> There are a few things I don't understand here:
> 1. Why DHCP requests are resulting in calls into xn_ifinit_locked.

Maybe DHCP flaps the interface up and down? TBH I have no idea.
Enabling/disabling certain features (CSUM, TSO) will also cause the interface
to reconnect, which might cause incoming packets to get stuck in the RX ring.

> 2. Why the calls into xn_ifinit_locked are only happening on T2 instances
> and not on any of the other EC2 instances I've tried.

Maybe T2 instances are on a more noisy environment? That I'm afraid I have no
idea.

> 3. Why xn_ifinit_locked is consuming ring responses.

There might be pending RX packets on the ring, so netfront consumes them and
signals netback. In the unlikely event that the RX ring was full when
xn_ifinit_locked is called, not doing this would mean the RX queue would get
stuck forever, since there's no guarantee netfront will receive event channel
notifications.

> so I'm not sure what the solution is, but hopefully someone who knows this
> code better will be able to help...

My first try would be to disable dropping the lock in xn_rxeof, I think that is
utterly incorrect. That should prevent multiple consumers from pocking at the
ring at the same time.

Roger.
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r301198 - head/sys/dev/xen/netfront

2017-05-02 Thread Colin Percival
On 06/02/16 04:16, Roger Pau Monné wrote:
> Author: royger
> Date: Thu Jun  2 11:16:35 2016
> New Revision: 301198
> URL: https://svnweb.freebsd.org/changeset/base/301198

I think this commit is responsible for panics I'm seeing in EC2 on T2 family
instances.  Every time a DHCP request is made, we call into xn_ifinit_locked
(not sure why -- something to do with making the interface promiscuous?) and
hit this code

> @@ -1760,7 +1715,7 @@ xn_ifinit_locked(struct netfront_info *n
>   xn_alloc_rx_buffers(rxq);
>   rxq->ring.sring->rsp_event = rxq->ring.rsp_cons + 1;
>   if (RING_HAS_UNCONSUMED_RESPONSES(>ring))
> - taskqueue_enqueue(rxq->tq, >intrtask);
> + xn_rxeof(rxq);
>   XN_RX_UNLOCK(rxq);
>   }

but under high traffic volumes I think a separate thread can already be
running in xn_rxeof, having dropped the RX lock while it passes a packet
up the stack.  This would result in two different threads trying to process
the same set of responses from the ring, with (unsurprisingly) bad results.

I'm not 100% sure that this is what's causing the panic, but it's definitely
happening under high traffic conditions immediately after xn_ifinit_locked is
called, so I think my speculation is well-founded.

There are a few things I don't understand here:
1. Why DHCP requests are resulting in calls into xn_ifinit_locked.
2. Why the calls into xn_ifinit_locked are only happening on T2 instances
and not on any of the other EC2 instances I've tried.
3. Why xn_ifinit_locked is consuming ring responses.
so I'm not sure what the solution is, but hopefully someone who knows this
code better will be able to help...

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

svn commit: r301198 - head/sys/dev/xen/netfront

2016-06-02 Thread Roger Pau Monné
Author: royger
Date: Thu Jun  2 11:16:35 2016
New Revision: 301198
URL: https://svnweb.freebsd.org/changeset/base/301198

Log:
  xen-netfront: switch to using an interrupt handler
  
  In order to use custom taskqueues we would have to mask the interrupt, which
  is basically what is already done for an interrupt handler, or else we risk
  loosing interrupts. This switches netfront to the same interrupt handling
  that was done before multiqueue support was added.
  
  Reviewed by:  Wei Liu 
  Sponsored by: Citrix Systems R

Modified:
  head/sys/dev/xen/netfront/netfront.c

Modified: head/sys/dev/xen/netfront/netfront.c
==
--- head/sys/dev/xen/netfront/netfront.cThu Jun  2 11:14:26 2016
(r301197)
+++ head/sys/dev/xen/netfront/netfront.cThu Jun  2 11:16:35 2016
(r301198)
@@ -121,9 +121,9 @@ static void xn_alloc_rx_buffers_callout(
 static void xn_release_rx_bufs(struct netfront_rxq *);
 static void xn_release_tx_bufs(struct netfront_txq *);
 
-static void xn_rxq_intr(void *);
-static void xn_txq_intr(void *);
-static int  xn_intr(void *);
+static void xn_rxq_intr(struct netfront_rxq *);
+static void xn_txq_intr(struct netfront_txq *);
+static void xn_intr(void *);
 static inline int xn_count_frags(struct mbuf *m);
 static int xn_assemble_tx_request(struct netfront_txq *, struct mbuf *);
 static int xn_ioctl(struct ifnet *, u_long, caddr_t);
@@ -188,9 +188,6 @@ struct netfront_rxq {
 
struct lro_ctrl lro;
 
-   struct taskqueue*tq;
-   struct task intrtask;
-
struct callout  rx_refill;
 
struct xn_rx_stats  stats;
@@ -214,7 +211,6 @@ struct netfront_txq {
struct buf_ring *br;
 
struct taskqueue*tq;
-   struct task intrtask;
struct task defrtask;
 
boolfull;
@@ -621,9 +617,8 @@ talk_to_backend(device_t dev, struct net
 }
 
 static void
-xn_rxq_tq_intr(void *xrxq, int pending)
+xn_rxq_intr(struct netfront_rxq *rxq)
 {
-   struct netfront_rxq *rxq = xrxq;
 
XN_RX_LOCK(rxq);
xn_rxeof(rxq);
@@ -642,9 +637,8 @@ xn_txq_start(struct netfront_txq *txq)
 }
 
 static void
-xn_txq_tq_intr(void *xtxq, int pending)
+xn_txq_intr(struct netfront_txq *txq)
 {
-   struct netfront_txq *txq = xtxq;
 
XN_TX_LOCK(txq);
if (RING_HAS_UNCONSUMED_RESPONSES(>ring))
@@ -684,8 +678,6 @@ destroy_rxq(struct netfront_rxq *rxq)
 
callout_drain(>rx_refill);
free(rxq->ring.sring, M_DEVBUF);
-   taskqueue_drain_all(rxq->tq);
-   taskqueue_free(rxq->tq);
 }
 
 static void
@@ -749,27 +741,11 @@ setup_rxqs(device_t dev, struct netfront
goto fail_grant_ring;
}
 
-   TASK_INIT(>intrtask, 0, xn_rxq_tq_intr, rxq);
-   rxq->tq = taskqueue_create_fast(rxq->name, M_WAITOK,
-   taskqueue_thread_enqueue, >tq);
-
callout_init(>rx_refill, 1);
-
-   error = taskqueue_start_threads(>tq, 1, PI_NET,
-   "%s rxq %d", device_get_nameunit(dev), rxq->id);
-   if (error != 0) {
-   device_printf(dev, "failed to start rx taskq %d\n",
-   rxq->id);
-   goto fail_start_thread;
-   }
}
 
return (0);
 
-fail_start_thread:
-   gnttab_end_foreign_access_ref(rxq->ring_ref);
-   taskqueue_drain_all(rxq->tq);
-   taskqueue_free(rxq->tq);
 fail_grant_ring:
gnttab_free_grant_references(rxq->gref_head);
free(rxq->ring.sring, M_DEVBUF);
@@ -871,9 +847,8 @@ setup_txqs(device_t dev, struct netfront
txq->br = buf_ring_alloc(NET_TX_RING_SIZE, M_DEVBUF,
M_WAITOK, >lock);
TASK_INIT(>defrtask, 0, xn_txq_tq_deferred, txq);
-   TASK_INIT(>intrtask, 0, xn_txq_tq_intr, txq);
 
-   txq->tq = taskqueue_create_fast(txq->name, M_WAITOK,
+   txq->tq = taskqueue_create(txq->name, M_WAITOK,
taskqueue_thread_enqueue, >tq);
 
error = taskqueue_start_threads(>tq, 1, PI_NET,
@@ -885,10 +860,9 @@ setup_txqs(device_t dev, struct netfront
}
 
error = xen_intr_alloc_and_bind_local_port(dev,
-   xenbus_get_otherend_id(dev), xn_intr, /* handler */ 
NULL,
-   >txq[q],
-   INTR_TYPE_NET | INTR_MPSAFE | INTR_ENTROPY,
-   >xen_intr_handle);
+   xenbus_get_otherend_id(dev), /* filter */ NULL, xn_intr,
+   >txq[q], INTR_TYPE_NET | INTR_MPSAFE | INTR_ENTROPY,
+   >xen_intr_handle);
 
if (error != 0) {
device_printf(dev, "xen_intr_alloc_and_bind_local_port 
failed\n");
@@ 

Re: svn commit: r301198 - head/sys/dev/xen/netfront

2016-06-02 Thread Roger Pau Monné
On Thu, Jun 02, 2016 at 11:16:36AM +, Roger Pau Monné wrote:
> Author: royger
> Date: Thu Jun  2 11:16:35 2016
> New Revision: 301198
> URL: https://svnweb.freebsd.org/changeset/base/301198
> 
> Log:
>   xen-netfront: switch to using an interrupt handler
>   
>   In order to use custom taskqueues we would have to mask the interrupt, which
>   is basically what is already done for an interrupt handler, or else we risk
>   loosing interrupts. This switches netfront to the same interrupt handling
>   that was done before multiqueue support was added.
>   
>   Reviewed by:Wei Liu 
>   Sponsored by:   Citrix Systems R

Differential revision: https://reviews.freebsd.org/D6609
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"