Re: svn commit: r301198 - head/sys/dev/xen/netfront
On Tuesday, May 09, 2017 08:36:13 PM Colin Percival wrote: > On 05/09/17 03:09, Roger Pau Monn� wrote: > > On Wed, May 03, 2017 at 05:13:40AM +, Colin Percival wrote: > >> On 06/02/16 04:16, Roger Pau Monn� wrote: > >>> Author: royger > >>> Date: Thu Jun 2 11:16:35 2016 > >>> New Revision: 301198 > >>> URL: https://svnweb.freebsd.org/changeset/base/301198 > >> > >> I think this commit is responsible for panics I'm seeing in EC2 on T2 > >> family > >> instances. [...] > >> but under high traffic volumes I think a separate thread can already be > >> running in xn_rxeof, having dropped the RX lock while it passes a packet > >> up the stack. This would result in two different threads trying to process > >> the same set of responses from the ring, with (unsurprisingly) bad results. > > > > Hm, right, xn_rxeof drops the lock while pushing the packet up the stack. > > There's a "XXX" comment on top of that, could you try to remove the lock > > dripping and see what happens? > > > > I'm not sure there's any reason to drop the lock here, I very much doubt > > if_input is going to sleep. > > Judging by > $ grep -R -B 1 -A 1 if_input /usr/src/sys/dev > I'm pretty sure that we do indeed need to drop the lock. If it's possible > to enter if_input while holding locks, there are a *lot* of network interface > drivers which are dropping locks unnecessarily... It depends on how the locks are used. If a NIC driver uses a single lock for both TX and RX, then it needs to drop the lock as on the TX side, if_transmit/if_start will be invoked with various network stack locks held. However, if a driver uses separate locks for RX and TX and doesn't acquire the RX lock while holding a TX lock, then it should be safe to hold the lock across if_input. A lot of the older 100Mbps PCI NIC drivers only use a single lock and thus need to drop the lock. Many multiqueue NIC drivers use separate locks however and probably don't need to drop them around if_input. -- John Baldwin ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r301198 - head/sys/dev/xen/netfront
On Tue, May 09, 2017 at 08:36:13PM +, Colin Percival wrote: > On 05/09/17 03:09, Roger Pau Monn� wrote: > > On Wed, May 03, 2017 at 05:13:40AM +, Colin Percival wrote: > >> On 06/02/16 04:16, Roger Pau Monn� wrote: > >>> Author: royger > >>> Date: Thu Jun 2 11:16:35 2016 > >>> New Revision: 301198 > >>> URL: https://svnweb.freebsd.org/changeset/base/301198 > >> > >> I think this commit is responsible for panics I'm seeing in EC2 on T2 > >> family > >> instances. [...] > >> but under high traffic volumes I think a separate thread can already be > >> running in xn_rxeof, having dropped the RX lock while it passes a packet > >> up the stack. This would result in two different threads trying to process > >> the same set of responses from the ring, with (unsurprisingly) bad results. > > > > Hm, right, xn_rxeof drops the lock while pushing the packet up the stack. > > There's a "XXX" comment on top of that, could you try to remove the lock > > dripping and see what happens? > > > > I'm not sure there's any reason to drop the lock here, I very much doubt > > if_input is going to sleep. > > Judging by > $ grep -R -B 1 -A 1 if_input /usr/src/sys/dev > I'm pretty sure that we do indeed need to drop the lock. If it's possible > to enter if_input while holding locks, there are a *lot* of network interface > drivers which are dropping locks unnecessarily... > > >> 3. Why xn_ifinit_locked is consuming ring responses. > > > > There might be pending RX packets on the ring, so netfront consumes them and > > signals netback. In the unlikely event that the RX ring was full when > > xn_ifinit_locked is called, not doing this would mean the RX queue would get > > stuck forever, since there's no guarantee netfront will receive event > > channel > > notifications. > > In that case, I'm guessing it would be safe to skip this if another thread is > already running xn_rxeof and chewing through the packets on the ring? It > would be easy to set a flag in xn_rxeof before we drop locks. Hm, I think it would be better to fix xn_rxeof so that netfront doesn't drop the lock while poking at the ring, what about the following patch? It should also prevent needlessly dropping and acquiring the lock for each packet netfront pushes up the stack. NB: I've only compile-tested this. ---8<--- diff --git a/sys/dev/xen/netfront/netfront.c b/sys/dev/xen/netfront/netfront.c index 99ccdaaf5000..482b9e948fde 100644 --- a/sys/dev/xen/netfront/netfront.c +++ b/sys/dev/xen/netfront/netfront.c @@ -1161,17 +1161,18 @@ xn_rxeof(struct netfront_rxq *rxq) struct mbufq mbufq_rxq, mbufq_errq; int err, work_to_do; - do { - XN_RX_LOCK_ASSERT(rxq); - if (!netfront_carrier_ok(np)) - return; + XN_RX_LOCK_ASSERT(rxq); + + if (!netfront_carrier_ok(np)) + return; - /* XXX: there should be some sane limit. */ - mbufq_init(_errq, INT_MAX); - mbufq_init(_rxq, INT_MAX); + /* XXX: there should be some sane limit. */ + mbufq_init(_errq, INT_MAX); + mbufq_init(_rxq, INT_MAX); - ifp = np->xn_ifp; + ifp = np->xn_ifp; + do { rp = rxq->ring.sring->rsp_prod; rmb(); /* Ensure we see queued responses up to 'rp'. */ @@ -1191,7 +1192,7 @@ xn_rxeof(struct netfront_rxq *rxq) } m->m_pkthdr.rcvif = ifp; - if ( rx->flags & NETRXF_data_validated ) { + if (rx->flags & NETRXF_data_validated) { /* * According to mbuf(9) the correct way to tell * the stack that the checksum of an inbound @@ -1214,50 +1215,45 @@ xn_rxeof(struct netfront_rxq *rxq) } (void )mbufq_enqueue(_rxq, m); - rxq->ring.rsp_cons = i; } - mbufq_drain(_errq); + rxq->ring.rsp_cons = i; - /* -* Process all the mbufs after the remapping is complete. -* Break the mbuf chain first though. -*/ - while ((m = mbufq_dequeue(_rxq)) != NULL) { - if_inc_counter(ifp, IFCOUNTER_IPACKETS, 1); + xn_alloc_rx_buffers(rxq); - /* XXX: Do we really need to drop the rx lock? */ - XN_RX_UNLOCK(rxq); + RING_FINAL_CHECK_FOR_RESPONSES(>ring, work_to_do); + } while (work_to_do); + + XN_RX_UNLOCK(rxq); + mbufq_drain(_errq); + /* +* Process all the mbufs after the remapping is complete. +* Break the mbuf chain first though. +*/ + while ((m = mbufq_dequeue(_rxq)) != NULL) { + if_inc_counter(ifp, IFCOUNTER_IPACKETS, 1); #if (defined(INET) || defined(INET6)) -
Re: svn commit: r301198 - head/sys/dev/xen/netfront
On 05/09/17 03:09, Roger Pau Monn� wrote: > On Wed, May 03, 2017 at 05:13:40AM +, Colin Percival wrote: >> On 06/02/16 04:16, Roger Pau Monn� wrote: >>> Author: royger >>> Date: Thu Jun 2 11:16:35 2016 >>> New Revision: 301198 >>> URL: https://svnweb.freebsd.org/changeset/base/301198 >> >> I think this commit is responsible for panics I'm seeing in EC2 on T2 family >> instances. [...] >> but under high traffic volumes I think a separate thread can already be >> running in xn_rxeof, having dropped the RX lock while it passes a packet >> up the stack. This would result in two different threads trying to process >> the same set of responses from the ring, with (unsurprisingly) bad results. > > Hm, right, xn_rxeof drops the lock while pushing the packet up the stack. > There's a "XXX" comment on top of that, could you try to remove the lock > dripping and see what happens? > > I'm not sure there's any reason to drop the lock here, I very much doubt > if_input is going to sleep. Judging by $ grep -R -B 1 -A 1 if_input /usr/src/sys/dev I'm pretty sure that we do indeed need to drop the lock. If it's possible to enter if_input while holding locks, there are a *lot* of network interface drivers which are dropping locks unnecessarily... >> 3. Why xn_ifinit_locked is consuming ring responses. > > There might be pending RX packets on the ring, so netfront consumes them and > signals netback. In the unlikely event that the RX ring was full when > xn_ifinit_locked is called, not doing this would mean the RX queue would get > stuck forever, since there's no guarantee netfront will receive event channel > notifications. In that case, I'm guessing it would be safe to skip this if another thread is already running xn_rxeof and chewing through the packets on the ring? It would be easy to set a flag in xn_rxeof before we drop locks. -- Colin Percival Security Officer Emeritus, FreeBSD | The power to serve Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r301198 - head/sys/dev/xen/netfront
On Wed, May 03, 2017 at 05:13:40AM +, Colin Percival wrote: > On 06/02/16 04:16, Roger Pau Monné wrote: > > Author: royger > > Date: Thu Jun 2 11:16:35 2016 > > New Revision: 301198 > > URL: https://svnweb.freebsd.org/changeset/base/301198 > > I think this commit is responsible for panics I'm seeing in EC2 on T2 family > instances. Every time a DHCP request is made, we call into xn_ifinit_locked > (not sure why -- something to do with making the interface promiscuous?) and > hit this code > > > @@ -1760,7 +1715,7 @@ xn_ifinit_locked(struct netfront_info *n > > xn_alloc_rx_buffers(rxq); > > rxq->ring.sring->rsp_event = rxq->ring.rsp_cons + 1; > > if (RING_HAS_UNCONSUMED_RESPONSES(>ring)) > > - taskqueue_enqueue(rxq->tq, >intrtask); > > + xn_rxeof(rxq); > > XN_RX_UNLOCK(rxq); > > } > > but under high traffic volumes I think a separate thread can already be > running in xn_rxeof, having dropped the RX lock while it passes a packet > up the stack. This would result in two different threads trying to process > the same set of responses from the ring, with (unsurprisingly) bad results. Hm, right, xn_rxeof drops the lock while pushing the packet up the stack. There's a "XXX" comment on top of that, could you try to remove the lock dripping and see what happens? I'm not sure there's any reason to drop the lock here, I very much doubt if_input is going to sleep. > I'm not 100% sure that this is what's causing the panic, but it's definitely > happening under high traffic conditions immediately after xn_ifinit_locked is > called, so I think my speculation is well-founded. > > There are a few things I don't understand here: > 1. Why DHCP requests are resulting in calls into xn_ifinit_locked. Maybe DHCP flaps the interface up and down? TBH I have no idea. Enabling/disabling certain features (CSUM, TSO) will also cause the interface to reconnect, which might cause incoming packets to get stuck in the RX ring. > 2. Why the calls into xn_ifinit_locked are only happening on T2 instances > and not on any of the other EC2 instances I've tried. Maybe T2 instances are on a more noisy environment? That I'm afraid I have no idea. > 3. Why xn_ifinit_locked is consuming ring responses. There might be pending RX packets on the ring, so netfront consumes them and signals netback. In the unlikely event that the RX ring was full when xn_ifinit_locked is called, not doing this would mean the RX queue would get stuck forever, since there's no guarantee netfront will receive event channel notifications. > so I'm not sure what the solution is, but hopefully someone who knows this > code better will be able to help... My first try would be to disable dropping the lock in xn_rxeof, I think that is utterly incorrect. That should prevent multiple consumers from pocking at the ring at the same time. Roger. ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r301198 - head/sys/dev/xen/netfront
On 06/02/16 04:16, Roger Pau Monné wrote: > Author: royger > Date: Thu Jun 2 11:16:35 2016 > New Revision: 301198 > URL: https://svnweb.freebsd.org/changeset/base/301198 I think this commit is responsible for panics I'm seeing in EC2 on T2 family instances. Every time a DHCP request is made, we call into xn_ifinit_locked (not sure why -- something to do with making the interface promiscuous?) and hit this code > @@ -1760,7 +1715,7 @@ xn_ifinit_locked(struct netfront_info *n > xn_alloc_rx_buffers(rxq); > rxq->ring.sring->rsp_event = rxq->ring.rsp_cons + 1; > if (RING_HAS_UNCONSUMED_RESPONSES(>ring)) > - taskqueue_enqueue(rxq->tq, >intrtask); > + xn_rxeof(rxq); > XN_RX_UNLOCK(rxq); > } but under high traffic volumes I think a separate thread can already be running in xn_rxeof, having dropped the RX lock while it passes a packet up the stack. This would result in two different threads trying to process the same set of responses from the ring, with (unsurprisingly) bad results. I'm not 100% sure that this is what's causing the panic, but it's definitely happening under high traffic conditions immediately after xn_ifinit_locked is called, so I think my speculation is well-founded. There are a few things I don't understand here: 1. Why DHCP requests are resulting in calls into xn_ifinit_locked. 2. Why the calls into xn_ifinit_locked are only happening on T2 instances and not on any of the other EC2 instances I've tried. 3. Why xn_ifinit_locked is consuming ring responses. so I'm not sure what the solution is, but hopefully someone who knows this code better will be able to help... -- Colin Percival Security Officer Emeritus, FreeBSD | The power to serve Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r301198 - head/sys/dev/xen/netfront
Author: royger Date: Thu Jun 2 11:16:35 2016 New Revision: 301198 URL: https://svnweb.freebsd.org/changeset/base/301198 Log: xen-netfront: switch to using an interrupt handler In order to use custom taskqueues we would have to mask the interrupt, which is basically what is already done for an interrupt handler, or else we risk loosing interrupts. This switches netfront to the same interrupt handling that was done before multiqueue support was added. Reviewed by: Wei LiuSponsored by: Citrix Systems R Modified: head/sys/dev/xen/netfront/netfront.c Modified: head/sys/dev/xen/netfront/netfront.c == --- head/sys/dev/xen/netfront/netfront.cThu Jun 2 11:14:26 2016 (r301197) +++ head/sys/dev/xen/netfront/netfront.cThu Jun 2 11:16:35 2016 (r301198) @@ -121,9 +121,9 @@ static void xn_alloc_rx_buffers_callout( static void xn_release_rx_bufs(struct netfront_rxq *); static void xn_release_tx_bufs(struct netfront_txq *); -static void xn_rxq_intr(void *); -static void xn_txq_intr(void *); -static int xn_intr(void *); +static void xn_rxq_intr(struct netfront_rxq *); +static void xn_txq_intr(struct netfront_txq *); +static void xn_intr(void *); static inline int xn_count_frags(struct mbuf *m); static int xn_assemble_tx_request(struct netfront_txq *, struct mbuf *); static int xn_ioctl(struct ifnet *, u_long, caddr_t); @@ -188,9 +188,6 @@ struct netfront_rxq { struct lro_ctrl lro; - struct taskqueue*tq; - struct task intrtask; - struct callout rx_refill; struct xn_rx_stats stats; @@ -214,7 +211,6 @@ struct netfront_txq { struct buf_ring *br; struct taskqueue*tq; - struct task intrtask; struct task defrtask; boolfull; @@ -621,9 +617,8 @@ talk_to_backend(device_t dev, struct net } static void -xn_rxq_tq_intr(void *xrxq, int pending) +xn_rxq_intr(struct netfront_rxq *rxq) { - struct netfront_rxq *rxq = xrxq; XN_RX_LOCK(rxq); xn_rxeof(rxq); @@ -642,9 +637,8 @@ xn_txq_start(struct netfront_txq *txq) } static void -xn_txq_tq_intr(void *xtxq, int pending) +xn_txq_intr(struct netfront_txq *txq) { - struct netfront_txq *txq = xtxq; XN_TX_LOCK(txq); if (RING_HAS_UNCONSUMED_RESPONSES(>ring)) @@ -684,8 +678,6 @@ destroy_rxq(struct netfront_rxq *rxq) callout_drain(>rx_refill); free(rxq->ring.sring, M_DEVBUF); - taskqueue_drain_all(rxq->tq); - taskqueue_free(rxq->tq); } static void @@ -749,27 +741,11 @@ setup_rxqs(device_t dev, struct netfront goto fail_grant_ring; } - TASK_INIT(>intrtask, 0, xn_rxq_tq_intr, rxq); - rxq->tq = taskqueue_create_fast(rxq->name, M_WAITOK, - taskqueue_thread_enqueue, >tq); - callout_init(>rx_refill, 1); - - error = taskqueue_start_threads(>tq, 1, PI_NET, - "%s rxq %d", device_get_nameunit(dev), rxq->id); - if (error != 0) { - device_printf(dev, "failed to start rx taskq %d\n", - rxq->id); - goto fail_start_thread; - } } return (0); -fail_start_thread: - gnttab_end_foreign_access_ref(rxq->ring_ref); - taskqueue_drain_all(rxq->tq); - taskqueue_free(rxq->tq); fail_grant_ring: gnttab_free_grant_references(rxq->gref_head); free(rxq->ring.sring, M_DEVBUF); @@ -871,9 +847,8 @@ setup_txqs(device_t dev, struct netfront txq->br = buf_ring_alloc(NET_TX_RING_SIZE, M_DEVBUF, M_WAITOK, >lock); TASK_INIT(>defrtask, 0, xn_txq_tq_deferred, txq); - TASK_INIT(>intrtask, 0, xn_txq_tq_intr, txq); - txq->tq = taskqueue_create_fast(txq->name, M_WAITOK, + txq->tq = taskqueue_create(txq->name, M_WAITOK, taskqueue_thread_enqueue, >tq); error = taskqueue_start_threads(>tq, 1, PI_NET, @@ -885,10 +860,9 @@ setup_txqs(device_t dev, struct netfront } error = xen_intr_alloc_and_bind_local_port(dev, - xenbus_get_otherend_id(dev), xn_intr, /* handler */ NULL, - >txq[q], - INTR_TYPE_NET | INTR_MPSAFE | INTR_ENTROPY, - >xen_intr_handle); + xenbus_get_otherend_id(dev), /* filter */ NULL, xn_intr, + >txq[q], INTR_TYPE_NET | INTR_MPSAFE | INTR_ENTROPY, + >xen_intr_handle); if (error != 0) { device_printf(dev, "xen_intr_alloc_and_bind_local_port failed\n"); @@
Re: svn commit: r301198 - head/sys/dev/xen/netfront
On Thu, Jun 02, 2016 at 11:16:36AM +, Roger Pau Monné wrote: > Author: royger > Date: Thu Jun 2 11:16:35 2016 > New Revision: 301198 > URL: https://svnweb.freebsd.org/changeset/base/301198 > > Log: > xen-netfront: switch to using an interrupt handler > > In order to use custom taskqueues we would have to mask the interrupt, which > is basically what is already done for an interrupt handler, or else we risk > loosing interrupts. This switches netfront to the same interrupt handling > that was done before multiqueue support was added. > > Reviewed by:Wei Liu> Sponsored by: Citrix Systems R Differential revision: https://reviews.freebsd.org/D6609 ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"