On Thu, Apr 08, 2010 at 02:06:09PM -0700, Jack Vogel wrote:
> Only one device support by em does multiqueue right now, and that is
> Hartwell, 82574.
> 

Thanks for the info.

Mike, here is updated patch. Now UDP bulk TX transfer performance
recovered a lot(about 890Mbps) but it still shows bad numbers
compared to other controllers. For example, bce(4) shows about
958Mbps for the same load.
During the testing I found a strong indication of packet reordering
issue of drbr interface. If I forcibly change to use single TX
queue, em(4) got 950Mbps as it used to be.

Jack, as we talked about possible drbr issue with igb(4), UDP
transfer seems to suffer from packet reordering issue here. Can we
make em(4)/igb(4) use single TX queue until we solve drbr interface
issue? Given that only one em(4) controller supports multiqueue,
dropping multiqueue support for em(4) does not look bad to me.

> Jack
> 
> 
> On Thu, Apr 8, 2010 at 2:05 PM, Mike Tancsa <m...@sentex.net> wrote:
> 
> > At 04:56 PM 4/8/2010, Pyun YongHyeon wrote:
> >
> >> On Thu, Apr 08, 2010 at 02:31:18PM -0400, Mike Tancsa wrote:
> >> > At 02:17 PM 4/8/2010, Pyun YongHyeon wrote:
> >> >
> >> > >Try this patch. It should fix the issue. It seems Jack forgot to
> >> > >strip CRC bytes as old em(4) didn't strip it, probably to
> >> > >workaround silicon bug of old em(4) controllers.
> >> >
> >> > Thanks! The attached patch does indeed fix the dhclient issue.
> >> >
> >> >
> >> > >It seems there are also TX issues here. The system load is too high
> >> > >and sometimes system is not responsive while TX is in progress.
> >> > >Because I initiated TCP bulk transfers, TSO should reduce CPU load
> >> > >a lot but it didn't so I guess it could also be related watchdog
> >> > >timeouts you've seen. I'll see what can be done.
> >> >
> >> > Thanks for looking into that as well!!
> >> >
> >> >         ---Mike
> >> >
> >>
> >> Mike,
> >>
> >> Here is patch I'm working on. This patch fixes high system load and
> >> system is very responsive as before. But it seems there is still
> >> some TX issue here. Bulk UDP performance is very poor(< 700Mbps)
> >> and I have no idea what caused this at this moment.
> >>
> >> BTW, I have trouble to reproduce watchdog timeouts. I'm not sure
> >> whether latest fix from Jack cured it. By chance does your
> >> controller support multi TX/RX queues? You can check whether em(4)
> >> uses multi-queues with "vmstat -i". If em(4) use multi-queue you
> >> may have multiple irq output for em0.
> >>
> >
> > Hi,
> >        I will give it a try later tonight!  This one does not seem to.
> >
> > 0(ich10)# vmstat -i
> > interrupt                          total       rate
> > irq16: uhci0+                         30          0
> > irq18: ehci0 uhci5                158419         17
> > irq19: fwohci0++                      86          0
> > irq21: uhci1                          17          0
> > irq23: uhci3 ehci1                     2          0
> > cpu0: timer                     18570305       1994
> > irq256: igb0                          80          0
> > irq257: igb0                         255          0
> > irq258: igb0                          66          0
> > irq259: igb0                          32          0
> > irq260: igb0                           2          0
> > irq261: igb1                        2679          0
> > irq262: igb1                         998          0
> > irq263: igb1                        2468          0
> > irq264: igb1                        6361          0
> > irq265: igb1                           2          0
> > irq266: em0                        33910          3
> > irq267: ahci1                      15317          1
> > cpu1: timer                     18557074       1993
> > cpu3: timer                     18557168       1993
> > cpu2: timer                     18557108       1993
> > Total                           74462379       7998
> > 0(ich10)#
> >
Index: sys/dev/e1000/if_em.c
===================================================================
--- sys/dev/e1000/if_em.c       (revision 206403)
+++ sys/dev/e1000/if_em.c       (working copy)
@@ -812,6 +812,10 @@
                return (err);
        }
 
+        /* Call cleanup if number of TX descriptors low */
+       if (txr->tx_avail <= EM_TX_CLEANUP_THRESHOLD)
+               em_txeof(txr);
+
        enq = 0;
        if (m == NULL) {
                next = drbr_dequeue(ifp, txr->br);
@@ -834,11 +838,16 @@
                ETHER_BPF_MTAP(ifp, next);
                if ((ifp->if_drv_flags & IFF_DRV_RUNNING) == 0)
                         break;
+               if (txr->tx_avail < EM_MAX_SCATTER) {
+                       ifp->if_drv_flags |= IFF_DRV_OACTIVE;
+                       break;
+               }
                next = drbr_dequeue(ifp, txr->br);
        }
 
        if (enq > 0) {
                 /* Set the watchdog */
+               txr->watchdog_time = ticks;
                 txr->watchdog_check = TRUE;
        }
        return (err);
@@ -864,8 +873,7 @@
        txr = &adapter->tx_rings[i];
 
        if (EM_TX_TRYLOCK(txr)) {
-               if (ifp->if_drv_flags & IFF_DRV_RUNNING)
-                       error = em_mq_start_locked(ifp, txr, m);
+               error = em_mq_start_locked(ifp, txr, m);
                EM_TX_UNLOCK(txr);
        } else 
                error = drbr_enqueue(ifp, txr->br, m);
@@ -909,8 +917,15 @@
        if (!adapter->link_active)
                return;
 
+        /* Call cleanup if number of TX descriptors low */
+       if (txr->tx_avail <= EM_TX_CLEANUP_THRESHOLD)
+               em_txeof(txr);
+
        while (!IFQ_DRV_IS_EMPTY(&ifp->if_snd)) {
-
+               if (txr->tx_avail < EM_MAX_SCATTER) {
+                       ifp->if_drv_flags |= IFF_DRV_OACTIVE;
+                       break;
+               }
                 IFQ_DRV_DEQUEUE(&ifp->if_snd, m_head);
                if (m_head == NULL)
                        break;
@@ -930,7 +945,8 @@
                ETHER_BPF_MTAP(ifp, m_head);
 
                /* Set timeout in case hardware has problems transmitting. */
-               txr->watchdog_check = TRUE;
+               txr->watchdog_time = ticks;
+                txr->watchdog_check = TRUE;
        }
 
        return;
@@ -1427,17 +1443,12 @@
        struct ifnet    *ifp = adapter->ifp;
        struct tx_ring  *txr = adapter->tx_rings;
        struct rx_ring  *rxr = adapter->rx_rings;
-       u32             loop = EM_MAX_LOOP;
-       bool            more_rx, more_tx;
+       bool            more_rx;
 
-
        if (ifp->if_drv_flags & IFF_DRV_RUNNING) {
+               more_rx = em_rxeof(rxr, adapter->rx_process_limit);
                EM_TX_LOCK(txr);
-               do {
-                       more_rx = em_rxeof(rxr, adapter->rx_process_limit);
-                       more_tx = em_txeof(txr);
-               } while (loop-- && (more_rx || more_tx));
-
+               em_txeof(txr);
 #if __FreeBSD_version >= 800000
                if (!drbr_empty(ifp, txr->br))
                        em_mq_start_locked(ifp, txr, NULL);
@@ -1445,10 +1456,9 @@
                if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd))
                        em_start_locked(ifp, txr);
 #endif
-               if (more_rx || more_tx)
+               EM_TX_UNLOCK(txr);
+               if (more_rx)
                        taskqueue_enqueue(adapter->tq, &adapter->que_task);
-
-               EM_TX_UNLOCK(txr);
        }
 
        em_enable_intr(adapter);
@@ -1466,18 +1476,13 @@
 {
        struct tx_ring *txr = arg;
        struct adapter *adapter = txr->adapter;
-       bool            more;
 
        ++txr->tx_irq;
        EM_TX_LOCK(txr);
-       more = em_txeof(txr);
+       em_txeof(txr);
        EM_TX_UNLOCK(txr);
-       if (more)
-               taskqueue_enqueue(txr->tq, &txr->tx_task);
-       else
-               /* Reenable this interrupt */
-               E1000_WRITE_REG(&adapter->hw, E1000_IMS, txr->ims);
-       return;
+       /* Reenable this interrupt */
+       E1000_WRITE_REG(&adapter->hw, E1000_IMS, txr->ims);
 }
 
 /*********************************************************************
@@ -1531,14 +1536,15 @@
 {
        struct rx_ring  *rxr = context;
        struct adapter  *adapter = rxr->adapter;
-       u32             loop = EM_MAX_LOOP;
         bool            more;
 
-        do {
-               more = em_rxeof(rxr, adapter->rx_process_limit);
-        } while (loop-- && more);
-        /* Reenable this interrupt */
-       E1000_WRITE_REG(&adapter->hw, E1000_IMS, rxr->ims);
+       more = em_rxeof(rxr, adapter->rx_process_limit);
+       if (more)
+               taskqueue_enqueue(rxr->tq, &rxr->rx_task);
+       else {
+               /* Reenable this interrupt */
+               E1000_WRITE_REG(&adapter->hw, E1000_IMS, rxr->ims);
+       }
 }
 
 static void
@@ -1547,15 +1553,10 @@
        struct tx_ring  *txr = context;
        struct adapter  *adapter = txr->adapter;
        struct ifnet    *ifp = adapter->ifp;
-       u32             loop = EM_MAX_LOOP;
-        bool            more;
 
        if (!EM_TX_TRYLOCK(txr))
                return;
-       do {
-               more = em_txeof(txr);
-       } while (loop-- && more);
-
+       em_txeof(txr);
 #if __FreeBSD_version >= 800000
        if (!drbr_empty(ifp, txr->br))
                em_mq_start_locked(ifp, txr, NULL);
@@ -1912,12 +1913,7 @@
        bus_dmamap_sync(txr->txdma.dma_tag, txr->txdma.dma_map,
            BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE);
        E1000_WRITE_REG(&adapter->hw, E1000_TDT(txr->me), i);
-       txr->watchdog_time = ticks;
 
-        /* Call cleanup if number of TX descriptors low */
-       if (txr->tx_avail <= EM_TX_CLEANUP_THRESHOLD)
-               em_txeof(txr);
-
        return (0);
 }
 
@@ -3619,7 +3615,8 @@
                if (last != -1) {
                        eop_desc = &txr->tx_base[last];
                        /* Get new done point */
-                       if (++last == adapter->num_tx_desc) last = 0;
+                       if (++last == adapter->num_tx_desc)
+                               last = 0;
                        done = last;
                } else
                        break;
@@ -4078,7 +4075,7 @@
 em_rxeof(struct rx_ring *rxr, int count)
 {
        struct adapter          *adapter = rxr->adapter;
-       struct ifnet            *ifp = adapter->ifp;;
+       struct ifnet            *ifp = adapter->ifp;
        struct mbuf             *mp, *sendmp;
        u8                      status;
        u16                     len;
@@ -4088,6 +4085,7 @@
 
        EM_RX_LOCK(rxr);
 
+       status = 0;
        for (i = rxr->next_to_check, processed = 0; count != 0;) {
 
                if ((ifp->if_drv_flags & IFF_DRV_RUNNING) == 0)
@@ -4195,7 +4193,11 @@
        rxr->next_to_check = i;
 
        EM_RX_UNLOCK(rxr);
+#ifdef DEVICE_POLLING
        return (rxdone);
+#else
+       return ((status & E1000_RXD_STAT_DD) ? TRUE : FALSE);
+#endif
 }
 
 #ifndef __NO_STRICT_ALIGNMENT
Index: sys/dev/e1000/if_em.h
===================================================================
--- sys/dev/e1000/if_em.h       (revision 206403)
+++ sys/dev/e1000/if_em.h       (working copy)
@@ -223,7 +223,7 @@
 #define HW_DEBUGOUT1(S, A)          if (DEBUG_HW) printf(S "\n", A)
 #define HW_DEBUGOUT2(S, A, B)       if (DEBUG_HW) printf(S "\n", A, B)
 
-#define EM_MAX_SCATTER         64
+#define EM_MAX_SCATTER         32
 #define EM_VFTA_SIZE           128
 #define EM_TSO_SIZE            (65535 + sizeof(struct ether_vlan_header))
 #define EM_TSO_SEG_SIZE                4096    /* Max dma segment size */
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to