Re: RELENG_7 em problems (and RELENG_8)

Mike Tancsa Tue, 17 Aug 2010 12:57:19 -0700

At 02:52 PM 8/17/2010, Pyun YongHyeon wrote:

Here is updated patch for HEAD and stable/8.
http://people.freebsd.org/~yongari/em.csum_tso.20100817.patch


It seems to work as expected under my limited environments. If

Thanks! The patch applies cleanly and all works as expected now! I amno longer able to trigger the bug. I just use the stock unmodifieddriver normally, so no multi queues


# vmstat -i
interrupt                          total       rate
irq256: em0                          149          0
irq257: em1                            3          0
irq259: em3                          971          2
irq260: ahci0                       1520          3



em3: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        
options=219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC>
        ether 00:15:17:xx:xx:xx
        inet6 fe80::215:17ff:fexx:xxxx%em3 prefixlen 64 scopeid 0x4
        inet 192.168.xx.xx netmask 0xffffff00 broadcast 192.168.xx.xx
        nd6 options=3<PERFORMNUD,ACCEPT_RTADV>
        media: Ethernet autoselect (100baseTX <full-duplex>)
        status: active

e...@pci0:3:0:0: class=0x020000 card=0x34ec8086 chip=0x10d38086rev=0x00 hdr=0x00

    vendor     = 'Intel Corporation'
    device     = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
    class      = network
    subclass   = ethernet
    cap 01[c8] = powerspec 2  supports D0 D3  current D0
    cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
    cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
    cap 11[a0] = MSI-X supports 5 messages in map 0x1c



patch < em.csum_tso.20100817.patch
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/e1000/if_em.c
|===================================================================
|--- sys/dev/e1000/if_em.c      (revision 211398)
|+++ sys/dev/e1000/if_em.c      (working copy)
--------------------------
Patching file sys/dev/e1000/if_em.c using Plan A...
Hunk #1 succeeded at 237.
Hunk #2 succeeded at 1730.
Hunk #3 succeeded at 1759.
Hunk #4 succeeded at 1930.
Hunk #5 succeeded at 3148.
Hunk #6 succeeded at 3351.
Hunk #7 succeeded at 3533.
Hunk #8 succeeded at 3590.
Hunk #9 succeeded at 3603.
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/e1000/if_em.h
|===================================================================
|--- sys/dev/e1000/if_em.h      (revision 211398)
|+++ sys/dev/e1000/if_em.h      (working copy)
--------------------------
Patching file sys/dev/e1000/if_em.h using Plan A...
Hunk #1 succeeded at 284.
done

        ---Mike

you're using multiple Tx queues with em(4) it would be better to
disable Tx checksum offloading as driver always have to create a
new checksum context for each frame. This will effectively disable
pipelined Tx data DMA which in turn greatly slows down Tx
performance for small sized frames. The reason driver have to
create a new checksum context when it uses multiple Tx queues comes
from hardware limitation. The controller tracks only for the last
context descriptor that was written such that driver does not know
the state of checksum context configured in other Tx queue.
Hope this helps.

>
>
>         ---Mike
>
>
> At 03:36 PM 7/2/2010, Pyun YongHyeon wrote:
> >On Fri, Jul 02, 2010 at 01:39:22PM -0400, Mike Tancsa wrote:
> >> Hi Jack,
> >>         Just a followup to the email below. I now saw what appears
> >> to be the same problem on RELENG_8, but on a different nic and with
> >> VLANs.  So not sure if this is a general em problem, a problem
> >> specific to some em NICs, or a TSO problem in general.  The issue
> >> seemed to be triggered when I added a new vlan based on
> >>
> >> e...@pci0:14:0:0:        class=0x020000 card=0x109a15d9
> >> chip=0x109a8086 rev=0x00 hdr=0x00
> >>     vendor     = 'Intel Corporation'
> >>     device     = 'Intel PRO/1000 PL Network Adaptor (82573L)'
> >>     class      = network
> >>     subclass   = ethernet
> >>     cap 01[c8] = powerspec 2  supports D0 D3  current D0
> >>     cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
> >>     cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
> >>
> >> pci14: <ACPI PCI bus> on pcib5
> >> em3: <Intel(R) PRO/1000 Network Connection 7.0.5> port 0x6000-0x601f
> >> mem 0xe8300000-0xe831ffff irq 17 at device 0.0 on pci14
> >> em3: Using MSI interrupt
> >> em3: [FILTER]
> >> em3: Ethernet address: 00:30:48:9f:eb:81
> >>
> >> em3: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST>
> >> metric 0 mtu 1500
> >>         options=2098<VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC>
> >>         ether 00:30:48:9f:eb:81
> >>         inet 10.255.255.254 netmask 0xfffffffc broadcast 10.255.255.255
> >>         media: Ethernet autoselect (1000baseT <full-duplex>)
> >>         status: active
> >>
> >> I had to disable tso, rxcsum and txsum in order to see the devices on
> >> the other side of the two vlans trunked off em3.  Unfortunately, the
> >> other sides were switches 100km and 500km away so I didnt have any
> >> tcpdump capabilities to diagnose the issue.  I had already created
> >> one vlan off this NIC and all was fine.  A few weeks later, I added a
> >> new one and I could no longer telnet into the remote switches from
> >> the local machine.... But, I could telnet into the switches from
> >> machines not on the problem box. Hence, it would appear to be a
> >> general TSO issue no ? I disabled tso on the nic (I didnt disable
> >> net.inet.tcp.tso as I forgot about that).. Still nothing. I could
> >> always ping the remote devices, but no tcp services.  I then
> >> remembered this issue from before, so I tried disabling tso on the
> >> NIC. Still nothing. Then I disabled rxcsum and txcsum and I could
> >> then telnet into the remote devices.
> >>
> >> This newly observed issue was from a buildworld on Mon Jun 14
> >> 11:29:12 EDT 2010.
> >>
> >> I will try and recreate the issue locally again to see if I can
> >> trigger the problem on demand.  Any thoughts on what it might be ?
> >> Perhaps an issue specific to certain em nics ?
> >>
> >
> >http://www.freebsd.org/cgi/query-pr.cgi?pr=141843
> >I'm not sure whether you're seeing the same issue though.
> >I didn't have chance to try latest em(4) on stable/7.
>
> --------------------------------------------------------------------
> Mike Tancsa,                                      tel +1 519 651 3400
> Sentex Communications,                            m...@sentex.net
> Providing Internet since 1994                    www.sentex.net
> Cambridge, Ontario Canada                         www.sentex.net/mike
>


--------------------------------------------------------------------
Mike Tancsa,                                      tel +1 519 651 3400
Sentex Communications,                            m...@sentex.net
Providing Internet since 1994                    www.sentex.net
Cambridge, Ontario Canada                         www.sentex.net/mike

_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: RELENG_7 em problems (and RELENG_8)

Reply via email to