Hi Jack,
        Two quick notes about the new driver.

On the server that was having nic lockups, so far so good. Saturday AM, the box would take a lot of level0 dumps as well as do about 70Mb/s of outbound rsync traffic. By now, the nic would have wedged at least once So far so good!


On different, new box, I decided to try HEAD, with the new driver, and ran into problems with the onboard nic

e...@pci0:0:25:0: class=0x020000 card=0x00368086 chip=0x10f08086 rev=0x06 hdr=0x00
    vendor     = 'Intel Corporation'
    class      = network
    subclass   = ethernet
    cap 01[c8] = powerspec 2  supports D0 D3  current D0
    cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
    cap 13[e0] = PCI Advanced Features: FLR TP

em0: <Intel(R) PRO/1000 Network Connection 7.0.5> port 0xf020-0xf03f mem 0xfe500000-0xfe51ffff,0xfe527000-0xfe527fff irq 20 at device 25.0 on pci0
em0: Using MSI interrupt
em0: [FILTER]
em0: Ethernet address: 70:71:bc:09:5e:aa

This is an intel branded desktop board

acpi0: <INTEL DH55TC> on motherboard

I find I have to disable rx and tx csum on the interface, otherwise there are a lot of re-transmits due to missed packets. tcpdump implies the packets are going out, but it seems never to get out. The mother board is at the office on an unmanaged switch right now, so I dont have any stats from the switch. But tcpdump shows a lot of outbound re-transmits. Turning off rxcsum and txcsum fixes the problem.

dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.0.8
dev.em.0.%driver: em
dev.em.0.%location: slot=25 function=0 handle=\_SB_.PCI0.GBE_
dev.em.0.%pnpinfo: vendor=0x8086 device=0x10f0 subvendor=0x8086 subdevice=0x0036 class=0x020000
dev.em.0.%parent: pci0
dev.em.0.nvm: -1
dev.em.0.rx_int_delay: 0
dev.em.0.tx_int_delay: 66
dev.em.0.rx_abs_int_delay: 66
dev.em.0.tx_abs_int_delay: 66
dev.em.0.rx_processing_limit: 100
dev.em.0.link_irq: 0
dev.em.0.mbuf_alloc_fail: 0
dev.em.0.cluster_alloc_fail: 0
dev.em.0.dropped: 0
dev.em.0.tx_dma_fail: 0
dev.em.0.rx_overruns: 0
dev.em.0.watchdog_timeouts: 0
dev.em.0.device_control: 1074790976
dev.em.0.rx_control: 67141634
dev.em.0.fc_high_water: 8192
dev.em.0.fc_low_water: 6692
dev.em.0.queue0.txd_head: 15
dev.em.0.queue0.txd_tail: 17
dev.em.0.queue0.tx_irq: 0
dev.em.0.queue0.no_desc_avail: 0
dev.em.0.queue0.rxd_head: 843
dev.em.0.queue0.rxd_tail: 842
dev.em.0.queue0.rx_irq: 0
dev.em.0.mac_stats.excess_coll: 0
dev.em.0.mac_stats.single_coll: 0
dev.em.0.mac_stats.multiple_coll: 0
dev.em.0.mac_stats.late_coll: 0
dev.em.0.mac_stats.collision_count: 0
dev.em.0.mac_stats.symbol_errors: 0
dev.em.0.mac_stats.sequence_errors: 0
dev.em.0.mac_stats.defer_count: 0
dev.em.0.mac_stats.missed_packets: 0
dev.em.0.mac_stats.recv_no_buff: 0
dev.em.0.mac_stats.recv_undersize: 0
dev.em.0.mac_stats.recv_fragmented: 0
dev.em.0.mac_stats.recv_oversize: 0
dev.em.0.mac_stats.recv_jabber: 0
dev.em.0.mac_stats.recv_errs: 0
dev.em.0.mac_stats.crc_errs: 0
dev.em.0.mac_stats.alignment_errs: 0
dev.em.0.mac_stats.coll_ext_errs: 0
dev.em.0.mac_stats.xon_recvd: 80
dev.em.0.mac_stats.xon_txd: 0
dev.em.0.mac_stats.xoff_recvd: 82
dev.em.0.mac_stats.xoff_txd: 0
dev.em.0.mac_stats.total_pkts_recvd: 35697
dev.em.0.mac_stats.good_pkts_recvd: 35535
dev.em.0.mac_stats.bcast_pkts_recvd: 231
dev.em.0.mac_stats.mcast_pkts_recvd: 85
dev.em.0.mac_stats.rx_frames_64: 0
dev.em.0.mac_stats.rx_frames_65_127: 0
dev.em.0.mac_stats.rx_frames_128_255: 0
dev.em.0.mac_stats.rx_frames_256_511: 0
dev.em.0.mac_stats.rx_frames_512_1023: 0
dev.em.0.mac_stats.rx_frames_1024_1522: 0
dev.em.0.mac_stats.good_octets_recvd: 14878015
dev.em.0.mac_stats.good_octets_txd: 14051783
dev.em.0.mac_stats.total_pkts_txd: 45313
dev.em.0.mac_stats.good_pkts_txd: 45313
dev.em.0.mac_stats.bcast_pkts_txd: 3
dev.em.0.mac_stats.mcast_pkts_txd: 5
dev.em.0.mac_stats.tx_frames_64: 0
dev.em.0.mac_stats.tx_frames_65_127: 0
dev.em.0.mac_stats.tx_frames_128_255: 0
dev.em.0.mac_stats.tx_frames_256_511: 0
dev.em.0.mac_stats.tx_frames_512_1023: 0
dev.em.0.mac_stats.tx_frames_1024_1522: 0
dev.em.0.mac_stats.tso_txd: 2788
dev.em.0.mac_stats.tso_ctx_fail: 0
dev.em.0.interrupts.asserts: 48733
dev.em.0.interrupts.rx_pkt_timer: 0
dev.em.0.interrupts.rx_abs_timer: 0
dev.em.0.interrupts.tx_pkt_timer: 0
dev.em.0.interrupts.tx_abs_timer: 0
dev.em.0.interrupts.tx_queue_empty: 0
dev.em.0.interrupts.tx_queue_min_thresh: 0
dev.em.0.interrupts.rx_desc_min_thresh: 0
dev.em.0.interrupts.rx_overrun: 0
dev.em.0.wake: 0



At 08:00 PM 9/26/2010, Jack Vogel wrote:
The system I've had stress tests running on has 82574 LOMs, so I hope it
will solve the problem, will see tomorrow morning at how things have held
up...

Jack


On Sun, Sep 26, 2010 at 4:43 PM, Mike Tancsa <<mailto:m...@sentex.net>m...@sentex.net> wrote:
At 06:19 PM 9/26/2010, Jack Vogel wrote:
Your em1 is using MSI not MSIX and thus can't have multiple queues. I'm
not sure whats broken from what you show here. I will try to get the new
driver out shortly for you to try.


With this particular NIC, it will wedge under high load. I tried 2 different motherboards and chipsets the same behaviour.

       ---Mike


Jack



On Sun, Sep 26, 2010 at 2:57 PM, Mike Tancsa <<mailto:m...@sentex.net><mailto:m...@sentex.net>m...@sentex.net> wrote:
At 06:36 PM 9/24/2010, Jack Vogel wrote:
There is a new revision of the em driver coming next week, its going thru some
stress pounding over the weekend, if no issues show up I'll put it into HEAD.

Yongari's changes in TX context handling which effects checksum and tso
are added. I've also decided that multiple queues in 82574 just are a source
of problems without a lot of benefit, so it still uses MSIX but with only 3 vectors,
meaning it seperates TX and RX but has a single queue.


Thanks, looking forward to trying it out! With respect to the multiple queues, I thought the driver already used just the one on RELENG_8 ? If not, is there a way to force the existing driver to use just the one queue ?

On the box that has the NIC locking up, it shows

e...@pci0:9:0:0: class=0x020000 card=0x34ec8086 chip=0x10d38086 rev=0x00 hdr=0x00

  vendor     = 'Intel Corporation'
  device     = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
  class      = network
  subclass   = ethernet
  cap 01[c8] = powerspec 2  supports D0 D3  current D0
  cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
  cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)

and

vmstat -i shows

irq256: em0                      5129063        353
irq257: em1                       531251         36

in a wedged state, stats look like

dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 7.0.5
dev.em.1.%driver: em
dev.em.1.%location: slot=0 function=0 handle=\_SB_.PCI0.PEX4.HART
dev.em.1.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086 subdevice=0x34ec class=0x020000
dev.em.1.%parent: pci9
dev.em.1.nvm: -1
dev.em.1.rx_int_delay: 0
dev.em.1.tx_int_delay: 66
dev.em.1.rx_abs_int_delay: 66
dev.em.1.tx_abs_int_delay: 66
dev.em.1.rx_processing_limit: 100
dev.em.1.link_irq: 0
dev.em.1.mbuf_alloc_fail: 0
dev.em.1.cluster_alloc_fail: 0
dev.em.1.dropped: 0
dev.em.1.tx_dma_fail: 0
dev.em.1.fc_high_water: 18432
dev.em.1.fc_low_water: 16932
dev.em.1.mac_stats.excess_coll: 0
dev.em.1.mac_stats.symbol_errors: 0
dev.em.1.mac_stats.sequence_errors: 0
dev.em.1.mac_stats.defer_count: 0
dev.em.1.mac_stats.missed_packets: 41522
dev.em.1.mac_stats.recv_no_buff: 19
dev.em.1.mac_stats.recv_errs: 0
dev.em.1.mac_stats.crc_errs: 0
dev.em.1.mac_stats.alignment_errs: 0
dev.em.1.mac_stats.coll_ext_errs: 0
dev.em.1.mac_stats.rx_overruns: 41398
dev.em.1.mac_stats.watchdog_timeouts: 0
dev.em.1.mac_stats.xon_recvd: 0
dev.em.1.mac_stats.xon_txd: 0
dev.em.1.mac_stats.xoff_recvd: 0
dev.em.1.mac_stats.xoff_txd: 0
dev.em.1.mac_stats.total_pkts_recvd: 95229129
dev.em.1.mac_stats.good_pkts_recvd: 95187607
dev.em.1.mac_stats.bcast_pkts_recvd: 79244
dev.em.1.mac_stats.mcast_pkts_recvd: 0
dev.em.1.mac_stats.rx_frames_64: 93680
dev.em.1.mac_stats.rx_frames_65_127: 1516349
dev.em.1.mac_stats.rx_frames_128_255: 4464941
dev.em.1.mac_stats.rx_frames_256_511: 4024
dev.em.1.mac_stats.rx_frames_512_1023: 2096067
dev.em.1.mac_stats.rx_frames_1024_1522: 87012546
dev.em.1.mac_stats.good_octets_recvd: 0
dev.em.1.mac_stats.good_octest_txd: 0
dev.em.1.mac_stats.total_pkts_txd: 66775098
dev.em.1.mac_stats.good_pkts_txd: 66775098
dev.em.1.mac_stats.bcast_pkts_txd: 509
dev.em.1.mac_stats.mcast_pkts_txd: 7
dev.em.1.mac_stats.tx_frames_64: 48038472
dev.em.1.mac_stats.tx_frames_65_127: 13402833
dev.em.1.mac_stats.tx_frames_128_255: 5324413
dev.em.1.mac_stats.tx_frames_256_511: 957
dev.em.1.mac_stats.tx_frames_512_1023: 319
dev.em.1.mac_stats.tx_frames_1024_1522: 8104
dev.em.1.mac_stats.tso_txd: 1069
dev.em.1.mac_stats.tso_ctx_fail: 0
dev.em.1.interrupts.asserts: 0
dev.em.1.interrupts.rx_pkt_timer: 0
dev.em.1.interrupts.rx_abs_timer: 0
dev.em.1.interrupts.tx_pkt_timer: 0
dev.em.1.interrupts.tx_abs_timer: 0
dev.em.1.interrupts.tx_queue_empty: 0
dev.em.1.interrupts.tx_queue_min_thresh: 0
dev.em.1.interrupts.rx_desc_min_thresh: 0
dev.em.1.interrupts.rx_overrun: 0
dev.em.1.host.breaker_tx_pkt: 0
dev.em.1.host.host_tx_pkt_discard: 0
dev.em.1.host.rx_pkt: 0
dev.em.1.host.breaker_rx_pkts: 0
dev.em.1.host.breaker_rx_pkt_drop: 0
dev.em.1.host.tx_good_pkt: 0
dev.em.1.host.breaker_tx_pkt_drop: 0
dev.em.1.host.rx_good_bytes: 0
dev.em.1.host.tx_good_bytes: 0
dev.em.1.host.length_errors: 0
dev.em.1.host.serdes_violation_pkt: 0
dev.em.1.host.header_redir_missed: 0

ifconfig down/up just panics or locks up the box when its in this state. I also have IPMI enabled on this nic, but it shows the same issue with it disabled.

      ---Mike



--------------------------------------------------------------------
Mike Tancsa,                                      tel +1 519 651 3400
Sentex Communications, <mailto:m...@sentex.net><mailto:m...@sentex.net>m...@sentex.net Providing Internet since 1994 <<http://www.sentex.net>http://www.sentex.net>www.sentex.net Cambridge, Ontario Canada <<http://www.sentex.net/mike>http://www.sentex.net/mike>www.sentex.net/mike


--------------------------------------------------------------------
Mike Tancsa,                                      tel +1 519 651 3400
Sentex Communications, <mailto:m...@sentex.net>m...@sentex.net Providing Internet since 1994 <http://www.sentex.net>www.sentex.net Cambridge, Ontario Canada <http://www.sentex.net/mike>www.sentex.net/mike


--------------------------------------------------------------------
Mike Tancsa,                                      tel +1 519 651 3400
Sentex Communications,                            m...@sentex.net
Providing Internet since 1994                    www.sentex.net
Cambridge, Ontario Canada                         www.sentex.net/mike

_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to