On 11/7/2011 9:24 PM, Jason Wolfe wrote:
On Mon, Oct 31, 2011 at 1:13 AM, Hooman Fazaeli <hoomanfaza...@gmail.com 
<mailto:hoomanfaza...@gmail.com>> wrote:


    Attached is a patch for if_em.c. It flushes interface queue when it is full
    and link is not active. Please note that when this happens, drops are 
increasing
    on interface and this will trigger your scripts as before. You need to 
change
    a little the scripts as follows:

                  check interface TX status
                  if (interface TX seems hung) {
                            sleep 5
                            check interface TX status
                            if (interface TX seems hung) {
                                 reset the interface.
                            }
                    }

    For MULTIQUEUE, it just disables the check for link status (which is not 
good).
    so pls. test in non-MULTIQUEUE mode.

    The patch also contains some minor fixups to compile on 7 plus
    a fix from r1.69 which addressed RX hang problem (the fix was
    later removed in r1.70). I included it for Emil to give it a
    try.

    Pls. let me know if you have any problems with patch.


Hooman,

Unfortunately  one of the server just had a wedge event a couple hours ago with 
this patch.  To confirm your changes should cause a recovery within the time 
I'm allowing, here is the current format:

check interface TX status
if (interface TX seems hung) {
    sleep 3
    check packets out
    sleep 2
    check packets out
    if (packets not incrementing) {
        reset the interface
    }
}

I bounced em0 because dropped packets incremented 1749543 to 1749708 and the 
interface is not incrementing packets out.

4:10AM up 6 days, 15:23, 0 users, load averages: 0.02, 0.12, 0.14

em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC>
ether X
inet6 X%em0 prefixlen 64 scopeid 0x1
nd6 options=1<PERFORMNUD>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active

em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC>
ether X
inet6 X%em1 prefixlen 64 scopeid 0x2
nd6 options=3<PERFORMNUD,ACCEPT_RTADV>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active

ipfw0: flags=8801<UP,SIMPLEX,MULTICAST> metric 0 mtu 65536

lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
options=3<RXCSUM,TXCSUM>
inet 127.0.0.1 netmask 0xff000000
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4
inet X.X.X.X netmask 0xffffffff
inet X.X.X.X netmask 0xffffffff
inet X.X.X.X netmask 0xffffffff
nd6 options=3<PERFORMNUD,ACCEPT_RTADV>

lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC>
ether X
inet X.X.X.X netmask 0xffffff00 broadcast X.X.X.X
inet6 X%lagg0 prefixlen 64 scopeid 0x5
inet6 X prefixlen 64 autoconf
nd6 options=3<PERFORMNUD,ACCEPT_RTADV>
media: Ethernet autoselect
status: active
laggproto loadbalance
laggport: em0 flags=4<ACTIVE>
laggport: em1 flags=4<ACTIVE>

interrupt total rate
irq3: uart1 3810 0
cpu0: timer 1147568087 2000
irq256: em0:rx 0 59779710 104
irq257: em0:tx 0 2771888652 4831
irq258: em0:link 1 0
irq259: em1:rx 0 3736828886 6512
irq260: em1:tx 0 2790566376 4863
irq261: em1:link 27286 0
irq262: mps0 395687386 689
cpu1: timer 1147559894 2000
cpu2: timer 1147559901 2000
cpu3: timer 1147559902 2000
Total 14345029891 25001

13466/4144/17610 mbufs in use (current/cache/total)
2567/2635/5202/5853720 mbuf clusters in use (current/cache/total/max)
2567/633 mbuf+clusters out of packet secondary zone in use (current/cache)
6798/554/7352/2926859 4k (page size) jumbo clusters in use 
(current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
35692K/8522K/44214K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll Drop
em0 1500 <Link#1> 00:25:90:2b:e5:75 60747643 0 0 11246408092 0 0 1750763
em0 1500 fe80:1::225:9 fe80:1::225:90ff: 0 - - 4 - - -
em1 1500 <Link#2> 00:25:90:2b:e5:75 11237195776 123950 0 11344722383 0 0 545682
em1 1500 fe80:2::225:9 fe80:2::225:90ff: 0 - - 1 - - -
lagg0 1500 <Link#5> 00:25:90:2b:e5:75 11297850142 0 0 22588666102 2296445 0 0
lagg0 1500 69.164.38.0/2 <http://69.164.38.0/2> 69.164.38.83 10189108030 - - 
22592881776 - - -
lagg0 1500 fe80:5::225:9 fe80:5::225:90ff: 24 - - 28 - - -
lagg0 1500 2607:f4e8:310 2607:f4e8:310:12: 19578 - - 19591 - - -

kern.msgbuf:

Nov 7 04:10:06 cds1033 kernel: Interface is RUNNING and INACTIVE
Nov 7 04:10:06 cds1033 kernel: em0: hw tdh = 558, hw tdt = 558
Nov 7 04:10:06 cds1033 kernel: em0: hw rdh = 889, hw rdt = 888
Nov 7 04:10:06 cds1033 kernel: em0: Tx Queue Status = 0
Nov 7 04:10:06 cds1033 kernel: em0: TX descriptors avail = 1024
Nov 7 04:10:06 cds1033 kernel: em0: Tx Descriptors avail failure = 0
Nov 7 04:10:06 cds1033 kernel: em0: RX discarded packets = 0
Nov 7 04:10:06 cds1033 kernel: em0: RX Next to Check = 889
Nov 7 04:10:06 cds1033 kernel: em0: RX Next to Refresh = 888
Nov 7 04:10:06 cds1033 kernel: em0: Link state: active
Nov 7 04:10:06 cds1033 kernel: Interface is RUNNING and INACTIVE
Nov 7 04:10:06 cds1033 kernel: em1: hw tdh = 837, hw tdt = 837
Nov 7 04:10:06 cds1033 kernel: em1: hw rdh = 198, hw rdt = 189
Nov 7 04:10:06 cds1033 kernel: em1: Tx Queue Status = 0
Nov 7 04:10:06 cds1033 kernel: em1: TX descriptors avail = 1021
Nov 7 04:10:06 cds1033 kernel: em1: Tx Descriptors avail failure = 0
Nov 7 04:10:06 cds1033 kernel: em1: RX discarded packets = 0
Nov 7 04:10:06 cds1033 kernel: em1: RX Next to Check = 303
Nov 7 04:10:06 cds1033 kernel: em1: RX Next to Refresh = 302
Nov 7 04:10:06 cds1033 kernel: em1: Link state: active

net.inet.ip.intr_queue_maxlen: 512
net.inet.ip.intr_queue_drops: 0
dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.3
dev.em.0.%driver: em
dev.em.0.%location: slot=0 function=0
dev.em.0.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x15d9 
subdevice=0x10d3 class=0x020000
dev.em.0.%parent: pci1
dev.em.0.nvm: -1
dev.em.0.debug: -1
dev.em.0.rx_int_delay: 0
dev.em.0.tx_int_delay: 66
dev.em.0.rx_abs_int_delay: 66
dev.em.0.tx_abs_int_delay: 66
dev.em.0.rx_processing_limit: 100
dev.em.0.flow_control: 0
dev.em.0.eee_control: 0
dev.em.0.link_irq: 1
dev.em.0.mbuf_alloc_fail: 0
dev.em.0.cluster_alloc_fail: 0
dev.em.0.dropped: 0
dev.em.0.tx_dma_fail: 0
dev.em.0.rx_overruns: 0
dev.em.0.watchdog_timeouts: 0
dev.em.0.link_toggles: 3
dev.em.0.device_control: 1074790984
dev.em.0.rx_control: 67141634
dev.em.0.fc_high_water: 18432
dev.em.0.fc_low_water: 16932
dev.em.0.queue0.txd_head: 558
dev.em.0.queue0.txd_tail: 558
dev.em.0.queue0.tx_irq: 2771888613
dev.em.0.queue0.no_desc_avail: 0
dev.em.0.queue0.rxd_head: 102
dev.em.0.queue0.rxd_tail: 101
dev.em.0.queue0.rx_irq: 59779941
dev.em.0.mac_stats.excess_coll: 0
dev.em.0.mac_stats.single_coll: 0
dev.em.0.mac_stats.multiple_coll: 0
dev.em.0.mac_stats.late_coll: 0
dev.em.0.mac_stats.collision_count: 0
dev.em.0.mac_stats.symbol_errors: 0
dev.em.0.mac_stats.sequence_errors: 0
dev.em.0.mac_stats.defer_count: 0
dev.em.0.mac_stats.missed_packets: 0
dev.em.0.mac_stats.recv_no_buff: 0
dev.em.0.mac_stats.recv_undersize: 0
dev.em.0.mac_stats.recv_fragmented: 0
dev.em.0.mac_stats.recv_oversize: 0
dev.em.0.mac_stats.recv_jabber: 0
dev.em.0.mac_stats.recv_errs: 0
dev.em.0.mac_stats.crc_errs: 0
dev.em.0.mac_stats.alignment_errs: 0
dev.em.0.mac_stats.coll_ext_errs: 0
dev.em.0.mac_stats.xon_recvd: 0
dev.em.0.mac_stats.xon_txd: 0
dev.em.0.mac_stats.xoff_recvd: 0
dev.em.0.mac_stats.xoff_txd: 0
dev.em.0.mac_stats.total_pkts_recvd: 60768005
dev.em.0.mac_stats.good_pkts_recvd: 60768005
dev.em.0.mac_stats.bcast_pkts_recvd: 60744833
dev.em.0.mac_stats.mcast_pkts_recvd: 2948
dev.em.0.mac_stats.rx_frames_64: 60744822
dev.em.0.mac_stats.rx_frames_65_127: 3491
dev.em.0.mac_stats.rx_frames_128_255: 19293
dev.em.0.mac_stats.rx_frames_256_511: 364
dev.em.0.mac_stats.rx_frames_512_1023: 35
dev.em.0.mac_stats.rx_frames_1024_1522: 0
dev.em.0.mac_stats.good_octets_recvd: 3890774076
dev.em.0.mac_stats.good_octets_txd: 15295538422911
dev.em.0.mac_stats.total_pkts_txd: 11246435226
dev.em.0.mac_stats.good_pkts_txd: 11246435226
dev.em.0.mac_stats.bcast_pkts_txd: 4
dev.em.0.mac_stats.mcast_pkts_txd: 3830
dev.em.0.mac_stats.tx_frames_64: 60338532
dev.em.0.mac_stats.tx_frames_65_127: 897135811
dev.em.0.mac_stats.tx_frames_128_255: 13933787
dev.em.0.mac_stats.tx_frames_256_511: 22107540
dev.em.0.mac_stats.tx_frames_512_1023: 116374045
dev.em.0.mac_stats.tx_frames_1024_1522: 10136545511
dev.em.0.mac_stats.tso_txd: 0
dev.em.0.mac_stats.tso_ctx_fail: 0
dev.em.0.interrupts.asserts: 3
dev.em.0.interrupts.rx_pkt_timer: 0
dev.em.0.interrupts.rx_abs_timer: 0
dev.em.0.interrupts.tx_pkt_timer: 0
dev.em.0.interrupts.tx_abs_timer: 0
dev.em.0.interrupts.tx_queue_empty: 0
dev.em.0.interrupts.tx_queue_min_thresh: 0
dev.em.0.interrupts.rx_desc_min_thresh: 0
dev.em.0.interrupts.rx_overrun: 0
dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 7.2.3
dev.em.1.%driver: em
dev.em.1.%location: slot=0 function=0
dev.em.1.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x15d9 
subdevice=0x10d3 class=0x020000
dev.em.1.%parent: pci2
dev.em.1.nvm: -1
dev.em.1.debug: -1
dev.em.1.rx_int_delay: 0
dev.em.1.tx_int_delay: 66
dev.em.1.rx_abs_int_delay: 66
dev.em.1.tx_abs_int_delay: 66
dev.em.1.rx_processing_limit: 100
dev.em.1.flow_control: 0
dev.em.1.eee_control: 0
dev.em.1.link_irq: 27207
dev.em.1.mbuf_alloc_fail: 0
dev.em.1.cluster_alloc_fail: 0
dev.em.1.dropped: 0
dev.em.1.tx_dma_fail: 0
dev.em.1.rx_overruns: 0
dev.em.1.watchdog_timeouts: 0
dev.em.1.link_toggles: 3
dev.em.1.device_control: 1074790984
dev.em.1.rx_control: 67141634
dev.em.1.fc_high_water: 18432
dev.em.1.fc_low_water: 16932
dev.em.1.queue0.txd_head: 804
dev.em.1.queue0.txd_tail: 804
dev.em.1.queue0.tx_irq: 2790571888
dev.em.1.queue0.no_desc_avail: 0
dev.em.1.queue0.rxd_head: 686
dev.em.1.queue0.rxd_tail: 686
dev.em.1.queue0.rx_irq: 3729597000
dev.em.1.mac_stats.excess_coll: 0
dev.em.1.mac_stats.single_coll: 0
dev.em.1.mac_stats.multiple_coll: 0
dev.em.1.mac_stats.late_coll: 0
dev.em.1.mac_stats.collision_count: 0
dev.em.1.mac_stats.symbol_errors: 0
dev.em.1.mac_stats.sequence_errors: 0
dev.em.1.mac_stats.defer_count: 0
dev.em.1.mac_stats.missed_packets: 123950
dev.em.1.mac_stats.recv_no_buff: 7063
dev.em.1.mac_stats.recv_undersize: 0
dev.em.1.mac_stats.recv_fragmented: 0
dev.em.1.mac_stats.recv_oversize: 0
dev.em.1.mac_stats.recv_jabber: 0
dev.em.1.mac_stats.recv_errs: 0
dev.em.1.mac_stats.crc_errs: 0
dev.em.1.mac_stats.alignment_errs: 0
dev.em.1.mac_stats.coll_ext_errs: 0
dev.em.1.mac_stats.xon_recvd: 0
dev.em.1.mac_stats.xon_txd: 0
dev.em.1.mac_stats.xoff_recvd: 0
dev.em.1.mac_stats.xoff_txd: 0
dev.em.1.mac_stats.total_pkts_recvd: 11237327404
dev.em.1.mac_stats.good_pkts_recvd: 11237203454
dev.em.1.mac_stats.bcast_pkts_recvd: 60741559
dev.em.1.mac_stats.mcast_pkts_recvd: 2935
dev.em.1.mac_stats.rx_frames_64: 4479481614
dev.em.1.mac_stats.rx_frames_65_127: 5124633522
dev.em.1.mac_stats.rx_frames_128_255: 13228214
dev.em.1.mac_stats.rx_frames_256_511: 23321029
dev.em.1.mac_stats.rx_frames_512_1023: 74960116
dev.em.1.mac_stats.rx_frames_1024_1522: 1521578959
dev.em.1.mac_stats.good_octets_recvd: 3024997755633
dev.em.1.mac_stats.good_octets_txd: 15349574964083
dev.em.1.mac_stats.total_pkts_txd: 11344737635
dev.em.1.mac_stats.good_pkts_txd: 11344737635
dev.em.1.mac_stats.bcast_pkts_txd: 2501
dev.em.1.mac_stats.mcast_pkts_txd: 8
dev.em.1.mac_stats.tx_frames_64: 62839963
dev.em.1.mac_stats.tx_frames_65_127: 962219224
dev.em.1.mac_stats.tx_frames_128_255: 14620849
dev.em.1.mac_stats.tx_frames_256_511: 20410906
dev.em.1.mac_stats.tx_frames_512_1023: 117272600
dev.em.1.mac_stats.tx_frames_1024_1522: 10167374093
dev.em.1.mac_stats.tso_txd: 0
dev.em.1.mac_stats.tso_ctx_fail: 0
dev.em.1.interrupts.asserts: 19749
dev.em.1.interrupts.rx_pkt_timer: 1
dev.em.1.interrupts.rx_abs_timer: 0
dev.em.1.interrupts.tx_pkt_timer: 0
dev.em.1.interrupts.tx_abs_timer: 0
dev.em.1.interrupts.tx_queue_empty: 0
dev.em.1.interrupts.tx_queue_min_thresh: 0
dev.em.1.interrupts.rx_desc_min_thresh: 0
dev.em.1.interrupts.rx_overrun: 18
hw.em.rx_hang_fixup: 0
hw.em.eee_setting: 0
hw.em.fc_setting: 0
hw.em.rx_process_limit: 100
hw.em.enable_msix: 1
hw.em.sbp: 0
hw.em.smart_pwr_down: 0
hw.em.txd: 1024
hw.em.rxd: 1024
hw.em.rx_abs_int_delay: 66
hw.em.tx_abs_int_delay: 66
hw.em.rx_int_delay: 0
hw.em.tx_int_delay: 66

Jason

I have allocated more time to the problem and guess I can explain what
your problem is.

With MSIX disabled, the driver uses fast interrupt handler (em_irq_fast)
which calls rx/tx task and then checks for link status change. This
implies that rx/tx task is executed with every link state change. This is
not efficient, as it is a waste of time to start transmission when link is down.
However, it has the effect that after a temporary link loss 
(active->inactive->active),
_start is executed and transmission continues normally. The value of 
link_toggles (3)
clearly indicates that you had such a transition when the problem occured.

With MSIX enabled, the link task (em_handle_link) does _not_ triggers
_start when the link changes state from inactive to active (which it should).
If if_snd quickly fills up during a temporary link loss, transmission is
stopped forever and the driver never recovers from that state.

The last patch should have reduced the frequency of the problem
but it assumes every IFQ_ENQUEUE is followed by a if_start which
is not a true assumption.

If you are willing to test, I can prepare another patch for you to fix
the issue in a different and more reliable way.










_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Reply via email to