Hi,

                I am getting Tx hangs with e1000e-1.0.15 driver. Attached
logs below.

 

Feb 10 06:05:11 1265762111 kernel: e1000: eth4: e1000_clean_tx_irq: Detected
Tx Unit Hang

Feb 10 06:05:11 1265762111 kernel:   Tx Queue             <0>

Feb 10 06:05:11 1265762111 kernel:   TDH                  <e1>

Feb 10 06:05:11 1265762111 kernel:   TDT                  <cc>

Feb 10 06:05:11 1265762111 kernel:   next_to_use          <cc>

Feb 10 06:05:11 1265762111 kernel:   next_to_clean        <e0>

Feb 10 06:05:11 1265762111 kernel: buffer_info[next_to_clean]

Feb 10 06:05:11 1265762111 kernel:   time_stamp           <56300a18>

Feb 10 06:05:11 1265762111 kernel:   next_to_watch        <e4>

Feb 10 06:05:11 1265762111 kernel:   jiffies              <56300b51>

Feb 10 06:05:11 1265762111 kernel:   next_to_watch.status <0>

Feb 10 06:05:13 1265762113 kernel: e1000: eth4: e1000_clean_tx_irq: Detected
Tx Unit Hang

Feb 10 06:05:13 1265762113 kernel:   Tx Queue             <0>

Feb 10 06:05:13 1265762113 kernel:   TDH                  <e1>

Feb 10 06:05:13 1265762113 kernel:   TDT                  <cc>

Feb 10 06:05:13 1265762113 kernel:   next_to_use          <cc>

Feb 10 06:05:13 1265762113 kernel:   next_to_clean        <e0>

Feb 10 06:05:13 1265762113 kernel: buffer_info[next_to_clean]

Feb 10 06:05:13 1265762113 kernel:   time_stamp           <56300a18>

Feb 10 06:05:13 1265762113 kernel:   next_to_watch        <e4>

Feb 10 06:05:13 1265762113 kernel:   jiffies              <56300d45>

Feb 10 06:05:13 1265762113 kernel:   next_to_watch.status <0>

Feb 10 06:05:15 1265762115 kernel: e1000: eth4: e1000_clean_tx_irq: Detected
Tx Unit Hang

Feb 10 06:05:15 1265762115 kernel:   Tx Queue             <0>

Feb 10 06:05:15 1265762115 kernel:   TDH                  <e1>

Feb 10 06:05:15 1265762115 kernel:   TDT                  <cc>

Feb 10 06:05:15 1265762115 kernel:   next_to_use          <cc>

Feb 10 06:05:15 1265762115 kernel:   next_to_clean        <e0>

Feb 10 06:05:15 1265762115 kernel: buffer_info[next_to_clean]

Feb 10 06:05:15 1265762115 kernel:   time_stamp           <56300a18>

Feb 10 06:05:15 1265762115 kernel:   next_to_watch        <e4>

Feb 10 06:05:15 1265762115 kernel:   jiffies              <56300f39>

Feb 10 06:05:15 1265762115 kernel:   next_to_watch.status <0>

Feb 10 06:05:17 1265762117 kernel: e1000: eth4: e1000_clean_tx_irq: Detected
Tx Unit Hang

Feb 10 06:05:17 1265762117 kernel:   Tx Queue             <0>

Feb 10 06:05:17 1265762117 kernel:   TDH                  <e1>

Feb 10 06:05:17 1265762117 kernel:   TDT                  <cc>

Feb 10 06:05:17 1265762117 kernel:   next_to_use          <cc>

Feb 10 06:05:17 1265762117 kernel:   next_to_clean        <e0>

Feb 10 06:05:17 1265762117 kernel: buffer_info[next_to_clean]

Feb 10 06:05:17 1265762117 kernel:   time_stamp           <56300a18>

Feb 10 06:05:17 1265762117 kernel:   next_to_watch        <e4>

Feb 10 06:05:17 1265762117 kernel:   jiffies              <5630112d>

Feb 10 06:05:17 1265762117 kernel:   next_to_watch.status <0>

Feb 10 06:05:17 1265762117 kernel: NETDEV WATCHDOG: eth4: transmit timed out

Feb 10 06:05:21 1265762121 kernel: e1000: eth4: e1000_watchdog_task: NIC
Link is Up 1000 Mbps Full Duplex, Flow Control: RX

Feb 10 06:06:45 1265762205 kernel: e1000: eth4: e1000_clean_tx_irq: Detected
Tx Unit Hang

Feb 10 06:06:45 1265762205 kernel:   Tx Queue             <0>

Feb 10 06:06:45 1265762205 kernel:   TDH                  <c7>

Feb 10 06:06:45 1265762205 kernel:   TDT                  <b2>

Feb 10 06:06:45 1265762205 kernel:   next_to_use          <b2>

Feb 10 06:06:45 1265762205 kernel:   next_to_clean        <c6>

Feb 10 06:06:45 1265762205 kernel: buffer_info[next_to_clean]

Feb 10 06:06:45 1265762205 kernel:   time_stamp           <563065bc>

Feb 10 06:06:45 1265762205 kernel:   next_to_watch        <ca>

Feb 10 06:06:45 1265762205 kernel:   jiffies              <563066d4>

Feb 10 06:06:45 1265762205 kernel:   next_to_watch.status <0>

Feb 10 06:06:47 1265762207 kernel: e1000: eth4: e1000_clean_tx_irq: Detected
Tx Unit Hang

Feb 10 06:06:47 1265762207 kernel:   Tx Queue             <0>

Feb 10 06:06:47 1265762207 kernel:   TDH                  <c7>

Feb 10 06:06:47 1265762207 kernel:   TDT                  <b2>

Feb 10 06:06:47 1265762207 kernel:   next_to_use          <b2>

Feb 10 06:06:47 1265762207 kernel:   next_to_clean        <c6>

Feb 10 06:06:47 1265762207 kernel: buffer_info[next_to_clean]

Feb 10 06:06:47 1265762207 kernel:   time_stamp           <563065bc>

Feb 10 06:06:47 1265762207 kernel:   next_to_watch        <ca>

Feb 10 06:06:47 1265762207 kernel:   jiffies              <563068c8>

Feb 10 06:06:47 1265762207 kernel:   next_to_watch.status <0>

Feb 10 06:06:49 1265762209 kernel: e1000: eth4: e1000_clean_tx_irq: Detected
Tx Unit Hang

Feb 10 06:06:49 1265762209 kernel:   Tx Queue             <0>

Feb 10 06:06:49 1265762209 kernel:   TDH                  <c7>

Feb 10 06:06:49 1265762209 kernel:   TDT                  <b2>

Feb 10 06:06:49 1265762209 kernel:   next_to_use          <b2>

Feb 10 06:06:49 1265762209 kernel:   next_to_clean        <c6>

Feb 10 06:06:49 1265762209 kernel: buffer_info[next_to_clean]

Feb 10 06:06:49 1265762209 kernel:   time_stamp           <563065bc>

Feb 10 06:06:49 1265762209 kernel:   next_to_watch        <ca>

Feb 10 06:06:49 1265762209 kernel:   jiffies              <56306abc>

Feb 10 06:06:49 1265762209 kernel:   next_to_watch.status <0>

Feb 10 06:06:51 1265762211 kernel: NETDEV WATCHDOG: eth4: transmit timed out

Feb 10 06:06:54 1265762214 kernel: e1000: eth4: e1000_watchdog_task: NIC
Link is Up 1000 Mbps Full Duplex, Flow Control: RX

Feb 10 06:12:36 1265762556 kernel: e1000: eth4: e1000_clean_tx_irq: Detected
Tx Unit Hang

Feb 10 06:12:36 1265762556 kernel:   Tx Queue             <0>

Feb 10 06:12:36 1265762556 kernel:   TDH                  <95>

Feb 10 06:12:36 1265762556 kernel:   TDT                  <80>

Feb 10 06:12:36 1265762556 kernel:   next_to_use          <80>

Feb 10 06:12:36 1265762556 kernel:   next_to_clean        <94>

Feb 10 06:12:36 1265762556 kernel: buffer_info[next_to_clean]

Feb 10 06:12:36 1265762556 kernel:   time_stamp           <5631bcf8>

Feb 10 06:12:36 1265762556 kernel:   next_to_watch        <98>

Feb 10 06:12:36 1265762556 kernel:   jiffies              <5631bdf3>

Feb 10 06:12:36 1265762556 kernel:   next_to_watch.status <0>

Feb 10 06:12:38 1265762558 kernel: e1000: eth4: e1000_clean_tx_irq: Detected
Tx Unit Hang

Feb 10 06:12:38 1265762558 kernel:   Tx Queue             <0>

Feb 10 06:12:38 1265762558 kernel:   TDH                  <95>

Feb 10 06:12:38 1265762558 kernel:   TDT                  <80>

Feb 10 06:12:38 1265762558 kernel:   next_to_use          <80>

Feb 10 06:12:38 1265762558 kernel:   next_to_clean        <94>

Feb 10 06:12:38 1265762558 kernel: buffer_info[next_to_clean]

Feb 10 06:12:38 1265762558 kernel:   time_stamp           <5631bcf8>

Feb 10 06:12:38 1265762558 kernel:   next_to_watch        <98>

Feb 10 06:12:38 1265762558 kernel:   jiffies              <5631bfe7>

Feb 10 06:12:38 1265762558 kernel:   next_to_watch.status <0>

Feb 10 06:12:40 1265762560 kernel: e1000: eth4: e1000_clean_tx_irq: Detected
Tx Unit Hang

Feb 10 06:12:40 1265762560 kernel:   Tx Queue             <0>

Feb 10 06:12:40 1265762560 kernel:   TDH                  <95>

Feb 10 06:12:40 1265762560 kernel:   TDT                  <80>

Feb 10 06:12:40 1265762560 kernel:   next_to_use          <80>

Feb 10 06:12:40 1265762560 kernel:   next_to_clean        <94>

Feb 10 06:12:40 1265762560 kernel: buffer_info[next_to_clean]

Feb 10 06:12:40 1265762560 kernel:   time_stamp           <5631bcf8>

Feb 10 06:12:40 1265762560 kernel:   next_to_watch        <98>

Feb 10 06:12:40 1265762560 kernel:   jiffies              <5631c1db>

Feb 10 06:12:40 1265762560 kernel:   next_to_watch.status <0>

Feb 10 06:12:42 1265762562 kernel: e1000: eth4: e1000_clean_tx_irq: Detected
Tx Unit Hang

Feb 10 06:12:42 1265762562 kernel:   Tx Queue             <0>

Feb 10 06:12:42 1265762562 kernel:   TDH                  <95>

Feb 10 06:12:42 1265762562 kernel:   TDT                  <80>

Feb 10 06:12:42 1265762562 kernel:   next_to_use          <80>

Feb 10 06:12:42 1265762562 kernel:   next_to_clean        <94>

Feb 10 06:12:42 1265762562 kernel: buffer_info[next_to_clean]

Feb 10 06:12:42 1265762562 kernel:   time_stamp           <5631bcf8>

Feb 10 06:12:42 1265762562 kernel:   next_to_watch        <98>

Feb 10 06:12:42 1265762562 kernel:   jiffies              <5631c3cf>

Feb 10 06:12:42 1265762562 kernel:   next_to_watch.status <0>

Feb 10 06:12:44 1265762564 kernel: NETDEV WATCHDOG: eth4: transmit timed out

Feb 10 06:12:47 1265762567 kernel: e1000: eth4: e1000_watchdog_task: NIC
Link is Up 1000 Mbps Full Duplex, Flow Control: RX

 

                lspci output

 

                [r...@manage1 /root]# lspci_ether

05:00.0 Ethernet controller: Intel Corporation: Unknown device 105e (rev 06)
- (E1000_DEV_ID_82571EB_COPPER)

05:00.1 Ethernet controller: Intel Corporation: Unknown device 105e (rev 06)
- (E1000_DEV_ID_82571EB_COPPER)

06:00.0 Ethernet controller: Intel Corporation: Unknown device 105e (rev 06)
- (E1000_DEV_ID_82571EB_COPPER)

06:00.1 Ethernet controller: Intel Corporation: Unknown device 105e (rev 06)
- (E1000_DEV_ID_82571EB_COPPER)

07:00.0 Ethernet controller: Intel Corporation: Unknown device 105e (rev 06)
- (E1000_DEV_ID_82571EB_COPPER)

07:00.1 Ethernet controller: Intel Corporation: Unknown device 105e (rev 06)
- (E1000_DEV_ID_82571EB_COPPER)

08:00.0 Ethernet controller: Intel Corporation: Unknown device 105e (rev 06)
- (E1000_DEV_ID_82571EB_COPPER)

08:00.1 Ethernet controller: Intel Corporation: Unknown device 105e (rev 06)
- (E1000_DEV_ID_82571EB_COPPER)

0d:00.0 Ethernet controller: Intel Corporation: Unknown device 1096 (rev 01)
- (E1000_DEV_ID_80003ES2LAN_COPPER_DPT)

0d:00.1 Ethernet controller: Intel Corporation: Unknown device 1096 (rev 01)
- (E1000_DEV_ID_80003ES2LAN_COPPER_DPT)

0f:00.0 Ethernet controller: Intel Corporation: Unknown device 105f (rev 06)
- (E1000_DEV_ID_82571EB_FIBER)

0f:00.1 Ethernet controller: Intel Corporation: Unknown device 105f (rev 06)
- (E1000_DEV_ID_82571EB_FIBER)

 

                modprobe e100e logs

 

Feb 10 13:11:58 1265825518 kernel: e1000e: Intel(R) PRO/1000 Network Driver
- 1.0.15-NAPI

Feb 10 13:11:58 1265825518 kernel: e1000e: Copyright(c) 1999 - 2009 Intel
Corporation.

Feb 10 13:11:58 1265825518 kernel: 0000:0f:00.0: eth0: (PCI
Express:2.5GB/s:Width x4) 00:90:fb:1c:b9:fe

Feb 10 13:11:58 1265825518 kernel: 0000:0f:00.0: eth0: Intel(R) PRO/1000
Network Connection

Feb 10 13:11:58 1265825518 kernel: 0000:0f:00.0: eth0: MAC: 1, PHY: 1, PBA
No: c85839-002

Feb 10 13:11:58 1265825518 kernel: 0000:0f:00.1: eth1: (PCI
Express:2.5GB/s:Width x4) 00:90:fb:1c:b9:ff

Feb 10 13:11:58 1265825518 kernel: 0000:0f:00.1: eth1: Intel(R) PRO/1000
Network Connection

Feb 10 13:11:58 1265825518 kernel: 0000:0f:00.1: eth1: MAC: 1, PHY: 1, PBA
No: c85839-002

Feb 10 13:11:58 1265825518 kernel: 0000:0d:00.0: eth2: (PCI
Express:2.5GB/s:Width x4) 00:90:fb:26:48:c8

Feb 10 13:11:58 1265825518 kernel: 0000:0d:00.0: eth2: Intel(R) PRO/1000
Network Connection

Feb 10 13:11:58 1265825518 kernel: 0000:0d:00.0: eth2: MAC: 6, PHY: 5, PBA
No: ffffff-0ff

Feb 10 13:11:58 1265825518 kernel: 0000:0d:00.1: eth3: (PCI
Express:2.5GB/s:Width x4) 00:90:fb:26:48:c9

Feb 10 13:11:58 1265825518 kernel: 0000:0d:00.1: eth3: Intel(R) PRO/1000
Network Connection

Feb 10 13:11:58 1265825518 kernel: 0000:0d:00.1: eth3: MAC: 6, PHY: 5, PBA
No: ffffff-0ff

Feb 10 13:11:58 1265825518 kernel: 0000:08:00.0: eth4: (PCI
Express:2.5GB/s:Width x4) 00:90:fb:25:e5:60

Feb 10 13:11:58 1265825518 kernel: 0000:08:00.0: eth4: Intel(R) PRO/1000
Network Connection

Feb 10 13:11:58 1265825518 kernel: 0000:08:00.0: eth4: MAC: 1, PHY: 4, PBA
No: 484020-010

Feb 10 13:11:58 1265825518 kernel: 0000:08:00.1: eth5: (PCI
Express:2.5GB/s:Width x4) 00:90:fb:25:e5:61

Feb 10 13:11:58 1265825518 kernel: 0000:08:00.1: eth5: Intel(R) PRO/1000
Network Connection

Feb 10 13:11:58 1265825518 kernel: 0000:08:00.1: eth5: MAC: 1, PHY: 4, PBA
No: 484020-010

Feb 10 13:11:58 1265825518 kernel: 0000:07:00.0: eth6: (PCI
Express:2.5GB/s:Width x4) 00:90:fb:25:e5:62

Feb 10 13:11:58 1265825518 kernel: 0000:07:00.0: eth6: Intel(R) PRO/1000
Network Connection

Feb 10 13:11:58 1265825518 kernel: 0000:07:00.0: eth6: MAC: 1, PHY: 4, PBA
No: 484020-011

Feb 10 13:11:58 1265825518 kernel: 0000:07:00.1: eth7: (PCI
Express:2.5GB/s:Width x4) 00:90:fb:25:e5:63

Feb 10 13:11:58 1265825518 kernel: 0000:07:00.1: eth7: Intel(R) PRO/1000
Network Connection

Feb 10 13:11:58 1265825518 kernel: 0000:07:00.1: eth7: MAC: 1, PHY: 4, PBA
No: 484020-011

Feb 10 13:11:58 1265825518 kernel: 0000:06:00.0: eth8: (PCI
Express:2.5GB/s:Width x4) 00:90:fb:25:5a:18

Feb 10 13:11:58 1265825518 kernel: 0000:06:00.0: eth8: Intel(R) PRO/1000
Network Connection

Feb 10 13:11:58 1265825518 kernel: 0000:06:00.0: eth8: MAC: 1, PHY: 4, PBA
No: c83246-002

Feb 10 13:11:58 1265825518 kernel: 0000:06:00.1: eth9: (PCI
Express:2.5GB/s:Width x4) 00:90:fb:25:5a:19

Feb 10 13:11:58 1265825518 kernel: 0000:06:00.1: eth9: Intel(R) PRO/1000
Network Connection

Feb 10 13:11:58 1265825518 kernel: 0000:06:00.1: eth9: MAC: 1, PHY: 4, PBA
No: c83246-002

Feb 10 13:11:58 1265825518 kernel: 0000:05:00.0: eth10: (PCI
Express:2.5GB/s:Width x4) 00:90:fb:25:5a:1a

Feb 10 13:11:58 1265825518 kernel: 0000:05:00.0: eth10: Intel(R) PRO/1000
Network Connection

Feb 10 13:11:58 1265825518 kernel: 0000:05:00.0: eth10: MAC: 1, PHY: 4, PBA
No: c83246-002

Feb 10 13:11:58 1265825518 kernel: 0000:05:00.1: eth11: (PCI
Express:2.5GB/s:Width x4) 00:90:fb:25:5a:1b

Feb 10 13:11:58 1265825518 kernel: 0000:05:00.1: eth11: Intel(R) PRO/1000
Network Connection

Feb 10 13:11:58 1265825518 kernel: 0000:05:00.1: eth11: MAC: 1, PHY: 4, PBA
No: c83246-002

 

 

                ethtool -g eth4

                

                                Ring parameters for eth4:

Pre-set maximums:

RX:             4096

RX Mini:        0

RX Jumbo:       0

TX:             4096

Current hardware settings:

RX:             2048

RX Mini:        0

RX Jumbo:       0

TX:             2048

 

                ethtool -k eth4

Offload parameters for eth4:

Cannot get device udp large send offload settings: Operation not supported

Cannot get device generic segmentation offload settings: Operation not
supported

rx-checksumming: on

tx-checksumming: on

scatter-gather: on

tcp segmentation offload: on

udp fragmentation offload: off

generic segmentation offload: off

 

                System Info:

                                Running kernel - 2.6.16.-13-1

                                Openswan - 2.4.9 with klips

                                cat /proc/interrupts

           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
CPU6       CPU7       

  0:   40087329        273        274        274        274        273
273        241    IO-APIC-edge  timer

  2:          0          0          0          0          0          0
0          0          XT-PIC  cascade

  4:         10          0          0          0          1          0
0          0    IO-APIC-edge  serial

  8:       3393          1          0          0          0          0
0          0    IO-APIC-edge  rtc

 66:         63          0          0      80096          0          0
0          0         PCI-MSI  eth0

 74:         63          0          0      80096          0          0
0          0         PCI-MSI  eth1

 82:      80158          0          0          0          0          0
0          0         PCI-MSI  eth2

 90:      80158          0          0          0          0          0
0          0         PCI-MSI  eth3

 98:        256          0    5594913          0  168731027          0
0          0         PCI-MSI  eth4

106:        130          0    6517103          0          0  255948447
0          0         PCI-MSI  eth5

114:         64          0     100789          0          0          0
0          0         PCI-MSI  eth6

122:         68          0      87466          0          0          0
0          0         PCI-MSI  eth7

130:        252          0          0     466626          0          0
0          0         PCI-MSI  eth8

138:      30033          0          0    4989635          0          0
0          0         PCI-MSI  eth9

146:         62          0          0      80096          0          0
0          0         PCI-MSI  eth10

153:     557669          0          1          0          0          0
0          0   IO-APIC-level  libata

154:         62          0          0      80096          0          0
0          0         PCI-MSI  eth11

NMI:          0          0          0          0          0          0
0          0 

LOC:   40086777   40087580   40087468   40087495   40083411   40083410
40086663   40086021 

ERR:          0

MIS:          0 

 

                This machine is a IPSEC Gateway and we are using openswan
2.4.9 with klips for VPN.

                Possible suspect for this Hang is a Fragmented UDP packet
coming/going on eth4 with datasize 32560 size over VPN tunnel. (eth4 <->
ipsec0 <-> eth5)

                Without VPN tunnel, I am not observing the hangs with same
size of UDP packets.          

                Let me know if you need more information on this.

 

Rgds,

Nishit Shah.        

 

------------------------------------------------------------------------------
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to