Hi Peter,
thanks for your immediate response. The second dump is with driver version
2.0.44.x. The first one is reproducible with 2.0.44.x and 2.0.72.4-NAPI.
Thanks and best regards
Wolfgang
-----Original Message-----
From: Waskiewicz Jr, Peter P [mailto:[email protected]]
Sent: Tue 6/8/2010 6:19 PM
To: Deuringer, Wolfgang [NETPWR/EMBCO/DE]
Cc: [email protected]
Subject: RE: ixgbe: schedule while atomic bug
Hi Wolfgang,
I'm no longer the primary contact for ixgbe. I've copied our engineering
team for better coverage.
Basically I think the first operation when enabling IP forwarding is
faulting due to a race when disabling LRO. Which driver version are you
using for the bug you're reproducing? Is it also 2.0.44.x? I don't know
why the change in the kernel pre-emption model fixes this, but I'm wondering
if the change is helping serialize some of our tasklets in the driver. I
know we recently fixed some races with some of our tasklets, but I don't
know if that driver has been released yet.
PJ Waskiewicz2.0.72.4-NAPI
[email protected]
From: [email protected] [mailto:[email protected]]
Sent: Tuesday, June 08, 2010 5:06 AM
To: Waskiewicz Jr, Peter P
Subject: ixgbe: schedule while atomic bug
Hi Peter,
I have seen a reproducible bug in the ixgbe driver when I execute the
command "r...@atca7360:~# echo 1 > /proc/sys/net/ipv4/ip_forward".
The bug is as following
Jan 21 12:59:40 localhost BUG: scheduling while atomic: bash/6136/0x00000002
Jan 21 12:59:40 localhost Modules linked in: x_tables ip6_tables ip_tables
sctp binfmt_misc boardctrl pram atca7xxx_sfmem softdog gen_probe cfi_probe
cfi_util cfi_cmdset_0001
Jan 21 12:59:40 localhost Pid: 6136, comm: bash Not tainted 2.6.27.39-grsec
#1
Jan 21 12:59:40 localhost
Jan 21 12:59:40 localhost Call Trace:
Jan 21 12:59:40 localhost [<ffffffff807088dc>] schedule+0xdc/0x31d
Jan 21 12:59:40 localhost [<ffffffff802011c5>] __switch_to+0x215/0x3c0
Jan 21 12:59:40 localhost [<ffffffff8023bc14>] lock_timer_base+0x34/0x70
Jan 21 12:59:40 localhost [<ffffffff8023be47>] __mod_timer+0xc7/0xe0
Jan 21 12:59:40 localhost [<ffffffff80709389>] schedule_timeout+0x79/0xf0
Jan 21 12:59:40 localhost [<ffffffff8023ba80>] process_timeout+0x0/0x80
Jan 21 12:59:40 localhost [<ffffffff8023c00d>] msleep+0x1d/0x40
Jan 21 12:59:40 localhost [<ffffffff804e6e46>]
ixgbe_setup_mac_link_smartspeed+0xa6/0x1b0
Jan 21 12:59:40 localhost [<ffffffff804d7fc0>] ixgbe_up_complete+0x840/0xd20
Jan 21 12:59:40 localhost [<ffffffff804da550>] ixgbe_reinit_locked+0x80/0xb0
Jan 21 12:59:40 localhost [<ffffffff804df9f0>] ixgbe_set_flags+0xe0/0xf0
Jan 21 12:59:40 localhost [<ffffffff8061ed0d>] dev_disable_lro+0x5d/0x80
Jan 21 12:59:40 localhost [<ffffffff806725e7>]
devinet_sysctl_forward+0x187/0x1c0
Jan 21 12:59:40 localhost [<ffffffff806ec84e>] net_ctl_permissions+0xe/0x40
Jan 21 12:59:40 localhost [<ffffffff8030e481>]
proc_sys_call_handler+0xe1/0xf0
Jan 21 12:59:40 localhost [<ffffffff802b5d1b>] vfs_write+0xcb/0x170
Jan 21 12:59:40 localhost [<ffffffff802b5f64>] sys_write+0x64/0x130
Jan 21 12:59:40 localhost [<ffffffff80202c3b>] system_call_done+0x0/0x5
Jan 21 12:59:40 localhost
Jan 21 12:59:40 localhost ixgbe: fabric1: ixgbe_watchdog_task: NIC Link is
Up 10 Gbps, Flow Control: RX/TX
Some system details:
r...@atca7360:~# uname -a
Linux atca7360.centellis_2k.com 2.6.27.39-grsec #1 SMP PREEMPT Mon Jun 7
12:25:28 CEST 2010 x86_64 x86_64 x86_64 GNU/Linux
Driver details:
r...@atca7360:~# ethtool -i fabric1
driver: ixgbe
version: 2.0.72.4-NAPI
firmware-version: 2.177-15
bus-info: 0000:04:00.0
Kernel configuration:
...
CONFIG_X86_SMP=y
CONFIG_X86_64_SMP=y
CONFIG_X86_HT=y
...
CONFIG_NUMA=y
...
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
# CONFIG_PREEMPT_RCU is not set
# CONFIG_PREEMPT_SOFTIRQS is not set
# CONFIG_PREEMPT_HARDIRQS is not set
# CONFIG_PREEMPT_TRACER is not set
...
# CONFIG_IP1000 is not set
CONFIG_IGB=y
CONFIG_IGB_LRO=y
# CONFIG_NS83820 is not set
# CONFIG_CHELSIO_T3 is not set
CONFIG_IXGBE=y
<http://kerneltrap.org/mailarchive/linux-netdev/2009/7/16/6213453/thread>
http://kerneltrap.org/mailarchive/linux-netdev/2009/7/16/6213453/thread says
that using "pre-empt to Voluntary" solves the problem. I have changed my
kernel to preempt voluntary, too. This solves the problem here, too. Could
you please tell me why this problem occurs?
Will there be an driver update in the near future which solves the problem?
Or do I need to stick to "preempt voluntary"?
At the customer site we have with ixge driver 2.0.44 a similar crash.. hope
that this has the same root cause.
Jun 7 19:56:13 localhost BUG: scheduling while atomic: bash/4429/0x00000002
Jun 7 19:56:13 localhost Modules linked in: ixgbe x_tables ip6_tables
ip_tables sctp binfmt_misc boardctrl softdog
Jun 7 19:56:13 localhost Pid: 4429, comm: bash Not tainted 2.6.27.39-grsec
#1
Jun 7 19:56:13 localhost
Jun 7 19:56:13 localhost Call Trace:
Jun 7 19:56:13 localhost [<ffffffff806eedd4>] schedule+0xd4/0x305
Jun 7 19:56:13 localhost [<ffffffff802824c5>]
__alloc_pages_internal+0xe5/0x590
Jun 7 19:56:13 localhost [<ffffffff8023be34>] lock_timer_base+0x34/0x70
Jun 7 19:56:13 localhost [<ffffffff8023c067>] __mod_timer+0xc7/0xe0
Jun 7 19:56:13 localhost [<ffffffff806ef879>] schedule_timeout+0x79/0xf0
Jun 7 19:56:13 localhost [<ffffffff8023bca0>] process_timeout+0x0/0x80
Jun 7 19:56:13 localhost [<ffffffff8023c22d>] msleep+0x1d/0x40
Jun 7 19:56:13 localhost [<ffffffffa0009549>] ixgbe_down+0xe9/0x3d0 [ixgbe]
Jun 7 19:56:13 localhost [<ffffffffa000c038>] ixgbe_reinit_locked+0x78/0xb0
[ixgbe]
Jun 7 19:56:13 localhost [<ffffffffa0010666>] ixgbe_set_flags+0x56/0xb0
[ixgbe]
Jun 7 19:56:13 localhost [<ffffffff80605cfd>] dev_disable_lro+0x5d/0x80
Jun 7 19:56:13 localhost [<ffffffff80658ba7>]
devinet_sysctl_forward+0x187/0x1c0
Jun 7 19:56:13 localhost [<ffffffff806d2d0e>] net_ctl_permissions+0xe/0x40
Jun 7 19:56:13 localhost [<ffffffff8030f511>]
proc_sys_call_handler+0xe1/0xf0
Jun 7 19:56:13 localhost [<ffffffff802b6dbb>] vfs_write+0xcb/0x170
Jun 7 19:56:13 localhost [<ffffffff802b7004>] sys_write+0x64/0x130
Jun 7 19:56:13 localhost [<ffffffff80202c3b>] system_call_done+0x0/0x5
Thanks and best regards
Wolfgang
-----------------------------
Wolfgang Deuringer
Embedded Computing GmbH
Emerson Network Power
Tel. +49 (0)89 9608-2228
[email protected]
<http://www.emersonnetworkpower.com/embeddedcomputing>
www.emersonnetworkpower.com/embeddedcomputing
Emerson Network Power - Embedded Computing GmbH, Lilienthalstr. 15, D- 85579
Neubiberg/Landkreis Muenchen, Deutschland /Germany. Geschaeftsfuehrer Kai
Holz, Amtsgericht Muenchen, HRB 171431 / VAT/USt.-ID: DE 127472241
------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit. See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit
http://communities.intel.com/community/wired