Hi Peter,

thanks for your immediate response. The second dump is with driver version 
2.0.44.x. The first one is reproducible with 2.0.44.x and 2.0.72.4-NAPI.

Thanks and best regards

     Wolfgang

-----Original Message-----
From: Waskiewicz Jr, Peter P [mailto:[email protected]]
Sent: Tue 6/8/2010 6:19 PM
To: Deuringer, Wolfgang [NETPWR/EMBCO/DE]
Cc: [email protected]
Subject: RE: ixgbe: schedule while atomic bug 
 
Hi Wolfgang,

 

I'm no longer the primary contact for ixgbe.  I've copied our engineering
team for better coverage.

 

Basically I think the first operation when enabling IP forwarding is
faulting due to a race when disabling LRO.  Which driver version are you
using for the bug you're reproducing?  Is it also 2.0.44.x?  I don't know
why the change in the kernel pre-emption model fixes this, but I'm wondering
if the change is helping serialize some of our tasklets in the driver.  I
know we recently fixed some races with some of our tasklets, but I don't
know if that driver has been released yet.

 

PJ Waskiewicz2.0.72.4-NAPI

[email protected]

 

From: [email protected] [mailto:[email protected]]

Sent: Tuesday, June 08, 2010 5:06 AM
To: Waskiewicz Jr, Peter P
Subject: ixgbe: schedule while atomic bug 

 

Hi Peter,

I have seen a reproducible bug in the ixgbe driver when I execute the
command  "r...@atca7360:~# echo 1 > /proc/sys/net/ipv4/ip_forward".

The bug is as following

Jan 21 12:59:40 localhost BUG: scheduling while atomic: bash/6136/0x00000002

Jan 21 12:59:40 localhost Modules linked in: x_tables ip6_tables ip_tables
sctp binfmt_misc boardctrl pram atca7xxx_sfmem softdog gen_probe cfi_probe
cfi_util cfi_cmdset_0001

Jan 21 12:59:40 localhost Pid: 6136, comm: bash Not tainted 2.6.27.39-grsec
#1

Jan 21 12:59:40 localhost 

Jan 21 12:59:40 localhost Call Trace:

Jan 21 12:59:40 localhost [<ffffffff807088dc>] schedule+0xdc/0x31d

Jan 21 12:59:40 localhost [<ffffffff802011c5>] __switch_to+0x215/0x3c0

Jan 21 12:59:40 localhost [<ffffffff8023bc14>] lock_timer_base+0x34/0x70

Jan 21 12:59:40 localhost [<ffffffff8023be47>] __mod_timer+0xc7/0xe0

Jan 21 12:59:40 localhost [<ffffffff80709389>] schedule_timeout+0x79/0xf0

Jan 21 12:59:40 localhost [<ffffffff8023ba80>] process_timeout+0x0/0x80

Jan 21 12:59:40 localhost [<ffffffff8023c00d>] msleep+0x1d/0x40

Jan 21 12:59:40 localhost [<ffffffff804e6e46>]
ixgbe_setup_mac_link_smartspeed+0xa6/0x1b0

Jan 21 12:59:40 localhost [<ffffffff804d7fc0>] ixgbe_up_complete+0x840/0xd20

Jan 21 12:59:40 localhost [<ffffffff804da550>] ixgbe_reinit_locked+0x80/0xb0

Jan 21 12:59:40 localhost [<ffffffff804df9f0>] ixgbe_set_flags+0xe0/0xf0

Jan 21 12:59:40 localhost [<ffffffff8061ed0d>] dev_disable_lro+0x5d/0x80

Jan 21 12:59:40 localhost [<ffffffff806725e7>]
devinet_sysctl_forward+0x187/0x1c0

Jan 21 12:59:40 localhost [<ffffffff806ec84e>] net_ctl_permissions+0xe/0x40

Jan 21 12:59:40 localhost [<ffffffff8030e481>]
proc_sys_call_handler+0xe1/0xf0

Jan 21 12:59:40 localhost [<ffffffff802b5d1b>] vfs_write+0xcb/0x170

Jan 21 12:59:40 localhost [<ffffffff802b5f64>] sys_write+0x64/0x130

Jan 21 12:59:40 localhost [<ffffffff80202c3b>] system_call_done+0x0/0x5

Jan 21 12:59:40 localhost 

Jan 21 12:59:40 localhost ixgbe: fabric1: ixgbe_watchdog_task: NIC Link is
Up 10 Gbps, Flow Control: RX/TX

Some system details:

r...@atca7360:~# uname -a

Linux atca7360.centellis_2k.com 2.6.27.39-grsec #1 SMP PREEMPT Mon Jun 7
12:25:28 CEST 2010 x86_64 x86_64 x86_64 GNU/Linux

Driver details:

r...@atca7360:~# ethtool -i fabric1

driver: ixgbe

version: 2.0.72.4-NAPI

firmware-version: 2.177-15

bus-info: 0000:04:00.0

Kernel configuration:

...

CONFIG_X86_SMP=y

CONFIG_X86_64_SMP=y

CONFIG_X86_HT=y

...

CONFIG_NUMA=y

...

# CONFIG_PREEMPT_NONE is not set

# CONFIG_PREEMPT_VOLUNTARY is not set

CONFIG_PREEMPT=y

# CONFIG_PREEMPT_RCU is not set

# CONFIG_PREEMPT_SOFTIRQS is not set

# CONFIG_PREEMPT_HARDIRQS is not set

# CONFIG_PREEMPT_TRACER is not set

...

# CONFIG_IP1000 is not set

CONFIG_IGB=y

CONFIG_IGB_LRO=y

# CONFIG_NS83820 is not set

# CONFIG_CHELSIO_T3 is not set

CONFIG_IXGBE=y

 <http://kerneltrap.org/mailarchive/linux-netdev/2009/7/16/6213453/thread>
http://kerneltrap.org/mailarchive/linux-netdev/2009/7/16/6213453/thread says
that using "pre-empt to Voluntary" solves the problem. I have changed my
kernel to preempt voluntary, too. This solves the problem here, too. Could
you please tell me why this problem occurs? 

Will there be an driver update in the near future which solves the problem?
Or do I need to stick to "preempt voluntary"?

At the customer site we have with ixge driver 2.0.44 a similar crash.. hope
that this has the same root cause. 

Jun  7 19:56:13 localhost BUG: scheduling while atomic: bash/4429/0x00000002

Jun  7 19:56:13 localhost Modules linked in: ixgbe x_tables ip6_tables
ip_tables sctp binfmt_misc boardctrl softdog

Jun  7 19:56:13 localhost Pid: 4429, comm: bash Not tainted 2.6.27.39-grsec
#1

Jun  7 19:56:13 localhost 

Jun  7 19:56:13 localhost Call Trace:

Jun  7 19:56:13 localhost [<ffffffff806eedd4>] schedule+0xd4/0x305

Jun  7 19:56:13 localhost [<ffffffff802824c5>]
__alloc_pages_internal+0xe5/0x590

Jun  7 19:56:13 localhost [<ffffffff8023be34>] lock_timer_base+0x34/0x70

Jun  7 19:56:13 localhost [<ffffffff8023c067>] __mod_timer+0xc7/0xe0

Jun  7 19:56:13 localhost [<ffffffff806ef879>] schedule_timeout+0x79/0xf0

Jun  7 19:56:13 localhost [<ffffffff8023bca0>] process_timeout+0x0/0x80

Jun  7 19:56:13 localhost [<ffffffff8023c22d>] msleep+0x1d/0x40

Jun  7 19:56:13 localhost [<ffffffffa0009549>] ixgbe_down+0xe9/0x3d0 [ixgbe]

Jun  7 19:56:13 localhost [<ffffffffa000c038>] ixgbe_reinit_locked+0x78/0xb0
[ixgbe]

Jun  7 19:56:13 localhost [<ffffffffa0010666>] ixgbe_set_flags+0x56/0xb0
[ixgbe]

Jun  7 19:56:13 localhost [<ffffffff80605cfd>] dev_disable_lro+0x5d/0x80

Jun  7 19:56:13 localhost [<ffffffff80658ba7>]
devinet_sysctl_forward+0x187/0x1c0

Jun  7 19:56:13 localhost [<ffffffff806d2d0e>] net_ctl_permissions+0xe/0x40

Jun  7 19:56:13 localhost [<ffffffff8030f511>]
proc_sys_call_handler+0xe1/0xf0

Jun  7 19:56:13 localhost [<ffffffff802b6dbb>] vfs_write+0xcb/0x170

Jun  7 19:56:13 localhost [<ffffffff802b7004>] sys_write+0x64/0x130

Jun  7 19:56:13 localhost [<ffffffff80202c3b>] system_call_done+0x0/0x5

Thanks and best regards

        Wolfgang 

-----------------------------

Wolfgang Deuringer 

Embedded Computing GmbH

Emerson Network Power

Tel. +49 (0)89 9608-2228 

[email protected]

 <http://www.emersonnetworkpower.com/embeddedcomputing>
www.emersonnetworkpower.com/embeddedcomputing 

Emerson Network Power - Embedded Computing GmbH, Lilienthalstr. 15, D- 85579
Neubiberg/Landkreis Muenchen, Deutschland /Germany. Geschaeftsfuehrer Kai
Holz, Amtsgericht Muenchen, HRB 171431 / VAT/USt.-ID: DE 127472241


------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to