Re: [ipw3945-devel] 2.6.24-rc5-mm1 -- INFO: possible circular locking dependency detected -- pm-suspend/5800 is trying to acquire lock
On Dec 18, 2007 9:58 PM, Zhu Yi [EMAIL PROTECTED] wrote: On Tue, 2007-12-18 at 15:57 +0100, Johannes Berg wrote: Thanks. This is a bug in iwlwifi. The problem is actually another case where my workqueue debugging with lockdep is triggering a warning :)) Here's the thing: iwl3945_cancel_deferred_work does cancel_delayed_work_sync(priv-init_alive_start); (which is the ((priv-init_alive_start)-work) lock) but it is called from within a locked section of mutex_lock(priv-mutex); (locked from iwl3945_pci_suspend) On the other hand, the task that runs from the init_alive_start workqueue is iwl3945_bg_init_alive_start() which will lock the same mutex. So the deadlock condition is that you can be in cancel_delayed_work_sync() above while the mutex is locked, and be waiting for iwl_3945_bg_init_alive_start() which tries to lock the mutex. Thanks for the analysis. Miles, please try the attached patch. I'll send a patch for both 3945 and 4965 to linux-wireless later. I tested it and it looks good here. Thanks! Miles -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
2.6.24-rc5-mm1 -- INFO: possible circular locking dependency detected -- pm-suspend/5800 is trying to acquire lock
I have only seen this happen once, and cannot reproduce it. I'll keep trying, though. Dec 16 22:10:48 syntropy kernel: [ 231.718023] === Dec 16 22:10:48 syntropy kernel: [ 231.718025] [ INFO: possible circular locking dependency detected ] Dec 16 22:10:48 syntropy kernel: [ 231.718028] 2.6.24-rc5-mm1 #7 Dec 16 22:10:48 syntropy kernel: [ 231.718029] --- Dec 16 22:10:48 syntropy kernel: [ 231.718032] pm-suspend/5800 is trying to acquire lock: Dec 16 22:10:48 syntropy kernel: [ 231.718034] ((priv-init_alive_start)-work){--..}, at: [__cancel_work_timer+0x8a/0x17f] __cancel_work_timer+0x8a/0x17f Dec 16 22:10:48 syntropy kernel: [ 231.718045] Dec 16 22:10:48 syntropy kernel: [ 231.718046] but task is already holding lock: Dec 16 22:10:48 syntropy kernel: [ 231.718047] (priv-mutex){--..}, at: [f8a587e7] iwl3945_pci_suspend+0x1d/0x65 [iwl3945] Dec 16 22:10:48 syntropy kernel: [ 231.718065] Dec 16 22:10:48 syntropy kernel: [ 231.718066] which lock already depends on the new lock. Dec 16 22:10:48 syntropy kernel: [ 231.718067] Dec 16 22:10:48 syntropy kernel: [ 231.718068] Dec 16 22:10:48 syntropy kernel: [ 231.718069] the existing dependency chain (in reverse order) is: Dec 16 22:10:48 syntropy kernel: [ 231.718071] Dec 16 22:10:48 syntropy kernel: [ 231.718072] - #1 (priv-mutex){--..}: Dec 16 22:10:48 syntropy kernel: [ 231.718075] [__lock_acquire+0xa17/0xbf4] __lock_acquire+0xa17/0xbf4 Dec 16 22:10:48 syntropy kernel: [ 231.718083] [mac80211:lock_acquire+0x76/0x1d8] lock_acquire+0x76/0x9d Dec 16 22:10:48 syntropy kernel: [ 231.718088] [pcmcia:mutex_lock_nested+0xf7/0xd7d] mutex_lock_nested+0xf7/0x294 Dec 16 22:10:48 syntropy kernel: [ 231.718096][f8a56ff7] iwl3945_bg_init_alive_start+0x2d/0x1d7 [iwl3945] Dec 16 22:10:48 syntropy kernel: [ 231.718109] [run_workqueue+0xbb/0x18b] run_workqueue+0xbb/0x18b Dec 16 22:10:48 syntropy kernel: [ 231.718115] [worker_thread+0xbe/0xcd] worker_thread+0xbe/0xcd Dec 16 22:10:48 syntropy kernel: [ 231.718121] [kthread+0x3b/0x61] kthread+0x3b/0x61 Dec 16 22:10:48 syntropy kernel: [ 231.718126] [kernel_thread_helper+0x7/0x10] kernel_thread_helper+0x7/0x10 Dec 16 22:10:48 syntropy kernel: [ 231.718133][] 0x Dec 16 22:10:48 syntropy kernel: [ 231.718145] Dec 16 22:10:48 syntropy kernel: [ 231.718146] - #0 ((priv-init_alive_start)-work){--..}: Dec 16 22:10:48 syntropy kernel: [ 231.718149] [__lock_acquire+0x93e/0xbf4] __lock_acquire+0x93e/0xbf4 Dec 16 22:10:48 syntropy kernel: [ 231.718155] [mac80211:lock_acquire+0x76/0x1d8] lock_acquire+0x76/0x9d Dec 16 22:10:48 syntropy kernel: [ 231.718161] [__cancel_work_timer+0xb3/0x17f] __cancel_work_timer+0xb3/0x17f Dec 16 22:10:48 syntropy kernel: [ 231.718167] [iwl3945:cancel_delayed_work_sync+0xb/0x0d] cancel_delayed_work_sync+0xb/0xd Dec 16 22:10:48 syntropy kernel: [ 231.718173][f8a542cb] __iwl3945_down+0x51/0x310 [iwl3945] Dec 16 22:10:48 syntropy kernel: [ 231.718184][f8a587f7] iwl3945_pci_suspend+0x2d/0x65 [iwl3945] Dec 16 22:10:48 syntropy kernel: [ 231.718196] [pci_device_suspend+0x1b/0x4b] pci_device_suspend+0x1b/0x4b Dec 16 22:10:48 syntropy kernel: [ 231.718203] [device_suspend+0x17e/0x259] device_suspend+0x17e/0x259 Dec 16 22:10:48 syntropy kernel: [ 231.718210] [suspend_devices_and_enter+0x3d/0x138] suspend_devices_and_enter+0x3d/0x138 Dec 16 22:10:48 syntropy kernel: [ 231.718217] [enter_state+0x121/0x17d] enter_state+0x121/0x17d Dec 16 22:10:48 syntropy kernel: [ 231.718222] [state_store+0x96/0xac] state_store+0x96/0xac Dec 16 22:10:48 syntropy kernel: [ 231.718228] [kobj_attr_store+0x1a/0x22] kobj_attr_store+0x1a/0x22 Dec 16 22:10:48 syntropy kernel: [ 231.718234] [sysfs_write_file+0xb8/0xe3] sysfs_write_file+0xb8/0xe3 Dec 16 22:10:48 syntropy kernel: [ 231.718242] [vfs_write+0xa4/0x120] vfs_write+0xa4/0x120 Dec 16 22:10:48 syntropy kernel: [ 231.718248] [sys_write+0x3b/0x60] sys_write+0x3b/0x60 Dec 16 22:10:48 syntropy kernel: [ 231.718254] [sysenter_past_esp+0x6b/0xc1] sysenter_past_esp+0x6b/0xc1 Dec 16 22:10:48 syntropy kernel: [ 231.718259][] 0x Dec 16 22:10:48 syntropy kernel: [ 231.718269] Dec 16 22:10:48 syntropy kernel: [ 231.718269] other info that might help us debug this: Dec 16 22:10:48 syntropy kernel: [ 231.718271] Dec 16 22:10:48 syntropy kernel: [ 231.718272] 4 locks held by pm-suspend/5800: Dec 16 22:10:48 syntropy kernel: [ 231.718274] #0: (buffer-mutex){--..}, at: [sysfs_write_file+0x25/0xe3] sysfs_write_file+0x25/0xe3 Dec 16 22:10:48 syntropy kernel: [ 231.718280] #1: (pm_mutex){--..}, at: [enter_state+0x166/0x17d] enter_state+0x166/0x17d Dec 16 22:10:48 syntropy kernel: [ 231.718286] #2: (dpm_mtx){--..}, at: [device_suspend+0x2b/0x259] device_suspend+0x2b/0x259 Dec 16 22:10:48 syntropy kernel: [ 231.718291] #3: (priv-mutex){--..}, at: [f8a587e7] iwl3945_pci_suspend+0x1d
Re: 2.6.24-rc5-mm1 -- INFO: possible circular locking dependency detected -- pm-suspend/5800 is trying to acquire lock
On Tue, 2007-12-18 at 09:03 -0500, Miles Lane wrote: I have only seen this happen once, and cannot reproduce it. I'll keep trying, though. Dec 16 22:10:48 syntropy kernel: [ 231.718023] === Do you have a version that isn't line-wrapped before I try to unwrap it? Thanks, johannes signature.asc Description: This is a digitally signed message part
Re: 2.6.24-rc5-mm1 -- INFO: possible circular locking dependency detected -- pm-suspend/5800 is trying to acquire lock
Sorry. GMail doesn't support sending unwrapped text, as far as I can tell. I will send the log segment to you as an attachment. Also, when I sent my .config inline to Andrew recently, it tripped his spam filter. I'll attach it as well. Thanks. This is a bug in iwlwifi. The problem is actually another case where my workqueue debugging with lockdep is triggering a warning :)) Here's the thing: iwl3945_cancel_deferred_work does cancel_delayed_work_sync(priv-init_alive_start); (which is the ((priv-init_alive_start)-work) lock) but it is called from within a locked section of mutex_lock(priv-mutex); (locked from iwl3945_pci_suspend) On the other hand, the task that runs from the init_alive_start workqueue is iwl3945_bg_init_alive_start() which will lock the same mutex. So the deadlock condition is that you can be in cancel_delayed_work_sync() above while the mutex is locked, and be waiting for iwl_3945_bg_init_alive_start() which tries to lock the mutex. johannes signature.asc Description: This is a digitally signed message part
2.6.24-rc5-mm1 - IPv6 throws section mismatches.
On Thu, 13 Dec 2007 02:40:50 PST, Andrew Morton said: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc5/2.6.24-rc5-mm1/ git-net.patch (I'm guessing one of Daniel's commits, but not sure which one) causes some complaints: LD vmlinux.o MODPOST vmlinux.o WARNING: vmlinux.o(.init.text+0x2263f): Section mismatch: reference to .exit.text:tcpv6_exit (between 'inet6_init' and 'ac6_proc_init') WARNING: vmlinux.o(.init.text+0x22644): Section mismatch: reference to .exit.text:udplitev6_exit (between 'inet6_init' and 'ac6_proc_init') WARNING: vmlinux.o(.init.text+0x22649): Section mismatch: reference to .exit.text:udpv6_exit (between 'inet6_init' and 'ac6_proc_init') WARNING: vmlinux.o(.init.text+0x22658): Section mismatch: reference to .exit.text:addrconf_cleanup (between 'inet6_init' and 'ac6_proc_init') WARNING: vmlinux.o(.init.text+0x226bc): Section mismatch: reference to .exit.text:rawv6_exit (between 'inet6_init' and 'ac6_proc_init') Looks like the problem is that tcpv6_exit and friends are called from net/ipv6/af_inet6.c:inet6_init() - which is declared as: static int __init inet6_init(void) I can see how calling an __exit from an __init would be Bad Juju... pgpPCyT64YIuZ.pgp Description: PGP signature
Re: 2.6.24-rc5-mm1 - IPv6 throws section mismatches.
[EMAIL PROTECTED] wrote: On Thu, 13 Dec 2007 02:40:50 PST, Andrew Morton said: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc5/2.6.24-rc5-mm1/ git-net.patch (I'm guessing one of Daniel's commits, but not sure which one) causes some complaints: LD vmlinux.o MODPOST vmlinux.o WARNING: vmlinux.o(.init.text+0x2263f): Section mismatch: reference to .exit.text:tcpv6_exit (between 'inet6_init' and 'ac6_proc_init') WARNING: vmlinux.o(.init.text+0x22644): Section mismatch: reference to .exit.text:udplitev6_exit (between 'inet6_init' and 'ac6_proc_init') WARNING: vmlinux.o(.init.text+0x22649): Section mismatch: reference to .exit.text:udpv6_exit (between 'inet6_init' and 'ac6_proc_init') WARNING: vmlinux.o(.init.text+0x22658): Section mismatch: reference to .exit.text:addrconf_cleanup (between 'inet6_init' and 'ac6_proc_init') WARNING: vmlinux.o(.init.text+0x226bc): Section mismatch: reference to .exit.text:rawv6_exit (between 'inet6_init' and 'ac6_proc_init') Looks like the problem is that tcpv6_exit and friends are called from net/ipv6/af_inet6.c:inet6_init() - which is declared as: static int __init inet6_init(void) I can see how calling an __exit from an __init would be Bad Juju... Yep, thanks Valdis for pointing that. I sent a patch several days ago which fix that to DaveM and he applied it to the latest net-2.6.25 -- Sauf indication contraire ci-dessus: Compagnie IBM France Siège Social : Tour Descartes, 2, avenue Gambetta, La Défense 5, 92400 Courbevoie RCS Nanterre 552 118 465 Forme Sociale : S.A.S. Capital Social : 542.737.118 ? SIREN/SIRET : 552 118 465 02430 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ipw3945-devel] 2.6.24-rc5-mm1 -- INFO: possible circular locking dependency detected -- pm-suspend/5800 is trying to acquire lock
On Tue, 2007-12-18 at 15:57 +0100, Johannes Berg wrote: Thanks. This is a bug in iwlwifi. The problem is actually another case where my workqueue debugging with lockdep is triggering a warning :)) Here's the thing: iwl3945_cancel_deferred_work does cancel_delayed_work_sync(priv-init_alive_start); (which is the ((priv-init_alive_start)-work) lock) but it is called from within a locked section of mutex_lock(priv-mutex); (locked from iwl3945_pci_suspend) On the other hand, the task that runs from the init_alive_start workqueue is iwl3945_bg_init_alive_start() which will lock the same mutex. So the deadlock condition is that you can be in cancel_delayed_work_sync() above while the mutex is locked, and be waiting for iwl_3945_bg_init_alive_start() which tries to lock the mutex. Thanks for the analysis. Miles, please try the attached patch. I'll send a patch for both 3945 and 4965 to linux-wireless later. Thanks, -yi diff --git a/drivers/net/wireless/iwlwifi/iwl3945-base.c b/drivers/net/wireless/iwlwifi/iwl3945-base.c index 88cf035..f0303e8 100644 --- a/drivers/net/wireless/iwlwifi/iwl3945-base.c +++ b/drivers/net/wireless/iwlwifi/iwl3945-base.c @@ -6355,8 +6355,6 @@ static void __iwl3945_down(struct iwl3945_priv *priv) /* Unblock any waiting calls */ wake_up_interruptible_all(priv-wait_command_queue); - iwl3945_cancel_deferred_work(priv); - /* Wipe out the EXIT_PENDING status bit if we are not actually * exiting the module */ if (!exit_pending) @@ -6431,6 +6429,8 @@ static void iwl3945_down(struct iwl3945_priv *priv) mutex_lock(priv-mutex); __iwl3945_down(priv); mutex_unlock(priv-mutex); + + iwl3945_cancel_deferred_work(priv); } #define MAX_HW_RESTARTS 5 @@ -8739,10 +8739,9 @@ static void iwl3945_pci_remove(struct pci_dev *pdev) IWL_DEBUG_INFO(*** UNLOAD DRIVER ***\n); - mutex_lock(priv-mutex); set_bit(STATUS_EXIT_PENDING, priv-status); - __iwl3945_down(priv); - mutex_unlock(priv-mutex); + + iwl3945_down(priv); /* Free MAC hash list for ADHOC */ for (i = 0; i IWL_IBSS_MAC_HASH_SIZE; i++) { @@ -8801,12 +8800,10 @@ static int iwl3945_pci_suspend(struct pci_dev *pdev, pm_message_t state) { struct iwl3945_priv *priv = pci_get_drvdata(pdev); - mutex_lock(priv-mutex); - set_bit(STATUS_IN_SUSPEND, priv-status); /* Take down the device; powers it off, etc. */ - __iwl3945_down(priv); + iwl3945_down(priv); if (priv-mac80211_registered) ieee80211_stop_queues(priv-hw); @@ -8815,8 +8812,6 @@ static int iwl3945_pci_suspend(struct pci_dev *pdev, pm_message_t state) pci_disable_device(pdev); pci_set_power_state(pdev, PCI_D3hot); - mutex_unlock(priv-mutex); - return 0; } @@ -8874,8 +8869,6 @@ static int iwl3945_pci_resume(struct pci_dev *pdev) printk(KERN_INFO Coming out of suspend...\n); - mutex_lock(priv-mutex); - pci_set_power_state(pdev, PCI_D0); err = pci_enable_device(pdev); pci_restore_state(pdev); @@ -8889,7 +8882,6 @@ static int iwl3945_pci_resume(struct pci_dev *pdev) pci_write_config_byte(pdev, 0x41, 0x00); iwl3945_resume(priv); - mutex_unlock(priv-mutex); return 0; }
Re: 2.6.24-rc5-mm1 -- inconsistent {in-softirq-W} - {softirq-on-R} usage.
From: Andrew Morton [EMAIL PROTECTED] Date: Fri, 14 Dec 2007 15:36:33 -0800 The networking bug looks to be around sock_i_ino()'s taking of sk_callback_lock with softirq's enabled. Perhaps this will fix it. One should be suspicious of any case where write_lock is performed on sk-sk_callback_lock in softint context. And that's the only way this can trigger, so this patch is wrong. Generally, sock_orphan() and sock_graft() are the only primary places where sk-sk_callback_lock is acquired as a writer. And these should be invoked only from process context. Perhaps there is some exception to this in some specialized layer such as SUNRPC, which are the only other spots I see potentially doing sk-sk_callback_lock write acquires in softint context, which as stated should not be done. OCFS2 and ISCSI seem to be following the rules in it's write lock calls on this lock. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc5-mm1 -- inconsistent {in-softirq-W} - {softirq-on-R} usage.
On Fri, 14 Dec 2007 22:58:24 -0500 Miles Lane [EMAIL PROTECTED] wrote: On Dec 14, 2007 6:36 PM, Andrew Morton [EMAIL PROTECTED] wrote: On Fri, 14 Dec 2007 17:13:21 -0500 Miles Lane [EMAIL PROTECTED] wrote: Sorry Andrew, I don't know who to forward this problem to. I tried running: find /proc | xargs cat and got this: = [ INFO: inconsistent lock state ] 2.6.24-rc5-mm1 #26 - inconsistent {in-softirq-W} - {softirq-on-R} usage. cat/6944 [HC0[0]:SC0[0]:HE1:SE1] takes: BUG: unable to handle kernel paging request at virtual address 0f1eff0b printing ip: c01fe64d *pde = Oops: [#1] PREEMPT SMP last sysfs file: /sys/block/sda/sda3/stat Modules linked in: aes_generic i915 drm rfcomm l2cap bluetooth cpufreq_stats cpufreq_conservative cpufreq_performance sbs sbshc dm_crypt sbp2 parport_pc lp parport pcmcia arc4 ecb crypto_blkcipher cryptomgr crypto_algapi tifm_7xx1 tifm_core yenta_socket rsrc_nonstatic pcmcia_core iwl3945 iTCO_wdt iTCO_vendor_support watchdog_core watchdog_dev snd_hda_intel mac80211 snd_pcm_oss snd_mixer_oss cfg80211 snd_pcm sky2 snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device snd soundcore snd_page_alloc shpchp pci_hotplug firewire_ohci firewire_core crc_itu_t ata_generic piix ide_core Pid: 6944, comm: cat Not tainted (2.6.24-rc5-mm1 #26) EIP: 0060:[c01fe64d] EFLAGS: 00210097 CPU: 0 EIP is at strnlen+0x9/0x1c EAX: 0f1eff0b EBX: 0f1eff0b ECX: 0f1eff0b EDX: fffe ESI: c05b74f6 EDI: d6267d94 EBP: d6267cc8 ESP: d6267cc8 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process cat (pid: 6944, ti=d6267000 task=d5a09000 task.ti=d6267000) Stack: d6267cfc c01fdd22 0400 c05b74f4 0001 c05b78f4 c048f503 0400 d5a09000 0002 d6267d0c c01fdf41 d6267d94 db68c04a d6267d74 c012ae81 d6267d94 0028 c05b89f7 00200046 Call Trace: [c0108eb2] show_trace_log_lvl+0x12/0x25 [c0108f4f] show_stack_log_lvl+0x8a/0x95 [c0108fe4] show_registers+0x8a/0x1bd [c010922f] die+0x118/0x1dc [c03cf706] do_page_fault+0x5a4/0x681 [c03cdd72] error_code+0x72/0x78 [c01fdd22] vsnprintf+0x277/0x40e [c01fdf41] vscnprintf+0xe/0x1d [c012ae81] vprintk+0xcb/0x2f3 [c012b0be] printk+0x15/0x17 [c0145e55] print_lock_name+0x4e/0xa2 [c0146099] print_lock+0xe/0x3a [c01464cf] print_usage_bug+0xbc/0x117 [c0146fb6] mark_lock+0x2e7/0x3fe [c0147b9a] __lock_acquire+0x498/0xbf4 [c014836c] lock_acquire+0x76/0x9d [c03cd6d2] _read_lock+0x23/0x32 [c03491ae] sock_i_ino+0x14/0x30 [c03c88ed] packet_seq_show+0x22/0x75 [c019b41a] seq_read+0x19d/0x26f [c01b0ded] proc_reg_read+0x60/0x74 [c01854aa] vfs_read+0x8a/0x106 [c01858a8] sys_read+0x3b/0x60 [c0107cea] sysenter_past_esp+0x6b/0xc1 === Code: 01 00 00 00 4f 89 fa 5f 89 d0 5d c3 55 85 c9 89 e5 57 89 c7 89 d0 74 05 f2 ae 75 01 4f 89 f8 5f 5d c3 55 89 c1 89 e5 89 c8 eb 06 80 38 00 74 07 40 4a 83 fa ff 75 f4 29 c8 5d c3 90 90 90 55 83 EIP: [c01fe64d] strnlen+0x9/0x1c SS:ESP 0068:d6267cc8 note: cat[6944] exited with preempt_count 4 I'd say you hit a networking locking bug and then when trying to report that bug, lockdep crashed. The networking bug looks to be around sock_i_ino()'s taking of sk_callback_lock with softirq's enabled. Perhaps this will fix it. diff -puN net/core/sock.c~a net/core/sock.c --- a/net/core/sock.c~a +++ a/net/core/sock.c @@ -1115,9 +1115,9 @@ int sock_i_uid(struct sock *sk) { int uid; - read_lock(sk-sk_callback_lock); + read_lock_bh(sk-sk_callback_lock); uid = sk-sk_socket ? SOCK_INODE(sk-sk_socket)-i_uid : 0; - read_unlock(sk-sk_callback_lock); + read_unlock_bh(sk-sk_callback_lock); return uid; } @@ -1125,9 +1125,9 @@ unsigned long sock_i_ino(struct sock *sk { unsigned long ino; - read_lock(sk-sk_callback_lock); + read_lock_bh(sk-sk_callback_lock); ino = sk-sk_socket ? SOCK_INODE(sk-sk_socket)-i_ino : 0; - read_unlock(sk-sk_callback_lock); + read_unlock_bh(sk-sk_callback_lock); return ino; } _ I applied the patch and then tried my test again. This time my system locked up. Perhaps I should open a new thread for this, since the problem looks pretty different. Dec 14 21:32:55 feargod kernel: process `cat' is using deprecated sysctl (syscall) net.ipv6.neigh.default.retrans_time; Use net.ipv6.neigh.default.retran s_time_ms instead. Dec 14 21:32:55 feargod kernel: Dec 14 21:32:55 feargod kernel: = Dec 14 21:32:55 feargod kernel: [ BUG: bad unlock balance detected! ] Dec 14 21:32:55 feargod kernel
Re: 2.6.24-rc5-mm1 -- inconsistent {in-softirq-W} - {softirq-on-R} usage.
Andrew Morton wrote, On 12/15/2007 12:13 PM: On Fri, 14 Dec 2007 22:58:24 -0500 Miles Lane [EMAIL PROTECTED] wrote: On Dec 14, 2007 6:36 PM, Andrew Morton [EMAIL PROTECTED] wrote: On Fri, 14 Dec 2007 17:13:21 -0500 Miles Lane [EMAIL PROTECTED] wrote: ... Pid: 6944, comm: cat Not tainted (2.6.24-rc5-mm1 #26) ... note: cat[6944] exited with preempt_count 4 ... Dec 14 21:32:55 feargod kernel: cat/6180 is trying to release lock ... Can you suggest a way in which others can reproduce this? Buy a cat? Jarek P. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc5-mm1 -- inconsistent {in-softirq-W} - {softirq-on-R} usage.
Andrew Morton wrote, On 12/15/2007 12:13 PM: On Fri, 14 Dec 2007 22:58:24 -0500 Miles Lane [EMAIL PROTECTED] wrote: ... I applied the patch and then tried my test again. This time my system locked up. Perhaps I should open a new thread for this, since the problem looks pretty different. Dec 14 21:32:55 feargod kernel: process `cat' is using deprecated sysctl (syscall) net.ipv6.neigh.default.retrans_time; Use net.ipv6.neigh.default.retran s_time_ms instead. Dec 14 21:32:55 feargod kernel: Dec 14 21:32:55 feargod kernel: = Dec 14 21:32:55 feargod kernel: [ BUG: bad unlock balance detected! ] Dec 14 21:32:55 feargod kernel: - Dec 14 21:32:55 feargod kernel: cat/6180 is trying to release lock (kkk�H3��) at: Dec 14 21:32:55 feargod kernel: [packet_seq_stop+0xe/0x10] packet_seq_stop+0xe/0x10 Dec 14 21:32:55 feargod kernel: but there are no more locks to release! Dec 14 21:32:55 feargod kernel: Dec 14 21:32:55 feargod kernel: other info that might help us debug this: Dec 14 21:32:55 feargod kernel: 2 locks held by cat/6180: Dec 14 21:32:55 feargod kernel: #0: (p-lock){--..}, at: [crypto_algapi:seq_read+0x25/0x191c1] seq_read+0x25/0x26f Dec 14 21:32:55 feargod kernel: #1: (net-packet.sklist_lock){-.--}, at: [packet_seq_start+0x14/0x4d] packet_seq_start+0x14/0x4d Miles, I didn't check this yet, but since there were some considerable changes, including locking, you could try with reverting some of these last 3 patches to net/packet by Denis: http://git.kernel.org/?p=linux/kernel/git/davem/net-2.6.25.git;a=history;f=net/packet/af_packet.c;h=485af5691d64270a02322925a6ccfad9d02b7f78;hb=HEAD Regards, Jarek P. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc5-mm1: cat /proc/net/packet - oops
Hello, As one of usual tests I run the following script: for i in `find /proc -type f`; do echo -n cat $i /dev/null ... ; cat $i /dev/null; echo done; done This time the culprit is /proc/net/packet. cat process gets killed $ cat /proc/net/packet Segmentation fault and lost in lots of messages from the script but for some reason there is no info in syslog (why?). I could capture the oops only when issued sysrq-7 or grater. That's why I didn't catch the oops earlier. I found it because the bug makes my sparc64 box need a hardware reset most of the time it happens and produces oops 2 screens long. x86 kills the cat process but system is still usable and running fine. Bisection points to: git-ubi.patch GOOD # git-net.patch BAD ipsec-fix-reversed-icmp6-policy-check.patch but this seems to be far from precise :) $ grep ^commit git-net.patch | wc -l 361 Not sure if this is important but when bisecting the mm tree the oops got shorter at some point so maybe some other patch is also involved. This one is from x86: [ 194.508398] BUG: unable to handle kernel paging request at virtual address bd47 [ 194.508412] printing eip: c0135d59 *pde = [ 194.508419] Oops: [#1] PREEMPT [ 194.508424] last sysfs file: /devices/pci:00/:00:01.0/:01:05.0/resource [ 194.508428] Modules linked in: usbhid hid orinoco_cs orinoco hermes pcmcia firmware_class uhci_hcd ehci_hcd usbcore psmouse yenta_socket rsrc_nonstatic rtc 8139too [ 194.508443] [ 194.508447] Pid: 5368, comm: cat Not tainted (2.6.24-rc5 #9) [ 194.508450] EIP: 0060:[c0135d59] EFLAGS: 00210046 CPU: 0 [ 194.508466] EIP is at __lock_acquire+0x5b/0xfc4 [ 194.508469] EAX: 0022 EBX: 00200246 ECX: bd43 EDX: 0002 [ 194.508472] ESI: bd43 EDI: EBP: d816ce80 ESP: d816ce14 [ 194.508475] DS: 007b ES: 007b FS: GS: 0033 SS: 0068 [ 194.508479] Process cat (pid: 5368, ti=d816c000 task=d826a000 task.ti=d816c000) [ 194.508481] Stack: c0135a21 d826a000 d816ce38 c0135697 d826a000 c0146ded [ 194.508490]c1304f98 0002 bd43 0001 d826a000 d816cec0 c013681d [ 194.508498]0006 0003 c03daa08 0001 0044 02ad 0005 [ 194.508506] Call Trace: [ 194.508508] [c01035d8] show_trace_log_lvl+0x1a/0x30 [ 194.508518] [c0103693] show_stack_log_lvl+0xa5/0xca [ 194.508523] [c0103787] show_registers+0xcf/0x23f [ 194.508528] [c0103a04] die+0x10d/0x1f5 [ 194.508532] [c0110cee] do_page_fault+0x27e/0x5f0 [ 194.508540] [c034684a] error_code+0x6a/0x70 [ 194.508550] [c0136d20] lock_acquire+0x5e/0x76 [ 194.508555] [c03461a6] _read_lock+0x35/0x42 [ 194.508560] [c02d957a] sock_i_ino+0x14/0x30 [ 194.508568] [c032c7e8] packet_seq_show+0x19/0xa0 [ 194.508576] [c0179f5c] seq_read+0x19a/0x29e [ 194.508583] [c0191b25] proc_reg_read+0x57/0x78 [ 194.508590] [c0161c8a] vfs_read+0x89/0x11d [ 194.508596] [c0162054] sys_read+0x3d/0x64 [ 194.508600] [c010261a] sysenter_past_esp+0x5f/0xa5 [ 194.508605] === [ 194.508607] Code: c0 85 c0 0f 84 64 03 00 00 9c 58 f6 c4 02 0f 85 b8 07 00 00 83 ff 07 0f 87 de 07 00 00 85 ff 8d 76 00 0f 85 4f 03 00 00 8b 4d c0 8b 71 04 85 f6 0f 84 41 03 00 00 89 f0 e8 d8 d7 ff ff 85 c0 0f [ 194.508651] EIP: [c0135d59] __lock_acquire+0x5b/0xfc4 SS:ESP 0068:d816ce14 [ 194.508660] note: cat[5368] exited with preempt_count 2 .config attached. Regards, Mariusz # # Automatically generated make config: don't edit # Linux kernel version: 2.6.24-rc5 # Sun Dec 16 00:22:27 2007 # # CONFIG_64BIT is not set CONFIG_X86_32=y # CONFIG_X86_64 is not set CONFIG_X86=y CONFIG_GENERIC_TIME=y CONFIG_GENERIC_CMOS_UPDATE=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_QUICKLIST=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_DMI=y # CONFIG_RWSEM_GENERIC_SPINLOCK is not set CONFIG_RWSEM_XCHGADD_ALGORITHM=y # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_GENERIC_CALIBRATE_DELAY=y # CONFIG_GENERIC_TIME_VSYSCALL is not set CONFIG_ARCH_SUPPORTS_OPROFILE=y CONFIG_ARCH_HIBERNATION_POSSIBLE=y CONFIG_ARCH_SUSPEND_POSSIBLE=y # CONFIG_ZONE_DMA32 is not set CONFIG_ARCH_POPULATES_NODE_MAP=y # CONFIG_AUDIT_ARCH is not set CONFIG_GENERIC_HARDIRQS=y CONFIG_GENERIC_IRQ_PROBE=y CONFIG_X86_BIOS_REBOOT=y CONFIG_KTIME_SCALAR=y CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config # # General setup # CONFIG_EXPERIMENTAL=y CONFIG_BROKEN_ON_SMP=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_LOCALVERSION= # CONFIG_LOCALVERSION_AUTO is not set CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y CONFIG_BSD_PROCESS_ACCT=y #
Re: 2.6.24-rc5-mm1: cat /proc/net/packet - oops
Mariusz Kozlowski [EMAIL PROTECTED] wrote: git-ubi.patch GOOD # git-net.patch BAD ipsec-fix-reversed-icmp6-policy-check.patch but this seems to be far from precise :) I suspect namespace borkage. But just because you pin-pointed my patch I'll try to track it down :) Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED] Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc5-mm1 -- inconsistent {in-softirq-W} - {softirq-on-R} usage.
Andrew Morton [EMAIL PROTECTED] wrote: My suspicion is that you've hit bad breakage in networking and lockdep just isn't sufficiently robust to handle what it's being given. Can you suggest a way in which others can reproduce this? I can reproduce this now. I suspect it's to do with the namespace work. I'll let you know when I have digged further. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED] Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc5-mm1 -- inconsistent {in-softirq-W} - {softirq-on-R} usage.
On Sat, 15 Dec 2007 15:55:09 -0500 Miles Lane [EMAIL PROTECTED] wrote: On Dec 15, 2007 3:13 PM, Miles Lane [EMAIL PROTECTED] wrote: On Dec 15, 2007 6:13 AM, Andrew Morton [EMAIL PROTECTED] wrote: On Fri, 14 Dec 2007 22:58:24 -0500 Miles Lane [EMAIL PROTECTED] wrote: On Dec 14, 2007 6:36 PM, Andrew Morton [EMAIL PROTECTED] wrote: On Fri, 14 Dec 2007 17:13:21 -0500 Miles Lane [EMAIL PROTECTED] wrote: Sorry Andrew, I don't know who to forward this problem to. I tried running: find /proc | xargs cat and got this: = [ INFO: inconsistent lock state ] 2.6.24-rc5-mm1 #26 - inconsistent {in-softirq-W} - {softirq-on-R} usage. cat/6944 [HC0[0]:SC0[0]:HE1:SE1] takes: BUG: unable to handle kernel paging request at virtual address 0f1eff0b printing ip: c01fe64d *pde = Oops: [#1] PREEMPT SMP last sysfs file: /sys/block/sda/sda3/stat Modules linked in: aes_generic i915 drm rfcomm l2cap bluetooth cpufreq_stats cpufreq_conservative cpufreq_performance sbs sbshc dm_crypt sbp2 parport_pc lp parport pcmcia arc4 ecb crypto_blkcipher cryptomgr crypto_algapi tifm_7xx1 tifm_core yenta_socket rsrc_nonstatic pcmcia_core iwl3945 iTCO_wdt iTCO_vendor_support watchdog_core watchdog_dev snd_hda_intel mac80211 snd_pcm_oss snd_mixer_oss cfg80211 snd_pcm sky2 snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device snd soundcore snd_page_alloc shpchp pci_hotplug firewire_ohci firewire_core crc_itu_t ata_generic piix ide_core Pid: 6944, comm: cat Not tainted (2.6.24-rc5-mm1 #26) EIP: 0060:[c01fe64d] EFLAGS: 00210097 CPU: 0 EIP is at strnlen+0x9/0x1c EAX: 0f1eff0b EBX: 0f1eff0b ECX: 0f1eff0b EDX: fffe ESI: c05b74f6 EDI: d6267d94 EBP: d6267cc8 ESP: d6267cc8 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process cat (pid: 6944, ti=d6267000 task=d5a09000 task.ti=d6267000) Stack: d6267cfc c01fdd22 0400 c05b74f4 0001 c05b78f4 c048f503 0400 d5a09000 0002 d6267d0c c01fdf41 d6267d94 db68c04a d6267d74 c012ae81 d6267d94 0028 c05b89f7 00200046 Call Trace: [c0108eb2] show_trace_log_lvl+0x12/0x25 [c0108f4f] show_stack_log_lvl+0x8a/0x95 [c0108fe4] show_registers+0x8a/0x1bd [c010922f] die+0x118/0x1dc [c03cf706] do_page_fault+0x5a4/0x681 [c03cdd72] error_code+0x72/0x78 [c01fdd22] vsnprintf+0x277/0x40e [c01fdf41] vscnprintf+0xe/0x1d [c012ae81] vprintk+0xcb/0x2f3 [c012b0be] printk+0x15/0x17 [c0145e55] print_lock_name+0x4e/0xa2 [c0146099] print_lock+0xe/0x3a [c01464cf] print_usage_bug+0xbc/0x117 [c0146fb6] mark_lock+0x2e7/0x3fe [c0147b9a] __lock_acquire+0x498/0xbf4 [c014836c] lock_acquire+0x76/0x9d [c03cd6d2] _read_lock+0x23/0x32 [c03491ae] sock_i_ino+0x14/0x30 [c03c88ed] packet_seq_show+0x22/0x75 [c019b41a] seq_read+0x19d/0x26f [c01b0ded] proc_reg_read+0x60/0x74 [c01854aa] vfs_read+0x8a/0x106 [c01858a8] sys_read+0x3b/0x60 [c0107cea] sysenter_past_esp+0x6b/0xc1 === Code: 01 00 00 00 4f 89 fa 5f 89 d0 5d c3 55 85 c9 89 e5 57 89 c7 89 d0 74 05 f2 ae 75 01 4f 89 f8 5f 5d c3 55 89 c1 89 e5 89 c8 eb 06 80 38 00 74 07 40 4a 83 fa ff 75 f4 29 c8 5d c3 90 90 90 55 83 EIP: [c01fe64d] strnlen+0x9/0x1c SS:ESP 0068:d6267cc8 note: cat[6944] exited with preempt_count 4 I'd say you hit a networking locking bug and then when trying to report that bug, lockdep crashed. The networking bug looks to be around sock_i_ino()'s taking of sk_callback_lock with softirq's enabled. Perhaps this will fix it. diff -puN net/core/sock.c~a net/core/sock.c --- a/net/core/sock.c~a +++ a/net/core/sock.c @@ -1115,9 +1115,9 @@ int sock_i_uid(struct sock *sk) { int uid; - read_lock(sk-sk_callback_lock); + read_lock_bh(sk-sk_callback_lock); uid = sk-sk_socket ? SOCK_INODE(sk-sk_socket)-i_uid : 0; - read_unlock(sk-sk_callback_lock); + read_unlock_bh(sk-sk_callback_lock); return uid; } @@ -1125,9 +1125,9 @@ unsigned long sock_i_ino(struct sock *sk { unsigned long ino; - read_lock(sk-sk_callback_lock); + read_lock_bh(sk-sk_callback_lock); ino = sk-sk_socket ? SOCK_INODE(sk-sk_socket)-i_ino : 0; - read_unlock(sk-sk_callback_lock); + read_unlock_bh(sk-sk_callback_lock); return ino; } _ I applied
Re: 2.6.24-rc5-mm1
Hi Andrew, I hit this just now. Not sure if I can reproduce it though. WARNING: at net/ipv4/tcp_input.c:2533 tcp_fastretrans_alert() Pid: 4624, comm: yield Not tainted 2.6.24-rc5-mm1 #5 [c010582a] show_trace_log_lvl+0x12/0x22 [c0105847] show_trace+0xd/0xf [c0105959] dump_stack+0x57/0x5e [c03db95b] tcp_fastretrans_alert+0xde/0x5bd [c03dcab2] tcp_ack+0x236/0x2e4 [c03dea01] tcp_rcv_established+0x51e/0x5c0 [c03e56f1] tcp_v4_do_rcv+0x22/0xc4 [c03e5c49] tcp_v4_rcv+0x4b6/0x7f5 [c03cd5ad] ip_local_deliver_finish+0xb9/0x169 [c03cd68a] ip_local_deliver+0x2d/0x34 [c03cd91d] ip_rcv_finish+0x28c/0x2ab [c03cdb16] ip_rcv+0x1da/0x204 [c03b800a] netif_receive_skb+0x23c/0x26f [c02db326] tg3_rx+0x246/0x353 [c02db4ac] tg3_poll_work+0x79/0x86 [c02db4e8] tg3_poll+0x2f/0x16f [c03b822b] net_rx_action+0xbb/0x1a8 [c0129596] __do_softirq+0x73/0xe6 [c0129642] do_softirq+0x39/0x51 [c01296c0] irq_exit+0x47/0x49 [c01064f4] do_IRQ+0x55/0x69 [c0105492] common_interrupt+0x2e/0x34 === -- regards, Dhaval -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc5-mm1
From: Herbert Xu [EMAIL PROTECTED] Date: Fri, 14 Dec 2007 10:08:07 +0800 [UDP]: Move udp_stats_in6 into net/ipv4/udp.c Now that external users may increment the counters directly, we need to ensure that udp_stats_in6 is always available. Otherwise we'd either have to requrie the external users to be built as modules or ipv6 to be built-in. This isn't too bad because udp_stats_in6 is just a pair of pointers plus an EXPORT, e.g., just 40 (16 + 24) bytes on x86-64. Signed-off-by: Herbert Xu [EMAIL PROTECTED] Applied. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc5-mm1 -- inconsistent {in-softirq-W} - {softirq-on-R} usage.
On Fri, 14 Dec 2007 17:13:21 -0500 Miles Lane [EMAIL PROTECTED] wrote: Sorry Andrew, I don't know who to forward this problem to. I tried running: find /proc | xargs cat and got this: = [ INFO: inconsistent lock state ] 2.6.24-rc5-mm1 #26 - inconsistent {in-softirq-W} - {softirq-on-R} usage. cat/6944 [HC0[0]:SC0[0]:HE1:SE1] takes: BUG: unable to handle kernel paging request at virtual address 0f1eff0b printing ip: c01fe64d *pde = Oops: [#1] PREEMPT SMP last sysfs file: /sys/block/sda/sda3/stat Modules linked in: aes_generic i915 drm rfcomm l2cap bluetooth cpufreq_stats cpufreq_conservative cpufreq_performance sbs sbshc dm_crypt sbp2 parport_pc lp parport pcmcia arc4 ecb crypto_blkcipher cryptomgr crypto_algapi tifm_7xx1 tifm_core yenta_socket rsrc_nonstatic pcmcia_core iwl3945 iTCO_wdt iTCO_vendor_support watchdog_core watchdog_dev snd_hda_intel mac80211 snd_pcm_oss snd_mixer_oss cfg80211 snd_pcm sky2 snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device snd soundcore snd_page_alloc shpchp pci_hotplug firewire_ohci firewire_core crc_itu_t ata_generic piix ide_core Pid: 6944, comm: cat Not tainted (2.6.24-rc5-mm1 #26) EIP: 0060:[c01fe64d] EFLAGS: 00210097 CPU: 0 EIP is at strnlen+0x9/0x1c EAX: 0f1eff0b EBX: 0f1eff0b ECX: 0f1eff0b EDX: fffe ESI: c05b74f6 EDI: d6267d94 EBP: d6267cc8 ESP: d6267cc8 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process cat (pid: 6944, ti=d6267000 task=d5a09000 task.ti=d6267000) Stack: d6267cfc c01fdd22 0400 c05b74f4 0001 c05b78f4 c048f503 0400 d5a09000 0002 d6267d0c c01fdf41 d6267d94 db68c04a d6267d74 c012ae81 d6267d94 0028 c05b89f7 00200046 Call Trace: [c0108eb2] show_trace_log_lvl+0x12/0x25 [c0108f4f] show_stack_log_lvl+0x8a/0x95 [c0108fe4] show_registers+0x8a/0x1bd [c010922f] die+0x118/0x1dc [c03cf706] do_page_fault+0x5a4/0x681 [c03cdd72] error_code+0x72/0x78 [c01fdd22] vsnprintf+0x277/0x40e [c01fdf41] vscnprintf+0xe/0x1d [c012ae81] vprintk+0xcb/0x2f3 [c012b0be] printk+0x15/0x17 [c0145e55] print_lock_name+0x4e/0xa2 [c0146099] print_lock+0xe/0x3a [c01464cf] print_usage_bug+0xbc/0x117 [c0146fb6] mark_lock+0x2e7/0x3fe [c0147b9a] __lock_acquire+0x498/0xbf4 [c014836c] lock_acquire+0x76/0x9d [c03cd6d2] _read_lock+0x23/0x32 [c03491ae] sock_i_ino+0x14/0x30 [c03c88ed] packet_seq_show+0x22/0x75 [c019b41a] seq_read+0x19d/0x26f [c01b0ded] proc_reg_read+0x60/0x74 [c01854aa] vfs_read+0x8a/0x106 [c01858a8] sys_read+0x3b/0x60 [c0107cea] sysenter_past_esp+0x6b/0xc1 === Code: 01 00 00 00 4f 89 fa 5f 89 d0 5d c3 55 85 c9 89 e5 57 89 c7 89 d0 74 05 f2 ae 75 01 4f 89 f8 5f 5d c3 55 89 c1 89 e5 89 c8 eb 06 80 38 00 74 07 40 4a 83 fa ff 75 f4 29 c8 5d c3 90 90 90 55 83 EIP: [c01fe64d] strnlen+0x9/0x1c SS:ESP 0068:d6267cc8 note: cat[6944] exited with preempt_count 4 I'd say you hit a networking locking bug and then when trying to report that bug, lockdep crashed. The networking bug looks to be around sock_i_ino()'s taking of sk_callback_lock with softirq's enabled. Perhaps this will fix it. diff -puN net/core/sock.c~a net/core/sock.c --- a/net/core/sock.c~a +++ a/net/core/sock.c @@ -1115,9 +1115,9 @@ int sock_i_uid(struct sock *sk) { int uid; - read_lock(sk-sk_callback_lock); + read_lock_bh(sk-sk_callback_lock); uid = sk-sk_socket ? SOCK_INODE(sk-sk_socket)-i_uid : 0; - read_unlock(sk-sk_callback_lock); + read_unlock_bh(sk-sk_callback_lock); return uid; } @@ -1125,9 +1125,9 @@ unsigned long sock_i_ino(struct sock *sk { unsigned long ino; - read_lock(sk-sk_callback_lock); + read_lock_bh(sk-sk_callback_lock); ino = sk-sk_socket ? SOCK_INODE(sk-sk_socket)-i_ino : 0; - read_unlock(sk-sk_callback_lock); + read_unlock_bh(sk-sk_callback_lock); return ino; } _ -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc5-mm1 -- inconsistent {in-softirq-W} - {softirq-on-R} usage.
On Fri, 14 Dec 2007 22:58:24 -0500 Miles Lane [EMAIL PROTECTED] wrote: On Dec 14, 2007 6:36 PM, Andrew Morton [EMAIL PROTECTED] wrote: On Fri, 14 Dec 2007 17:13:21 -0500 Miles Lane [EMAIL PROTECTED] wrote: Sorry Andrew, I don't know who to forward this problem to. I tried running: find /proc | xargs cat and got this: = [ INFO: inconsistent lock state ] 2.6.24-rc5-mm1 #26 - inconsistent {in-softirq-W} - {softirq-on-R} usage. cat/6944 [HC0[0]:SC0[0]:HE1:SE1] takes: BUG: unable to handle kernel paging request at virtual address 0f1eff0b printing ip: c01fe64d *pde = Oops: [#1] PREEMPT SMP last sysfs file: /sys/block/sda/sda3/stat Modules linked in: aes_generic i915 drm rfcomm l2cap bluetooth cpufreq_stats cpufreq_conservative cpufreq_performance sbs sbshc dm_crypt sbp2 parport_pc lp parport pcmcia arc4 ecb crypto_blkcipher cryptomgr crypto_algapi tifm_7xx1 tifm_core yenta_socket rsrc_nonstatic pcmcia_core iwl3945 iTCO_wdt iTCO_vendor_support watchdog_core watchdog_dev snd_hda_intel mac80211 snd_pcm_oss snd_mixer_oss cfg80211 snd_pcm sky2 snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device snd soundcore snd_page_alloc shpchp pci_hotplug firewire_ohci firewire_core crc_itu_t ata_generic piix ide_core Pid: 6944, comm: cat Not tainted (2.6.24-rc5-mm1 #26) EIP: 0060:[c01fe64d] EFLAGS: 00210097 CPU: 0 EIP is at strnlen+0x9/0x1c EAX: 0f1eff0b EBX: 0f1eff0b ECX: 0f1eff0b EDX: fffe ESI: c05b74f6 EDI: d6267d94 EBP: d6267cc8 ESP: d6267cc8 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process cat (pid: 6944, ti=d6267000 task=d5a09000 task.ti=d6267000) Stack: d6267cfc c01fdd22 0400 c05b74f4 0001 c05b78f4 c048f503 0400 d5a09000 0002 d6267d0c c01fdf41 d6267d94 db68c04a d6267d74 c012ae81 d6267d94 0028 c05b89f7 00200046 Call Trace: [c0108eb2] show_trace_log_lvl+0x12/0x25 [c0108f4f] show_stack_log_lvl+0x8a/0x95 [c0108fe4] show_registers+0x8a/0x1bd [c010922f] die+0x118/0x1dc [c03cf706] do_page_fault+0x5a4/0x681 [c03cdd72] error_code+0x72/0x78 [c01fdd22] vsnprintf+0x277/0x40e [c01fdf41] vscnprintf+0xe/0x1d [c012ae81] vprintk+0xcb/0x2f3 [c012b0be] printk+0x15/0x17 [c0145e55] print_lock_name+0x4e/0xa2 [c0146099] print_lock+0xe/0x3a [c01464cf] print_usage_bug+0xbc/0x117 [c0146fb6] mark_lock+0x2e7/0x3fe [c0147b9a] __lock_acquire+0x498/0xbf4 [c014836c] lock_acquire+0x76/0x9d [c03cd6d2] _read_lock+0x23/0x32 [c03491ae] sock_i_ino+0x14/0x30 [c03c88ed] packet_seq_show+0x22/0x75 [c019b41a] seq_read+0x19d/0x26f [c01b0ded] proc_reg_read+0x60/0x74 [c01854aa] vfs_read+0x8a/0x106 [c01858a8] sys_read+0x3b/0x60 [c0107cea] sysenter_past_esp+0x6b/0xc1 === Code: 01 00 00 00 4f 89 fa 5f 89 d0 5d c3 55 85 c9 89 e5 57 89 c7 89 d0 74 05 f2 ae 75 01 4f 89 f8 5f 5d c3 55 89 c1 89 e5 89 c8 eb 06 80 38 00 74 07 40 4a 83 fa ff 75 f4 29 c8 5d c3 90 90 90 55 83 EIP: [c01fe64d] strnlen+0x9/0x1c SS:ESP 0068:d6267cc8 note: cat[6944] exited with preempt_count 4 I'd say you hit a networking locking bug and then when trying to report that bug, lockdep crashed. The networking bug looks to be around sock_i_ino()'s taking of sk_callback_lock with softirq's enabled. Perhaps this will fix it. diff -puN net/core/sock.c~a net/core/sock.c --- a/net/core/sock.c~a +++ a/net/core/sock.c @@ -1115,9 +1115,9 @@ int sock_i_uid(struct sock *sk) { int uid; - read_lock(sk-sk_callback_lock); + read_lock_bh(sk-sk_callback_lock); uid = sk-sk_socket ? SOCK_INODE(sk-sk_socket)-i_uid : 0; - read_unlock(sk-sk_callback_lock); + read_unlock_bh(sk-sk_callback_lock); return uid; } @@ -1125,9 +1125,9 @@ unsigned long sock_i_ino(struct sock *sk { unsigned long ino; - read_lock(sk-sk_callback_lock); + read_lock_bh(sk-sk_callback_lock); ino = sk-sk_socket ? SOCK_INODE(sk-sk_socket)-i_ino : 0; - read_unlock(sk-sk_callback_lock); + read_unlock_bh(sk-sk_callback_lock); return ino; } _ I applied the patch and then tried my test again. This time my system locked up. Perhaps I should open a new thread for this, since the problem looks pretty different. Dec 14 21:32:55 feargod kernel: process `cat' is using deprecated sysctl (syscall) net.ipv6.neigh.default.retrans_time; Use net.ipv6.neigh.default.retran s_time_ms instead. Dec 14 21:32:55 feargod kernel: Dec 14 21:32:55 feargod kernel: = Dec 14 21:32:55 feargod kernel: [ BUG: bad unlock balance detected! ] Dec 14 21:32:55 feargod kernel
Re: 2.6.24-rc5-mm1 -- inconsistent {in-softirq-W} - {softirq-on-R} usage.
Andrew Morton [EMAIL PROTECTED] wrote: I'd say you hit a networking locking bug and then when trying to report that bug, lockdep crashed. The networking bug looks to be around sock_i_ino()'s taking of sk_callback_lock with softirq's enabled. Perhaps this will fix it. diff -puN net/core/sock.c~a net/core/sock.c --- a/net/core/sock.c~a +++ a/net/core/sock.c @@ -1115,9 +1115,9 @@ int sock_i_uid(struct sock *sk) { int uid; - read_lock(sk-sk_callback_lock); + read_lock_bh(sk-sk_callback_lock); uid = sk-sk_socket ? SOCK_INODE(sk-sk_socket)-i_uid : 0; - read_unlock(sk-sk_callback_lock); + read_unlock_bh(sk-sk_callback_lock); The callback lock is never taken for writing in BH context so this shouldn't be needed. Let's fix the lockdep checker first and then decide what's broken in networking, OK? Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED] Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc5-mm1
Hi, My config does not link any more: ... CHK include/linux/compile.h UPD include/linux/compile.h CC init/version.o LD init/built-in.o LD .tmp_vmlinux1 net/built-in.o: In function `xs_udp_data_ready': /home/peifferp/containers/kernel/linux-2.6.24-rc5-mm1/net/sunrpc/xprtsock.c:842: undefined reference to `udp_stats_in6' /home/peifferp/containers/kernel/linux-2.6.24-rc5-mm1/net/sunrpc/xprtsock.c:846: undefined reference to `udp_stats_in6' make[1]: *** [.tmp_vmlinux1] Error 1 make: *** [sub-make] Error 2 After a first look, udp_stats_in6 seems to be defined in ipv6 (file net/ipv6/udp.c) but I have CONFIG_IPV6=m and CONFIG_SUNRPC=y So, SUNRPC uses something defined in a module in my case ? ... looking more, this dependency seems to have been introduced by the patch [UDP]: Restore missing inDatagrams increments ( http://thread.gmane.org/gmane.linux.network/79716/focus=79831 ) (I cc netdev) I don't know what is the right way to fix this ... ? P. Andrew Morton wrote: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc5/2.6.24-rc5-mm1/ - If something goes wrong with a PCI device's probing or initialisation, try reverting pci-disable-decoding-during-sizing-of-bars.patch. - git-sched was dropped due to breaking suspend-to-RAM. - git-block has been restored after having had a few problems - git-newsetup.patch was dropped due to conflicts with git-x86 - git-perfmon.patch is still dropped for the same reason - git-kgdb.patch is still dropped for the same reason - Please do try to cc the correct developer and mailing list when reporting problems - I'm just buried in bugs over here. Boilerplate: - See the `hot-fixes' directory for any important updates to this patchset. - To fetch an -mm tree using git, use (for example) git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1 git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1 - -mm kernel commit activity can be reviewed by subscribing to the mm-commits mailing list. echo subscribe mm-commits | mail [EMAIL PROTECTED] - If you hit a bug in -mm and it is not obvious which patch caused it, it is most valuable if you can perform a bisection search to identify which patch introduced the bug. Instructions for this process are at http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt But beware that this process takes some time (around ten rebuilds and reboots), so consider reporting the bug first and if we cannot immediately identify the faulty patch, then perform the bisection search. - When reporting bugs, please try to Cc: the relevant maintainer and mailing list on any email. - When reporting bugs in this kernel via email, please also rewrite the email Subject: in some manner to reflect the nature of the bug. Some developers filter by Subject: when looking for messages to read. - Occasional snapshots of the -mm lineup are uploaded to ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on the mm-commits list. These probably are at least compilable. - More-than-daily -mm snapshots may be found at http://userweb.kernel.org/~akpm/mmotm/. These are almost certainly not compileable. Changes since 2.6.24-rc4-mm1: origin.patch git-acpi.patch git-alsa.patch git-agpgart.patch git-arm.patch git-arm-master.patch git-avr32.patch git-cpufreq.patch git-powerpc.patch git-drm.patch git-dvb.patch git-hwmon.patch git-gfs2-nmw.patch git-hid.patch git-hrt.patch git-ieee1394.patch git-infiniband.patch git-input.patch git-jfs.patch git-kbuild.patch git-kvm.patch git-lblnet.patch git-leds.patch git-libata-all.patch git-md-accel.patch git-mips.patch git-mmc.patch git-mtd.patch git-ubi.patch git-net.patch git-netdev-all.patch git-battery.patch git-nfs.patch git-nfsd.patch git-ocfs2.patch git-s390.patch git-sh.patch git-scsi-misc.patch git-scsi-rc-fixes.patch git-block.patch git-unionfs.patch git-v9fs.patch git-watchdog.patch git-wireless.patch git-ipwireless_cs.patch git-x86.patch git-xfs.patch git-cryptodev.patch git-xtensa.patch git trees -aio-only-account-i-o-wait-time-in-read_events-if-there-are-active-requests.patch -fix-cloneclone_newpid.patch -rtc-assure-proper-memory-ordering-with-respect-to-rtc_dev_busy-flag.patch -ufs-fix-nexstep-dir-block-size.patch -ufs-fix-nexstep-dir-block-size-checkpatch-fixes.patch -aoe-properly-initialise-the-request_queues-backing_dev_info.patch -mm-backing-devc-fix-percpu_counter_destroy-call-bug-in-bdi_init.patch -add-export_symbolksize.patch -spi-use-mutex-not-semaphore.patch -spi-at25-driver-is-for-eeprom-not-flash.patch -spi-simplify-spi_sync-calling-convention.patch -spi-use-simplified-spi_sync-calling-convention.patch -spi-initial-bf54x-spi
Re: 2.6.24-rc5-mm1
The problem comes from the new macro UDPX_INC_STATS_BH introduced by Herbert, which was a nice addition to increment the correct UDP MIB depending on the socket family, but unfortunately the use of this macro from kernel code (I mean code not compiled as module) requires that IPv6 is also compiled in kernel (CONFIG_IPv6=y) in order to have udp_stats_in6 defined at link time. Benjamin Pierre Peiffer wrote: Hi, My config does not link any more: ... CHK include/linux/compile.h UPD include/linux/compile.h CC init/version.o LD init/built-in.o LD .tmp_vmlinux1 net/built-in.o: In function `xs_udp_data_ready': /home/peifferp/containers/kernel/linux-2.6.24-rc5-mm1/net/sunrpc/xprtsock.c:842: undefined reference to `udp_stats_in6' /home/peifferp/containers/kernel/linux-2.6.24-rc5-mm1/net/sunrpc/xprtsock.c:846: undefined reference to `udp_stats_in6' make[1]: *** [.tmp_vmlinux1] Error 1 make: *** [sub-make] Error 2 After a first look, udp_stats_in6 seems to be defined in ipv6 (file net/ipv6/udp.c) but I have CONFIG_IPV6=m and CONFIG_SUNRPC=y So, SUNRPC uses something defined in a module in my case ? ... looking more, this dependency seems to have been introduced by the patch [UDP]: Restore missing inDatagrams increments ( http://thread.gmane.org/gmane.linux.network/79716/focus=79831 ) (I cc netdev) I don't know what is the right way to fix this ... ? P. Andrew Morton wrote: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc5/2.6.24-rc5-mm1/ - If something goes wrong with a PCI device's probing or initialisation, try reverting pci-disable-decoding-during-sizing-of-bars.patch. - git-sched was dropped due to breaking suspend-to-RAM. - git-block has been restored after having had a few problems - git-newsetup.patch was dropped due to conflicts with git-x86 - git-perfmon.patch is still dropped for the same reason - git-kgdb.patch is still dropped for the same reason - Please do try to cc the correct developer and mailing list when reporting problems - I'm just buried in bugs over here. Boilerplate: - See the `hot-fixes' directory for any important updates to this patchset. - To fetch an -mm tree using git, use (for example) git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1 git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1 - -mm kernel commit activity can be reviewed by subscribing to the mm-commits mailing list. echo subscribe mm-commits | mail [EMAIL PROTECTED] - If you hit a bug in -mm and it is not obvious which patch caused it, it is most valuable if you can perform a bisection search to identify which patch introduced the bug. Instructions for this process are at http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt But beware that this process takes some time (around ten rebuilds and reboots), so consider reporting the bug first and if we cannot immediately identify the faulty patch, then perform the bisection search. - When reporting bugs, please try to Cc: the relevant maintainer and mailing list on any email. - When reporting bugs in this kernel via email, please also rewrite the email Subject: in some manner to reflect the nature of the bug. Some developers filter by Subject: when looking for messages to read. - Occasional snapshots of the -mm lineup are uploaded to ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on the mm-commits list. These probably are at least compilable. - More-than-daily -mm snapshots may be found at http://userweb.kernel.org/~akpm/mmotm/. These are almost certainly not compileable. Changes since 2.6.24-rc4-mm1: origin.patch git-acpi.patch git-alsa.patch git-agpgart.patch git-arm.patch git-arm-master.patch git-avr32.patch git-cpufreq.patch git-powerpc.patch git-drm.patch git-dvb.patch git-hwmon.patch git-gfs2-nmw.patch git-hid.patch git-hrt.patch git-ieee1394.patch git-infiniband.patch git-input.patch git-jfs.patch git-kbuild.patch git-kvm.patch git-lblnet.patch git-leds.patch git-libata-all.patch git-md-accel.patch git-mips.patch git-mmc.patch git-mtd.patch git-ubi.patch git-net.patch git-netdev-all.patch git-battery.patch git-nfs.patch git-nfsd.patch git-ocfs2.patch git-s390.patch git-sh.patch git-scsi-misc.patch git-scsi-rc-fixes.patch git-block.patch git-unionfs.patch git-v9fs.patch git-watchdog.patch git-wireless.patch git-ipwireless_cs.patch git-x86.patch git-xfs.patch git-cryptodev.patch git-xtensa.patch git trees -aio-only-account-i-o-wait-time-in-read_events-if-there-are-active-requests.patch -fix-cloneclone_newpid.patch -rtc-assure-proper-memory-ordering-with-respect-to-rtc_dev_busy-flag.patch -ufs-fix-nexstep-dir-block-size.patch -ufs-fix-nexstep-dir
Re: 2.6.24-rc5-mm1
On Thu, Dec 13, 2007 at 04:01:34PM +0100, Benjamin Thery wrote: The problem comes from the new macro UDPX_INC_STATS_BH introduced by Herbert, which was a nice addition to increment the correct UDP MIB depending on the socket family, but unfortunately the use of this macro from kernel code (I mean code not compiled as module) requires that IPv6 is also compiled in kernel (CONFIG_IPv6=y) in order to have udp_stats_in6 defined at link time. Benjamin Pierre Peiffer wrote: Hi, My config does not link any more: ... CHK include/linux/compile.h UPD include/linux/compile.h CC init/version.o LD init/built-in.o LD .tmp_vmlinux1 net/built-in.o: In function `xs_udp_data_ready': /home/peifferp/containers/kernel/linux-2.6.24-rc5-mm1/net/sunrpc/xprtsock.c:842: undefined reference to `udp_stats_in6' /home/peifferp/containers/kernel/linux-2.6.24-rc5-mm1/net/sunrpc/xprtsock.c:846: undefined reference to `udp_stats_in6' make[1]: *** [.tmp_vmlinux1] Error 1 make: *** [sub-make] Error 2 After a first look, udp_stats_in6 seems to be defined in ipv6 (file net/ipv6/udp.c) but I have CONFIG_IPV6=m and CONFIG_SUNRPC=y So, SUNRPC uses something defined in a module in my case ? ... looking more, this dependency seems to have been introduced by the patch [UDP]: Restore missing inDatagrams increments ( http://thread.gmane.org/gmane.linux.network/79716/focus=79831 ) (I cc netdev) I don't know what is the right way to fix this ... ? you might do select IPV6 in the SUNRPC section of fs/Kconfig, however select is evil... -- Regards/Gruß, Boris. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc5-mm1
On Thu, Dec 13, 2007 at 05:07:44PM +0100, Borislav Petkov wrote: On Thu, Dec 13, 2007 at 04:01:34PM +0100, Benjamin Thery wrote: The problem comes from the new macro UDPX_INC_STATS_BH introduced by Herbert, which was a nice addition to increment the correct UDP MIB depending on the socket family, but unfortunately the use of this macro from kernel code (I mean code not compiled as module) requires that IPv6 is also compiled in kernel (CONFIG_IPv6=y) in order to have udp_stats_in6 defined at link time. Benjamin Pierre Peiffer wrote: Hi, My config does not link any more: ... CHK include/linux/compile.h UPD include/linux/compile.h CC init/version.o LD init/built-in.o LD .tmp_vmlinux1 net/built-in.o: In function `xs_udp_data_ready': /home/peifferp/containers/kernel/linux-2.6.24-rc5-mm1/net/sunrpc/xprtsock.c:842: undefined reference to `udp_stats_in6' /home/peifferp/containers/kernel/linux-2.6.24-rc5-mm1/net/sunrpc/xprtsock.c:846: undefined reference to `udp_stats_in6' make[1]: *** [.tmp_vmlinux1] Error 1 make: *** [sub-make] Error 2 After a first look, udp_stats_in6 seems to be defined in ipv6 (file net/ipv6/udp.c) but I have CONFIG_IPV6=m and CONFIG_SUNRPC=y So, SUNRPC uses something defined in a module in my case ? ... looking more, this dependency seems to have been introduced by the patch [UDP]: Restore missing inDatagrams increments ( http://thread.gmane.org/gmane.linux.network/79716/focus=79831 ) (I cc netdev) I don't know what is the right way to fix this ... ? you might do select IPV6 in the SUNRPC section of fs/Kconfig, however select is evil... select itself isn't evil. Nonsensical selects like the one you suggest (sunrpc does not require IPV6) are evil. Regards/Gruß, Boris. cu Adrian -- Is there not promise of rain? Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. Only a promise, Lao Er said. Pearl S. Buck - Dragon Seed -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc5-mm1
From: Benjamin Thery [EMAIL PROTECTED] Date: Thu, 13 Dec 2007 16:01:34 +0100 The problem comes from the new macro UDPX_INC_STATS_BH introduced by Herbert, which was a nice addition to increment the correct UDP MIB depending on the socket family, but unfortunately the use of this macro from kernel code (I mean code not compiled as module) requires that IPv6 is also compiled in kernel (CONFIG_IPv6=y) in order to have udp_stats_in6 defined at link time. Herbert, please take a look at this, thanks! Benjamin Pierre Peiffer wrote: Hi, My config does not link any more: ... CHK include/linux/compile.h UPD include/linux/compile.h CC init/version.o LD init/built-in.o LD .tmp_vmlinux1 net/built-in.o: In function `xs_udp_data_ready': /home/peifferp/containers/kernel/linux-2.6.24-rc5-mm1/net/sunrpc/xprtsock.c:842: undefined reference to `udp_stats_in6' /home/peifferp/containers/kernel/linux-2.6.24-rc5-mm1/net/sunrpc/xprtsock.c:846: undefined reference to `udp_stats_in6' make[1]: *** [.tmp_vmlinux1] Error 1 make: *** [sub-make] Error 2 After a first look, udp_stats_in6 seems to be defined in ipv6 (file net/ipv6/udp.c) but I have CONFIG_IPV6=m and CONFIG_SUNRPC=y So, SUNRPC uses something defined in a module in my case ? ... looking more, this dependency seems to have been introduced by the patch [UDP]: Restore missing inDatagrams increments ( http://thread.gmane.org/gmane.linux.network/79716/focus=79831 ) (I cc netdev) I don't know what is the right way to fix this ... ? P. Andrew Morton wrote: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc5/2.6.24-rc5-mm1/ - If something goes wrong with a PCI device's probing or initialisation, try reverting pci-disable-decoding-during-sizing-of-bars.patch. - git-sched was dropped due to breaking suspend-to-RAM. - git-block has been restored after having had a few problems - git-newsetup.patch was dropped due to conflicts with git-x86 - git-perfmon.patch is still dropped for the same reason - git-kgdb.patch is still dropped for the same reason - Please do try to cc the correct developer and mailing list when reporting problems - I'm just buried in bugs over here. Boilerplate: - See the `hot-fixes' directory for any important updates to this patchset. - To fetch an -mm tree using git, use (for example) git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1 git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1 - -mm kernel commit activity can be reviewed by subscribing to the mm-commits mailing list. echo subscribe mm-commits | mail [EMAIL PROTECTED] - If you hit a bug in -mm and it is not obvious which patch caused it, it is most valuable if you can perform a bisection search to identify which patch introduced the bug. Instructions for this process are at http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt But beware that this process takes some time (around ten rebuilds and reboots), so consider reporting the bug first and if we cannot immediately identify the faulty patch, then perform the bisection search. - When reporting bugs, please try to Cc: the relevant maintainer and mailing list on any email. - When reporting bugs in this kernel via email, please also rewrite the email Subject: in some manner to reflect the nature of the bug. Some developers filter by Subject: when looking for messages to read. - Occasional snapshots of the -mm lineup are uploaded to ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on the mm-commits list. These probably are at least compilable. - More-than-daily -mm snapshots may be found at http://userweb.kernel.org/~akpm/mmotm/. These are almost certainly not compileable. Changes since 2.6.24-rc4-mm1: origin.patch git-acpi.patch git-alsa.patch git-agpgart.patch git-arm.patch git-arm-master.patch git-avr32.patch git-cpufreq.patch git-powerpc.patch git-drm.patch git-dvb.patch git-hwmon.patch git-gfs2-nmw.patch git-hid.patch git-hrt.patch git-ieee1394.patch git-infiniband.patch git-input.patch git-jfs.patch git-kbuild.patch git-kvm.patch git-lblnet.patch git-leds.patch git-libata-all.patch git-md-accel.patch git-mips.patch git-mmc.patch git-mtd.patch git-ubi.patch git-net.patch git-netdev-all.patch git-battery.patch git-nfs.patch git-nfsd.patch git-ocfs2.patch git-s390.patch git-sh.patch git-scsi-misc.patch git-scsi-rc-fixes.patch git-block.patch git-unionfs.patch git-v9fs.patch git-watchdog.patch git-wireless.patch git-ipwireless_cs.patch git-x86.patch git-xfs.patch
Re: 2.6.24-rc5-mm1 regression - kernel warning on tcp_fastretrans_alert()
On Thu, 13 Dec 2007 20:26:21 +0530 Kamalesh Babulal [EMAIL PROTECTED] wrote: Hi Andrew, Hi. Please do try to cc netdev@vger.kernel.org on net-related problems. Doing so will often save multiple hours latency and will optimise away one entire email (ie: this one). Following call trace is seen in 2.6.24-rc5-mm1 kernel also,it was reported for 2.6.24-rc4-mm1 kernel http://lkml.org/lkml/2007/12/6/22 ls21b kernel: [ 7530.313408] WARNING: at net/ipv4/tcp_input.c:2533 tcp_fastretrans_alert() ls21b kernel: [ 7530.354051] Pid: 0, comm: swapper Not tainted 2.6.24-rc5-mm1 #1 ls21b kernel: [ 7530.389487] ls21b kernel: [ 7530.389488] Call Trace: ls21b kernel: [ 7530.413030] IRQ [80482374] tcp_fastretrans_alert+0x127/0xdaf ls21b kernel: [ 7530.454295] [804850cd] tcp_ack+0xf2f/0x10fe ls21b kernel: [ 7530.485066] [80488503] tcp_rcv_established+0x695/0x79a ls21b kernel: [ 7530.521542] [8025c46a] trace_hardirqs_off+0x39/0xdc ls21b kernel: [ 7530.556468] [8048eb70] tcp_v4_do_rcv+0x37/0x3e1 ls21b kernel: [ 7530.589317] [80491764] tcp_v4_rcv+0xac7/0xb93 ls21b kernel: [ 7530.621126] [80472c40] ip_local_deliver_finish+0x54/0x20f ls21b kernel: [ 7530.659168] [80472d20] ip_local_deliver_finish+0x134/0x20f ls21b kernel: [ 7530.697724] [804732cc] ip_local_deliver+0x72/0x7a ls21b kernel: [ 7530.731609] [80472b7c] ip_rcv_finish+0x3c0/0x430 ls21b kernel: [ 7530.764977] [8044d9a6] netif_receive_skb+0x10e/0x44d ls21b kernel: [ 7530.800422] [80473223] ip_rcv+0x326/0x35d ls21b kernel: [ 7530.830148] [8044dc77] netif_receive_skb+0x3df/0x44d ls21b kernel: [ 7530.865603] [8814d44a] :bnx2:bnx2_poll+0x1262/0x14a4 ls21b kernel: [ 7530.901039] [8034817d] __next_cpu+0x19/0x28 ls21b kernel: [ 7530.931805] [802323a1] find_busiest_group+0x252/0x6da ls21b kernel: [ 7530.967768] [8025c46a] trace_hardirqs_off+0x39/0xdc ls21b kernel: [ 7531.002693] [8025c46a] trace_hardirqs_off+0x39/0xdc ls21b kernel: [ 7531.037612] [8025c21f] check_chain_key+0x9c/0x15f ls21b kernel: [ 7531.071501] [8026012b] __lock_acquire+0xdee/0xf06 ls21b kernel: [ 7531.105386] [80450476] net_rx_action+0x75/0x234 ls21b kernel: [ 7531.138233] [80450476] net_rx_action+0x75/0x234 ls21b kernel: [ 7531.171074] [804504ed] net_rx_action+0xec/0x234 ls21b kernel: [ 7531.203920] [80243f02] __do_softirq+0x5f/0xe3 ls21b kernel: [ 7531.235721] [8020d5cc] call_softirq+0x1c/0x28 ls21b kernel: [ 7531.267528] [8020ecdf] do_softirq+0x45/0x108 ls21b kernel: [ 7531.298811] [80243ea1] irq_exit+0x4e/0x50 ls21b kernel: [ 7531.328540] [8020ef3d] do_IRQ+0x171/0x194 ls21b kernel: [ 7531.358267] [8020c8c6] ret_from_intr+0x0/0xf ls21b kernel: [ 7531.389549] EOI [8020b1ec] default_idle+0x58/0x8a ls21b kernel: [ 7531.425096] [8020b1ea] default_idle+0x56/0x8a ls21b kernel: [ 7531.456900] [8020b194] default_idle+0x0/0x8a ls21b kernel: [ 7531.488186] [8020b2d3] cpu_idle+0xb5/0xec ls21b kernel: [ 7531.517913] [802226f4] start_secondary+0x3ca/0x3da That is if (WARN_ON(!tp-sacked_out tp-fackets_out)) tp-fackets_out = 0; -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.24-rc5-mm1
On Thu, Dec 13, 2007 at 09:45:54AM -0800, David Miller wrote: From: Benjamin Thery [EMAIL PROTECTED] Date: Thu, 13 Dec 2007 16:01:34 +0100 The problem comes from the new macro UDPX_INC_STATS_BH introduced by Herbert, which was a nice addition to increment the correct UDP MIB depending on the socket family, but unfortunately the use of this macro from kernel code (I mean code not compiled as module) requires that IPv6 is also compiled in kernel (CONFIG_IPv6=y) in order to have udp_stats_in6 defined at link time. Herbert, please take a look at this, thanks! OK, let's just move udp_stats_in6 into net/ipv4/udp.c. It's only 40 bytes or less. [UDP]: Move udp_stats_in6 into net/ipv4/udp.c Now that external users may increment the counters directly, we need to ensure that udp_stats_in6 is always available. Otherwise we'd either have to requrie the external users to be built as modules or ipv6 to be built-in. This isn't too bad because udp_stats_in6 is just a pair of pointers plus an EXPORT, e.g., just 40 (16 + 24) bytes on x86-64. Signed-off-by: Herbert Xu [EMAIL PROTECTED] diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 9ed6393..3d60215 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -112,6 +112,9 @@ DEFINE_SNMP_STAT(struct udp_mib, udp_statistics) __read_mostly; EXPORT_SYMBOL(udp_statistics); +DEFINE_SNMP_STAT(struct udp_mib, udp_stats_in6) __read_mostly; +EXPORT_SYMBOL(udp_stats_in6); + struct hlist_head udp_hash[UDP_HTABLE_SIZE]; DEFINE_RWLOCK(udp_hash_lock); diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 8cbdcc9..7db5a9d 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -51,9 +51,6 @@ #include linux/seq_file.h #include udp_impl.h -DEFINE_SNMP_STAT(struct udp_mib, udp_stats_in6) __read_mostly; -EXPORT_SYMBOL(udp_stats_in6); - static inline int udp_v6_get_port(struct sock *sk, unsigned short snum) { return udp_get_port(sk, snum, ipv6_rcv_saddr_equal); Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED] Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html