Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
On 05/20, [EMAIL PROTECTED] wrote: > > I've done some more tests and quite frankly I think this is really related > to the dreaded ''fglrx.ko'' module. It seems to me that it is much easier > to reproduce the problem if that damn module is loaded. It does uses > workqueue. Then there is another driver ipw3945 loaded and it is required > to run binary only ''ipw3945d'' daemon just to start using wireless driver > ... > > In either way both these kernel modules are workqueue users. > > Btw, I had also tested kernel (compiled from the same source) but on > different laptop (EVO N800v), single core, Pentium M 2GHz. Kernel is not > freezing on shutdown, even loop nfs kernel stop/start - does not cause any > kernel panic as on nx9420 (Dual Core) laptop. And that with or without any > patch applied from Oleg. :(( Great. Even if not a bugfix, this patch is a reasonable cleanup anyway. Thank you very much for additional testing and report! Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
On Sun, May 20, 2007 at 01:37:13PM +0300, [EMAIL PROTECTED] wrote: > Hello Oleg, > > I've done some more tests and quite frankly I think this is really related > to the dreaded ''fglrx.ko'' module. It seems to me that it is much easier > to reproduce the problem if that damn module is loaded. It does uses > workqueue. Then there is another driver ipw3945 loaded and it is required > to run binary only ''ipw3945d'' daemon just to start using wireless driver > ... > > In either way both these kernel modules are workqueue users. Have you ever been able to reproduce the problem on a kernel that never had those modules loaded? --b. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
Hello Oleg, I've done some more tests and quite frankly I think this is really related to the dreaded ''fglrx.ko'' module. It seems to me that it is much easier to reproduce the problem if that damn module is loaded. It does uses workqueue. Then there is another driver ipw3945 loaded and it is required to run binary only ''ipw3945d'' daemon just to start using wireless driver ... In either way both these kernel modules are workqueue users. Btw, I had also tested kernel (compiled from the same source) but on different laptop (EVO N800v), single core, Pentium M 2GHz. Kernel is not freezing on shutdown, even loop nfs kernel stop/start - does not cause any kernel panic as on nx9420 (Dual Core) laptop. And that with or without any patch applied from Oleg. :(( I think this time it is really needed to stop here, kernel was tainted for a reason. :((( Thank you both, Oleg and Andrew. Zilvinas "Lucky ATI fglrx owner" Valinskas On Sat, 19 May 2007, Oleg Nesterov wrote: On 05/18, Zilvinas Valinskas wrote: On Thu, 2007-05-17 at 22:45 +0400, Oleg Nesterov wrote: However, I can't understand why cleanup_workqueue_thread() hangs anyway. It shouldn't. Looks like rpciod/1 was preempted, and can't get CPU. According to kernel-nfs-freeze.log it is TASK_RUNNING. Strange. It is very sad, because this code was supposed to be cleanuped anyway, but if it is really buggy, it would be great to know why. Can this be related to : CONFIG_PREEMPT=y Yes, but this preemption should be very unlikely, but it happens every time for you, strange. lockd in turn spins with preemption enabled, but somehow rpciod/1 can't make progress. system_state == SYSTEM_HALT, but this shouldn't affect preempt_schedule_irq(). So I think there is something else. workqueue.objdump - without any patch. So it hangs waiting for cwq->thread == NULL, as expected. OK. I still can't see how this code could be wrong, but it is bad anyway and should be changed. The 2nd patch was done more than a month ago, but was delayed for some stupid reasons. I'll send it today. Still, it is not clear to me what happens, and you have other crashes with nfs stop/start http://marc.info/?l=linux-kernel=117939027602591 http://marc.info/?l=linux-kernel=117939257630947 which probaly need some attention. Thanks! Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
Hello Oleg, I've done some more tests and quite frankly I think this is really related to the dreaded ''fglrx.ko'' module. It seems to me that it is much easier to reproduce the problem if that damn module is loaded. It does uses workqueue. Then there is another driver ipw3945 loaded and it is required to run binary only ''ipw3945d'' daemon just to start using wireless driver ... In either way both these kernel modules are workqueue users. Btw, I had also tested kernel (compiled from the same source) but on different laptop (EVO N800v), single core, Pentium M 2GHz. Kernel is not freezing on shutdown, even loop nfs kernel stop/start - does not cause any kernel panic as on nx9420 (Dual Core) laptop. And that with or without any patch applied from Oleg. :(( I think this time it is really needed to stop here, kernel was tainted for a reason. :((( Thank you both, Oleg and Andrew. Zilvinas Lucky ATI fglrx owner Valinskas On Sat, 19 May 2007, Oleg Nesterov wrote: On 05/18, Zilvinas Valinskas wrote: On Thu, 2007-05-17 at 22:45 +0400, Oleg Nesterov wrote: However, I can't understand why cleanup_workqueue_thread() hangs anyway. It shouldn't. Looks like rpciod/1 was preempted, and can't get CPU. According to kernel-nfs-freeze.log it is TASK_RUNNING. Strange. It is very sad, because this code was supposed to be cleanuped anyway, but if it is really buggy, it would be great to know why. Can this be related to : CONFIG_PREEMPT=y Yes, but this preemption should be very unlikely, but it happens every time for you, strange. lockd in turn spins with preemption enabled, but somehow rpciod/1 can't make progress. system_state == SYSTEM_HALT, but this shouldn't affect preempt_schedule_irq(). So I think there is something else. workqueue.objdump - without any patch. So it hangs waiting for cwq-thread == NULL, as expected. OK. I still can't see how this code could be wrong, but it is bad anyway and should be changed. The 2nd patch was done more than a month ago, but was delayed for some stupid reasons. I'll send it today. Still, it is not clear to me what happens, and you have other crashes with nfs stop/start http://marc.info/?l=linux-kernelm=117939027602591 http://marc.info/?l=linux-kernelm=117939257630947 which probaly need some attention. Thanks! Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
On Sun, May 20, 2007 at 01:37:13PM +0300, [EMAIL PROTECTED] wrote: Hello Oleg, I've done some more tests and quite frankly I think this is really related to the dreaded ''fglrx.ko'' module. It seems to me that it is much easier to reproduce the problem if that damn module is loaded. It does uses workqueue. Then there is another driver ipw3945 loaded and it is required to run binary only ''ipw3945d'' daemon just to start using wireless driver ... In either way both these kernel modules are workqueue users. Have you ever been able to reproduce the problem on a kernel that never had those modules loaded? --b. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
On 05/20, [EMAIL PROTECTED] wrote: I've done some more tests and quite frankly I think this is really related to the dreaded ''fglrx.ko'' module. It seems to me that it is much easier to reproduce the problem if that damn module is loaded. It does uses workqueue. Then there is another driver ipw3945 loaded and it is required to run binary only ''ipw3945d'' daemon just to start using wireless driver ... In either way both these kernel modules are workqueue users. Btw, I had also tested kernel (compiled from the same source) but on different laptop (EVO N800v), single core, Pentium M 2GHz. Kernel is not freezing on shutdown, even loop nfs kernel stop/start - does not cause any kernel panic as on nx9420 (Dual Core) laptop. And that with or without any patch applied from Oleg. :(( Great. Even if not a bugfix, this patch is a reasonable cleanup anyway. Thank you very much for additional testing and report! Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
On 05/18, Zilvinas Valinskas wrote: > > On Thu, 2007-05-17 at 22:45 +0400, Oleg Nesterov wrote: > > > > However, I can't understand why cleanup_workqueue_thread() hangs anyway. > > It shouldn't. Looks like rpciod/1 was preempted, and can't get CPU. > > According > > to kernel-nfs-freeze.log it is TASK_RUNNING. Strange. > > > > It is very sad, because this code was supposed to be cleanuped anyway, > > but if it is really buggy, it would be great to know why. > > Can this be related to : > > CONFIG_PREEMPT=y Yes, but this preemption should be very unlikely, but it happens every time for you, strange. lockd in turn spins with preemption enabled, but somehow rpciod/1 can't make progress. system_state == SYSTEM_HALT, but this shouldn't affect preempt_schedule_irq(). So I think there is something else. > workqueue.objdump - without any patch. So it hangs waiting for cwq->thread == NULL, as expected. OK. I still can't see how this code could be wrong, but it is bad anyway and should be changed. The 2nd patch was done more than a month ago, but was delayed for some stupid reasons. I'll send it today. Still, it is not clear to me what happens, and you have other crashes with nfs stop/start http://marc.info/?l=linux-kernel=117939027602591 http://marc.info/?l=linux-kernel=117939257630947 which probaly need some attention. Thanks! Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
On Fri, 18 May 2007 15:17:36 +0300 Zilvinas Valinskas <[EMAIL PROTECTED]> wrote: > Have found this in dmesg (well earlier because of initcall_debug) I've > never noticed that during boot (scrolls away too fast). Anyway - > > [7.841871] NetLabel: Initializing > [7.841983] NetLabel: domain hash size = 128 > [7.842095] NetLabel: protocols = UNLABELED CIPSOv4 > [7.842219] NetLabel: unlabeled traffic allowed by default > [7.842338] BUG: at include/linux/slub_def.h:77 kmalloc_index() > [7.842451] > [7.842452] Call Trace: > [7.842677] [] get_slab+0x1cc/0x260 > [7.842791] [] __kmalloc+0xd/0x80 > [7.842907] [] cache_k8_northbridges+0x7e/0x100 > [7.843024] [] gart_iommu_init+0x33/0x5b0 > [7.843140] [] netlbl_unlabel_acceptflg_set+0x86/0xf0 > [7.843255] [] pci_iommu_init+0x9/0x20 > [7.843370] [] kernel_init+0x157/0x330 > [7.843485] [] child_rip+0xa/0x12 > [7.843601] [] acpi_ds_init_one_object+0x0/0x7c > [7.843715] [] kernel_init+0x0/0x330 > [7.843829] [] child_rip+0x0/0x12 > [7.843941] > [7.844056] PCI-GART: No AMD northbridge found. yup, thanks - the below patch will be in this evening's batch -> Linus. From: Ben Collins <[EMAIL PROTECTED]> kmalloc for flush_words resulted in zero size allocation when no k8_northbridges existed. Short circuit the code path for this case. Also remove uneeded zeroing of num_k8_northbridges just after checking if it is zero. Signed-off-by: Ben Collins <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Cc: Dave Jones <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- arch/x86_64/kernel/k8.c |7 ++- 1 files changed, 6 insertions(+), 1 deletion(-) diff -puN arch/x86_64/kernel/k8.c~avoid-zero-size-allocation-in-cache_k8_northbridges arch/x86_64/kernel/k8.c --- a/arch/x86_64/kernel/k8.c~avoid-zero-size-allocation-in-cache_k8_northbridges +++ a/arch/x86_64/kernel/k8.c @@ -39,10 +39,10 @@ int cache_k8_northbridges(void) { int i; struct pci_dev *dev; + if (num_k8_northbridges) return 0; - num_k8_northbridges = 0; dev = NULL; while ((dev = next_k8_northbridge(dev)) != NULL) num_k8_northbridges++; @@ -52,6 +52,11 @@ int cache_k8_northbridges(void) if (!k8_northbridges) return -ENOMEM; + if (!num_k8_northbridges) { + k8_northbridges[0] = NULL; + return 0; + } + flush_words = kmalloc(num_k8_northbridges * sizeof(u32), GFP_KERNEL); if (!flush_words) { kfree(k8_northbridges); _ > Does this backtrace looks sane ? Hmm, netlabel code mixes with > acpi_ds_init_one_object() ... Strange. Backtraces can be pretty messy nowadays. CONFIG_FRAME_POINTER helps improve them. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
Hello, Have found this in dmesg (well earlier because of initcall_debug) I've never noticed that during boot (scrolls away too fast). Anyway - [7.841871] NetLabel: Initializing [7.841983] NetLabel: domain hash size = 128 [7.842095] NetLabel: protocols = UNLABELED CIPSOv4 [7.842219] NetLabel: unlabeled traffic allowed by default [7.842338] BUG: at include/linux/slub_def.h:77 kmalloc_index() [7.842451] [7.842452] Call Trace: [7.842677] [] get_slab+0x1cc/0x260 [7.842791] [] __kmalloc+0xd/0x80 [7.842907] [] cache_k8_northbridges+0x7e/0x100 [7.843024] [] gart_iommu_init+0x33/0x5b0 [7.843140] [] netlbl_unlabel_acceptflg_set+0x86/0xf0 [7.843255] [] pci_iommu_init+0x9/0x20 [7.843370] [] kernel_init+0x157/0x330 [7.843485] [] child_rip+0xa/0x12 [7.843601] [] acpi_ds_init_one_object+0x0/0x7c [7.843715] [] kernel_init+0x0/0x330 [7.843829] [] child_rip+0x0/0x12 [7.843941] [7.844056] PCI-GART: No AMD northbridge found. Does this backtrace looks sane ? Hmm, netlabel code mixes with acpi_ds_init_one_object() ... Strange. On Wed, 2007-05-16 at 12:15 -0700, Andrew Morton wrote: > On Wed, 16 May 2007 21:00:41 +0300 > Zilvinas Valinskas <[EMAIL PROTECTED]> wrote: > > > Hello, > > > > In short, on shutdown my laptop is always freezing now. I was able to > > capture the 'sysrq-P' (hit that several times), sysrq-T outputs. Please > > see .config and log messages at http://barclay.balt.net/~zilvinas/oops/ > > > > Kernel version I had built according git is : > > > > [EMAIL PROTECTED]:/projects/linux-amd64.git$ git describe HEAD > > v2.6.22-rc1-29-gfaa8b6c > > > > On top of that I have CFS v12 applied (no other changes otherwise). > > Please note that there is ''fglrx.ko'' loaded and kernel is tainted > > because of that (feel free to ignore the report ...). > > > > Anyway, 'sysrq-P' always show that PC is stuck at (NFS lockd?) and it is > > always the same backtrace is shown. 'sysrq-t' output is in > > 'kernel-nfs-freeze.log' file (did not want to post it here). > > > > Pid: 3652, comm: lockd Tainted: P 2.6.22-rc1-cfs-v12 #1 > > > > [] wq_barrier_func+0x0/0x10 > > [] destroy_workqueue+0x75/0xa0 > > [] :sunrpc:rpciod_down+0xf4/0x170 > > [] :lockd:lockd+0x244/0x300 > > [] schedule_tail+0x3f/0xb0 > > [] child_rip+0xa/0x12 > > [] :lockd:lockd+0x0/0x300 > > [] :lockd:lockd+0x0/0x300 > > [] child_rip+0x0/0x12 > > > > Hope this helps. Thanks in advance for any advice how to solve problem ! > > For now I am back to '2.6.21.1-cfs-v10'. > > > > Thanks for the report. I'm thinking "Oleg". - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
Hello Oleg, On Thu, 2007-05-17 at 22:45 +0400, Oleg Nesterov wrote: > Hello Zilvinas, > > On 05/17, Zilvinas Valinskas wrote: > > > > Patch seems to help and it seems kernel doesn't free anymore. I've > > booted new kernel and did : > > OK, thank you very much. So, we have some other problems, and I _think_ > that workqueue.c is not the source of them. You are welcome. I wish I could determine and fix the problem myself. I will try to help, debug the problem as long as there is any progress or ideas to try out. > However, I can't understand why cleanup_workqueue_thread() hangs anyway. > It shouldn't. Looks like rpciod/1 was preempted, and can't get CPU. According > to kernel-nfs-freeze.log it is TASK_RUNNING. Strange. > > It is very sad, because this code was supposed to be cleanuped anyway, > but if it is really buggy, it would be great to know why. Can this be related to : CONFIG_PREEMPT=y # CONFIG_PREEMPT_BKL is not set > Perhaps, we can understand the problem with your help. Could you please > revert the patch I sent, and send me (privately) the output of > > objdump -d kernel/workqueue.o I have uploaded files at http://barclay.balt.net/~zilvinas/oops/ workqueue.objdump - without any patch. workqueue+oleg-old.objdump - with older patch Oleg sent on Thu, 17 May. workqueue+oleg-new.objdump - with the newest patch from Oleg applied. For what it's worth, I am using Debian/Unstable $ gcc -v Using built-in specs. Target: x86_64-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c ++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --with-gxx-include-dir=/usr/include/c++/4.1.3 --program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --enable-checking=release x86_64-linux-gnu Thread model: posix gcc version 4.1.3 20070514 (prerelease) (Debian 4.1.2-7) $ ld -V GNU ld (GNU Binutils for Debian) 2.17.50.20070426 Supported emulations: elf_x86_64 elf_i386 i386linux > ? I doubt very much I'll see something interesting, but who knows... > > Thanks! > > Oleg. > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
Hello Oleg, On Thu, 2007-05-17 at 22:45 +0400, Oleg Nesterov wrote: Hello Zilvinas, On 05/17, Zilvinas Valinskas wrote: Patch seems to help and it seems kernel doesn't free anymore. I've booted new kernel and did : OK, thank you very much. So, we have some other problems, and I _think_ that workqueue.c is not the source of them. You are welcome. I wish I could determine and fix the problem myself. I will try to help, debug the problem as long as there is any progress or ideas to try out. However, I can't understand why cleanup_workqueue_thread() hangs anyway. It shouldn't. Looks like rpciod/1 was preempted, and can't get CPU. According to kernel-nfs-freeze.log it is TASK_RUNNING. Strange. It is very sad, because this code was supposed to be cleanuped anyway, but if it is really buggy, it would be great to know why. Can this be related to : CONFIG_PREEMPT=y # CONFIG_PREEMPT_BKL is not set Perhaps, we can understand the problem with your help. Could you please revert the patch I sent, and send me (privately) the output of objdump -d kernel/workqueue.o I have uploaded files at http://barclay.balt.net/~zilvinas/oops/ workqueue.objdump - without any patch. workqueue+oleg-old.objdump - with older patch Oleg sent on Thu, 17 May. workqueue+oleg-new.objdump - with the newest patch from Oleg applied. For what it's worth, I am using Debian/Unstable $ gcc -v Using built-in specs. Target: x86_64-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c ++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --with-gxx-include-dir=/usr/include/c++/4.1.3 --program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --enable-checking=release x86_64-linux-gnu Thread model: posix gcc version 4.1.3 20070514 (prerelease) (Debian 4.1.2-7) $ ld -V GNU ld (GNU Binutils for Debian) 2.17.50.20070426 Supported emulations: elf_x86_64 elf_i386 i386linux ? I doubt very much I'll see something interesting, but who knows... Thanks! Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
Hello, Have found this in dmesg (well earlier because of initcall_debug) I've never noticed that during boot (scrolls away too fast). Anyway - [7.841871] NetLabel: Initializing [7.841983] NetLabel: domain hash size = 128 [7.842095] NetLabel: protocols = UNLABELED CIPSOv4 [7.842219] NetLabel: unlabeled traffic allowed by default [7.842338] BUG: at include/linux/slub_def.h:77 kmalloc_index() [7.842451] [7.842452] Call Trace: [7.842677] [8029215c] get_slab+0x1cc/0x260 [7.842791] [8029229d] __kmalloc+0xd/0x80 [7.842907] [802219ee] cache_k8_northbridges+0x7e/0x100 [7.843024] [8062bd13] gart_iommu_init+0x33/0x5b0 [7.843140] [8049f836] netlbl_unlabel_acceptflg_set+0x86/0xf0 [7.843255] [80626f49] pci_iommu_init+0x9/0x20 [7.843370] [806216d7] kernel_init+0x157/0x330 [7.843485] [8020b0f8] child_rip+0xa/0x12 [7.843601] [80373fd8] acpi_ds_init_one_object+0x0/0x7c [7.843715] [80621580] kernel_init+0x0/0x330 [7.843829] [8020b0ee] child_rip+0x0/0x12 [7.843941] [7.844056] PCI-GART: No AMD northbridge found. Does this backtrace looks sane ? Hmm, netlabel code mixes with acpi_ds_init_one_object() ... Strange. On Wed, 2007-05-16 at 12:15 -0700, Andrew Morton wrote: On Wed, 16 May 2007 21:00:41 +0300 Zilvinas Valinskas [EMAIL PROTECTED] wrote: Hello, In short, on shutdown my laptop is always freezing now. I was able to capture the 'sysrq-P' (hit that several times), sysrq-T outputs. Please see .config and log messages at http://barclay.balt.net/~zilvinas/oops/ Kernel version I had built according git is : [EMAIL PROTECTED]:/projects/linux-amd64.git$ git describe HEAD v2.6.22-rc1-29-gfaa8b6c On top of that I have CFS v12 applied (no other changes otherwise). Please note that there is ''fglrx.ko'' loaded and kernel is tainted because of that (feel free to ignore the report ...). Anyway, 'sysrq-P' always show that PC is stuck at (NFS lockd?) and it is always the same backtrace is shown. 'sysrq-t' output is in 'kernel-nfs-freeze.log' file (did not want to post it here). Pid: 3652, comm: lockd Tainted: P 2.6.22-rc1-cfs-v12 #1 [8024a5a0] wq_barrier_func+0x0/0x10 [8024a7e5] destroy_workqueue+0x75/0xa0 [8833cd34] :sunrpc:rpciod_down+0xf4/0x170 [8836dd74] :lockd:lockd+0x244/0x300 [80233e1f] schedule_tail+0x3f/0xb0 [8020b0f8] child_rip+0xa/0x12 [8836db30] :lockd:lockd+0x0/0x300 [8836db30] :lockd:lockd+0x0/0x300 [8020b0ee] child_rip+0x0/0x12 Hope this helps. Thanks in advance for any advice how to solve problem ! For now I am back to '2.6.21.1-cfs-v10'. Thanks for the report. I'm thinking Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
On Fri, 18 May 2007 15:17:36 +0300 Zilvinas Valinskas [EMAIL PROTECTED] wrote: Have found this in dmesg (well earlier because of initcall_debug) I've never noticed that during boot (scrolls away too fast). Anyway - [7.841871] NetLabel: Initializing [7.841983] NetLabel: domain hash size = 128 [7.842095] NetLabel: protocols = UNLABELED CIPSOv4 [7.842219] NetLabel: unlabeled traffic allowed by default [7.842338] BUG: at include/linux/slub_def.h:77 kmalloc_index() [7.842451] [7.842452] Call Trace: [7.842677] [8029215c] get_slab+0x1cc/0x260 [7.842791] [8029229d] __kmalloc+0xd/0x80 [7.842907] [802219ee] cache_k8_northbridges+0x7e/0x100 [7.843024] [8062bd13] gart_iommu_init+0x33/0x5b0 [7.843140] [8049f836] netlbl_unlabel_acceptflg_set+0x86/0xf0 [7.843255] [80626f49] pci_iommu_init+0x9/0x20 [7.843370] [806216d7] kernel_init+0x157/0x330 [7.843485] [8020b0f8] child_rip+0xa/0x12 [7.843601] [80373fd8] acpi_ds_init_one_object+0x0/0x7c [7.843715] [80621580] kernel_init+0x0/0x330 [7.843829] [8020b0ee] child_rip+0x0/0x12 [7.843941] [7.844056] PCI-GART: No AMD northbridge found. yup, thanks - the below patch will be in this evening's batch - Linus. From: Ben Collins [EMAIL PROTECTED] kmalloc for flush_words resulted in zero size allocation when no k8_northbridges existed. Short circuit the code path for this case. Also remove uneeded zeroing of num_k8_northbridges just after checking if it is zero. Signed-off-by: Ben Collins [EMAIL PROTECTED] Cc: Andi Kleen [EMAIL PROTECTED] Cc: Dave Jones [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- arch/x86_64/kernel/k8.c |7 ++- 1 files changed, 6 insertions(+), 1 deletion(-) diff -puN arch/x86_64/kernel/k8.c~avoid-zero-size-allocation-in-cache_k8_northbridges arch/x86_64/kernel/k8.c --- a/arch/x86_64/kernel/k8.c~avoid-zero-size-allocation-in-cache_k8_northbridges +++ a/arch/x86_64/kernel/k8.c @@ -39,10 +39,10 @@ int cache_k8_northbridges(void) { int i; struct pci_dev *dev; + if (num_k8_northbridges) return 0; - num_k8_northbridges = 0; dev = NULL; while ((dev = next_k8_northbridge(dev)) != NULL) num_k8_northbridges++; @@ -52,6 +52,11 @@ int cache_k8_northbridges(void) if (!k8_northbridges) return -ENOMEM; + if (!num_k8_northbridges) { + k8_northbridges[0] = NULL; + return 0; + } + flush_words = kmalloc(num_k8_northbridges * sizeof(u32), GFP_KERNEL); if (!flush_words) { kfree(k8_northbridges); _ Does this backtrace looks sane ? Hmm, netlabel code mixes with acpi_ds_init_one_object() ... Strange. Backtraces can be pretty messy nowadays. CONFIG_FRAME_POINTER helps improve them. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
On 05/18, Zilvinas Valinskas wrote: On Thu, 2007-05-17 at 22:45 +0400, Oleg Nesterov wrote: However, I can't understand why cleanup_workqueue_thread() hangs anyway. It shouldn't. Looks like rpciod/1 was preempted, and can't get CPU. According to kernel-nfs-freeze.log it is TASK_RUNNING. Strange. It is very sad, because this code was supposed to be cleanuped anyway, but if it is really buggy, it would be great to know why. Can this be related to : CONFIG_PREEMPT=y Yes, but this preemption should be very unlikely, but it happens every time for you, strange. lockd in turn spins with preemption enabled, but somehow rpciod/1 can't make progress. system_state == SYSTEM_HALT, but this shouldn't affect preempt_schedule_irq(). So I think there is something else. workqueue.objdump - without any patch. So it hangs waiting for cwq-thread == NULL, as expected. OK. I still can't see how this code could be wrong, but it is bad anyway and should be changed. The 2nd patch was done more than a month ago, but was delayed for some stupid reasons. I'll send it today. Still, it is not clear to me what happens, and you have other crashes with nfs stop/start http://marc.info/?l=linux-kernelm=117939027602591 http://marc.info/?l=linux-kernelm=117939257630947 which probaly need some attention. Thanks! Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
Hello Zilvinas, On 05/17, Zilvinas Valinskas wrote: > > Patch seems to help and it seems kernel doesn't free anymore. I've > booted new kernel and did : OK, thank you very much. So, we have some other problems, and I _think_ that workqueue.c is not the source of them. However, I can't understand why cleanup_workqueue_thread() hangs anyway. It shouldn't. Looks like rpciod/1 was preempted, and can't get CPU. According to kernel-nfs-freeze.log it is TASK_RUNNING. Strange. It is very sad, because this code was supposed to be cleanuped anyway, but if it is really buggy, it would be great to know why. Perhaps, we can understand the problem with your help. Could you please revert the patch I sent, and send me (privately) the output of objdump -d kernel/workqueue.o ? I doubt very much I'll see something interesting, but who knows... Thanks! Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
And another one crash, achieved by running the following in the shell. Ran several times, as see from dmesg: $ op=stop; sudo /etc/init.d/nfs-common $op; \ sudo /etc/init.d/nfs-kernel-server $op; \ op=start; sudo /etc/init.d/nfs-common $op; \ sudo /etc/init.d/nfs-kernel-server $op; Repeat several times ;) The dmesg output: May 17 11:36:23 zv kernel: [ 613.071050] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory May 17 11:36:23 zv kernel: [ 613.071082] NFSD: starting 90-second grace period May 17 11:36:25 zv kernel: [ 615.639312] nfsd: last server has exited May 17 11:36:25 zv kernel: [ 615.639322] nfsd: unexporting all filesystems May 17 11:36:25 zv kernel: [ 615.838746] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory May 17 11:36:25 zv kernel: [ 615.838782] NFSD: starting 90-second grace period May 17 11:36:26 zv kernel: [ 616.464554] nfsd: last server has exited May 17 11:36:26 zv kernel: [ 616.464563] nfsd: unexporting all filesystems May 17 11:36:26 zv kernel: [ 616.468219] RPC: failed to contact local rpcbind server (errno 5). May 17 11:36:26 zv kernel: [ 616.669736] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory May 17 11:36:26 zv kernel: [ 616.669771] NFSD: starting 90-second grace period May 17 11:36:27 zv kernel: [ 617.200592] nfsd: last server has exited May 17 11:36:27 zv kernel: [ 617.200601] nfsd: unexporting all filesystems May 17 11:36:27 zv kernel: [ 617.202565] RPC: failed to contact local rpcbind server (errno 5). May 17 11:36:27 zv kernel: [ 617.409917] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory May 17 11:36:27 zv kernel: [ 617.409948] NFSD: starting 90-second grace period May 17 11:36:27 zv kernel: [ 617.872937] nfsd: last server has exited May 17 11:36:27 zv kernel: [ 617.872945] nfsd: unexporting all filesystems May 17 11:36:27 zv kernel: [ 617.877526] RPC: failed to contact local rpcbind server (errno 5). May 17 11:36:28 zv kernel: [ 618.084212] PGD 21f9e067 PUD 3b8bf067 PMD 0 May 17 11:36:28 zv kernel: [ 618.084224] CPU 0 May 17 11:36:28 zv kernel: [ 618.084227] Modules linked in: fglrx(P) nfs ipv6 nfsd exportfs lockd nfs_acl sunrpc pp dev lp autofs4 deflate zlib_deflate twofish twofish_common camellia serpent blowfish des cbc ecb blkcipher aes xcbc sha256 sha1 crypto_null af_key piix ide_core dm_crypt dm_snapshot dm_mirror dm_mod sbp2 loop coretemp cpufreq_conser vative cpufreq_stats acpi_cpufreq freq_table pcmcia snd_hda_intel usbhid snd_pcm_oss snd_mixer_oss pl2303 ipw3945 ye nta_socket snd_pcm ohci1394 ieee1394 tifm_7xx1 joydev snd_timer usbserial tsdev tpm_infineon sdhci rsrc_nonstatic ie ee80211 ieee80211_crypt parport_pc snd tpm fw_ohci fw_core parport firmware_class iTCO_wdt iTCO_vendor_support sg ps mouse pcmcia_core tg3 mmc_core crc_itu_t tifm_core pcspkr tpm_bios soundcore snd_page_alloc intel_agp sr_mod serio_r aw ehci_hcd uhci_hcd cdrom evdev May 17 11:36:28 zv kernel: [ 618.084327] Pid: 5560, comm: rpc.nfsd Tainted: P 2.6.22-rc1-cfs-v12 #2 May 17 11:36:28 zv kernel: [ 618.084332] RIP: 0010:[] [] kobject_cleanup+0x24/ 0xa0 May 17 11:36:28 zv kernel: [ 618.084342] RSP: 0018:8100210bdd08 EFLAGS: 00010202 May 17 11:36:28 zv kernel: [ 618.084347] RAX: 0001 RBX: 810021c7d688 RCX: 804c4be0 May 17 11:36:28 zv kernel: [ 618.084353] RDX: RSI: 80341f40 RDI: 810021c7d688 May 17 11:36:28 zv kernel: [ 618.084358] RBP: 80341f40 R08: R09: May 17 11:36:28 zv kernel: [ 618.084362] R10: 0001 R11: R12: 0010 May 17 11:36:28 zv kernel: [ 618.084367] R13: 810001fe6270 R14: 88382941 R15: May 17 11:36:28 zv kernel: [ 618.084374] FS: 2ab11a0db6f0() GS:80603000() knlGS:00 00 May 17 11:36:28 zv kernel: [ 618.084379] CS: 0010 DS: ES: CR0: 8005003b May 17 11:36:28 zv kernel: [ 618.084384] CR2: 0010 CR3: 38d4f000 CR4: 06e0 May 17 11:36:28 zv kernel: [ 618.084390] Process rpc.nfsd (pid: 5560, threadinfo 8100210bc000, task 8100266 a6000) May 17 11:36:28 zv kernel: [ 618.084394] Stack: 0287 810021c7d6a4 80341f40 81003bf9837 8 May 17 11:36:28 zv kernel: [ 618.084405] 810001fe6270 80342fff 81003bf98378 810038bf0f50 May 17 11:36:28 zv kernel: [ 618.084414] 81003bf98370 802e59ec 88382941 810021250100 May 17 11:36:28 zv kernel: [ 618.084422] Call Trace: May 17 11:36:28 zv kernel: [ 618.084432] [] kobject_release+0x0/0x10 May 17 11:36:28 zv kernel: [ 618.084440] [] kref_put+0x3f/0x80 May 17 11:36:28 zv kernel: [ 618.084449] [] sysfs_hash_and_remove+0x14c/0x160 May 17 11:36:28 zv kernel: [ 618.084460] [] sysfs_slab_alias+0x71/0xa0 May 17 11:36:28 zv
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
Hello Oleg, Patch seems to help and it seems kernel doesn't free anymore. I've booted new kernel and did : #1 $ sudo /etc/init.d/nfs-kernel-server stop #2 $ sudo /etc/init.d/nfs-common stop Previously it was enough to run '#1' to freeze the kernel. This time with your patch applied #1 and #2 worked fine. So far so good. Don't know why , but I've tried to run #1 & #2 several times - as a result OOPS (kernel is tainted). Opps from dmesg: [ 429.103734] usb 1-5.4: link qh8-0601/81003ebac320 start 7 [1/2 us] [ 436.009276] nfsd: last server has exited [ 436.009410] nfsd: unexporting all filesystems [ 436.011395] RPC: failed to contact local rpcbind server (errno 5). [ 460.950495] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory [ 460.950659] NFSD: starting 90-second grace period [ 615.796112] nfsd: last server has exited [ 615.796121] nfsd: unexporting all filesystems [ 615.800976] RPC: failed to contact local rpcbind server (errno 5). [ 619.444368] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory [ 619.03] NFSD: starting 90-second grace period [ 620.576730] nfsd: last server has exited [ 620.576739] nfsd: unexporting all filesystems [ 620.581036] RPC: failed to contact local rpcbind server (errno 5). [ 621.606324] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory [ 621.606359] NFSD: starting 90-second grace period [ 622.561989] nfsd: last server has exited [ 622.561999] nfsd: unexporting all filesystems [ 623.639396] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory [ 623.639430] NFSD: starting 90-second grace period [ 623.639487] Unable to handle kernel paging request at RIP: [ 623.639492] [] __kfree_skb+0x9f/0x150 [ 623.639504] PGD 203067 PUD 0 [ 623.639510] Oops: 0002 [1] PREEMPT SMP [ 623.639515] CPU 0 [ 623.639519] Modules linked in: fglrx(P) nfs nfsd exportfs lockd nfs_acl sunrpc ppdev lp autofs4 ipw3945 ieee80211 ieee80211_crypt ipv6 deflate zlib_deflate twofish twofish_common camellia serpent blowfish des cbc ecb blkcipher aes xcbc sha256 sha1 crypto_null af_key piix ide_core dm_crypt dm_snapshot dm_mirror dm_mod sbp2 loop coretemp cpufreq_conservative cpufreq_stats acpi_cpufreq freq_table usbhid pl2303 ohci1394 ieee1394 usbserial pcmcia firmware_class snd_hda_intel snd_pcm_oss snd_mixer_oss sdhci snd_pcm joydev iTCO_wdt fw_ohci fw_core mmc_core snd_timer tg3 sg snd yenta_socket rsrc_nonstatic pcmcia_core crc_itu_t iTCO_vendor_support tifm_7xx1 tsdev parport_pc parport intel_agp tpm_infineon tpm tpm_bios uhci_hcd sr_mod tifm_core ehci_hcd psmouse soundcore snd_page_alloc pcspkr serio_raw evdev cdrom [ 623.639616] Pid: 616, comm: udevd Tainted: P 2.6.22-rc1-cfs-v12 #2 [ 623.639622] RIP: 0010:[] [] __kfree_skb+0x9f/0x150 [ 623.639631] RSP: 0018:81003ed87be8 EFLAGS: 00010286 [ 623.639635] RAX: 81003f2144a0 RBX: RCX: [ 623.639641] RDX: 0130 RSI: 8100285eb400 RDI: [ 623.639646] RBP: 8100285eb400 R08: 0050eaf0 R09: [ 623.639651] R10: R11: 0246 R12: 81003f214400 [ 623.639656] R13: 81003ed87ee8 R14: 8100285eb440 R15: [ 623.639662] FS: 2b0370c18e00() GS:80603000() knlGS: [ 623.639667] CS: 0010 DS: ES: CR0: 8005003b [ 623.639672] CR2: CR3: 3ed6c000 CR4: 06e0 [ 623.639678] Process udevd (pid: 616, threadinfo 81003ed86000, task 81003ecd) [ 623.639682] Stack: 81003ed87ee8 81003ed87e68 8100285eb400 8043b6a6 [ 623.639694] 0001 810001ff7b80 0050 81003ed87db8 [ 623.639702] [ 623.639709] Call Trace: [ 623.639719] [] netlink_recvmsg+0x176/0x3a0 [ 623.639739] [] sock_recvmsg+0x150/0x170 [ 623.639754] [] autoremove_wake_function+0x0/0x30 [ 623.639768] [] core_sys_select+0x26e/0x350 [ 623.639785] [] __d_lookup+0x165/0x180 [ 623.639797] [] sys_recvfrom+0xfe/0x190 [ 623.639807] [] remove_wait_queue+0x19/0x60 [ 623.639823] [] sys_select+0x44/0x1c0 [ 623.639836] [] system_call+0x7e/0x83 [ 623.639849] [ 623.639851] [ 623.639852] Code: f0 ff 0f 0f 94 c0 84 c0 75 27 66 c7 85 a8 00 00 00 00 00 66 [ 623.639871] RIP [] __kfree_skb+0x9f/0x150 [ 623.639878] RSP [ 623.639881] CR2: Hmm, I've got something different now :( - On Thu, 2007-05-17 at 02:55 +0400, Oleg Nesterov wrote: > On Wed, 16 May 2007 21:00:41 +0300 > Zilvinas Valinskas <[EMAIL PROTECTED]> wrote: > > > > In short, on shutdown my laptop is always freezing now. I was able to > > capture the 'sysrq-P' (hit that several times), sysrq-T outputs. Please > > see .config and log messages at http://barclay.balt.net/~zilvinas/oops/ > > > >
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
Hello Oleg, Andrew, Sure no problem Oleg, compiling now, reboot will follow with results. Thank you both ! On Thu, 2007-05-17 at 02:55 +0400, Oleg Nesterov wrote: > Zilvinas, could you try the patch below? > > It is a shot in the dark. I hope I'll suggest somethimg better tomorrow. > > Oleg. > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
Hello Oleg, Andrew, Sure no problem Oleg, compiling now, reboot will follow with results. Thank you both ! On Thu, 2007-05-17 at 02:55 +0400, Oleg Nesterov wrote: Zilvinas, could you try the patch below? It is a shot in the dark. I hope I'll suggest somethimg better tomorrow. Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
Hello Oleg, Patch seems to help and it seems kernel doesn't free anymore. I've booted new kernel and did : #1 $ sudo /etc/init.d/nfs-kernel-server stop #2 $ sudo /etc/init.d/nfs-common stop Previously it was enough to run '#1' to freeze the kernel. This time with your patch applied #1 and #2 worked fine. So far so good. Don't know why , but I've tried to run #1 #2 several times - as a result OOPS (kernel is tainted). Opps from dmesg: [ 429.103734] usb 1-5.4: link qh8-0601/81003ebac320 start 7 [1/2 us] [ 436.009276] nfsd: last server has exited [ 436.009410] nfsd: unexporting all filesystems [ 436.011395] RPC: failed to contact local rpcbind server (errno 5). [ 460.950495] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory [ 460.950659] NFSD: starting 90-second grace period [ 615.796112] nfsd: last server has exited [ 615.796121] nfsd: unexporting all filesystems [ 615.800976] RPC: failed to contact local rpcbind server (errno 5). [ 619.444368] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory [ 619.03] NFSD: starting 90-second grace period [ 620.576730] nfsd: last server has exited [ 620.576739] nfsd: unexporting all filesystems [ 620.581036] RPC: failed to contact local rpcbind server (errno 5). [ 621.606324] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory [ 621.606359] NFSD: starting 90-second grace period [ 622.561989] nfsd: last server has exited [ 622.561999] nfsd: unexporting all filesystems [ 623.639396] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory [ 623.639430] NFSD: starting 90-second grace period [ 623.639487] Unable to handle kernel paging request at RIP: [ 623.639492] [8041c47f] __kfree_skb+0x9f/0x150 [ 623.639504] PGD 203067 PUD 0 [ 623.639510] Oops: 0002 [1] PREEMPT SMP [ 623.639515] CPU 0 [ 623.639519] Modules linked in: fglrx(P) nfs nfsd exportfs lockd nfs_acl sunrpc ppdev lp autofs4 ipw3945 ieee80211 ieee80211_crypt ipv6 deflate zlib_deflate twofish twofish_common camellia serpent blowfish des cbc ecb blkcipher aes xcbc sha256 sha1 crypto_null af_key piix ide_core dm_crypt dm_snapshot dm_mirror dm_mod sbp2 loop coretemp cpufreq_conservative cpufreq_stats acpi_cpufreq freq_table usbhid pl2303 ohci1394 ieee1394 usbserial pcmcia firmware_class snd_hda_intel snd_pcm_oss snd_mixer_oss sdhci snd_pcm joydev iTCO_wdt fw_ohci fw_core mmc_core snd_timer tg3 sg snd yenta_socket rsrc_nonstatic pcmcia_core crc_itu_t iTCO_vendor_support tifm_7xx1 tsdev parport_pc parport intel_agp tpm_infineon tpm tpm_bios uhci_hcd sr_mod tifm_core ehci_hcd psmouse soundcore snd_page_alloc pcspkr serio_raw evdev cdrom [ 623.639616] Pid: 616, comm: udevd Tainted: P 2.6.22-rc1-cfs-v12 #2 [ 623.639622] RIP: 0010:[8041c47f] [8041c47f] __kfree_skb+0x9f/0x150 [ 623.639631] RSP: 0018:81003ed87be8 EFLAGS: 00010286 [ 623.639635] RAX: 81003f2144a0 RBX: RCX: [ 623.639641] RDX: 0130 RSI: 8100285eb400 RDI: [ 623.639646] RBP: 8100285eb400 R08: 0050eaf0 R09: [ 623.639651] R10: R11: 0246 R12: 81003f214400 [ 623.639656] R13: 81003ed87ee8 R14: 8100285eb440 R15: [ 623.639662] FS: 2b0370c18e00() GS:80603000() knlGS: [ 623.639667] CS: 0010 DS: ES: CR0: 8005003b [ 623.639672] CR2: CR3: 3ed6c000 CR4: 06e0 [ 623.639678] Process udevd (pid: 616, threadinfo 81003ed86000, task 81003ecd) [ 623.639682] Stack: 81003ed87ee8 81003ed87e68 8100285eb400 8043b6a6 [ 623.639694] 0001 810001ff7b80 0050 81003ed87db8 [ 623.639702] [ 623.639709] Call Trace: [ 623.639719] [8043b6a6] netlink_recvmsg+0x176/0x3a0 [ 623.639739] [80415b80] sock_recvmsg+0x150/0x170 [ 623.639754] [8024e760] autoremove_wake_function+0x0/0x30 [ 623.639768] [802a531e] core_sys_select+0x26e/0x350 [ 623.639785] [802a9f05] __d_lookup+0x165/0x180 [ 623.639797] [80416f8e] sys_recvfrom+0xfe/0x190 [ 623.639807] [8024e969] remove_wait_queue+0x19/0x60 [ 623.639823] [802a5874] sys_select+0x44/0x1c0 [ 623.639836] [8020a2ae] system_call+0x7e/0x83 [ 623.639849] [ 623.639851] [ 623.639852] Code: f0 ff 0f 0f 94 c0 84 c0 75 27 66 c7 85 a8 00 00 00 00 00 66 [ 623.639871] RIP [8041c47f] __kfree_skb+0x9f/0x150 [ 623.639878] RSP 81003ed87be8 [ 623.639881] CR2: Hmm, I've got something different now :( - On Thu, 2007-05-17 at 02:55 +0400, Oleg Nesterov wrote: On Wed, 16 May 2007 21:00:41 +0300 Zilvinas Valinskas [EMAIL PROTECTED] wrote: In short, on shutdown
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
And another one crash, achieved by running the following in the shell. Ran several times, as see from dmesg: $ op=stop; sudo /etc/init.d/nfs-common $op; \ sudo /etc/init.d/nfs-kernel-server $op; \ op=start; sudo /etc/init.d/nfs-common $op; \ sudo /etc/init.d/nfs-kernel-server $op; Repeat several times ;) The dmesg output: May 17 11:36:23 zv kernel: [ 613.071050] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory May 17 11:36:23 zv kernel: [ 613.071082] NFSD: starting 90-second grace period May 17 11:36:25 zv kernel: [ 615.639312] nfsd: last server has exited May 17 11:36:25 zv kernel: [ 615.639322] nfsd: unexporting all filesystems May 17 11:36:25 zv kernel: [ 615.838746] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory May 17 11:36:25 zv kernel: [ 615.838782] NFSD: starting 90-second grace period May 17 11:36:26 zv kernel: [ 616.464554] nfsd: last server has exited May 17 11:36:26 zv kernel: [ 616.464563] nfsd: unexporting all filesystems May 17 11:36:26 zv kernel: [ 616.468219] RPC: failed to contact local rpcbind server (errno 5). May 17 11:36:26 zv kernel: [ 616.669736] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory May 17 11:36:26 zv kernel: [ 616.669771] NFSD: starting 90-second grace period May 17 11:36:27 zv kernel: [ 617.200592] nfsd: last server has exited May 17 11:36:27 zv kernel: [ 617.200601] nfsd: unexporting all filesystems May 17 11:36:27 zv kernel: [ 617.202565] RPC: failed to contact local rpcbind server (errno 5). May 17 11:36:27 zv kernel: [ 617.409917] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory May 17 11:36:27 zv kernel: [ 617.409948] NFSD: starting 90-second grace period May 17 11:36:27 zv kernel: [ 617.872937] nfsd: last server has exited May 17 11:36:27 zv kernel: [ 617.872945] nfsd: unexporting all filesystems May 17 11:36:27 zv kernel: [ 617.877526] RPC: failed to contact local rpcbind server (errno 5). May 17 11:36:28 zv kernel: [ 618.084212] PGD 21f9e067 PUD 3b8bf067 PMD 0 May 17 11:36:28 zv kernel: [ 618.084224] CPU 0 May 17 11:36:28 zv kernel: [ 618.084227] Modules linked in: fglrx(P) nfs ipv6 nfsd exportfs lockd nfs_acl sunrpc pp dev lp autofs4 deflate zlib_deflate twofish twofish_common camellia serpent blowfish des cbc ecb blkcipher aes xcbc sha256 sha1 crypto_null af_key piix ide_core dm_crypt dm_snapshot dm_mirror dm_mod sbp2 loop coretemp cpufreq_conser vative cpufreq_stats acpi_cpufreq freq_table pcmcia snd_hda_intel usbhid snd_pcm_oss snd_mixer_oss pl2303 ipw3945 ye nta_socket snd_pcm ohci1394 ieee1394 tifm_7xx1 joydev snd_timer usbserial tsdev tpm_infineon sdhci rsrc_nonstatic ie ee80211 ieee80211_crypt parport_pc snd tpm fw_ohci fw_core parport firmware_class iTCO_wdt iTCO_vendor_support sg ps mouse pcmcia_core tg3 mmc_core crc_itu_t tifm_core pcspkr tpm_bios soundcore snd_page_alloc intel_agp sr_mod serio_r aw ehci_hcd uhci_hcd cdrom evdev May 17 11:36:28 zv kernel: [ 618.084327] Pid: 5560, comm: rpc.nfsd Tainted: P 2.6.22-rc1-cfs-v12 #2 May 17 11:36:28 zv kernel: [ 618.084332] RIP: 0010:[80341ec4] [80341ec4] kobject_cleanup+0x24/ 0xa0 May 17 11:36:28 zv kernel: [ 618.084342] RSP: 0018:8100210bdd08 EFLAGS: 00010202 May 17 11:36:28 zv kernel: [ 618.084347] RAX: 0001 RBX: 810021c7d688 RCX: 804c4be0 May 17 11:36:28 zv kernel: [ 618.084353] RDX: RSI: 80341f40 RDI: 810021c7d688 May 17 11:36:28 zv kernel: [ 618.084358] RBP: 80341f40 R08: R09: May 17 11:36:28 zv kernel: [ 618.084362] R10: 0001 R11: R12: 0010 May 17 11:36:28 zv kernel: [ 618.084367] R13: 810001fe6270 R14: 88382941 R15: May 17 11:36:28 zv kernel: [ 618.084374] FS: 2ab11a0db6f0() GS:80603000() knlGS:00 00 May 17 11:36:28 zv kernel: [ 618.084379] CS: 0010 DS: ES: CR0: 8005003b May 17 11:36:28 zv kernel: [ 618.084384] CR2: 0010 CR3: 38d4f000 CR4: 06e0 May 17 11:36:28 zv kernel: [ 618.084390] Process rpc.nfsd (pid: 5560, threadinfo 8100210bc000, task 8100266 a6000) May 17 11:36:28 zv kernel: [ 618.084394] Stack: 0287 810021c7d6a4 80341f40 81003bf9837 8 May 17 11:36:28 zv kernel: [ 618.084405] 810001fe6270 80342fff 81003bf98378 810038bf0f50 May 17 11:36:28 zv kernel: [ 618.084414] 81003bf98370 802e59ec 88382941 810021250100 May 17 11:36:28 zv kernel: [ 618.084422] Call Trace: May 17 11:36:28 zv kernel: [ 618.084432] [80341f40] kobject_release+0x0/0x10 May 17 11:36:28 zv kernel: [ 618.084440] [80342fff] kref_put+0x3f/0x80 May 17 11:36:28 zv kernel: [ 618.084449] [802e59ec] sysfs_hash_and_remove+0x14c/0x160 May 17 11:36:28 zv
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
Hello Zilvinas, On 05/17, Zilvinas Valinskas wrote: Patch seems to help and it seems kernel doesn't free anymore. I've booted new kernel and did : OK, thank you very much. So, we have some other problems, and I _think_ that workqueue.c is not the source of them. However, I can't understand why cleanup_workqueue_thread() hangs anyway. It shouldn't. Looks like rpciod/1 was preempted, and can't get CPU. According to kernel-nfs-freeze.log it is TASK_RUNNING. Strange. It is very sad, because this code was supposed to be cleanuped anyway, but if it is really buggy, it would be great to know why. Perhaps, we can understand the problem with your help. Could you please revert the patch I sent, and send me (privately) the output of objdump -d kernel/workqueue.o ? I doubt very much I'll see something interesting, but who knows... Thanks! Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
On Wed, 16 May 2007 21:00:41 +0300 Zilvinas Valinskas <[EMAIL PROTECTED]> wrote: > > In short, on shutdown my laptop is always freezing now. I was able to > capture the 'sysrq-P' (hit that several times), sysrq-T outputs. Please > see .config and log messages at http://barclay.balt.net/~zilvinas/oops/ > > Kernel version I had built according git is : > > [EMAIL PROTECTED]:/projects/linux-amd64.git$ git describe HEAD > v2.6.22-rc1-29-gfaa8b6c > > On top of that I have CFS v12 applied (no other changes otherwise). > Please note that there is ''fglrx.ko'' loaded and kernel is tainted > because of that (feel free to ignore the report ...). > > Anyway, 'sysrq-P' always show that PC is stuck at (NFS lockd?) and it is > always the same backtrace is shown. 'sysrq-t' output is in > 'kernel-nfs-freeze.log' file (did not want to post it here). > > Pid: 3652, comm: lockd Tainted: P 2.6.22-rc1-cfs-v12 #1 > > [] wq_barrier_func+0x0/0x10 > [] destroy_workqueue+0x75/0xa0 > [] :sunrpc:rpciod_down+0xf4/0x170 > [] :lockd:lockd+0x244/0x300 > [] schedule_tail+0x3f/0xb0 > [] child_rip+0xa/0x12 > [] :lockd:lockd+0x0/0x300 > [] :lockd:lockd+0x0/0x300 > [] child_rip+0x0/0x12 > > Hope this helps. Thanks in advance for any advice how to solve problem ! > For now I am back to '2.6.21.1-cfs-v10'. > Nice, thanks. Zilvinas, could you try the patch below? It is a shot in the dark. I hope I'll suggest somethimg better tomorrow. Oleg. --- OLD/kernel/workqueue.c~ 2007-05-17 00:15:37.0 +0400 +++ OLD/kernel/workqueue.c 2007-05-17 02:51:15.0 +0400 @@ -752,16 +752,25 @@ static void cleanup_workqueue_thread(str spin_unlock_irq(>lock); if (alive) { + int n; + wait_for_completion(); - while (unlikely(cwq->thread != NULL)) - cpu_relax(); - /* -* Wait until cwq->thread unlocks cwq->lock, -* it won't touch *cwq after that. -*/ - smp_rmb(); - spin_unlock_wait(>lock); + for (n = 0;; ++n) { + spin_lock_irq(>lock); + alive = (cwq->thread != NULL); + spin_unlock_irq(>lock); + + if (!alive) + break; + + if (n > 1000) { + printk(KERN_CRIT "ERR!! wq: %s\n", cwq->wq->name); + break; + } + + yield(); + } } } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
On Wed, 16 May 2007 21:00:41 +0300 Zilvinas Valinskas <[EMAIL PROTECTED]> wrote: > Hello, > > In short, on shutdown my laptop is always freezing now. I was able to > capture the 'sysrq-P' (hit that several times), sysrq-T outputs. Please > see .config and log messages at http://barclay.balt.net/~zilvinas/oops/ > > Kernel version I had built according git is : > > [EMAIL PROTECTED]:/projects/linux-amd64.git$ git describe HEAD > v2.6.22-rc1-29-gfaa8b6c > > On top of that I have CFS v12 applied (no other changes otherwise). > Please note that there is ''fglrx.ko'' loaded and kernel is tainted > because of that (feel free to ignore the report ...). > > Anyway, 'sysrq-P' always show that PC is stuck at (NFS lockd?) and it is > always the same backtrace is shown. 'sysrq-t' output is in > 'kernel-nfs-freeze.log' file (did not want to post it here). > > Pid: 3652, comm: lockd Tainted: P 2.6.22-rc1-cfs-v12 #1 > > [] wq_barrier_func+0x0/0x10 > [] destroy_workqueue+0x75/0xa0 > [] :sunrpc:rpciod_down+0xf4/0x170 > [] :lockd:lockd+0x244/0x300 > [] schedule_tail+0x3f/0xb0 > [] child_rip+0xa/0x12 > [] :lockd:lockd+0x0/0x300 > [] :lockd:lockd+0x0/0x300 > [] child_rip+0x0/0x12 > > Hope this helps. Thanks in advance for any advice how to solve problem ! > For now I am back to '2.6.21.1-cfs-v10'. > Thanks for the report. I'm thinking "Oleg". - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
Hello, In short, on shutdown my laptop is always freezing now. I was able to capture the 'sysrq-P' (hit that several times), sysrq-T outputs. Please see .config and log messages at http://barclay.balt.net/~zilvinas/oops/ Kernel version I had built according git is : [EMAIL PROTECTED]:/projects/linux-amd64.git$ git describe HEAD v2.6.22-rc1-29-gfaa8b6c On top of that I have CFS v12 applied (no other changes otherwise). Please note that there is ''fglrx.ko'' loaded and kernel is tainted because of that (feel free to ignore the report ...). Anyway, 'sysrq-P' always show that PC is stuck at (NFS lockd?) and it is always the same backtrace is shown. 'sysrq-t' output is in 'kernel-nfs-freeze.log' file (did not want to post it here). Pid: 3652, comm: lockd Tainted: P 2.6.22-rc1-cfs-v12 #1 [] wq_barrier_func+0x0/0x10 [] destroy_workqueue+0x75/0xa0 [] :sunrpc:rpciod_down+0xf4/0x170 [] :lockd:lockd+0x244/0x300 [] schedule_tail+0x3f/0xb0 [] child_rip+0xa/0x12 [] :lockd:lockd+0x0/0x300 [] :lockd:lockd+0x0/0x300 [] child_rip+0x0/0x12 Hope this helps. Thanks in advance for any advice how to solve problem ! For now I am back to '2.6.21.1-cfs-v10'. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
Hello, In short, on shutdown my laptop is always freezing now. I was able to capture the 'sysrq-P' (hit that several times), sysrq-T outputs. Please see .config and log messages at http://barclay.balt.net/~zilvinas/oops/ Kernel version I had built according git is : [EMAIL PROTECTED]:/projects/linux-amd64.git$ git describe HEAD v2.6.22-rc1-29-gfaa8b6c On top of that I have CFS v12 applied (no other changes otherwise). Please note that there is ''fglrx.ko'' loaded and kernel is tainted because of that (feel free to ignore the report ...). Anyway, 'sysrq-P' always show that PC is stuck at (NFS lockd?) and it is always the same backtrace is shown. 'sysrq-t' output is in 'kernel-nfs-freeze.log' file (did not want to post it here). Pid: 3652, comm: lockd Tainted: P 2.6.22-rc1-cfs-v12 #1 [8024a5a0] wq_barrier_func+0x0/0x10 [8024a7e5] destroy_workqueue+0x75/0xa0 [8833cd34] :sunrpc:rpciod_down+0xf4/0x170 [8836dd74] :lockd:lockd+0x244/0x300 [80233e1f] schedule_tail+0x3f/0xb0 [8020b0f8] child_rip+0xa/0x12 [8836db30] :lockd:lockd+0x0/0x300 [8836db30] :lockd:lockd+0x0/0x300 [8020b0ee] child_rip+0x0/0x12 Hope this helps. Thanks in advance for any advice how to solve problem ! For now I am back to '2.6.21.1-cfs-v10'. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
On Wed, 16 May 2007 21:00:41 +0300 Zilvinas Valinskas [EMAIL PROTECTED] wrote: Hello, In short, on shutdown my laptop is always freezing now. I was able to capture the 'sysrq-P' (hit that several times), sysrq-T outputs. Please see .config and log messages at http://barclay.balt.net/~zilvinas/oops/ Kernel version I had built according git is : [EMAIL PROTECTED]:/projects/linux-amd64.git$ git describe HEAD v2.6.22-rc1-29-gfaa8b6c On top of that I have CFS v12 applied (no other changes otherwise). Please note that there is ''fglrx.ko'' loaded and kernel is tainted because of that (feel free to ignore the report ...). Anyway, 'sysrq-P' always show that PC is stuck at (NFS lockd?) and it is always the same backtrace is shown. 'sysrq-t' output is in 'kernel-nfs-freeze.log' file (did not want to post it here). Pid: 3652, comm: lockd Tainted: P 2.6.22-rc1-cfs-v12 #1 [8024a5a0] wq_barrier_func+0x0/0x10 [8024a7e5] destroy_workqueue+0x75/0xa0 [8833cd34] :sunrpc:rpciod_down+0xf4/0x170 [8836dd74] :lockd:lockd+0x244/0x300 [80233e1f] schedule_tail+0x3f/0xb0 [8020b0f8] child_rip+0xa/0x12 [8836db30] :lockd:lockd+0x0/0x300 [8836db30] :lockd:lockd+0x0/0x300 [8020b0ee] child_rip+0x0/0x12 Hope this helps. Thanks in advance for any advice how to solve problem ! For now I am back to '2.6.21.1-cfs-v10'. Thanks for the report. I'm thinking Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel NFS lockd freezes notebook on shutdown (Linux 2.6.22-rc1 + CFS v12)
On Wed, 16 May 2007 21:00:41 +0300 Zilvinas Valinskas [EMAIL PROTECTED] wrote: In short, on shutdown my laptop is always freezing now. I was able to capture the 'sysrq-P' (hit that several times), sysrq-T outputs. Please see .config and log messages at http://barclay.balt.net/~zilvinas/oops/ Kernel version I had built according git is : [EMAIL PROTECTED]:/projects/linux-amd64.git$ git describe HEAD v2.6.22-rc1-29-gfaa8b6c On top of that I have CFS v12 applied (no other changes otherwise). Please note that there is ''fglrx.ko'' loaded and kernel is tainted because of that (feel free to ignore the report ...). Anyway, 'sysrq-P' always show that PC is stuck at (NFS lockd?) and it is always the same backtrace is shown. 'sysrq-t' output is in 'kernel-nfs-freeze.log' file (did not want to post it here). Pid: 3652, comm: lockd Tainted: P 2.6.22-rc1-cfs-v12 #1 [8024a5a0] wq_barrier_func+0x0/0x10 [8024a7e5] destroy_workqueue+0x75/0xa0 [8833cd34] :sunrpc:rpciod_down+0xf4/0x170 [8836dd74] :lockd:lockd+0x244/0x300 [80233e1f] schedule_tail+0x3f/0xb0 [8020b0f8] child_rip+0xa/0x12 [8836db30] :lockd:lockd+0x0/0x300 [8836db30] :lockd:lockd+0x0/0x300 [8020b0ee] child_rip+0x0/0x12 Hope this helps. Thanks in advance for any advice how to solve problem ! For now I am back to '2.6.21.1-cfs-v10'. Nice, thanks. Zilvinas, could you try the patch below? It is a shot in the dark. I hope I'll suggest somethimg better tomorrow. Oleg. --- OLD/kernel/workqueue.c~ 2007-05-17 00:15:37.0 +0400 +++ OLD/kernel/workqueue.c 2007-05-17 02:51:15.0 +0400 @@ -752,16 +752,25 @@ static void cleanup_workqueue_thread(str spin_unlock_irq(cwq-lock); if (alive) { + int n; + wait_for_completion(barr.done); - while (unlikely(cwq-thread != NULL)) - cpu_relax(); - /* -* Wait until cwq-thread unlocks cwq-lock, -* it won't touch *cwq after that. -*/ - smp_rmb(); - spin_unlock_wait(cwq-lock); + for (n = 0;; ++n) { + spin_lock_irq(cwq-lock); + alive = (cwq-thread != NULL); + spin_unlock_irq(cwq-lock); + + if (!alive) + break; + + if (n 1000) { + printk(KERN_CRIT ERR!! wq: %s\n, cwq-wq-name); + break; + } + + yield(); + } } } - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/