Re: [Kernel-packages] [Bug 1348670] Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010, set_nfsv4_acl_one+0x21/0xb0 [nfsd]
Frank: I think it's better to file a new bug about this. Do include the actual log message. I'm looking at the source code for 3.13.0-66.108 and the fix is still in place, so it can't be *exactly* the same problem as before. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1348670 Title: BUG: unable to handle kernel NULL pointer dereference at 0010, set_nfsv4_acl_one+0x21/0xb0 [nfsd] Status in linux package in Ubuntu: Invalid Status in linux source package in Lucid: Invalid Status in linux source package in Precise: Fix Released Status in linux source package in Trusty: Fix Released Status in linux source package in Utopic: Invalid Status in linux package in Debian: Fix Released Bug description: I've seen this happen twice in the last 8 days on an NFS server running Ubuntu precise and kernels 3.2.0-65.98-generic (on the first occasion) and 3.2.0-67.101-generic (the second time), amd64. This never happened before in several months of operation; until 2014-07-01 this server was running an older 3.2.0 kernel. When this error appears in the logs, the system stops answering NFS RPCs (e.g., "rpcinfo -u localhost nfs 3" hangs) and a reboot is necessary to restore NFS service. A more detailed stack trace follows. Looking at the source code (fs/nfsd/vfs.c:set_nfsv4_acl_one()) I see that the call posix_acl_xattr_size(pacl->a_count) is not preceded by a check that pacl != NULL. Could this be related to the following entry in the changelog for 3.2.0-65.98? * NFSD: Call ->set_acl with a NULL ACL structure if no entries - LP: #1328154 Jul 24 10:12:53 server kernel: [575939.742131] IP: [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] PGD c243bb067 PUD c2400a067 PMD 0 Jul 24 10:12:53 server kernel: [575939.742131] Oops: [#1] SMP Jul 24 10:12:53 server kernel: [575939.742131] CPU 3 Jul 24 10:12:53 server kernel: [575939.742131] Modules linked in: usblp btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs reiserfs ext2 cts openafs(P) xt_tcpudp ipmi_si ipmi_devintf ipmi_msghandler iptable_filter ip_tables x_tables autofs4 bnep parport_pc rfcomm bluetooth ppdev binfmt_misc rpcsec_gss_krb5 nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc xfs dm_crypt bridge stp psmouse hpilo sp5100_tco i2c_piix4 amd64_edac_mod hpwdt edac_core k10temp edac_mce_amd joydev serio_raw acpi_power_meter mac_hid lp parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 multipath linear radeon ttm drm_kms_helper drm osst usbhid hid st ch i2c_algo_bit pata_atiixp hpsa bnx2 Jul 24 10:12:53 server kernel: [575939.742131] Jul 24 10:12:53 server kernel: [575939.742131] Pid: 2523, comm: nfsd Tainted: P O 3.2.0-67-generic #101-Ubuntu HP ProLiant DL385 G7 Jul 24 10:12:53 server kernel: [575939.742131] RIP: 0010:[] [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] RSP: 0018:880422085ce0 EFLAGS: 00010282 Jul 24 10:12:53 server kernel: [575939.742131] RAX: 4000 RBX: 880e29b16cc0 RCX: 013cc2cc Jul 24 10:12:53 server kernel: [575939.742131] RDX: a0583374 RSI: RDI: 880e29b16cc0 Jul 24 10:12:53 server kernel: [575939.742131] RBP: 880422085d10 R08: ea002cdf3b80 R09: a055c4af Jul 24 10:12:53 server kernel: [575939.742131] R10: 880b37ceed00 R11: 4004 R12: Jul 24 10:12:53 server kernel: [575939.742131] R13: 8807f56418c0 R14: R15: 880c2268d180 Jul 24 10:12:53 server kernel: [575939.742131] FS: 7fafd700() GS:88103fc8() knlGS: Jul 24 10:12:53 server kernel: [575939.742131] CS: 0010 DS: ES: CR0: 8005003b Jul 24 10:12:53 server kernel: [575939.742131] CR2: 0010 CR3: 000c22d6c000 CR4: 06e0 Jul 24 10:12:53 server kernel: [575939.742131] DR0: DR1: DR2: Jul 24 10:12:53 server kernel: [575939.742131] DR3: DR6: 0ff0 DR7: 0400 Jul 24 10:12:53 server kernel: [575939.742131] Process nfsd (pid: 2523, threadinfo 880422084000, task 880425964500) Jul 24 10:12:53 server kernel: [575939.742131] Stack: Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d040 880e29b16cc0 8807f56418c0 Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d180 880422085d50 a055d5e3 Jul 24 10:12:53 server kernel: [575939.742131] 880b37cee840 880c22684000 880c2268d040 Jul 24 10:12:53 server kernel: [575939.742131] Call Trace: Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd4_set_nfs4_acl+0x143/0x
Re: [Kernel-packages] [Bug 1504003] Re: Regression: spurious wakeup on HP EliteBook 850 G1
* Christopher M. Penalver [2015-10-09 09:45:00 +]: > Sergio Gelato, please perform the apport-collect as previously > requested. I tried. apport-collect refused to run, asking me to install python-apport. I did that. apport-collect still refused to run, with the same request. I will not be trying again. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1504003 Title: Regression: spurious wakeup on HP EliteBook 850 G1 Status in linux package in Ubuntu: Incomplete Bug description: Starting with Ubuntu kernel 3.13.0-44.73, the system spuriously reboots instead of powering off. Kernels up to 3.13.0-43.72 are not affected. The problem is 100% reproducible. It has also been mentioned in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1346269/comments/15 but that bug report was originally about different hardware (Lynx Point, not Lynx Point-LP) and should not be hijacked to discuss the EliteBook 8x0. The same holds for bug #1320282. Kernel 3.19.0-30 is also affected. The problem persists if the system is booted with xhci_hcd.quirks=0x2000 (XHCI_SPURIOUS_REBOOT) but goes away if one boots with xhci_hcd.quirks=0x4 (XHCI_SPURIOUS_WAKEUP). I've almost completed a bisection between 3.13.0-43.72 and 3.13.0-44.73. The remaining commits include both "xhci: Switch only Intel Lynx Point-LP ports to EHCI on shutdown." and "xhci: no switching back on non-ULT Haswell", however both the hardware configuration (Lynx Point-LP, 8086:9c31) and the above mentioned tests with quirk bit settings strongly suggest it's the latter commit that is at fault and that on this particular hardware (ULT Haswell) one needs *both* XHCI_SPURIOUS_REBOOT (automatically enabled by all the kernels I mentioned) and XHCI_SPURIOUS_WAKEUP. I have only intermittent access to an affected system and limited interest in spending more time on this now that a workaround (run a newer kernel with xhci_hcd.quirks=0x4) is known. However, my findings so far may be valuable to others. A look at the current linux-stable source tree on git.kernel.org suggests that the problem has yet to be addressed upstream (unless it's fixed/masked by some other change elsewhere in the code; I haven't actually tested the latest upstream kernel). The output of lspci -vvvnn (not -vnvn, sorry; I hope it's good enough) is attached. We've tried BIOS revisions 1.30 and 1.33. (1.33 is the latest, and 1.30 was a critical update so anything older is probably not worth supporting.) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1504003/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1504003] Re: Regression: spurious wakeup on HP EliteBook 850 G1
This problem does not manifest itself in log files. The relevant hardware information (lspci output) is already attached. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1504003 Title: Regression: spurious wakeup on HP EliteBook 850 G1 Status in linux package in Ubuntu: Incomplete Bug description: Starting with Ubuntu kernel 3.13.0-44.73, the system spuriously reboots instead of powering off. Kernels up to 3.13.0-43.72 are not affected. The problem is 100% reproducible. It has also been mentioned in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1346269/comments/15 but that bug report was originally about different hardware (Lynx Point, not Lynx Point-LP) and should not be hijacked to discuss the EliteBook 8x0. The same holds for bug #1320282. Kernel 3.19.0-30 is also affected. The problem persists if the system is booted with xhci_hcd.quirks=0x2000 (XHCI_SPURIOUS_REBOOT) but goes away if one boots with xhci_hcd.quirks=0x4 (XHCI_SPURIOUS_WAKEUP). I've almost completed a bisection between 3.13.0-43.72 and 3.13.0-44.73. The remaining commits include both "xhci: Switch only Intel Lynx Point-LP ports to EHCI on shutdown." and "xhci: no switching back on non-ULT Haswell", however both the hardware configuration (Lynx Point-LP, 8086:9c31) and the above mentioned tests with quirk bit settings strongly suggest it's the latter commit that is at fault and that on this particular hardware (ULT Haswell) one needs *both* XHCI_SPURIOUS_REBOOT (automatically enabled by all the kernels I mentioned) and XHCI_SPURIOUS_WAKEUP. I have only intermittent access to an affected system and limited interest in spending more time on this now that a workaround (run a newer kernel with xhci_hcd.quirks=0x4) is known. However, my findings so far may be valuable to others. A look at the current linux-stable source tree on git.kernel.org suggests that the problem has yet to be addressed upstream (unless it's fixed/masked by some other change elsewhere in the code; I haven't actually tested the latest upstream kernel). The output of lspci -vvvnn (not -vnvn, sorry; I hope it's good enough) is attached. We've tried BIOS revisions 1.30 and 1.33. (1.33 is the latest, and 1.30 was a critical update so anything older is probably not worth supporting.) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1504003/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1504003] Re: Regression: spurious wakeup on HP EliteBook 850 G1
** Attachment removed: "lspci -vvvnn output on EliteBook 850 G1" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1504003/+attachment/4488282/+files/lspci-vvvnn.log ** Attachment added: "lspci -vvvnn output on HP EliteBook 850 G1" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1504003/+attachment/4488303/+files/lspci-vvvnn.log -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1504003 Title: Regression: spurious wakeup on HP EliteBook 850 G1 Status in linux package in Ubuntu: Incomplete Bug description: Starting with Ubuntu kernel 3.13.0-44.73, the system spuriously reboots instead of powering off. Kernels up to 3.13.0-43.72 are not affected. The problem is 100% reproducible. It has also been mentioned in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1346269/comments/15 but that bug report was originally about different hardware (Lynx Point, not Lynx Point-LP) and should not be hijacked to discuss the EliteBook 8x0. The same holds for bug #1320282. Kernel 3.19.0-30 is also affected. The problem persists if the system is booted with xhci_hcd.quirks=0x2000 (XHCI_SPURIOUS_REBOOT) but goes away if one boots with xhci_hcd.quirks=0x4 (XHCI_SPURIOUS_WAKEUP). I've almost completed a bisection between 3.13.0-43.72 and 3.13.0-44.73. The remaining commits include both "xhci: Switch only Intel Lynx Point-LP ports to EHCI on shutdown." and "xhci: no switching back on non-ULT Haswell", however both the hardware configuration (Lynx Point-LP, 8086:9c31) and the above mentioned tests with quirk bit settings strongly suggest it's the latter commit that is at fault and that on this particular hardware (ULT Haswell) one needs *both* XHCI_SPURIOUS_REBOOT (automatically enabled by all the kernels I mentioned) and XHCI_SPURIOUS_WAKEUP. I have only intermittent access to an affected system and limited interest in spending more time on this now that a workaround (run a newer kernel with xhci_hcd.quirks=0x4) is known. However, my findings so far may be valuable to others. A look at the current linux-stable source tree on git.kernel.org suggests that the problem has yet to be addressed upstream (unless it's fixed/masked by some other change elsewhere in the code; I haven't actually tested the latest upstream kernel). The output of lspci -vvvnn (not -vnvn, sorry; I hope it's good enough) is attached. We've tried BIOS revisions 1.30 and 1.33. (1.33 is the latest, and 1.30 was a critical update so anything older is probably not worth supporting.) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1504003/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1504003] [NEW] Regression: spurious wakeup on HP EliteBook 850 G1
Public bug reported: Starting with Ubuntu kernel 3.13.0-44.73, the system spuriously reboots instead of powering off. Kernels up to 3.13.0-43.72 are not affected. The problem is 100% reproducible. It has also been mentioned in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1346269/comments/15 but that bug report was originally about different hardware (Lynx Point, not Lynx Point-LP) and should not be hijacked to discuss the EliteBook 8x0. The same holds for bug #1320282. Kernel 3.19.0-30 is also affected. The problem persists if the system is booted with xhci_hcd.quirks=0x2000 (XHCI_SPURIOUS_REBOOT) but goes away if one boots with xhci_hcd.quirks=0x4 (XHCI_SPURIOUS_WAKEUP). I've almost completed a bisection between 3.13.0-43.72 and 3.13.0-44.73. The remaining commits include both "xhci: Switch only Intel Lynx Point- LP ports to EHCI on shutdown." and "xhci: no switching back on non-ULT Haswell", however both the hardware configuration (Lynx Point-LP, 8086:9c31) and the above mentioned tests with quirk bit settings strongly suggest it's the latter commit that is at fault and that on this particular hardware (ULT Haswell) one needs *both* XHCI_SPURIOUS_REBOOT (automatically enabled by all the kernels I mentioned) and XHCI_SPURIOUS_WAKEUP. I have only intermittent access to an affected system and limited interest in spending more time on this now that a workaround (run a newer kernel with xhci_hcd.quirks=0x4) is known. However, my findings so far may be valuable to others. A look at the current linux-stable source tree on git.kernel.org suggests that the problem has yet to be addressed upstream (unless it's fixed/masked by some other change elsewhere in the code; I haven't actually tested the latest upstream kernel). The output of lspci -vvvnn (not -vnvn, sorry; I hope it's good enough) is attached. We've tried BIOS revisions 1.30 and 1.33. (1.33 is the latest, and 1.30 was a critical update so anything older is probably not worth supporting.) ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Tags: amd64 trusty ** Attachment added: "lspci -vvvnn output on EliteBook 850 G1" https://bugs.launchpad.net/bugs/1504003/+attachment/4488282/+files/lspci-vvvnn.log -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1504003 Title: Regression: spurious wakeup on HP EliteBook 850 G1 Status in linux package in Ubuntu: New Bug description: Starting with Ubuntu kernel 3.13.0-44.73, the system spuriously reboots instead of powering off. Kernels up to 3.13.0-43.72 are not affected. The problem is 100% reproducible. It has also been mentioned in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1346269/comments/15 but that bug report was originally about different hardware (Lynx Point, not Lynx Point-LP) and should not be hijacked to discuss the EliteBook 8x0. The same holds for bug #1320282. Kernel 3.19.0-30 is also affected. The problem persists if the system is booted with xhci_hcd.quirks=0x2000 (XHCI_SPURIOUS_REBOOT) but goes away if one boots with xhci_hcd.quirks=0x4 (XHCI_SPURIOUS_WAKEUP). I've almost completed a bisection between 3.13.0-43.72 and 3.13.0-44.73. The remaining commits include both "xhci: Switch only Intel Lynx Point-LP ports to EHCI on shutdown." and "xhci: no switching back on non-ULT Haswell", however both the hardware configuration (Lynx Point-LP, 8086:9c31) and the above mentioned tests with quirk bit settings strongly suggest it's the latter commit that is at fault and that on this particular hardware (ULT Haswell) one needs *both* XHCI_SPURIOUS_REBOOT (automatically enabled by all the kernels I mentioned) and XHCI_SPURIOUS_WAKEUP. I have only intermittent access to an affected system and limited interest in spending more time on this now that a workaround (run a newer kernel with xhci_hcd.quirks=0x4) is known. However, my findings so far may be valuable to others. A look at the current linux-stable source tree on git.kernel.org suggests that the problem has yet to be addressed upstream (unless it's fixed/masked by some other change elsewhere in the code; I haven't actually tested the latest upstream kernel). The output of lspci -vvvnn (not -vnvn, sorry; I hope it's good enough) is attached. We've tried BIOS revisions 1.30 and 1.33. (1.33 is the latest, and 1.30 was a critical update so anything older is probably not worth supporting.) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1504003/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1423472] Re: Logs flooded with "nfs4_schedule_state_manager: kthread_run: -12"
So far, this particular symptom has been seen exactly once. The host it was observed on was reinstalled from scratch with trusty a few weeks ago following a hard disk failure, so no dist-upgrade involved. I did some NFS-client-related tuning on this and many other machines this week so it's conceivable that this has caused new code paths to be exercised (although the changes were rather benign: a longer credential timeout in rpc.gssd, an explicit port number for nfs.nfs_callback_tcpport, a smaller value for auth_rpcgss.key_expire_timeo, and only the rpc.gssd change had actually taken effect on that host at the time of the incident). I've looked at the source code "for kthread_run" (or rather the function behind that macro). The error is the result of a memory allocation failure. What caused the kernel to run out of memory (this machine has 32GB of RAM, by the way) last night is probably unknowable at this point, and need not have had anything to do with NFS. *This* bug report is only about the fact that the issuing of that particular error message (from fs/nfs/nfs4state.c:nfs4_schedule_state_manager()) is not rate- limited (neither in 3.13 nor in the linux-stable tree at git.kernel.org), which put an undesirable load on my syslog infrastructure. That should be easy to fix: it's what pr_warn_ratelimited() is for. I cannot reproduce the symptom at will, so I won't actually test the kernel from vivid: any negative result would be inconclusive. Since I know from reading the source code that the message is still not rate- limited upstream, I assume that kernel-bug-exists-upstream is the right choice. ** Tags added: kernel-bug-exists-upstream ** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1423472 Title: Logs flooded with "nfs4_schedule_state_manager: kthread_run: -12" Status in linux package in Ubuntu: Confirmed Bug description: An NFSv4 client running kernel 3.13.0-44-generic #73-Ubuntu (amd64) suddenly started spewing nfs4_schedule_state_manager: kthread_run: -12 log messages at an average rate of 2.65 kHz. It did not stop until I rebooted it. At the very least that message needs to be rate-limited. (Doesn't seem to be fixed upstream yet.) As for the underlying problem, -12 is -ENOMEM. I'm afraid I have no idea why the kernel ran out of memory at that point. WIll follow up if the problem ever recurs. This bug report is mainly about the lack of rate limiting. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1423472/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1423472] Re: Logs flooded with "nfs4_schedule_state_manager: kthread_run: -12"
Logs too big for inclusion (the problem was log flooding). Also, they would be missed by apport-collect because /var had been filled by an earlier, not necessarily related, problem; the only full copy of the logs is on a remote syslog server which does not run Ubuntu. ** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1423472 Title: Logs flooded with "nfs4_schedule_state_manager: kthread_run: -12" Status in linux package in Ubuntu: Confirmed Bug description: An NFSv4 client running kernel 3.13.0-44-generic #73-Ubuntu (amd64) suddenly started spewing nfs4_schedule_state_manager: kthread_run: -12 log messages at an average rate of 2.65 kHz. It did not stop until I rebooted it. At the very least that message needs to be rate-limited. (Doesn't seem to be fixed upstream yet.) As for the underlying problem, -12 is -ENOMEM. I'm afraid I have no idea why the kernel ran out of memory at that point. WIll follow up if the problem ever recurs. This bug report is mainly about the lack of rate limiting. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1423472/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1423472] [NEW] Logs flooded with "nfs4_schedule_state_manager: kthread_run: -12"
Public bug reported: An NFSv4 client running kernel 3.13.0-44-generic #73-Ubuntu (amd64) suddenly started spewing nfs4_schedule_state_manager: kthread_run: -12 log messages at an average rate of 2.65 kHz. It did not stop until I rebooted it. At the very least that message needs to be rate-limited. (Doesn't seem to be fixed upstream yet.) As for the underlying problem, -12 is -ENOMEM. I'm afraid I have no idea why the kernel ran out of memory at that point. WIll follow up if the problem ever recurs. This bug report is mainly about the lack of rate limiting. ** Affects: linux (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1423472 Title: Logs flooded with "nfs4_schedule_state_manager: kthread_run: -12" Status in linux package in Ubuntu: New Bug description: An NFSv4 client running kernel 3.13.0-44-generic #73-Ubuntu (amd64) suddenly started spewing nfs4_schedule_state_manager: kthread_run: -12 log messages at an average rate of 2.65 kHz. It did not stop until I rebooted it. At the very least that message needs to be rate-limited. (Doesn't seem to be fixed upstream yet.) As for the underlying problem, -12 is -ENOMEM. I'm afraid I have no idea why the kernel ran out of memory at that point. WIll follow up if the problem ever recurs. This bug report is mainly about the lack of rate limiting. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1423472/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
Re: [Kernel-packages] [Bug 1396961] Re: WARNING: wrong connector dpms state (Intel 82946GZ/GL, 8086:2972)
* Christopher M. Penalver [2014-11-27 20:21:56 +]: > Sergio Gelato, could you please specify why exactly you cannot run the > apport-collect? The authorization page doesn't work in my default browser (lynx); I get an error page after authentication. I've tried opening it in Firefox but didn't get anywhere either. I'd be happy to run apport-cli with the output sent to a local file, then upload that as an attachment (in case the apport output I attached when I filed the bug isn't enough). -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1396961 Title: WARNING: wrong connector dpms state (Intel 82946GZ/GL, 8086:2972) Status in “linux” package in Ubuntu: Incomplete Bug description: First seen after upgrading from precise to trusty, reproducible with both trusty (3.13.0-40) and backported utopic (3.16.0-25) kernels on the following hardware/BIOS: LENOVO 963673G/LENOVO, BIOS 2QKT29AUS 07/29/2008 . The information that follows is mostly from booting 3.13.0 with drm.debug=4. (The apport attachment is from 3.16.0-25.33, however.) Symptom: at variable intervals (minutes to hours), starting immediately after boot, the kernel logs a pair of WARNING messages from drivers/gpu/drm/i915/intel_display.c:4197 and drivers/gpu/drm/i915/intel_display.c:4199: wrong connector dpms state active connector not linked to encoder The computer uses Intel integrated graphics: 00:02.0 VGA compatible controller [0300]: Intel Corporation 82946GZ/GL Integrated Graphics Controller [8086:2972] (rev 02) (prog-if 00 [VGA controller]) Subsystem: Lenovo Device [17aa:300b] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- [disabled] Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit- Address: fee0300c Data: 4122 Capabilities: [d0] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Kernel driver in use: i915 The only attached display, via a DVI cable, is an Acer: [51.628] (II) intel(0): EDID vendor "ACR", prod id 2032 It looks to me like the kernel is of two minds as to whether connector 10 (VGA-2) is connected or not. Eventually it concludes, correctly, that it is disconnected, but intel_sdvo_detect() thinks that it is connected and ->get_hw_state() is fooled by this. The warnings reflect this confusion. There is no obvious loss of function but the warnings are noisy and I'd like to get rid of them. Here is a subset of kernel messages that seem relevant: [...] Nov 19 15:27:09 s1mt02 kernel: [0.584452] vesafb: mode is 1600x1200x32, line length=6400, pages=0 Nov 19 15:27:09 s1mt02 kernel: [0.584454] vesafb: scrolling: redraw Nov 19 15:27:09 s1mt02 kernel: [0.584456] vesafb: Truecolor: size=8:8:8:8, s hift=24:16:8:0 Nov 19 15:27:09 s1mt02 kernel: [0.585182] vesafb: framebuffer at 0xc000, mapped to 0xc9000118, using 7552k, total 7552k Nov 19 15:27:09 s1mt02 kernel: [0.662752] Console: switching to colour frame buffer device 200x75 [...] Nov 19 15:27:10 s1mt02 kernel: [ 10.529074] [drm] Initialized drm 1.1.0 20060810 Nov 19 15:27:10 s1mt02 kernel: [ 11.342875] [drm:intel_detect_pch], No PCH found. Nov 19 15:27:10 s1mt02 kernel: [ 11.342882] [drm] Memory usable by graphics device = 512M Nov 19 15:27:10 s1mt02 kernel: [ 11.342885] checking generic (c000 76) vs hw (c000 1000) Nov 19 15:27:10 s1mt02 kernel: [ 11.342887] fb: conflicting fb hw usage inteldrmfb vs VESA VGA - removing generic driver Nov 19 15:27:10 s1mt02 kernel: [ 11.342919] Console: switching to colour dummy device 80x25 Nov 19 15:27:10 s1mt02 kernel: [ 11.392088] i915 :00:02.0: irq 43 for MSI/MSI-X Nov 19 15:27:10 s1mt02 kernel: [ 11.392106] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). Nov 19 15:27:10 s1mt02 kernel: [ 11.392109] [drm] Driver supports precise vblank timestamp query. Nov 19 15:27:10 s1mt02 kernel: [ 11.392113] [drm:init_vbt_defaults], Set default to SSC at 100MHz Nov 19 15:27:10 s1mt02 kernel: [ 11.392119] [drm:intel_parse_bios], Using VBT from OpRegion: $VBT BROADWATER-G d Nov 19 15:27:10 s1mt02 kernel: [ 11.392123] [drm:parse_general_features], BDB_GENERAL_FEATURES int_tv_support 1 int_crt_support 1 lvds_use_ssc 0 lvds_ssc_freq 96 display_clock_mode 0 fdi_rx_polarity_inverted 0 Nov 19 15:27:10 s1mt02 kernel: [ 11.392129] [drm:parse_general_definitions], crt_ddc_bus_pin: 2 Nov 19 15:27:10 s1mt02 kernel: [ 11.392135] [drm:parse_sdvo_panel_data], Found S
[Kernel-packages] [Bug 1396961] Re: WARNING: wrong connector dpms state (Intel 82946GZ/GL, 8086:2972)
I'm unable to run apport-collect but the bug description already contains detailed information from the logs. ** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed ** Tags added: utopic -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1396961 Title: WARNING: wrong connector dpms state (Intel 82946GZ/GL, 8086:2972) Status in “linux” package in Ubuntu: Confirmed Bug description: First seen after upgrading from precise to trusty, reproducible with both trusty (3.13.0-40) and backported utopic (3.16.0-25) kernels on the following hardware/BIOS: LENOVO 963673G/LENOVO, BIOS 2QKT29AUS 07/29/2008 . The information that follows is mostly from booting 3.13.0 with drm.debug=4. (The apport attachment is from 3.16.0-25.33, however.) Symptom: at variable intervals (minutes to hours), starting immediately after boot, the kernel logs a pair of WARNING messages from drivers/gpu/drm/i915/intel_display.c:4197 and drivers/gpu/drm/i915/intel_display.c:4199: wrong connector dpms state active connector not linked to encoder The computer uses Intel integrated graphics: 00:02.0 VGA compatible controller [0300]: Intel Corporation 82946GZ/GL Integrated Graphics Controller [8086:2972] (rev 02) (prog-if 00 [VGA controller]) Subsystem: Lenovo Device [17aa:300b] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- [disabled] Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit- Address: fee0300c Data: 4122 Capabilities: [d0] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Kernel driver in use: i915 The only attached display, via a DVI cable, is an Acer: [51.628] (II) intel(0): EDID vendor "ACR", prod id 2032 It looks to me like the kernel is of two minds as to whether connector 10 (VGA-2) is connected or not. Eventually it concludes, correctly, that it is disconnected, but intel_sdvo_detect() thinks that it is connected and ->get_hw_state() is fooled by this. The warnings reflect this confusion. There is no obvious loss of function but the warnings are noisy and I'd like to get rid of them. Here is a subset of kernel messages that seem relevant: [...] Nov 19 15:27:09 s1mt02 kernel: [0.584452] vesafb: mode is 1600x1200x32, line length=6400, pages=0 Nov 19 15:27:09 s1mt02 kernel: [0.584454] vesafb: scrolling: redraw Nov 19 15:27:09 s1mt02 kernel: [0.584456] vesafb: Truecolor: size=8:8:8:8, s hift=24:16:8:0 Nov 19 15:27:09 s1mt02 kernel: [0.585182] vesafb: framebuffer at 0xc000, mapped to 0xc9000118, using 7552k, total 7552k Nov 19 15:27:09 s1mt02 kernel: [0.662752] Console: switching to colour frame buffer device 200x75 [...] Nov 19 15:27:10 s1mt02 kernel: [ 10.529074] [drm] Initialized drm 1.1.0 20060810 Nov 19 15:27:10 s1mt02 kernel: [ 11.342875] [drm:intel_detect_pch], No PCH found. Nov 19 15:27:10 s1mt02 kernel: [ 11.342882] [drm] Memory usable by graphics device = 512M Nov 19 15:27:10 s1mt02 kernel: [ 11.342885] checking generic (c000 76) vs hw (c000 1000) Nov 19 15:27:10 s1mt02 kernel: [ 11.342887] fb: conflicting fb hw usage inteldrmfb vs VESA VGA - removing generic driver Nov 19 15:27:10 s1mt02 kernel: [ 11.342919] Console: switching to colour dummy device 80x25 Nov 19 15:27:10 s1mt02 kernel: [ 11.392088] i915 :00:02.0: irq 43 for MSI/MSI-X Nov 19 15:27:10 s1mt02 kernel: [ 11.392106] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). Nov 19 15:27:10 s1mt02 kernel: [ 11.392109] [drm] Driver supports precise vblank timestamp query. Nov 19 15:27:10 s1mt02 kernel: [ 11.392113] [drm:init_vbt_defaults], Set default to SSC at 100MHz Nov 19 15:27:10 s1mt02 kernel: [ 11.392119] [drm:intel_parse_bios], Using VBT from OpRegion: $VBT BROADWATER-G d Nov 19 15:27:10 s1mt02 kernel: [ 11.392123] [drm:parse_general_features], BDB_GENERAL_FEATURES int_tv_support 1 int_crt_support 1 lvds_use_ssc 0 lvds_ssc_freq 96 display_clock_mode 0 fdi_rx_polarity_inverted 0 Nov 19 15:27:10 s1mt02 kernel: [ 11.392129] [drm:parse_general_definitions], crt_ddc_bus_pin: 2 Nov 19 15:27:10 s1mt02 kernel: [ 11.392135] [drm:parse_sdvo_panel_data], Found SDVO panel mode in BIOS VBT tables: Nov 19 15:27:10 s1mt02 kernel: [ 11.392139] [drm:drm_mode_debug_printmodeline], Modeline 0:"1600x1200" 0 162000 1600 1664 1856 2160 1200 1201 1204 1250 0x8 0xa Nov 19 15:27:10 s1mt02 kernel: [ 11.392146] [drm:parse_sdvo_device_mapping], the SDVO device with slave addr 70 is found on SDVOB port Nov 19 15:
[Kernel-packages] [Bug 1396961] [NEW] WARNING: wrong connector dpms state (Intel 82946GZ/GL, 8086:2972)
Public bug reported: First seen after upgrading from precise to trusty, reproducible with both trusty (3.13.0-40) and backported utopic (3.16.0-25) kernels on the following hardware/BIOS: LENOVO 963673G/LENOVO, BIOS 2QKT29AUS 07/29/2008 . The information that follows is mostly from booting 3.13.0 with drm.debug=4. (The apport attachment is from 3.16.0-25.33, however.) Symptom: at variable intervals (minutes to hours), starting immediately after boot, the kernel logs a pair of WARNING messages from drivers/gpu/drm/i915/intel_display.c:4197 and drivers/gpu/drm/i915/intel_display.c:4199: wrong connector dpms state active connector not linked to encoder The computer uses Intel integrated graphics: 00:02.0 VGA compatible controller [0300]: Intel Corporation 82946GZ/GL Integrated Graphics Controller [8086:2972] (rev 02) (prog-if 00 [VGA controller]) Subsystem: Lenovo Device [17aa:300b] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- [disabled] Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit- Address: fee0300c Data: 4122 Capabilities: [d0] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Kernel driver in use: i915 The only attached display, via a DVI cable, is an Acer: [51.628] (II) intel(0): EDID vendor "ACR", prod id 2032 It looks to me like the kernel is of two minds as to whether connector 10 (VGA-2) is connected or not. Eventually it concludes, correctly, that it is disconnected, but intel_sdvo_detect() thinks that it is connected and ->get_hw_state() is fooled by this. The warnings reflect this confusion. There is no obvious loss of function but the warnings are noisy and I'd like to get rid of them. Here is a subset of kernel messages that seem relevant: [...] Nov 19 15:27:09 s1mt02 kernel: [0.584452] vesafb: mode is 1600x1200x32, line length=6400, pages=0 Nov 19 15:27:09 s1mt02 kernel: [0.584454] vesafb: scrolling: redraw Nov 19 15:27:09 s1mt02 kernel: [0.584456] vesafb: Truecolor: size=8:8:8:8, s hift=24:16:8:0 Nov 19 15:27:09 s1mt02 kernel: [0.585182] vesafb: framebuffer at 0xc000, mapped to 0xc9000118, using 7552k, total 7552k Nov 19 15:27:09 s1mt02 kernel: [0.662752] Console: switching to colour frame buffer device 200x75 [...] Nov 19 15:27:10 s1mt02 kernel: [ 10.529074] [drm] Initialized drm 1.1.0 20060810 Nov 19 15:27:10 s1mt02 kernel: [ 11.342875] [drm:intel_detect_pch], No PCH found. Nov 19 15:27:10 s1mt02 kernel: [ 11.342882] [drm] Memory usable by graphics device = 512M Nov 19 15:27:10 s1mt02 kernel: [ 11.342885] checking generic (c000 76) vs hw (c000 1000) Nov 19 15:27:10 s1mt02 kernel: [ 11.342887] fb: conflicting fb hw usage inteldrmfb vs VESA VGA - removing generic driver Nov 19 15:27:10 s1mt02 kernel: [ 11.342919] Console: switching to colour dummy device 80x25 Nov 19 15:27:10 s1mt02 kernel: [ 11.392088] i915 :00:02.0: irq 43 for MSI/MSI-X Nov 19 15:27:10 s1mt02 kernel: [ 11.392106] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). Nov 19 15:27:10 s1mt02 kernel: [ 11.392109] [drm] Driver supports precise vblank timestamp query. Nov 19 15:27:10 s1mt02 kernel: [ 11.392113] [drm:init_vbt_defaults], Set default to SSC at 100MHz Nov 19 15:27:10 s1mt02 kernel: [ 11.392119] [drm:intel_parse_bios], Using VBT from OpRegion: $VBT BROADWATER-G d Nov 19 15:27:10 s1mt02 kernel: [ 11.392123] [drm:parse_general_features], BDB_GENERAL_FEATURES int_tv_support 1 int_crt_support 1 lvds_use_ssc 0 lvds_ssc_freq 96 display_clock_mode 0 fdi_rx_polarity_inverted 0 Nov 19 15:27:10 s1mt02 kernel: [ 11.392129] [drm:parse_general_definitions], crt_ddc_bus_pin: 2 Nov 19 15:27:10 s1mt02 kernel: [ 11.392135] [drm:parse_sdvo_panel_data], Found SDVO panel mode in BIOS VBT tables: Nov 19 15:27:10 s1mt02 kernel: [ 11.392139] [drm:drm_mode_debug_printmodeline], Modeline 0:"1600x1200" 0 162000 1600 1664 1856 2160 1200 1201 1204 1250 0x8 0xa Nov 19 15:27:10 s1mt02 kernel: [ 11.392146] [drm:parse_sdvo_device_mapping], the SDVO device with slave addr 70 is found on SDVOB port Nov 19 15:27:10 s1mt02 kernel: [ 11.392150] [drm:parse_sdvo_device_mapping], SDVO device: dvo=1, addr=70, wiring=1, ddc_pin=29, i2c_pin=5 Nov 19 15:27:10 s1mt02 kernel: [ 11.392154] [drm:parse_sdvo_device_mapping], the SDVO device with slave addr 70 is found on SDVOB port Nov 19 15:27:10 s1mt02 kernel: [ 11.392158] [drm:parse_sdvo_device_mapping], Maybe one SDVO port is shared by two SDVO device. Nov 19 15:27:10 s1mt02 kernel: [ 11.392162] [drm:parse_mipi], No MIPI BDB found Nov 19 15:27:10 s1mt02 kernel: [ 11.392177] [drm:intel_dsm_pci_probe], no _DSM method for intel device
[Kernel-packages] [Bug 1348670] Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010, set_nfsv4_acl_one+0x21/0xb0 [nfsd]
** Tags removed: verification-needed-precise ** Tags added: verification-done-precise -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1348670 Title: BUG: unable to handle kernel NULL pointer dereference at 0010, set_nfsv4_acl_one+0x21/0xb0 [nfsd] Status in “linux” package in Ubuntu: Invalid Status in “linux” source package in Lucid: Invalid Status in “linux” source package in Precise: Fix Committed Status in “linux” source package in Trusty: Fix Committed Status in “linux” source package in Utopic: Invalid Status in “linux” package in Debian: Fix Released Bug description: I've seen this happen twice in the last 8 days on an NFS server running Ubuntu precise and kernels 3.2.0-65.98-generic (on the first occasion) and 3.2.0-67.101-generic (the second time), amd64. This never happened before in several months of operation; until 2014-07-01 this server was running an older 3.2.0 kernel. When this error appears in the logs, the system stops answering NFS RPCs (e.g., "rpcinfo -u localhost nfs 3" hangs) and a reboot is necessary to restore NFS service. A more detailed stack trace follows. Looking at the source code (fs/nfsd/vfs.c:set_nfsv4_acl_one()) I see that the call posix_acl_xattr_size(pacl->a_count) is not preceded by a check that pacl != NULL. Could this be related to the following entry in the changelog for 3.2.0-65.98? * NFSD: Call ->set_acl with a NULL ACL structure if no entries - LP: #1328154 Jul 24 10:12:53 server kernel: [575939.742131] IP: [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] PGD c243bb067 PUD c2400a067 PMD 0 Jul 24 10:12:53 server kernel: [575939.742131] Oops: [#1] SMP Jul 24 10:12:53 server kernel: [575939.742131] CPU 3 Jul 24 10:12:53 server kernel: [575939.742131] Modules linked in: usblp btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs reiserfs ext2 cts openafs(P) xt_tcpudp ipmi_si ipmi_devintf ipmi_msghandler iptable_filter ip_tables x_tables autofs4 bnep parport_pc rfcomm bluetooth ppdev binfmt_misc rpcsec_gss_krb5 nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc xfs dm_crypt bridge stp psmouse hpilo sp5100_tco i2c_piix4 amd64_edac_mod hpwdt edac_core k10temp edac_mce_amd joydev serio_raw acpi_power_meter mac_hid lp parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 multipath linear radeon ttm drm_kms_helper drm osst usbhid hid st ch i2c_algo_bit pata_atiixp hpsa bnx2 Jul 24 10:12:53 server kernel: [575939.742131] Jul 24 10:12:53 server kernel: [575939.742131] Pid: 2523, comm: nfsd Tainted: P O 3.2.0-67-generic #101-Ubuntu HP ProLiant DL385 G7 Jul 24 10:12:53 server kernel: [575939.742131] RIP: 0010:[] [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] RSP: 0018:880422085ce0 EFLAGS: 00010282 Jul 24 10:12:53 server kernel: [575939.742131] RAX: 4000 RBX: 880e29b16cc0 RCX: 013cc2cc Jul 24 10:12:53 server kernel: [575939.742131] RDX: a0583374 RSI: RDI: 880e29b16cc0 Jul 24 10:12:53 server kernel: [575939.742131] RBP: 880422085d10 R08: ea002cdf3b80 R09: a055c4af Jul 24 10:12:53 server kernel: [575939.742131] R10: 880b37ceed00 R11: 4004 R12: Jul 24 10:12:53 server kernel: [575939.742131] R13: 8807f56418c0 R14: R15: 880c2268d180 Jul 24 10:12:53 server kernel: [575939.742131] FS: 7fafd700() GS:88103fc8() knlGS: Jul 24 10:12:53 server kernel: [575939.742131] CS: 0010 DS: ES: CR0: 8005003b Jul 24 10:12:53 server kernel: [575939.742131] CR2: 0010 CR3: 000c22d6c000 CR4: 06e0 Jul 24 10:12:53 server kernel: [575939.742131] DR0: DR1: DR2: Jul 24 10:12:53 server kernel: [575939.742131] DR3: DR6: 0ff0 DR7: 0400 Jul 24 10:12:53 server kernel: [575939.742131] Process nfsd (pid: 2523, threadinfo 880422084000, task 880425964500) Jul 24 10:12:53 server kernel: [575939.742131] Stack: Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d040 880e29b16cc0 8807f56418c0 Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d180 880422085d50 a055d5e3 Jul 24 10:12:53 server kernel: [575939.742131] 880b37cee840 880c22684000 880c2268d040 Jul 24 10:12:53 server kernel: [575939.742131] Call Trace: Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd4_set_nfs4_acl+0x143/0x150 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd4_setattr+0xd4/0x130 [nfsd] Jul 24 10:12:53 server kern
Re: [Kernel-packages] [Bug 1348670] Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010, set_nfsv4_acl_one+0x21/0xb0 [nfsd]
* ScHRiLL [2014-10-02 14:01:59 +]: > Same error with new kernel. It was fine and working without a hick up > and today the same issue reverted. 16:01:16 up 6 days Please check changelog.Debian.gz before jumping to conclusions. The fix is not in #103 because it was committed too late in the cycle. Enable precise-proposed, install 70.105 and try again. > Using > 3.2.0-69-generic #103-Ubuntu SMP Tue Sep 2 05:02:14 UTC 2014 x86_64 x86_64 > x86_64 GNU/Linux -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1348670 Title: BUG: unable to handle kernel NULL pointer dereference at 0010, set_nfsv4_acl_one+0x21/0xb0 [nfsd] Status in “linux” package in Ubuntu: Invalid Status in “linux” source package in Lucid: Invalid Status in “linux” source package in Precise: Fix Committed Status in “linux” source package in Trusty: Fix Committed Status in “linux” source package in Utopic: Invalid Status in “linux” package in Debian: Fix Released Bug description: I've seen this happen twice in the last 8 days on an NFS server running Ubuntu precise and kernels 3.2.0-65.98-generic (on the first occasion) and 3.2.0-67.101-generic (the second time), amd64. This never happened before in several months of operation; until 2014-07-01 this server was running an older 3.2.0 kernel. When this error appears in the logs, the system stops answering NFS RPCs (e.g., "rpcinfo -u localhost nfs 3" hangs) and a reboot is necessary to restore NFS service. A more detailed stack trace follows. Looking at the source code (fs/nfsd/vfs.c:set_nfsv4_acl_one()) I see that the call posix_acl_xattr_size(pacl->a_count) is not preceded by a check that pacl != NULL. Could this be related to the following entry in the changelog for 3.2.0-65.98? * NFSD: Call ->set_acl with a NULL ACL structure if no entries - LP: #1328154 Jul 24 10:12:53 server kernel: [575939.742131] IP: [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] PGD c243bb067 PUD c2400a067 PMD 0 Jul 24 10:12:53 server kernel: [575939.742131] Oops: [#1] SMP Jul 24 10:12:53 server kernel: [575939.742131] CPU 3 Jul 24 10:12:53 server kernel: [575939.742131] Modules linked in: usblp btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs reiserfs ext2 cts openafs(P) xt_tcpudp ipmi_si ipmi_devintf ipmi_msghandler iptable_filter ip_tables x_tables autofs4 bnep parport_pc rfcomm bluetooth ppdev binfmt_misc rpcsec_gss_krb5 nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc xfs dm_crypt bridge stp psmouse hpilo sp5100_tco i2c_piix4 amd64_edac_mod hpwdt edac_core k10temp edac_mce_amd joydev serio_raw acpi_power_meter mac_hid lp parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 multipath linear radeon ttm drm_kms_helper drm osst usbhid hid st ch i2c_algo_bit pata_atiixp hpsa bnx2 Jul 24 10:12:53 server kernel: [575939.742131] Jul 24 10:12:53 server kernel: [575939.742131] Pid: 2523, comm: nfsd Tainted: P O 3.2.0-67-generic #101-Ubuntu HP ProLiant DL385 G7 Jul 24 10:12:53 server kernel: [575939.742131] RIP: 0010:[] [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] RSP: 0018:880422085ce0 EFLAGS: 00010282 Jul 24 10:12:53 server kernel: [575939.742131] RAX: 4000 RBX: 880e29b16cc0 RCX: 013cc2cc Jul 24 10:12:53 server kernel: [575939.742131] RDX: a0583374 RSI: RDI: 880e29b16cc0 Jul 24 10:12:53 server kernel: [575939.742131] RBP: 880422085d10 R08: ea002cdf3b80 R09: a055c4af Jul 24 10:12:53 server kernel: [575939.742131] R10: 880b37ceed00 R11: 4004 R12: Jul 24 10:12:53 server kernel: [575939.742131] R13: 8807f56418c0 R14: R15: 880c2268d180 Jul 24 10:12:53 server kernel: [575939.742131] FS: 7fafd700() GS:88103fc8() knlGS: Jul 24 10:12:53 server kernel: [575939.742131] CS: 0010 DS: ES: CR0: 8005003b Jul 24 10:12:53 server kernel: [575939.742131] CR2: 0010 CR3: 000c22d6c000 CR4: 06e0 Jul 24 10:12:53 server kernel: [575939.742131] DR0: DR1: DR2: Jul 24 10:12:53 server kernel: [575939.742131] DR3: DR6: 0ff0 DR7: 0400 Jul 24 10:12:53 server kernel: [575939.742131] Process nfsd (pid: 2523, threadinfo 880422084000, task 880425964500) Jul 24 10:12:53 server kernel: [575939.742131] Stack: Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d040 880e29b16cc0 8807f56418c0 Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d180 880422085d50 a055d5e3 Jul
[Kernel-packages] [Bug 1348670] Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010, set_nfsv4_acl_one+0x21/0xb0 [nfsd]
** Tags removed: verification-needed-trusty ** Tags added: verification-done-trusty -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1348670 Title: BUG: unable to handle kernel NULL pointer dereference at 0010, set_nfsv4_acl_one+0x21/0xb0 [nfsd] Status in “linux” package in Ubuntu: Invalid Status in “linux” source package in Lucid: Invalid Status in “linux” source package in Precise: Fix Committed Status in “linux” source package in Trusty: Fix Committed Status in “linux” source package in Utopic: Invalid Status in “linux” package in Debian: New Bug description: I've seen this happen twice in the last 8 days on an NFS server running Ubuntu precise and kernels 3.2.0-65.98-generic (on the first occasion) and 3.2.0-67.101-generic (the second time), amd64. This never happened before in several months of operation; until 2014-07-01 this server was running an older 3.2.0 kernel. When this error appears in the logs, the system stops answering NFS RPCs (e.g., "rpcinfo -u localhost nfs 3" hangs) and a reboot is necessary to restore NFS service. A more detailed stack trace follows. Looking at the source code (fs/nfsd/vfs.c:set_nfsv4_acl_one()) I see that the call posix_acl_xattr_size(pacl->a_count) is not preceded by a check that pacl != NULL. Could this be related to the following entry in the changelog for 3.2.0-65.98? * NFSD: Call ->set_acl with a NULL ACL structure if no entries - LP: #1328154 Jul 24 10:12:53 server kernel: [575939.742131] IP: [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] PGD c243bb067 PUD c2400a067 PMD 0 Jul 24 10:12:53 server kernel: [575939.742131] Oops: [#1] SMP Jul 24 10:12:53 server kernel: [575939.742131] CPU 3 Jul 24 10:12:53 server kernel: [575939.742131] Modules linked in: usblp btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs reiserfs ext2 cts openafs(P) xt_tcpudp ipmi_si ipmi_devintf ipmi_msghandler iptable_filter ip_tables x_tables autofs4 bnep parport_pc rfcomm bluetooth ppdev binfmt_misc rpcsec_gss_krb5 nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc xfs dm_crypt bridge stp psmouse hpilo sp5100_tco i2c_piix4 amd64_edac_mod hpwdt edac_core k10temp edac_mce_amd joydev serio_raw acpi_power_meter mac_hid lp parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 multipath linear radeon ttm drm_kms_helper drm osst usbhid hid st ch i2c_algo_bit pata_atiixp hpsa bnx2 Jul 24 10:12:53 server kernel: [575939.742131] Jul 24 10:12:53 server kernel: [575939.742131] Pid: 2523, comm: nfsd Tainted: P O 3.2.0-67-generic #101-Ubuntu HP ProLiant DL385 G7 Jul 24 10:12:53 server kernel: [575939.742131] RIP: 0010:[] [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] RSP: 0018:880422085ce0 EFLAGS: 00010282 Jul 24 10:12:53 server kernel: [575939.742131] RAX: 4000 RBX: 880e29b16cc0 RCX: 013cc2cc Jul 24 10:12:53 server kernel: [575939.742131] RDX: a0583374 RSI: RDI: 880e29b16cc0 Jul 24 10:12:53 server kernel: [575939.742131] RBP: 880422085d10 R08: ea002cdf3b80 R09: a055c4af Jul 24 10:12:53 server kernel: [575939.742131] R10: 880b37ceed00 R11: 4004 R12: Jul 24 10:12:53 server kernel: [575939.742131] R13: 8807f56418c0 R14: R15: 880c2268d180 Jul 24 10:12:53 server kernel: [575939.742131] FS: 7fafd700() GS:88103fc8() knlGS: Jul 24 10:12:53 server kernel: [575939.742131] CS: 0010 DS: ES: CR0: 8005003b Jul 24 10:12:53 server kernel: [575939.742131] CR2: 0010 CR3: 000c22d6c000 CR4: 06e0 Jul 24 10:12:53 server kernel: [575939.742131] DR0: DR1: DR2: Jul 24 10:12:53 server kernel: [575939.742131] DR3: DR6: 0ff0 DR7: 0400 Jul 24 10:12:53 server kernel: [575939.742131] Process nfsd (pid: 2523, threadinfo 880422084000, task 880425964500) Jul 24 10:12:53 server kernel: [575939.742131] Stack: Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d040 880e29b16cc0 8807f56418c0 Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d180 880422085d50 a055d5e3 Jul 24 10:12:53 server kernel: [575939.742131] 880b37cee840 880c22684000 880c2268d040 Jul 24 10:12:53 server kernel: [575939.742131] Call Trace: Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd4_set_nfs4_acl+0x143/0x150 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd4_setattr+0xd4/0x130 [nfsd] Jul 24 10:12:53 server kernel: [575939
[Kernel-packages] [Bug 1365869] Re: After upgrade to 3.13.0-35.62, rpc.gssd complains about missing /run/rpc_pipefs/gssd/clntXX/info
** Tags removed: verification-needed-trusty ** Tags added: verification-done-trusty -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1365869 Title: After upgrade to 3.13.0-35.62, rpc.gssd complains about missing /run/rpc_pipefs/gssd/clntXX/info Status in “linux” package in Ubuntu: Fix Released Status in “linux” source package in Trusty: Fix Committed Bug description: The following changes in 3.13.0-35.62: * sunrpc: create a new dummy pipe for gssd to hold open - LP: #1327563 * sunrpc: replace sunrpc_net->gssd_running flag with a more reliable check - LP: #1327563 * nfs: check if gssd is running before attempting to use krb5i auth in SETCLIENTID call - LP: #1327563 are causing rpc.gssd to fill syslog with messages of the form ERROR: can't open /run/rpc_pipefs/gssd/clntXX/info: No such file or directory The problem was discussed last December in https://bugzilla.redhat.com/show_bug.cgi?id=1037793 where the resolution was to include the following three patches: http://marc.info/?l=linux-nfs&m=138624689302466&w=2 http://marc.info/?l=linux-nfs&m=138624684502447&w=2 http://marc.info/?l=linux-nfs&m=138624684502447&w=2 These patches are already in the upstream kernel (since 3.14). I suggest cherry-picking them for 3.13. Commit hashes from the 3.14 branch: 3396f92f8be606ea485b0a82d4e7749a448b013b e2f0c83a9de331d9352185ca3642616c13127539 23e66ba97127ff3b064d4c6c5138aa34eafc492f To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1365869/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1365869] Re: After upgrade to 3.13.0-35.62, rpc.gssd complains about missing /run/rpc_pipefs/gssd/clntXX/info
I've built, installed and tested the kernel described in comment #4. It does what I expected it to do: * file /run/rpc_pipefs/gssd/clntXX/info now exists; * rpc.gssd no longer complains. No adverse side effects so far. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1365869 Title: After upgrade to 3.13.0-35.62, rpc.gssd complains about missing /run/rpc_pipefs/gssd/clntXX/info Status in “linux” package in Ubuntu: Fix Released Status in “linux” source package in Trusty: In Progress Bug description: The following changes in 3.13.0-35.62: * sunrpc: create a new dummy pipe for gssd to hold open - LP: #1327563 * sunrpc: replace sunrpc_net->gssd_running flag with a more reliable check - LP: #1327563 * nfs: check if gssd is running before attempting to use krb5i auth in SETCLIENTID call - LP: #1327563 are causing rpc.gssd to fill syslog with messages of the form ERROR: can't open /run/rpc_pipefs/gssd/clntXX/info: No such file or directory The problem was discussed last December in https://bugzilla.redhat.com/show_bug.cgi?id=1037793 where the resolution was to include the following three patches: http://marc.info/?l=linux-nfs&m=138624689302466&w=2 http://marc.info/?l=linux-nfs&m=138624684502447&w=2 http://marc.info/?l=linux-nfs&m=138624684502447&w=2 These patches are already in the upstream kernel (since 3.14). I suggest cherry-picking them for 3.13. Commit hashes from the 3.14 branch: 3396f92f8be606ea485b0a82d4e7749a448b013b e2f0c83a9de331d9352185ca3642616c13127539 23e66ba97127ff3b064d4c6c5138aa34eafc492f To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1365869/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1365869] Re: After upgrade to 3.13.0-35.62, rpc.gssd complains about missing /run/rpc_pipefs/gssd/clntXX/info
Won't run apport-collect. (Not really needed for this particular bug.) ** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1365869 Title: After upgrade to 3.13.0-35.62, rpc.gssd complains about missing /run/rpc_pipefs/gssd/clntXX/info Status in “linux” package in Ubuntu: Confirmed Bug description: The following changes in 3.13.0-35.62: * sunrpc: create a new dummy pipe for gssd to hold open - LP: #1327563 * sunrpc: replace sunrpc_net->gssd_running flag with a more reliable check - LP: #1327563 * nfs: check if gssd is running before attempting to use krb5i auth in SETCLIENTID call - LP: #1327563 are causing rpc.gssd to fill syslog with messages of the form ERROR: can't open /run/rpc_pipefs/gssd/clntXX/info: No such file or directory The problem was discussed last December in https://bugzilla.redhat.com/show_bug.cgi?id=1037793 where the resolution was to include the following three patches: http://marc.info/?l=linux-nfs&m=138624689302466&w=2 http://marc.info/?l=linux-nfs&m=138624684502447&w=2 http://marc.info/?l=linux-nfs&m=138624684502447&w=2 These patches are already in the upstream kernel (since 3.14). I suggest cherry-picking them for 3.13. Commit hashes from the 3.14 branch: 3396f92f8be606ea485b0a82d4e7749a448b013b e2f0c83a9de331d9352185ca3642616c13127539 23e66ba97127ff3b064d4c6c5138aa34eafc492f To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1365869/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1365869] Re: After upgrade to 3.13.0-35.62, rpc.gssd complains about missing /run/rpc_pipefs/gssd/clntXX/info
** Tags removed: fixed-in-upstream-kernel ** Tags added: kernel-fixed-upstream ** Description changed: The following changes in 3.13.0-35.62: - * sunrpc: create a new dummy pipe for gssd to hold open -- LP: #1327563 - * sunrpc: replace sunrpc_net->gssd_running flag with a more reliable check -- LP: #1327563 - * nfs: check if gssd is running before attempting to use krb5i auth in SETCLIENTID call -- LP: #1327563 + * sunrpc: create a new dummy pipe for gssd to hold open + - LP: #1327563 + * sunrpc: replace sunrpc_net->gssd_running flag with a more reliable check + - LP: #1327563 + * nfs: check if gssd is running before attempting to use krb5i auth in SETCLIENTID call + - LP: #1327563 are causing rpc.gssd to fill syslog with messages of the form ERROR: can't open /run/rpc_pipefs/gssd/clntXX/info: No such file or directory - The problem was discussed last winter in https://bugzilla.redhat.com/show_bug.cgi?id=1037793 + The problem was discussed last December in https://bugzilla.redhat.com/show_bug.cgi?id=1037793 where the resolution was to include the following three patches: http://marc.info/?l=linux-nfs&m=138624689302466&w=2 http://marc.info/?l=linux-nfs&m=138624684502447&w=2 http://marc.info/?l=linux-nfs&m=138624684502447&w=2 These patches are already in the upstream kernel (since 3.14). I suggest cherry-picking them for 3.13. Commit hashes from the 3.14 branch: - 3396f92f8be606ea485b0a82d4e7749a448b013b - e2f0c83a9de331d9352185ca3642616c13127539 - 23e66ba97127ff3b064d4c6c5138aa34eafc492f + 3396f92f8be606ea485b0a82d4e7749a448b013b + e2f0c83a9de331d9352185ca3642616c13127539 + 23e66ba97127ff3b064d4c6c5138aa34eafc492f -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1365869 Title: After upgrade to 3.13.0-35.62, rpc.gssd complains about missing /run/rpc_pipefs/gssd/clntXX/info Status in “linux” package in Ubuntu: Incomplete Bug description: The following changes in 3.13.0-35.62: * sunrpc: create a new dummy pipe for gssd to hold open - LP: #1327563 * sunrpc: replace sunrpc_net->gssd_running flag with a more reliable check - LP: #1327563 * nfs: check if gssd is running before attempting to use krb5i auth in SETCLIENTID call - LP: #1327563 are causing rpc.gssd to fill syslog with messages of the form ERROR: can't open /run/rpc_pipefs/gssd/clntXX/info: No such file or directory The problem was discussed last December in https://bugzilla.redhat.com/show_bug.cgi?id=1037793 where the resolution was to include the following three patches: http://marc.info/?l=linux-nfs&m=138624689302466&w=2 http://marc.info/?l=linux-nfs&m=138624684502447&w=2 http://marc.info/?l=linux-nfs&m=138624684502447&w=2 These patches are already in the upstream kernel (since 3.14). I suggest cherry-picking them for 3.13. Commit hashes from the 3.14 branch: 3396f92f8be606ea485b0a82d4e7749a448b013b e2f0c83a9de331d9352185ca3642616c13127539 23e66ba97127ff3b064d4c6c5138aa34eafc492f To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1365869/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1365869] [NEW] After upgrade to 3.13.0-35.62, rpc.gssd complains about missing /run/rpc_pipefs/gssd/clntXX/info
Public bug reported: The following changes in 3.13.0-35.62: * sunrpc: create a new dummy pipe for gssd to hold open - LP: #1327563 * sunrpc: replace sunrpc_net->gssd_running flag with a more reliable check - LP: #1327563 * nfs: check if gssd is running before attempting to use krb5i auth in SETCLIENTID call - LP: #1327563 are causing rpc.gssd to fill syslog with messages of the form ERROR: can't open /run/rpc_pipefs/gssd/clntXX/info: No such file or directory The problem was discussed last December in https://bugzilla.redhat.com/show_bug.cgi?id=1037793 where the resolution was to include the following three patches: http://marc.info/?l=linux-nfs&m=138624689302466&w=2 http://marc.info/?l=linux-nfs&m=138624684502447&w=2 http://marc.info/?l=linux-nfs&m=138624684502447&w=2 These patches are already in the upstream kernel (since 3.14). I suggest cherry-picking them for 3.13. Commit hashes from the 3.14 branch: 3396f92f8be606ea485b0a82d4e7749a448b013b e2f0c83a9de331d9352185ca3642616c13127539 23e66ba97127ff3b064d4c6c5138aa34eafc492f ** Affects: linux (Ubuntu) Importance: Undecided Status: Incomplete ** Tags: cherry-pick kernel-fixed-upstream kernel-fs regression-update trusty -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1365869 Title: After upgrade to 3.13.0-35.62, rpc.gssd complains about missing /run/rpc_pipefs/gssd/clntXX/info Status in “linux” package in Ubuntu: Incomplete Bug description: The following changes in 3.13.0-35.62: * sunrpc: create a new dummy pipe for gssd to hold open - LP: #1327563 * sunrpc: replace sunrpc_net->gssd_running flag with a more reliable check - LP: #1327563 * nfs: check if gssd is running before attempting to use krb5i auth in SETCLIENTID call - LP: #1327563 are causing rpc.gssd to fill syslog with messages of the form ERROR: can't open /run/rpc_pipefs/gssd/clntXX/info: No such file or directory The problem was discussed last December in https://bugzilla.redhat.com/show_bug.cgi?id=1037793 where the resolution was to include the following three patches: http://marc.info/?l=linux-nfs&m=138624689302466&w=2 http://marc.info/?l=linux-nfs&m=138624684502447&w=2 http://marc.info/?l=linux-nfs&m=138624684502447&w=2 These patches are already in the upstream kernel (since 3.14). I suggest cherry-picking them for 3.13. Commit hashes from the 3.14 branch: 3396f92f8be606ea485b0a82d4e7749a448b013b e2f0c83a9de331d9352185ca3642616c13127539 23e66ba97127ff3b064d4c6c5138aa34eafc492f To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1365869/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1348670] Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010, set_nfsv4_acl_one+0x21/0xb0 [nfsd]
I'm a bit confused by Tim's changes in #16. The bug affects kernels up to and including 3.13 (trusty). I'll take his word that it also affects lucid, but what does a status of Invalid mean? ** Tags removed: regression-update ** Tags added: regression-updatekernel-fixed-upstream ** Tags removed: regression-updatekernel-fixed-upstream ** Tags added: kernel-fixed-upstream regression-update -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1348670 Title: BUG: unable to handle kernel NULL pointer dereference at 0010, set_nfsv4_acl_one+0x21/0xb0 [nfsd] Status in “linux” package in Ubuntu: Invalid Status in “linux” source package in Lucid: In Progress Status in “linux” package in Debian: New Bug description: I've seen this happen twice in the last 8 days on an NFS server running Ubuntu precise and kernels 3.2.0-65.98-generic (on the first occasion) and 3.2.0-67.101-generic (the second time), amd64. This never happened before in several months of operation; until 2014-07-01 this server was running an older 3.2.0 kernel. When this error appears in the logs, the system stops answering NFS RPCs (e.g., "rpcinfo -u localhost nfs 3" hangs) and a reboot is necessary to restore NFS service. A more detailed stack trace follows. Looking at the source code (fs/nfsd/vfs.c:set_nfsv4_acl_one()) I see that the call posix_acl_xattr_size(pacl->a_count) is not preceded by a check that pacl != NULL. Could this be related to the following entry in the changelog for 3.2.0-65.98? * NFSD: Call ->set_acl with a NULL ACL structure if no entries - LP: #1328154 Jul 24 10:12:53 server kernel: [575939.742131] IP: [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] PGD c243bb067 PUD c2400a067 PMD 0 Jul 24 10:12:53 server kernel: [575939.742131] Oops: [#1] SMP Jul 24 10:12:53 server kernel: [575939.742131] CPU 3 Jul 24 10:12:53 server kernel: [575939.742131] Modules linked in: usblp btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs reiserfs ext2 cts openafs(P) xt_tcpudp ipmi_si ipmi_devintf ipmi_msghandler iptable_filter ip_tables x_tables autofs4 bnep parport_pc rfcomm bluetooth ppdev binfmt_misc rpcsec_gss_krb5 nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc xfs dm_crypt bridge stp psmouse hpilo sp5100_tco i2c_piix4 amd64_edac_mod hpwdt edac_core k10temp edac_mce_amd joydev serio_raw acpi_power_meter mac_hid lp parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 multipath linear radeon ttm drm_kms_helper drm osst usbhid hid st ch i2c_algo_bit pata_atiixp hpsa bnx2 Jul 24 10:12:53 server kernel: [575939.742131] Jul 24 10:12:53 server kernel: [575939.742131] Pid: 2523, comm: nfsd Tainted: P O 3.2.0-67-generic #101-Ubuntu HP ProLiant DL385 G7 Jul 24 10:12:53 server kernel: [575939.742131] RIP: 0010:[] [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] RSP: 0018:880422085ce0 EFLAGS: 00010282 Jul 24 10:12:53 server kernel: [575939.742131] RAX: 4000 RBX: 880e29b16cc0 RCX: 013cc2cc Jul 24 10:12:53 server kernel: [575939.742131] RDX: a0583374 RSI: RDI: 880e29b16cc0 Jul 24 10:12:53 server kernel: [575939.742131] RBP: 880422085d10 R08: ea002cdf3b80 R09: a055c4af Jul 24 10:12:53 server kernel: [575939.742131] R10: 880b37ceed00 R11: 4004 R12: Jul 24 10:12:53 server kernel: [575939.742131] R13: 8807f56418c0 R14: R15: 880c2268d180 Jul 24 10:12:53 server kernel: [575939.742131] FS: 7fafd700() GS:88103fc8() knlGS: Jul 24 10:12:53 server kernel: [575939.742131] CS: 0010 DS: ES: CR0: 8005003b Jul 24 10:12:53 server kernel: [575939.742131] CR2: 0010 CR3: 000c22d6c000 CR4: 06e0 Jul 24 10:12:53 server kernel: [575939.742131] DR0: DR1: DR2: Jul 24 10:12:53 server kernel: [575939.742131] DR3: DR6: 0ff0 DR7: 0400 Jul 24 10:12:53 server kernel: [575939.742131] Process nfsd (pid: 2523, threadinfo 880422084000, task 880425964500) Jul 24 10:12:53 server kernel: [575939.742131] Stack: Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d040 880e29b16cc0 8807f56418c0 Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d180 880422085d50 a055d5e3 Jul 24 10:12:53 server kernel: [575939.742131] 880b37cee840 880c22684000 880c2268d040 Jul 24 10:12:53 server kernel: [575939.742131] Call Trace: Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd4_set_nfs4_acl+0x143/0
Re: [Kernel-packages] [Bug 1348670] Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010, set_nfsv4_acl_one+0x21/0xb0 [nfsd]
* Joseph Salisbury [2014-09-03 19:46:04 +]: > Also, has the patch in comment #12 been sent upstream for inclusion in > the mainline/stable kernel? The affected code was refactored out of existence in kernel 3.14. As such, my patch is inapplicable to 3.14 and later. The replacement set_acl methods in the various filesystem drivers generally are coded to cope with a NULL argument; I didn't conduct an exhaustive search but I looked at a few and didn't notice anything problematic. Given the above, I see no need to actually test kernel 3.17. Will tag. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1348670 Title: BUG: unable to handle kernel NULL pointer dereference at 0010, set_nfsv4_acl_one+0x21/0xb0 [nfsd] Status in “linux” package in Ubuntu: Invalid Status in “linux” source package in Lucid: In Progress Status in “linux” package in Debian: New Bug description: I've seen this happen twice in the last 8 days on an NFS server running Ubuntu precise and kernels 3.2.0-65.98-generic (on the first occasion) and 3.2.0-67.101-generic (the second time), amd64. This never happened before in several months of operation; until 2014-07-01 this server was running an older 3.2.0 kernel. When this error appears in the logs, the system stops answering NFS RPCs (e.g., "rpcinfo -u localhost nfs 3" hangs) and a reboot is necessary to restore NFS service. A more detailed stack trace follows. Looking at the source code (fs/nfsd/vfs.c:set_nfsv4_acl_one()) I see that the call posix_acl_xattr_size(pacl->a_count) is not preceded by a check that pacl != NULL. Could this be related to the following entry in the changelog for 3.2.0-65.98? * NFSD: Call ->set_acl with a NULL ACL structure if no entries - LP: #1328154 Jul 24 10:12:53 server kernel: [575939.742131] IP: [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] PGD c243bb067 PUD c2400a067 PMD 0 Jul 24 10:12:53 server kernel: [575939.742131] Oops: [#1] SMP Jul 24 10:12:53 server kernel: [575939.742131] CPU 3 Jul 24 10:12:53 server kernel: [575939.742131] Modules linked in: usblp btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs reiserfs ext2 cts openafs(P) xt_tcpudp ipmi_si ipmi_devintf ipmi_msghandler iptable_filter ip_tables x_tables autofs4 bnep parport_pc rfcomm bluetooth ppdev binfmt_misc rpcsec_gss_krb5 nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc xfs dm_crypt bridge stp psmouse hpilo sp5100_tco i2c_piix4 amd64_edac_mod hpwdt edac_core k10temp edac_mce_amd joydev serio_raw acpi_power_meter mac_hid lp parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 multipath linear radeon ttm drm_kms_helper drm osst usbhid hid st ch i2c_algo_bit pata_atiixp hpsa bnx2 Jul 24 10:12:53 server kernel: [575939.742131] Jul 24 10:12:53 server kernel: [575939.742131] Pid: 2523, comm: nfsd Tainted: P O 3.2.0-67-generic #101-Ubuntu HP ProLiant DL385 G7 Jul 24 10:12:53 server kernel: [575939.742131] RIP: 0010:[] [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] RSP: 0018:880422085ce0 EFLAGS: 00010282 Jul 24 10:12:53 server kernel: [575939.742131] RAX: 4000 RBX: 880e29b16cc0 RCX: 013cc2cc Jul 24 10:12:53 server kernel: [575939.742131] RDX: a0583374 RSI: RDI: 880e29b16cc0 Jul 24 10:12:53 server kernel: [575939.742131] RBP: 880422085d10 R08: ea002cdf3b80 R09: a055c4af Jul 24 10:12:53 server kernel: [575939.742131] R10: 880b37ceed00 R11: 4004 R12: Jul 24 10:12:53 server kernel: [575939.742131] R13: 8807f56418c0 R14: R15: 880c2268d180 Jul 24 10:12:53 server kernel: [575939.742131] FS: 7fafd700() GS:88103fc8() knlGS: Jul 24 10:12:53 server kernel: [575939.742131] CS: 0010 DS: ES: CR0: 8005003b Jul 24 10:12:53 server kernel: [575939.742131] CR2: 0010 CR3: 000c22d6c000 CR4: 06e0 Jul 24 10:12:53 server kernel: [575939.742131] DR0: DR1: DR2: Jul 24 10:12:53 server kernel: [575939.742131] DR3: DR6: 0ff0 DR7: 0400 Jul 24 10:12:53 server kernel: [575939.742131] Process nfsd (pid: 2523, threadinfo 880422084000, task 880425964500) Jul 24 10:12:53 server kernel: [575939.742131] Stack: Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d040 880e29b16cc0 8807f56418c0 Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d180 880422085d50 a055d5e3 Jul 24 10:12:53 server kernel: [575939.742131] 880b37cee840 880
Re: [Kernel-packages] [Bug 1348670] Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010, set_nfsv4_acl_one+0x21/0xb0 [nfsd]
* Sergio Gelato [2014-08-22 07:29:32 -]: > I'm now testing my one-line patch from comment #5 on top of 3.2.0-67.101 > (amd64, generic kernel flavour). So far it doesn't seem to make things > worse, but since I don't have a sure-fire way of triggering the bug it > may take a while to get experimental confirmation that it cures the issue. I've now got >9 days of uptime on two NFS servers with that patch (both servers had been previously affected by the bug) without any trouble; not a single nfsd thread has been lost. Unfortunately the fix didn't make it into 3.2.0-68.102 so I'm having to build my own kernels once more. What are the chances of this fix (or an equivalent/better one, of course) being included in 3.2.63? I'm attaching the patch again in diff form for clarity and convenience. ** Patch added: "nfsd-fix-acl-null-pointer-deref.patch" https://bugs.launchpad.net/bugs/1348670/+attachment/4192076/+files/nfsd-fix-acl-null-pointer-deref.patch -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1348670 Title: BUG: unable to handle kernel NULL pointer dereference at 0010, set_nfsv4_acl_one+0x21/0xb0 [nfsd] Status in “linux” package in Ubuntu: Confirmed Status in “linux” package in Debian: New Bug description: I've seen this happen twice in the last 8 days on an NFS server running Ubuntu precise and kernels 3.2.0-65.98-generic (on the first occasion) and 3.2.0-67.101-generic (the second time), amd64. This never happened before in several months of operation; until 2014-07-01 this server was running an older 3.2.0 kernel. When this error appears in the logs, the system stops answering NFS RPCs (e.g., "rpcinfo -u localhost nfs 3" hangs) and a reboot is necessary to restore NFS service. A more detailed stack trace follows. Looking at the source code (fs/nfsd/vfs.c:set_nfsv4_acl_one()) I see that the call posix_acl_xattr_size(pacl->a_count) is not preceded by a check that pacl != NULL. Could this be related to the following entry in the changelog for 3.2.0-65.98? * NFSD: Call ->set_acl with a NULL ACL structure if no entries - LP: #1328154 Jul 24 10:12:53 server kernel: [575939.742131] IP: [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] PGD c243bb067 PUD c2400a067 PMD 0 Jul 24 10:12:53 server kernel: [575939.742131] Oops: [#1] SMP Jul 24 10:12:53 server kernel: [575939.742131] CPU 3 Jul 24 10:12:53 server kernel: [575939.742131] Modules linked in: usblp btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs reiserfs ext2 cts openafs(P) xt_tcpudp ipmi_si ipmi_devintf ipmi_msghandler iptable_filter ip_tables x_tables autofs4 bnep parport_pc rfcomm bluetooth ppdev binfmt_misc rpcsec_gss_krb5 nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc xfs dm_crypt bridge stp psmouse hpilo sp5100_tco i2c_piix4 amd64_edac_mod hpwdt edac_core k10temp edac_mce_amd joydev serio_raw acpi_power_meter mac_hid lp parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 multipath linear radeon ttm drm_kms_helper drm osst usbhid hid st ch i2c_algo_bit pata_atiixp hpsa bnx2 Jul 24 10:12:53 server kernel: [575939.742131] Jul 24 10:12:53 server kernel: [575939.742131] Pid: 2523, comm: nfsd Tainted: P O 3.2.0-67-generic #101-Ubuntu HP ProLiant DL385 G7 Jul 24 10:12:53 server kernel: [575939.742131] RIP: 0010:[] [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] RSP: 0018:880422085ce0 EFLAGS: 00010282 Jul 24 10:12:53 server kernel: [575939.742131] RAX: 4000 RBX: 880e29b16cc0 RCX: 013cc2cc Jul 24 10:12:53 server kernel: [575939.742131] RDX: a0583374 RSI: RDI: 880e29b16cc0 Jul 24 10:12:53 server kernel: [575939.742131] RBP: 880422085d10 R08: ea002cdf3b80 R09: a055c4af Jul 24 10:12:53 server kernel: [575939.742131] R10: 880b37ceed00 R11: 4004 R12: Jul 24 10:12:53 server kernel: [575939.742131] R13: 8807f56418c0 R14: R15: 880c2268d180 Jul 24 10:12:53 server kernel: [575939.742131] FS: 7fafd700() GS:88103fc8() knlGS: Jul 24 10:12:53 server kernel: [575939.742131] CS: 0010 DS: ES: CR0: 8005003b Jul 24 10:12:53 server kernel: [575939.742131] CR2: 0010 CR3: 000c22d6c000 CR4: 06e0 Jul 24 10:12:53 server kernel: [575939.742131] DR0: DR1: DR2: Jul 24 10:12:53 server kernel: [575939.742131] DR3: DR6: 0ff0 DR7: 0400 Jul 24 10:12:53 server kernel: [575939.742131] Process nfsd (pid: 25
Re: [Kernel-packages] [Bug 1348670] Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010, set_nfsv4_acl_one+0x21/0xb0 [nfsd]
I'm now testing my one-line patch from comment #5 on top of 3.2.0-67.101 (amd64, generic kernel flavour). So far it doesn't seem to make things worse, but since I don't have a sure-fire way of triggering the bug it may take a while to get experimental confirmation that it cures the issue. (I'm reasonably confident about it based on my reading of the source code, however. The various set_acl methods in 3.14 seem to be doing the same thing as that patch.) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1348670 Title: BUG: unable to handle kernel NULL pointer dereference at 0010, set_nfsv4_acl_one+0x21/0xb0 [nfsd] Status in “linux” package in Ubuntu: Confirmed Status in “linux” package in Debian: New Bug description: I've seen this happen twice in the last 8 days on an NFS server running Ubuntu precise and kernels 3.2.0-65.98-generic (on the first occasion) and 3.2.0-67.101-generic (the second time), amd64. This never happened before in several months of operation; until 2014-07-01 this server was running an older 3.2.0 kernel. When this error appears in the logs, the system stops answering NFS RPCs (e.g., "rpcinfo -u localhost nfs 3" hangs) and a reboot is necessary to restore NFS service. A more detailed stack trace follows. Looking at the source code (fs/nfsd/vfs.c:set_nfsv4_acl_one()) I see that the call posix_acl_xattr_size(pacl->a_count) is not preceded by a check that pacl != NULL. Could this be related to the following entry in the changelog for 3.2.0-65.98? * NFSD: Call ->set_acl with a NULL ACL structure if no entries - LP: #1328154 Jul 24 10:12:53 server kernel: [575939.742131] IP: [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] PGD c243bb067 PUD c2400a067 PMD 0 Jul 24 10:12:53 server kernel: [575939.742131] Oops: [#1] SMP Jul 24 10:12:53 server kernel: [575939.742131] CPU 3 Jul 24 10:12:53 server kernel: [575939.742131] Modules linked in: usblp btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs reiserfs ext2 cts openafs(P) xt_tcpudp ipmi_si ipmi_devintf ipmi_msghandler iptable_filter ip_tables x_tables autofs4 bnep parport_pc rfcomm bluetooth ppdev binfmt_misc rpcsec_gss_krb5 nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc xfs dm_crypt bridge stp psmouse hpilo sp5100_tco i2c_piix4 amd64_edac_mod hpwdt edac_core k10temp edac_mce_amd joydev serio_raw acpi_power_meter mac_hid lp parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 multipath linear radeon ttm drm_kms_helper drm osst usbhid hid st ch i2c_algo_bit pata_atiixp hpsa bnx2 Jul 24 10:12:53 server kernel: [575939.742131] Jul 24 10:12:53 server kernel: [575939.742131] Pid: 2523, comm: nfsd Tainted: P O 3.2.0-67-generic #101-Ubuntu HP ProLiant DL385 G7 Jul 24 10:12:53 server kernel: [575939.742131] RIP: 0010:[] [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] RSP: 0018:880422085ce0 EFLAGS: 00010282 Jul 24 10:12:53 server kernel: [575939.742131] RAX: 4000 RBX: 880e29b16cc0 RCX: 013cc2cc Jul 24 10:12:53 server kernel: [575939.742131] RDX: a0583374 RSI: RDI: 880e29b16cc0 Jul 24 10:12:53 server kernel: [575939.742131] RBP: 880422085d10 R08: ea002cdf3b80 R09: a055c4af Jul 24 10:12:53 server kernel: [575939.742131] R10: 880b37ceed00 R11: 4004 R12: Jul 24 10:12:53 server kernel: [575939.742131] R13: 8807f56418c0 R14: R15: 880c2268d180 Jul 24 10:12:53 server kernel: [575939.742131] FS: 7fafd700() GS:88103fc8() knlGS: Jul 24 10:12:53 server kernel: [575939.742131] CS: 0010 DS: ES: CR0: 8005003b Jul 24 10:12:53 server kernel: [575939.742131] CR2: 0010 CR3: 000c22d6c000 CR4: 06e0 Jul 24 10:12:53 server kernel: [575939.742131] DR0: DR1: DR2: Jul 24 10:12:53 server kernel: [575939.742131] DR3: DR6: 0ff0 DR7: 0400 Jul 24 10:12:53 server kernel: [575939.742131] Process nfsd (pid: 2523, threadinfo 880422084000, task 880425964500) Jul 24 10:12:53 server kernel: [575939.742131] Stack: Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d040 880e29b16cc0 8807f56418c0 Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d180 880422085d50 a055d5e3 Jul 24 10:12:53 server kernel: [575939.742131] 880b37cee840 880c22684000 880c2268d040 Jul 24 10:12:53 server kernel: [575939.742131] Call Trace: Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd4_set_nfs4_
Re: [Kernel-packages] [Bug 1348670] Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010, set_nfsv4_acl_one+0x21/0xb0 [nfsd]
* Michiel [2014-08-07 10:58:29 -]: > Any hints at a workaround in the meantime? It's especially nasty since a > dead NFS server locks up the clients completely. I'd say either test my suggested patch (I'm on holiday and haven't gotten around to testing, but since it only modifies the code path that triggers the bug you should be pretty safe from side effects) or try nfsd.ko from an older kernel. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1348670 Title: BUG: unable to handle kernel NULL pointer dereference at 0010, set_nfsv4_acl_one+0x21/0xb0 [nfsd] Status in “linux” package in Ubuntu: Confirmed Status in “linux” package in Debian: New Bug description: I've seen this happen twice in the last 8 days on an NFS server running Ubuntu precise and kernels 3.2.0-65.98-generic (on the first occasion) and 3.2.0-67.101-generic (the second time), amd64. This never happened before in several months of operation; until 2014-07-01 this server was running an older 3.2.0 kernel. When this error appears in the logs, the system stops answering NFS RPCs (e.g., "rpcinfo -u localhost nfs 3" hangs) and a reboot is necessary to restore NFS service. A more detailed stack trace follows. Looking at the source code (fs/nfsd/vfs.c:set_nfsv4_acl_one()) I see that the call posix_acl_xattr_size(pacl->a_count) is not preceded by a check that pacl != NULL. Could this be related to the following entry in the changelog for 3.2.0-65.98? * NFSD: Call ->set_acl with a NULL ACL structure if no entries - LP: #1328154 Jul 24 10:12:53 server kernel: [575939.742131] IP: [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] PGD c243bb067 PUD c2400a067 PMD 0 Jul 24 10:12:53 server kernel: [575939.742131] Oops: [#1] SMP Jul 24 10:12:53 server kernel: [575939.742131] CPU 3 Jul 24 10:12:53 server kernel: [575939.742131] Modules linked in: usblp btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs reiserfs ext2 cts openafs(P) xt_tcpudp ipmi_si ipmi_devintf ipmi_msghandler iptable_filter ip_tables x_tables autofs4 bnep parport_pc rfcomm bluetooth ppdev binfmt_misc rpcsec_gss_krb5 nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc xfs dm_crypt bridge stp psmouse hpilo sp5100_tco i2c_piix4 amd64_edac_mod hpwdt edac_core k10temp edac_mce_amd joydev serio_raw acpi_power_meter mac_hid lp parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 multipath linear radeon ttm drm_kms_helper drm osst usbhid hid st ch i2c_algo_bit pata_atiixp hpsa bnx2 Jul 24 10:12:53 server kernel: [575939.742131] Jul 24 10:12:53 server kernel: [575939.742131] Pid: 2523, comm: nfsd Tainted: P O 3.2.0-67-generic #101-Ubuntu HP ProLiant DL385 G7 Jul 24 10:12:53 server kernel: [575939.742131] RIP: 0010:[] [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] RSP: 0018:880422085ce0 EFLAGS: 00010282 Jul 24 10:12:53 server kernel: [575939.742131] RAX: 4000 RBX: 880e29b16cc0 RCX: 013cc2cc Jul 24 10:12:53 server kernel: [575939.742131] RDX: a0583374 RSI: RDI: 880e29b16cc0 Jul 24 10:12:53 server kernel: [575939.742131] RBP: 880422085d10 R08: ea002cdf3b80 R09: a055c4af Jul 24 10:12:53 server kernel: [575939.742131] R10: 880b37ceed00 R11: 4004 R12: Jul 24 10:12:53 server kernel: [575939.742131] R13: 8807f56418c0 R14: R15: 880c2268d180 Jul 24 10:12:53 server kernel: [575939.742131] FS: 7fafd700() GS:88103fc8() knlGS: Jul 24 10:12:53 server kernel: [575939.742131] CS: 0010 DS: ES: CR0: 8005003b Jul 24 10:12:53 server kernel: [575939.742131] CR2: 0010 CR3: 000c22d6c000 CR4: 06e0 Jul 24 10:12:53 server kernel: [575939.742131] DR0: DR1: DR2: Jul 24 10:12:53 server kernel: [575939.742131] DR3: DR6: 0ff0 DR7: 0400 Jul 24 10:12:53 server kernel: [575939.742131] Process nfsd (pid: 2523, threadinfo 880422084000, task 880425964500) Jul 24 10:12:53 server kernel: [575939.742131] Stack: Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d040 880e29b16cc0 8807f56418c0 Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d180 880422085d50 a055d5e3 Jul 24 10:12:53 server kernel: [575939.742131] 880b37cee840 880c22684000 880c2268d040 Jul 24 10:12:53 server kernel: [575939.742131] Call Trace: Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd4_set_nfs4_acl+0x143/0x150 [nfsd] Jul 24 10:12:53 server kernel
[Kernel-packages] [Bug 1348670] Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010, set_nfsv4_acl_one+0x21/0xb0 [nfsd]
** Bug watch added: Debian Bug tracker #754420 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=754420 ** Also affects: linux (Debian) via http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=754420 Importance: Unknown Status: Unknown -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1348670 Title: BUG: unable to handle kernel NULL pointer dereference at 0010, set_nfsv4_acl_one+0x21/0xb0 [nfsd] Status in “linux” package in Ubuntu: Confirmed Status in “linux” package in Debian: Unknown Bug description: I've seen this happen twice in the last 8 days on an NFS server running Ubuntu precise and kernels 3.2.0-65.98-generic (on the first occasion) and 3.2.0-67.101-generic (the second time), amd64. This never happened before in several months of operation; until 2014-07-01 this server was running an older 3.2.0 kernel. When this error appears in the logs, the system stops answering NFS RPCs (e.g., "rpcinfo -u localhost nfs 3" hangs) and a reboot is necessary to restore NFS service. A more detailed stack trace follows. Looking at the source code (fs/nfsd/vfs.c:set_nfsv4_acl_one()) I see that the call posix_acl_xattr_size(pacl->a_count) is not preceded by a check that pacl != NULL. Could this be related to the following entry in the changelog for 3.2.0-65.98? * NFSD: Call ->set_acl with a NULL ACL structure if no entries - LP: #1328154 Jul 24 10:12:53 server kernel: [575939.742131] IP: [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] PGD c243bb067 PUD c2400a067 PMD 0 Jul 24 10:12:53 server kernel: [575939.742131] Oops: [#1] SMP Jul 24 10:12:53 server kernel: [575939.742131] CPU 3 Jul 24 10:12:53 server kernel: [575939.742131] Modules linked in: usblp btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs reiserfs ext2 cts openafs(P) xt_tcpudp ipmi_si ipmi_devintf ipmi_msghandler iptable_filter ip_tables x_tables autofs4 bnep parport_pc rfcomm bluetooth ppdev binfmt_misc rpcsec_gss_krb5 nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc xfs dm_crypt bridge stp psmouse hpilo sp5100_tco i2c_piix4 amd64_edac_mod hpwdt edac_core k10temp edac_mce_amd joydev serio_raw acpi_power_meter mac_hid lp parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 multipath linear radeon ttm drm_kms_helper drm osst usbhid hid st ch i2c_algo_bit pata_atiixp hpsa bnx2 Jul 24 10:12:53 server kernel: [575939.742131] Jul 24 10:12:53 server kernel: [575939.742131] Pid: 2523, comm: nfsd Tainted: P O 3.2.0-67-generic #101-Ubuntu HP ProLiant DL385 G7 Jul 24 10:12:53 server kernel: [575939.742131] RIP: 0010:[] [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] RSP: 0018:880422085ce0 EFLAGS: 00010282 Jul 24 10:12:53 server kernel: [575939.742131] RAX: 4000 RBX: 880e29b16cc0 RCX: 013cc2cc Jul 24 10:12:53 server kernel: [575939.742131] RDX: a0583374 RSI: RDI: 880e29b16cc0 Jul 24 10:12:53 server kernel: [575939.742131] RBP: 880422085d10 R08: ea002cdf3b80 R09: a055c4af Jul 24 10:12:53 server kernel: [575939.742131] R10: 880b37ceed00 R11: 4004 R12: Jul 24 10:12:53 server kernel: [575939.742131] R13: 8807f56418c0 R14: R15: 880c2268d180 Jul 24 10:12:53 server kernel: [575939.742131] FS: 7fafd700() GS:88103fc8() knlGS: Jul 24 10:12:53 server kernel: [575939.742131] CS: 0010 DS: ES: CR0: 8005003b Jul 24 10:12:53 server kernel: [575939.742131] CR2: 0010 CR3: 000c22d6c000 CR4: 06e0 Jul 24 10:12:53 server kernel: [575939.742131] DR0: DR1: DR2: Jul 24 10:12:53 server kernel: [575939.742131] DR3: DR6: 0ff0 DR7: 0400 Jul 24 10:12:53 server kernel: [575939.742131] Process nfsd (pid: 2523, threadinfo 880422084000, task 880425964500) Jul 24 10:12:53 server kernel: [575939.742131] Stack: Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d040 880e29b16cc0 8807f56418c0 Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d180 880422085d50 a055d5e3 Jul 24 10:12:53 server kernel: [575939.742131] 880b37cee840 880c22684000 880c2268d040 Jul 24 10:12:53 server kernel: [575939.742131] Call Trace: Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd4_set_nfs4_acl+0x143/0x150 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd4_setattr+0xd4/0x130 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd4_proc_compound+0x518/0x6e0 [nfsd] Jul
Re: [Kernel-packages] [Bug 1348670] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010, set_nfsv4_acl_one+0x21/0xb0 [nfsd]
* Sergio Gelato [2014-07-25 14:23:03 -]: > Could this be related to the following entry in the changelog > for 3.2.0-65.98? > > * NFSD: Call ->set_acl with a NULL ACL structure if no entries > - LP: #1328154 Yes, I think that's it. That change allows posix_state_to_acl() to return NULL in some cases, and the pre-3.14 set_nfsv4_acl() code doesn't guard against being passed a NULL for the pacl argument. From a brief perusal of the sources I think this affects kernels 3.13 (trusty) and older. A quick fix might be to add if (!pacl) return vfs_setxattr(dentry, key, NULL, 0, 0); at the beginning of set_nfsv4_acl_one(). Note I haven't tested this yet. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1348670 Title: BUG: unable to handle kernel NULL pointer dereference at 0010, set_nfsv4_acl_one+0x21/0xb0 [nfsd] Status in “linux” package in Ubuntu: Confirmed Bug description: I've seen this happen twice in the last 8 days on an NFS server running Ubuntu precise and kernels 3.2.0-65.98-generic (on the first occasion) and 3.2.0-67.101-generic (the second time), amd64. This never happened before in several months of operation; until 2014-07-01 this server was running an older 3.2.0 kernel. When this error appears in the logs, the system stops answering NFS RPCs (e.g., "rpcinfo -u localhost nfs 3" hangs) and a reboot is necessary to restore NFS service. A more detailed stack trace follows. Looking at the source code (fs/nfsd/vfs.c:set_nfsv4_acl_one()) I see that the call posix_acl_xattr_size(pacl->a_count) is not preceded by a check that pacl != NULL. Could this be related to the following entry in the changelog for 3.2.0-65.98? * NFSD: Call ->set_acl with a NULL ACL structure if no entries - LP: #1328154 Jul 24 10:12:53 server kernel: [575939.742131] IP: [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] PGD c243bb067 PUD c2400a067 PMD 0 Jul 24 10:12:53 server kernel: [575939.742131] Oops: [#1] SMP Jul 24 10:12:53 server kernel: [575939.742131] CPU 3 Jul 24 10:12:53 server kernel: [575939.742131] Modules linked in: usblp btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs reiserfs ext2 cts openafs(P) xt_tcpudp ipmi_si ipmi_devintf ipmi_msghandler iptable_filter ip_tables x_tables autofs4 bnep parport_pc rfcomm bluetooth ppdev binfmt_misc rpcsec_gss_krb5 nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc xfs dm_crypt bridge stp psmouse hpilo sp5100_tco i2c_piix4 amd64_edac_mod hpwdt edac_core k10temp edac_mce_amd joydev serio_raw acpi_power_meter mac_hid lp parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 multipath linear radeon ttm drm_kms_helper drm osst usbhid hid st ch i2c_algo_bit pata_atiixp hpsa bnx2 Jul 24 10:12:53 server kernel: [575939.742131] Jul 24 10:12:53 server kernel: [575939.742131] Pid: 2523, comm: nfsd Tainted: P O 3.2.0-67-generic #101-Ubuntu HP ProLiant DL385 G7 Jul 24 10:12:53 server kernel: [575939.742131] RIP: 0010:[] [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] RSP: 0018:880422085ce0 EFLAGS: 00010282 Jul 24 10:12:53 server kernel: [575939.742131] RAX: 4000 RBX: 880e29b16cc0 RCX: 013cc2cc Jul 24 10:12:53 server kernel: [575939.742131] RDX: a0583374 RSI: RDI: 880e29b16cc0 Jul 24 10:12:53 server kernel: [575939.742131] RBP: 880422085d10 R08: ea002cdf3b80 R09: a055c4af Jul 24 10:12:53 server kernel: [575939.742131] R10: 880b37ceed00 R11: 4004 R12: Jul 24 10:12:53 server kernel: [575939.742131] R13: 8807f56418c0 R14: R15: 880c2268d180 Jul 24 10:12:53 server kernel: [575939.742131] FS: 7fafd700() GS:88103fc8() knlGS: Jul 24 10:12:53 server kernel: [575939.742131] CS: 0010 DS: ES: CR0: 8005003b Jul 24 10:12:53 server kernel: [575939.742131] CR2: 0010 CR3: 000c22d6c000 CR4: 06e0 Jul 24 10:12:53 server kernel: [575939.742131] DR0: DR1: DR2: Jul 24 10:12:53 server kernel: [575939.742131] DR3: DR6: 0ff0 DR7: 0400 Jul 24 10:12:53 server kernel: [575939.742131] Process nfsd (pid: 2523, threadinfo 880422084000, task 880425964500) Jul 24 10:12:53 server kernel: [575939.742131] Stack: Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d040 880e29b16cc0 8807f56418c0 Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d180 880422085d50 a0
[Kernel-packages] [Bug 1348670] Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010, set_nfsv4_acl_one+0x21/0xb0 [nfsd]
Can't run apport-collect on this server. ** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1348670 Title: BUG: unable to handle kernel NULL pointer dereference at 0010, set_nfsv4_acl_one+0x21/0xb0 [nfsd] Status in “linux” package in Ubuntu: Confirmed Bug description: I've seen this happen twice in the last 8 days on an NFS server running Ubuntu precise and kernels 3.2.0-65.98-generic (on the first occasion) and 3.2.0-67.101-generic (the second time), amd64. This never happened before in several months of operation; until 2014-07-01 this server was running an older 3.2.0 kernel. When this error appears in the logs, the system stops answering NFS RPCs (e.g., "rpcinfo -u localhost nfs 3" hangs) and a reboot is necessary to restore NFS service. A more detailed stack trace follows. Looking at the source code (fs/nfsd/vfs.c:set_nfsv4_acl_one()) I see that the call posix_acl_xattr_size(pacl->a_count) is not preceded by a check that pacl != NULL. Could this be related to the following entry in the changelog for 3.2.0-65.98? * NFSD: Call ->set_acl with a NULL ACL structure if no entries - LP: #1328154 Jul 24 10:12:53 server kernel: [575939.742131] IP: [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] PGD c243bb067 PUD c2400a067 PMD 0 Jul 24 10:12:53 server kernel: [575939.742131] Oops: [#1] SMP Jul 24 10:12:53 server kernel: [575939.742131] CPU 3 Jul 24 10:12:53 server kernel: [575939.742131] Modules linked in: usblp btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs reiserfs ext2 cts openafs(P) xt_tcpudp ipmi_si ipmi_devintf ipmi_msghandler iptable_filter ip_tables x_tables autofs4 bnep parport_pc rfcomm bluetooth ppdev binfmt_misc rpcsec_gss_krb5 nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc xfs dm_crypt bridge stp psmouse hpilo sp5100_tco i2c_piix4 amd64_edac_mod hpwdt edac_core k10temp edac_mce_amd joydev serio_raw acpi_power_meter mac_hid lp parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 multipath linear radeon ttm drm_kms_helper drm osst usbhid hid st ch i2c_algo_bit pata_atiixp hpsa bnx2 Jul 24 10:12:53 server kernel: [575939.742131] Jul 24 10:12:53 server kernel: [575939.742131] Pid: 2523, comm: nfsd Tainted: P O 3.2.0-67-generic #101-Ubuntu HP ProLiant DL385 G7 Jul 24 10:12:53 server kernel: [575939.742131] RIP: 0010:[] [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] RSP: 0018:880422085ce0 EFLAGS: 00010282 Jul 24 10:12:53 server kernel: [575939.742131] RAX: 4000 RBX: 880e29b16cc0 RCX: 013cc2cc Jul 24 10:12:53 server kernel: [575939.742131] RDX: a0583374 RSI: RDI: 880e29b16cc0 Jul 24 10:12:53 server kernel: [575939.742131] RBP: 880422085d10 R08: ea002cdf3b80 R09: a055c4af Jul 24 10:12:53 server kernel: [575939.742131] R10: 880b37ceed00 R11: 4004 R12: Jul 24 10:12:53 server kernel: [575939.742131] R13: 8807f56418c0 R14: R15: 880c2268d180 Jul 24 10:12:53 server kernel: [575939.742131] FS: 7fafd700() GS:88103fc8() knlGS: Jul 24 10:12:53 server kernel: [575939.742131] CS: 0010 DS: ES: CR0: 8005003b Jul 24 10:12:53 server kernel: [575939.742131] CR2: 0010 CR3: 000c22d6c000 CR4: 06e0 Jul 24 10:12:53 server kernel: [575939.742131] DR0: DR1: DR2: Jul 24 10:12:53 server kernel: [575939.742131] DR3: DR6: 0ff0 DR7: 0400 Jul 24 10:12:53 server kernel: [575939.742131] Process nfsd (pid: 2523, threadinfo 880422084000, task 880425964500) Jul 24 10:12:53 server kernel: [575939.742131] Stack: Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d040 880e29b16cc0 8807f56418c0 Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d180 880422085d50 a055d5e3 Jul 24 10:12:53 server kernel: [575939.742131] 880b37cee840 880c22684000 880c2268d040 Jul 24 10:12:53 server kernel: [575939.742131] Call Trace: Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd4_set_nfs4_acl+0x143/0x150 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd4_setattr+0xd4/0x130 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd4_proc_compound+0x518/0x6e0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd_dispatch+0xeb/0x230 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] [] svc_process_common+0x345/0x690 [sunrpc] Jul 24 10:
[Kernel-packages] [Bug 992678] Re: 3.2.0-24-virtual rejects sec=krb5p NFS mount option
*** This bug is a duplicate of bug 769527 *** https://bugs.launchpad.net/bugs/769527 ** This bug has been marked a duplicate of bug 769527 Missing rpcsec_gss_krb5 module -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/992678 Title: 3.2.0-24-virtual rejects sec=krb5p NFS mount option Status in “linux” package in Ubuntu: Confirmed Bug description: # uname -a Linux client 3.2.0-24-virtual #37-Ubuntu SMP Wed Apr 25 12:51:49 UTC 2012 i686 i686 i386 GNU/Linux # mount -v -t nfs -o sec=krb5p server:/srv/nfs/k /mnt mount.nfs: timeout set for Tue May 1 17:46:30 2012 mount.nfs: trying text-based options 'sec=krb5p,vers=4,addr=192.168.0.10,clientaddr=192.168.0.20' mount.nfs: mount(2): Invalid argument mount.nfs: an incorrect mount option was specified If I boot the same VM using 3.2.0-24-generic-pae (no other changes), the mount succeeds. sec=sys NFS mounts work just fine with either kernel. After each failed mount attempt, dmesg says: gss_create: Pseudoflavor 390005 not found! RPC: Couldn't create auth handle (flavor 390005) Additional info: # cat /proc/version_signature Ubuntu 3.2.0-24.37-virtual 3.2.14 lspci -vnvn produces no output. This is running as a guest on a Debian squeeze Xen 4.0.1 host. The auth_rpcgss module is loaded: # lsmod Module Size Used by autofs427969 31 xt_tcpudp 12531 14 nf_conntrack_ipv4 19084 5 nf_defrag_ipv4 12649 1 nf_conntrack_ipv4 xt_state 12514 5 nf_conntrack 73847 2 nf_conntrack_ipv4,xt_state iptable_filter 12706 1 ip_tables 18106 1 iptable_filter x_tables 21974 4 xt_tcpudp,xt_state,iptable_filter,ip_tables nfsd 229850 2 nfs 307289 0 lockd 78804 2 nfsd,nfs fscache50642 1 nfs auth_rpcgss39597 2 nfsd,nfs nfs_acl12771 2 nfsd,nfs sunrpc205647 6 nfsd,nfs,lockd,auth_rpcgss,nfs_acl ext2 67987 1 lp 17455 0 parport40930 1 lp Comparing the kernel config files in /boot shows only these differences: -CONFIG_VERSION_SIGNATURE="Ubuntu 3.2.0-24.37-generic-pae 3.2.14" +CONFIG_VERSION_SIGNATURE="Ubuntu 3.2.0-24.37-virtual 3.2.14" -CONFIG_PHYSICAL_START=0x100 +CONFIG_PHYSICAL_START=0x10 -CONFIG_PHYSICAL_ALIGN=0x100 +CONFIG_PHYSICAL_ALIGN=0x10 -CONFIG_INTEL_IDLE=y +# CONFIG_INTEL_IDLE is not set dmesg says: gss_create: Pseudoflavor 390005 not found! RPC: Couldn't create auth handle (flavor 390005) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/992678/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1348670] [NEW] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010, set_nfsv4_acl_one+0x21/0xb0 [nfsd]
Public bug reported: I've seen this happen twice in the last 8 days on an NFS server running Ubuntu precise and kernels 3.2.0-65.98-generic (on the first occasion) and 3.2.0-67.101-generic (the second time), amd64. This never happened before in several months of operation; until 2014-07-01 this server was running an older 3.2.0 kernel. When this error appears in the logs, the system stops answering NFS RPCs (e.g., "rpcinfo -u localhost nfs 3" hangs) and a reboot is necessary to restore NFS service. A more detailed stack trace follows. Looking at the source code (fs/nfsd/vfs.c:set_nfsv4_acl_one()) I see that the call posix_acl_xattr_size(pacl->a_count) is not preceded by a check that pacl != NULL. Could this be related to the following entry in the changelog for 3.2.0-65.98? * NFSD: Call ->set_acl with a NULL ACL structure if no entries - LP: #1328154 Jul 24 10:12:53 server kernel: [575939.742131] IP: [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] PGD c243bb067 PUD c2400a067 PMD 0 Jul 24 10:12:53 server kernel: [575939.742131] Oops: [#1] SMP Jul 24 10:12:53 server kernel: [575939.742131] CPU 3 Jul 24 10:12:53 server kernel: [575939.742131] Modules linked in: usblp btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs reiserfs ext2 cts openafs(P) xt_tcpudp ipmi_si ipmi_devintf ipmi_msghandler iptable_filter ip_tables x_tables autofs4 bnep parport_pc rfcomm bluetooth ppdev binfmt_misc rpcsec_gss_krb5 nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc xfs dm_crypt bridge stp psmouse hpilo sp5100_tco i2c_piix4 amd64_edac_mod hpwdt edac_core k10temp edac_mce_amd joydev serio_raw acpi_power_meter mac_hid lp parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 multipath linear radeon ttm drm_kms_helper drm osst usbhid hid st ch i2c_algo_bit pata_atiixp hpsa bnx2 Jul 24 10:12:53 server kernel: [575939.742131] Jul 24 10:12:53 server kernel: [575939.742131] Pid: 2523, comm: nfsd Tainted: P O 3.2.0-67-generic #101-Ubuntu HP ProLiant DL385 G7 Jul 24 10:12:53 server kernel: [575939.742131] RIP: 0010:[] [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] RSP: 0018:880422085ce0 EFLAGS: 00010282 Jul 24 10:12:53 server kernel: [575939.742131] RAX: 4000 RBX: 880e29b16cc0 RCX: 013cc2cc Jul 24 10:12:53 server kernel: [575939.742131] RDX: a0583374 RSI: RDI: 880e29b16cc0 Jul 24 10:12:53 server kernel: [575939.742131] RBP: 880422085d10 R08: ea002cdf3b80 R09: a055c4af Jul 24 10:12:53 server kernel: [575939.742131] R10: 880b37ceed00 R11: 4004 R12: Jul 24 10:12:53 server kernel: [575939.742131] R13: 8807f56418c0 R14: R15: 880c2268d180 Jul 24 10:12:53 server kernel: [575939.742131] FS: 7fafd700() GS:88103fc8() knlGS: Jul 24 10:12:53 server kernel: [575939.742131] CS: 0010 DS: ES: CR0: 8005003b Jul 24 10:12:53 server kernel: [575939.742131] CR2: 0010 CR3: 000c22d6c000 CR4: 06e0 Jul 24 10:12:53 server kernel: [575939.742131] DR0: DR1: DR2: Jul 24 10:12:53 server kernel: [575939.742131] DR3: DR6: 0ff0 DR7: 0400 Jul 24 10:12:53 server kernel: [575939.742131] Process nfsd (pid: 2523, threadinfo 880422084000, task 880425964500) Jul 24 10:12:53 server kernel: [575939.742131] Stack: Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d040 880e29b16cc0 8807f56418c0 Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d180 880422085d50 a055d5e3 Jul 24 10:12:53 server kernel: [575939.742131] 880b37cee840 880c22684000 880c2268d040 Jul 24 10:12:53 server kernel: [575939.742131] Call Trace: Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd4_set_nfs4_acl+0x143/0x150 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd4_setattr+0xd4/0x130 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd4_proc_compound+0x518/0x6e0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd_dispatch+0xeb/0x230 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] [] svc_process_common+0x345/0x690 [sunrpc] Jul 24 10:12:53 server kernel: [575939.742131] [] ? try_to_wake_up+0x200/0x200 Jul 24 10:12:53 server kernel: [575939.742131] [] svc_process+0x102/0x150 [sunrpc] Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd+0xbd/0x160 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] [] ? nfsd_startup+0xf0/0xf0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] [] kthread+0x8c/0xa0 Jul 24 10:12:53 server kernel: [575939.742131] [] kernel_thread_helper+0x4/0x10 Jul 24 10:12:53 server kernel: [575939.742131] [] ? flush_kthread_worker+
[Kernel-packages] [Bug 1348670] Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010, set_nfsv4_acl_one+0x21/0xb0 [nfsd]
** Attachment added: "lspci-vnvn.log" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1348670/+attachment/4162784/+files/lspci-vnvn.log -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1348670 Title: BUG: unable to handle kernel NULL pointer dereference at 0010, set_nfsv4_acl_one+0x21/0xb0 [nfsd] Status in “linux” package in Ubuntu: Incomplete Bug description: I've seen this happen twice in the last 8 days on an NFS server running Ubuntu precise and kernels 3.2.0-65.98-generic (on the first occasion) and 3.2.0-67.101-generic (the second time), amd64. This never happened before in several months of operation; until 2014-07-01 this server was running an older 3.2.0 kernel. When this error appears in the logs, the system stops answering NFS RPCs (e.g., "rpcinfo -u localhost nfs 3" hangs) and a reboot is necessary to restore NFS service. A more detailed stack trace follows. Looking at the source code (fs/nfsd/vfs.c:set_nfsv4_acl_one()) I see that the call posix_acl_xattr_size(pacl->a_count) is not preceded by a check that pacl != NULL. Could this be related to the following entry in the changelog for 3.2.0-65.98? * NFSD: Call ->set_acl with a NULL ACL structure if no entries - LP: #1328154 Jul 24 10:12:53 server kernel: [575939.742131] IP: [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] PGD c243bb067 PUD c2400a067 PMD 0 Jul 24 10:12:53 server kernel: [575939.742131] Oops: [#1] SMP Jul 24 10:12:53 server kernel: [575939.742131] CPU 3 Jul 24 10:12:53 server kernel: [575939.742131] Modules linked in: usblp btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs reiserfs ext2 cts openafs(P) xt_tcpudp ipmi_si ipmi_devintf ipmi_msghandler iptable_filter ip_tables x_tables autofs4 bnep parport_pc rfcomm bluetooth ppdev binfmt_misc rpcsec_gss_krb5 nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc xfs dm_crypt bridge stp psmouse hpilo sp5100_tco i2c_piix4 amd64_edac_mod hpwdt edac_core k10temp edac_mce_amd joydev serio_raw acpi_power_meter mac_hid lp parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 multipath linear radeon ttm drm_kms_helper drm osst usbhid hid st ch i2c_algo_bit pata_atiixp hpsa bnx2 Jul 24 10:12:53 server kernel: [575939.742131] Jul 24 10:12:53 server kernel: [575939.742131] Pid: 2523, comm: nfsd Tainted: P O 3.2.0-67-generic #101-Ubuntu HP ProLiant DL385 G7 Jul 24 10:12:53 server kernel: [575939.742131] RIP: 0010:[] [] set_nfsv4_acl_one+0x21/0xb0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] RSP: 0018:880422085ce0 EFLAGS: 00010282 Jul 24 10:12:53 server kernel: [575939.742131] RAX: 4000 RBX: 880e29b16cc0 RCX: 013cc2cc Jul 24 10:12:53 server kernel: [575939.742131] RDX: a0583374 RSI: RDI: 880e29b16cc0 Jul 24 10:12:53 server kernel: [575939.742131] RBP: 880422085d10 R08: ea002cdf3b80 R09: a055c4af Jul 24 10:12:53 server kernel: [575939.742131] R10: 880b37ceed00 R11: 4004 R12: Jul 24 10:12:53 server kernel: [575939.742131] R13: 8807f56418c0 R14: R15: 880c2268d180 Jul 24 10:12:53 server kernel: [575939.742131] FS: 7fafd700() GS:88103fc8() knlGS: Jul 24 10:12:53 server kernel: [575939.742131] CS: 0010 DS: ES: CR0: 8005003b Jul 24 10:12:53 server kernel: [575939.742131] CR2: 0010 CR3: 000c22d6c000 CR4: 06e0 Jul 24 10:12:53 server kernel: [575939.742131] DR0: DR1: DR2: Jul 24 10:12:53 server kernel: [575939.742131] DR3: DR6: 0ff0 DR7: 0400 Jul 24 10:12:53 server kernel: [575939.742131] Process nfsd (pid: 2523, threadinfo 880422084000, task 880425964500) Jul 24 10:12:53 server kernel: [575939.742131] Stack: Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d040 880e29b16cc0 8807f56418c0 Jul 24 10:12:53 server kernel: [575939.742131] 880c2268d180 880422085d50 a055d5e3 Jul 24 10:12:53 server kernel: [575939.742131] 880b37cee840 880c22684000 880c2268d040 Jul 24 10:12:53 server kernel: [575939.742131] Call Trace: Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd4_set_nfs4_acl+0x143/0x150 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd4_setattr+0xd4/0x130 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd4_proc_compound+0x518/0x6e0 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] [] nfsd_dispatch+0xeb/0x230 [nfsd] Jul 24 10:12:53 server kernel: [575939.742131] [] svc_process_commo
Re: [Kernel-packages] [Bug 769527] Re: Missing rpcsec_gss_krb5 module
* Tim Gardner [2014-05-13 19:04:45 -]: > Marking verification-done-trusty. > > dpkg-deb --contents linux-image-3.13.0-26-generic_3.13.0-26.48_amd64.deb > |grep rpcsec_gss_krb5 > -rw-r--r-- root/root 50260 2014-05-07 18:38 > ./lib/modules/3.13.0-26-generic/kernel/net/sunrpc/auth_gss/rpcsec_gss_krb5.ko > > ** Tags removed: verification-needed-trusty > ** Tags added: verification-done-trusty Confirmed for i386 as well, by actually installing the kernel, booting into it and exercising the functionality of rpcsec_gss_krb5. Thanks. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/769527 Title: Missing rpcsec_gss_krb5 module Status in “linux” package in Ubuntu: Fix Committed Status in “linux” source package in Trusty: Fix Committed Status in “linux” source package in Utopic: Fix Committed Bug description: Binary package hint: linux-image-2.6.38-8-virtual The module rpcsec_gss_krb5 is missing: dumaresq@borgil:/lib/modules/2.6.38-8-virtual/kernel/net/sunrpc/auth_gss$ ls -l total 72 -rw-r--r-- 1 root root 72336 2011-04-11 01:06 auth_rpcgss.ko dumaresq@borgil:/lib/modules/2.6.38-8-virtual/kernel/net/sunrpc/auth_gss$ Where as on my "real" machine: root@middleearth:/lib/modules/2.6.38-8-server/kernel/net/sunrpc/auth_gss# ls -l total 124 -rw-r--r-- 1 root root 72336 2011-04-11 00:46 auth_rpcgss.ko -rw-r--r-- 1 root root 51776 2011-04-11 00:46 rpcsec_gss_krb5.ko root@middleearth:/lib/modules/2.6.38-8-server/kernel/net/sunrpc/auth_gss# uname -a Linux borgil 2.6.38-8-virtual #42-Ubuntu SMP Mon Apr 11 04:06:34 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux cat /proc/version_signature Ubuntu 2.6.38-8.42-virtual 2.6.38.2 dmesg [0.00] Initializing cgroup subsys cpuset [0.00] Initializing cgroup subsys cpu [0.00] Linux version 2.6.38-8-virtual (buildd@allspice) (gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu3) ) #42-Ubuntu SMP Mon Apr 11 04:06:34 UTC 2011 (Ubuntu 2.6.38-8.42-virtual 2.6.38.2) [0.00] Command line: root=UUID=210a1bd7-248d-4f8d-aa03-6efa9927332e ro quiet splash [0.00] BIOS-provided physical RAM map: [0.00] BIOS-e820: - 0009fc00 (usable) [0.00] BIOS-e820: 0009fc00 - 000a (reserved) [0.00] BIOS-e820: 000f - 0010 (reserved) [0.00] BIOS-e820: 0010 - 3fffd000 (usable) [0.00] BIOS-e820: 3fffd000 - 4000 (reserved) [0.00] BIOS-e820: feffc000 - ff00 (reserved) [0.00] BIOS-e820: fffc - 0001 (reserved) [0.00] NX (Execute Disable) protection: active [0.00] DMI 2.4 present. [0.00] DMI: Bochs Bochs, BIOS Bochs 01/01/2007 [0.00] e820 update range: - 0001 (usable) ==> (reserved) [0.00] e820 remove range: 000a - 0010 (usable) [0.00] No AGP bridge found [0.00] last_pfn = 0x3fffd max_arch_pfn = 0x4 [0.00] MTRR default type: write-back [0.00] MTRR fixed ranges enabled: [0.00] 0-9 write-back [0.00] A-B uncachable [0.00] C-F write-protect [0.00] MTRR variable ranges enabled: [0.00] 0 base 00E000 mask FFE000 uncachable [0.00] 1 disabled [0.00] 2 disabled [0.00] 3 disabled [0.00] 4 disabled [0.00] 5 disabled [0.00] 6 disabled [0.00] 7 disabled [0.00] PAT not supported by CPU. [0.00] found SMP MP-table at [880fd790] fd790 [0.00] initial memory mapped : 0 - 2000 [0.00] init_memory_mapping: -3fffd000 [0.00] 00 - 003fe0 page 2M [0.00] 003fe0 - 003fffd000 page 4k [0.00] kernel direct mapping tables up to 3fffd000 @ 1fffd000-2000 [0.00] RAMDISK: 37bd3000 - 37ff [0.00] ACPI: RSDP 000fd740 00014 (v00 BOCHS ) [0.00] ACPI: RSDT 3fffdc40 00034 (v01 BOCHS BXPCRSDT 0001 BXPC 0001) [0.00] ACPI: FACP 3e70 00074 (v01 BOCHS BXPCFACP 0001 BXPC 0001) [0.00] ACPI: DSDT 3fffde40 01FB7 (v01 BXPC BXDSDT 0001 INTL 20090123) [0.00] ACPI: FACS 3e00 00040 [0.00] ACPI: SSDT 3fffdda0 0009E (v01 BOCHS BXPCSSDT 0001 BXPC 0001) [0.00] ACPI: APIC 3fffdcc0 00072 (v01 BOCHS BXPCAPIC 0001 BXPC 0001) [0.00] ACPI: HPET 3fffdc80 00038 (v01 BOCHS BXPCHPET 0001 BXPC 0001) [0.00] ACPI: Local APIC address 0xfee0 [0.00] No NUMA config
[Kernel-packages] [Bug 769527] Re: Missing rpcsec_gss_krb5 module
The problem here is that rpcsec_gss_krb5 is in linux-image-extra, not in linux-image. As a result, if one installs the -virtual kernel flavour that module is missing. The problem still exists in trusty (linux- image-{,extra-}3.13.0-24-generic). It's annoying since Kerberized NFS is a perfectly normal thing to run in a virtual machine. Also, the related module auth_rpcgss is in the main package so I fail to see why rpcsec_gss_krb5 should be an extra. Please move rpcsec_gss_krb5 to the main linux-image package. ** Changed in: linux (Ubuntu) Status: Expired => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/769527 Title: Missing rpcsec_gss_krb5 module Status in “linux” package in Ubuntu: Confirmed Bug description: Binary package hint: linux-image-2.6.38-8-virtual The module rpcsec_gss_krb5 is missing: dumaresq@borgil:/lib/modules/2.6.38-8-virtual/kernel/net/sunrpc/auth_gss$ ls -l total 72 -rw-r--r-- 1 root root 72336 2011-04-11 01:06 auth_rpcgss.ko dumaresq@borgil:/lib/modules/2.6.38-8-virtual/kernel/net/sunrpc/auth_gss$ Where as on my "real" machine: root@middleearth:/lib/modules/2.6.38-8-server/kernel/net/sunrpc/auth_gss# ls -l total 124 -rw-r--r-- 1 root root 72336 2011-04-11 00:46 auth_rpcgss.ko -rw-r--r-- 1 root root 51776 2011-04-11 00:46 rpcsec_gss_krb5.ko root@middleearth:/lib/modules/2.6.38-8-server/kernel/net/sunrpc/auth_gss# uname -a Linux borgil 2.6.38-8-virtual #42-Ubuntu SMP Mon Apr 11 04:06:34 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux cat /proc/version_signature Ubuntu 2.6.38-8.42-virtual 2.6.38.2 dmesg [0.00] Initializing cgroup subsys cpuset [0.00] Initializing cgroup subsys cpu [0.00] Linux version 2.6.38-8-virtual (buildd@allspice) (gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu3) ) #42-Ubuntu SMP Mon Apr 11 04:06:34 UTC 2011 (Ubuntu 2.6.38-8.42-virtual 2.6.38.2) [0.00] Command line: root=UUID=210a1bd7-248d-4f8d-aa03-6efa9927332e ro quiet splash [0.00] BIOS-provided physical RAM map: [0.00] BIOS-e820: - 0009fc00 (usable) [0.00] BIOS-e820: 0009fc00 - 000a (reserved) [0.00] BIOS-e820: 000f - 0010 (reserved) [0.00] BIOS-e820: 0010 - 3fffd000 (usable) [0.00] BIOS-e820: 3fffd000 - 4000 (reserved) [0.00] BIOS-e820: feffc000 - ff00 (reserved) [0.00] BIOS-e820: fffc - 0001 (reserved) [0.00] NX (Execute Disable) protection: active [0.00] DMI 2.4 present. [0.00] DMI: Bochs Bochs, BIOS Bochs 01/01/2007 [0.00] e820 update range: - 0001 (usable) ==> (reserved) [0.00] e820 remove range: 000a - 0010 (usable) [0.00] No AGP bridge found [0.00] last_pfn = 0x3fffd max_arch_pfn = 0x4 [0.00] MTRR default type: write-back [0.00] MTRR fixed ranges enabled: [0.00] 0-9 write-back [0.00] A-B uncachable [0.00] C-F write-protect [0.00] MTRR variable ranges enabled: [0.00] 0 base 00E000 mask FFE000 uncachable [0.00] 1 disabled [0.00] 2 disabled [0.00] 3 disabled [0.00] 4 disabled [0.00] 5 disabled [0.00] 6 disabled [0.00] 7 disabled [0.00] PAT not supported by CPU. [0.00] found SMP MP-table at [880fd790] fd790 [0.00] initial memory mapped : 0 - 2000 [0.00] init_memory_mapping: -3fffd000 [0.00] 00 - 003fe0 page 2M [0.00] 003fe0 - 003fffd000 page 4k [0.00] kernel direct mapping tables up to 3fffd000 @ 1fffd000-2000 [0.00] RAMDISK: 37bd3000 - 37ff [0.00] ACPI: RSDP 000fd740 00014 (v00 BOCHS ) [0.00] ACPI: RSDT 3fffdc40 00034 (v01 BOCHS BXPCRSDT 0001 BXPC 0001) [0.00] ACPI: FACP 3e70 00074 (v01 BOCHS BXPCFACP 0001 BXPC 0001) [0.00] ACPI: DSDT 3fffde40 01FB7 (v01 BXPC BXDSDT 0001 INTL 20090123) [0.00] ACPI: FACS 3e00 00040 [0.00] ACPI: SSDT 3fffdda0 0009E (v01 BOCHS BXPCSSDT 0001 BXPC 0001) [0.00] ACPI: APIC 3fffdcc0 00072 (v01 BOCHS BXPCAPIC 0001 BXPC 0001) [0.00] ACPI: HPET 3fffdc80 00038 (v01 BOCHS BXPCHPET 0001 BXPC 0001) [0.00] ACPI: Local APIC address 0xfee0 [0.00] No NUMA configuration found [0.00] Faking a node at -3fffd000 [