Re: [PATCH] torture: Correctly fetch CPUs for kvm-build.sh with all native language
On Thu, 1 Apr 2021, Paul E. McKenney wrote: > +# This script knows only English. > +LANG=en_US.UTF-8; export LANG This, too, will only work if en_US.UTF-8 is installed . Check with "locale -a" if it is. Also, Perl will complain loudly if the language is not installed (try: "LANG=en_US.UTF-9 perl"), a nice way to test if LANG works as expected. So, wouldn't LANG=C be a more conservative fallback here? Christian. -- BOFH excuse #58: high pressure system failure
acpi PNP0C14:02: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910
Hi, while looking through boot messages I came across the following on a Lenovo T470 laptop with Linux 5.8: acpi PNP0C14:02: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:01) acpi PNP0C14:03: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:01) Searching the interwebs brought me to an old patch proposal: > https://lkml.org/lkml/2017/12/8/914 > Fri, 8 Dec 2017 20:34:21 -0600 > [PATCH 2/2] platform/x86: wmi: Allow creating WMI devices with the same GUID The patch was proposed, but never made it into mainline. It's not really a big deal, booting continues and all devices appear to work, only these two messages get logged during boot. I'm just wondering if this needs to be fixed or if it's really just a cosmetic issue. Full dmesg: https://pastebin.ubuntu.com/p/2pPv3hywPF/ Thanks, Christian. -- BOFH excuse #451: astropneumatic oscillations in the water-cooling
Re: [PATCH] CREDITS: remove link http://www.dementia.org/~shadow
On Tue, 14 Jul 2020, Jonathan Corbet wrote: > > N: Derrick J. Brashear > > E: sha...@dementia.org > > -W: http://www.dementia.org/~shadow That particular entry moved to: W: http://www.contrib.andrew.cmu.edu/~shadow/ (The https version only supports TLSv1, and Firefox balks) Otherwise, what Jon said: > So thanks for addressing these. That said, I do wonder if this is quite > the right thing to do. I'm assuming that the old sites still exist in the > wayback machine somewhere, and somebody might actually want to find them. > Pity the poor anthropologist researching the origins of the the > billion-line, free-software kernels widely used in the 2500's... > > So maybe we should either mark it as "[BROKEN]" or make a direct link into > the wayback machine instead? That would enable the suitably motivated to > go after the content that once existed. As an innocent bystander, I'd opt for [BROKEN] tags, or Wayback machine substitutes, instead of just removing those entries. My 2 cents, Christian. -- BOFH excuse #128: Power Company having EMP problems with their reactor
Re: process '/usr/bin/rsync' started with executable stack
On Tue, 23 Jun 2020, Kees Cook wrote: > > If you run something with exec stack after the message > > you shouldn't get it second time. > > If you want to reset this flag, you can do: > # echo 1 > /sys/kernel/debug/clear_warn_once Thanks. Although, I tend to not mount /sys/kernel/{config,debug,tracing} and other things, I always thought they are not needed and could maybe lower the attack surface if not mounted. Or maybe my tinfoil hat needs some adjustment... Christian. -- BOFH excuse #279: The static electricity routing is acting up...
Re: [PATCH] Re: filesystem being remounted supports timestamps until 2038
On Sat, 4 Jan 2020, Christian Kujau wrote: > On Sun, 29 Dec 2019, Linus Torvalds wrote: > > > When file systems are remounted a couple of times per day (e.g. rw/ro > > > for backup > > > purposes), dmesg gets flooded with these messages. Change pr_warn > > > into pr_debug > > > to make it stop. > > > > How about just doing it once per mount? > > Yes, once per mount would work, and maybe not print a warning on remounts > at all. Is there any chance that this can be revisited perhaps? This is still flooding my dmesg just because I have that (curde?) mechanism in place to remount the backup device after the hourly backup-run to read-only. Sure, I could omit that ("Doc, it hurts when I do that", as Al would comment), but that's really the only repeating message that gets triggered because of this. 1067 messages in ~60 days of uptime :-| Does the patch below make any sense, would that work? Please reconsider, Christian. > Commit f8b92ba67c5d ("mount: Add mount warning for impending timestamp > expiry") introduced: > >Mounted %s file system at %s supports timestamps until [...] > > in mnt_warn_timestamp_expiry(), but then 0ecee6699064 ("fs/namespace.c: > fix use-after-free of mount in mnt_warn_timestamp_expiry") changed this to > > %s filesystem being %s at %s supports timestamps until [...] > > in order to fix a use-after-free. > > > Of course, if you actually unmount and completely re-mount a > > filesystem, then that would still warn multiple times, but at that > > point I think it's reasonable to do. > > Yes, of course. Umount/remount cycles should still issue a warning, but > "-o remount" should not, IMHO. > > Thanks, > Christian. commit c9a5338b4930cdf99073042de0717db43d7b75be Author: Christian Kujau Date: Thu Dec 26 17:39:57 2019 -0800 Commit f8b92ba67c5d ("mount: Add mount warning for impending timestamp expiry") resp. 0ecee6699064 ("fix use-after-free of mount in mnt_warn_timestamp_expiry()") introduced a pr_warn message and the following gets sent to dmesg on every remount: [...] filesystem being remounted at /mnt supports timestamps until 2038 (0x7fff) When file systems are remounted a couple of times per day (e.g. rw/ro for backup purposes), dmesg gets flooded with these messages. Change pr_warn into pr_debug to make it stop. Signed-off-by: Christian Kujau diff --git a/fs/namespace.c b/fs/namespace.c index be601d3a8008..afc6a13e7316 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2478,7 +2478,7 @@ static void mnt_warn_timestamp_expiry(struct path *mountpoint, struct vfsmount * time64_to_tm(sb->s_time_max, 0, ); - pr_warn("%s filesystem being %s at %s supports timestamps until %04ld (0x%llx)\n", + pr_debug("%s filesystem being %s at %s supports timestamps until %04ld (0x%llx)\n", sb->s_type->name, is_mounted(mnt) ? "remounted" : "mounted", mntpath, -- BOFH excuse #66: bit bucket overflow
Re: process '/usr/bin/rsync' started with executable stack
On Wed, 24 Jun 2020, Alexey Dobriyan wrote: > BTW this bug was exactly the one described in the changelog: > compiling assembly brings executable stack by default: Great, thanks for the pointer, will wait until this lands in Arch. My search engine brought up the lkml discussion though, no the thread[0] on rsync-cvs ;-) Christian. [0] https://lists.samba.org/archive/rsync-cvs/2020-June/007661.html -- BOFH excuse #211: Lightning strikes.
Re: process '/usr/bin/rsync' started with executable stack
On Wed, 24 Jun 2020, Alexey Dobriyan wrote: > > > process '/usr/bin/rsync' started with executable stack > > > But I can't reproduce this message, > > This message is once-per-reboot. Interesting, thanks. Now I know why I cannot reproduce this. I still wonder what made rsync trigger this message today. The machine is running for some weeks, rsync is run a few times an hour the whole day, regularly and automatically, with always the same parameters. But oh, now I see, rsync had been upgraded (automatically) over night: > [ALPM] upgraded rsync (3.1.3-3 -> 3.2.0-1) And indeed, the _older_ version had NX enabled: $ wget https://archive.archlinux.org/packages/.all/rsync-3.1.3-3-x86_64.pkg.tar.zst $ zstd -dc rsync-3.1.3-3-x86_64.pkg.tar.zst | tar -xf - usr/bin/rsync $ checksec --format=json --extended --file=usr/bin/rsync | jq { "usr/bin/rsync": { "relro": "full", "canary": "yes", "nx": "yes", "pie": "yes", "clangcfi": "no", "safestack": "no", "rpath": "no", "runpath": "no", "symbols": "no", "fortify_source": "yes", "fortified": "10", "fortify-able": "19" } } So, while I still think a PID would have been nice, now I know that it's pr_warn_once and won't be printed again until after the next reboot. Going to ask the Arch folks why NX has been disabled... Thanks, Christian. -- BOFH excuse #211: Lightning strikes.
Re: process '/usr/bin/rsync' started with executable stack
On Tue, 23 Jun 2020, Kees Cook wrote: > > $ checksec --format=json --extended --file=`which rsync` | jq > > { > > "/usr/bin/rsync": { > > "relro": "full", > > "canary": "yes", > > "nx": "no", > ^^ > > It is, indeed, marked executable, it seems. What distro is this? Arch Linux (x86-64) with 5.6.5.a-1-hardened[0], running in a Xen DomU. Christian. [0] https://git.archlinux.org/svntogit/packages.git/tree/trunk?h=packages/linux-hardened -- BOFH excuse #211: Lightning strikes.
process '/usr/bin/rsync' started with executable stack
Hi, exactly this[0] happened today, on a 5.6.5 kernel: process '/usr/bin/rsync' started with executable stack But I can't reproduce this message, and rsync (v3.2.0, not exactly abandonware) runs several times a day, so to repeat Andrew's questions[0] from last year: > What are poor users supposed to do if this message comes out? > Hopefully google the message and end up at this thread. What do you > want to tell them? Also, the PID is missing from that message. I had some long running rsync processes running earlier, maybe the RWE status would have been visible in /proc/$PID/map, or somewhere else maybe? Please advise? :-) Thanks, Christian. [0] https://lore.kernel.org/patchwork/patch/1164047/#1362722 $ checksec --format=json --extended --file=`which rsync` | jq { "/usr/bin/rsync": { "relro": "full", "canary": "yes", "nx": "no", "pie": "yes", "clangcfi": "no", "safestack": "no", "rpath": "no", "runpath": "no", "symbols": "no", "fortify_source": "yes", "fortified": "10", "fortify-able": "19" } } -- BOFH excuse #244: Your cat tried to eat the mouse.
Re: 5.7.0 / BUG: kernel NULL pointer dereference / setup_cpu_watcher
On Fri, 5 Jun 2020, Andrew Cooper wrote: > PVH domains don't have the emulated platform device, so Linux will be > finding ~0 when it goes looking in config space. > > The diagnostic should be skipped in that case, to avoid giving the false > impression that something is wrong. Understood, thanks for explaining that. I won't be able to edit arch/x86/xen/platform-pci-unplug.c to fix that though :-\ Christian. -- BOFH excuse #134: because of network lag due to too many people playing deathmatch
Re: 5.7.0 / BUG: kernel NULL pointer dereference / setup_cpu_watcher
On Fri, 5 Jun 2020, Jürgen Groß wrote: > Do you happen to start the guest with vcpus < maxvcpus? Indeed, I was booting with vcpus=2, maxvcpus=4. Setting both to the same value made the domU boot. > If yes there is already a patch pending for 5.8: > https://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git/commit/?h=for-linus-5.8=c54b071c192dfe8061336f650ceaf358e6386e0b Applied that manually and now the system boots even with vcpus < maxvcpus. So, if this still matters: Tested-by: Christian Kujau Thank you for your response, and the fix! Christian. -- BOFH excuse #146: Communications satellite used by the military for star wars.
5.7.0 / BUG: kernel NULL pointer dereference / setup_cpu_watcher
Hi, I'm running a small Xen PVH domain and upgrading from vanilla 5.6.0 to 5.7.0 caused the splat below, really early during boot. The configuration has not changed, all new "make oldconfig" prompts have been answered with "N". Old and new config, dmesg are here: http://nerdbynature.de/bits/5.7.0/ Searching the interwebs for similar reports didn't return much: * drm_sched_get_cleanup_job: BUG: kernel NULL pointer dereference https://bugzilla.redhat.com/show_bug.cgi?id=1822984 -- but this appears to be really DRM related. - https://lkml.org/lkml/2020/4/10/545 * A recent mm/vmstat patch, mentioning "device_offline" in its output https://patchwork.kernel.org/patch/11563009/ But other than a few overlapping strings, I guess all of that is totally unrelated :( Thanks, Christian. Note: that "Xen Platform PCI: unrecognised magic value" on the top appears in 5.6 kernels as well, but no ill effects so far. --- Xen Platform PCI: unrecognised magic value ACPI: No IOAPIC entries present BUG: kernel NULL pointer dereference, address: 02d0 #PF: supervisor read access in kernel mode #PF: error_code(0x) - not-present page PGD 0 P4D 0 Oops: [#1] SMP PTI CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0 #2 RIP: 0010:device_offline+0x8/0xf0 Code: 48 89 e7 e8 3a ee f3 ff 4c 89 e0 48 83 c4 10 5b 41 5c c3 45 31 e4 48 83 c4 10 4c 89 e0 5b 41 5c c3 90 41 54 55 53 48 83 ec 10 87 d0 02 00 00 01 0f 85 ca 00 00 00 48 89 fb 48 8b 7f 48 48 85 RSP: :bd9100013e78 EFLAGS: 00010286 RAX: RBX: RCX: 820001fa RDX: 9c9c3dd0 RSI: 820001fa RDI: RBP: 0002 R08: 0001 R09: R10: 9c9c3d5072a8 R11: R12: 9c9c3d594720 R13: 8a57e5a8 R14: R15: FS: () GS:9c9c3dc0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 02d0 CR3: 6b00a001 CR4: 001606b0 Call Trace: setup_cpu_watcher+0x44/0x60 ? plt_clk_driver_init+0xe/0xe setup_vcpu_hotplug_event+0x23/0x26 do_one_initcall+0x47/0x180 kernel_init_freeable+0x13b/0x19d ? rest_init+0x95/0x95 kernel_init+0x5/0xeb ret_from_fork+0x35/0x40 Modules linked in: CR2: 02d0 ---[ end trace b0cc587db609787f ]--- -- BOFH excuse #440: Cache miss - please take better aim next time
Re: file system permissions regression affecting root
On Wed, 13 May 2020, Patrick Donnelly wrote: > However, it seems odd that this depends on the owner of the directory. > i.e. this protection only seems to be enforced if the sticky directory > is owned by root. That's expected? According to the documentation[0] this appears to be intentional: protected_regular: [...] When set to "1" don't allow O_CREAT open on regular files that we don't own in world writable sticky directories, unless they are owned by the owner of the directory. C. [0] https://www.kernel.org/doc/Documentation/sysctl/fs.txt -- BOFH excuse #263: It's stuck in the Web.
Re: [Jfs-discussion] [fs] 05c5a0273b: netperf.Throughput_total_tps -71.8% regression
On Tue, 12 May 2020, kernel test robot wrote: > FYI, we noticed a -71.8% regression of netperf.Throughput_total_tps due to > commit: As noted in this report, netperf is used to "measure various aspect of networking performance". Are we sure the bisect is correct? JFS is a filesystem and is not touching net/ in any way. So, having not attempted to reproduce this, maybe the JFS commit is a red herring? C. -- BOFH excuse #50: Change in Earth's rotational speed
ptrace.c:202:6: warning: this statement may fall through
While compiling mainline with gcc-9.1.1 the following warning is emitted: === ../arch/x86/kernel/ptrace.c: In function ‘set_segment_reg’: ../arch/x86/kernel/ptrace.c:202:6: warning: this statement may fall through [-Wimplicit-fallthrough=] 202 | if (unlikely(value == 0)) | ^ ../arch/x86/kernel/ptrace.c:205:2: note: here 205 | default: | ^~~ === The patch below silences the warning, but I don't know if this is actual intended behaviour. Christian. diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c index 0fdbe89d0754..0030456d6e5c 100644 --- a/arch/x86/kernel/ptrace.c +++ b/arch/x86/kernel/ptrace.c @@ -201,6 +201,7 @@ static int set_segment_reg(struct task_struct *task, case offsetof(struct user_regs_struct, ss): if (unlikely(value == 0)) return -EIO; + /* fall through */ default: *pt_regs_access(task_pt_regs(task), offset) = value; -- BOFH excuse #326: We need a licensed electrician to replace the light bulbs in the computer room.
Re: FS-Cache: Duplicate cookie detected
Hi David, On Tue, 12 Mar 2019, David Howells wrote: > > My /usr/local/src mount was mounted with vers=4.2 (default), while > > nfstest_cache was mounting its test-mount with vers=4.1! Apart from the > > different rsize/wsize values, the version number stood out. And indeed, > > when I mount my regular NFS mount /usr/local/src with vers=4.1, the > > "duplicate cookie" is no longer printed. > > Yeah - NFS superblocks are differentiated by a whole host of parameters, > including protocol version number, and caches aren't shared between > superblocks because this introduces a tricky coherency problem. > > The issue is that NFS superblocks to the same place do not currently manage > coherency (inode attributes, data) between themselves, except via the server. > > However, if "fsc" isn't given on the mount commandline, the superblock > probably shouldn't get a server-level cookie if we can avoid it. Just checking - are you waiting for new results from me, should I test something that I missed? Or are new patches in the works? :-D Thanks, Christian. -- BOFH excuse #139: UBNC (user brain not connected)
Re: FS-Cache: Duplicate cookie detected
On Mon, 11 Mar 2019, David Howells wrote: > I've a couple more patches for you - one a bugfix and one that will print more > information. They don't actually affect the problem you're seeing. I'll post > them as replies to this message. Thanks for the patches. I've applied all three to v5.0 and ran "nfstest_cache" and was able to reproduce the messages. Please note that I'm only running "nfstest_cache" because it's somehow able to reproduce the message reliably - otherwise the message just shows up once or twice in syslog, but I didn't know how to reproduce it. But I noticed something else this time, and I did not notice that before: while running nfstest_cache, the "duplicate cookie" messages were only triggered when my other, non-test mount was also mounted during the test. Let me describe my F29 test VM again: * VM boots, and /usr/local/src gets mounted via NFS, read-only, and with w/o fsc options. cachefilesd isn't even installed here. * I run nfstest_cache and apparently it's mounting the same NFS export from the server to /mnt/t, as a readonly mount. So two mounts, one in /usr/local/src, the other in /mnt/t, both readonly and both w/o "fsc", but the "duplicate cookie" message is only printed when /usr/local/src was mounted. If /usr/local/src wasn't mounted, the test would complete[0] and no "duplicate message was printed. And then I noticed: -- $ mount | tail -2 | fold horus:/usr/local/src on /usr/local/src type nfs4 (ro,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255, hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.56.139, local_lock=none,addr=192.168.0.115) horus:/ on /mnt/t type nfs4 (rw,relatime,vers=4.1,rsize=4096,wsize=4096,namlen=255, hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.56.139, local_lock=none,addr=192.168.0.115) -- My /usr/local/src mount was mounted with vers=4.2 (default), while nfstest_cache was mounting its test-mount with vers=4.1! Apart from the different rsize/wsize values, the version number stood out. And indeed, when I mount my regular NFS mount /usr/local/src with vers=4.1, the "duplicate cookie" is no longer printed. For simplicity, I've attached two logs to this email: * nfs_no-mount.txt.xz - showing /proc/fs/nfsfs/volumes and /proc/fs/fscache/stats every 0.01 seconds, while running nfstest_cache in another terminal. Note that no duplicate "cookie messages" were triggered, as /usr/local/src was not mounted. * nfs_with-mount.txt.xz - same, but here /usr/local/src was mounted (and defaulted to vers=4.2), and thus "duplicate cookie" messages were printed. I fear that all this may complicate this strange behaviour, and now we're examining NFS mount versions, but I only noticed that now, not earlier :-\ I can't comment on the patches much, as you mentioned they won't make the message go away, but I hope it printed more details now. Thanks, Christian. [0] Again, I'm using nfstest_cache only to trigger the message. Everytime I execute it, the test fails, because I think it expects a rw-mount: $ nfstest_cache --server horus --client fedora0 --runtest=acregmin_attr *** Verify consistency of attribute caching with NFSv4.1 on a file acregmin = 10 TEST: Running test 'acregmin_attr' FAIL: Traceback (most recent call last): File "/usr/bin/nfstest_cache", line 199, in do_file_test fdw = open(self.absfile, "w") IOError: [Errno 30] Read-only file system: '/mnt/t/nfstest_cache_20190311223404_f_1' TIME: 4.497078s 1 tests (0 passed, 1 failed) Total time: 5.529826s -- BOFH excuse #209: Only people with names beginning with 'A' are getting mail this week (a la Microsoft) nfs_with-mount.txt.xz Description: application/xz nfs_no-mount.txt.xz Description: application/xz
Re: FS-Cache: Duplicate cookie detected
On Fri, 8 Mar 2019, Christian Kujau wrote: > Running Linux v5.0 with this patch applied does indeed still produce the > "Duplicate cookie detected" messages, but I only ever see wrq=0 when > running nfstest_cache: > >https://paste.fedoraproject.org/paste/dkav0FQzYZxE9-V7GphjAQ And again with the whole /proc/fs/fscache/stats output and better time stamps: https://paste.fedoraproject.org/paste/hZtCPStJlqB1d9JXnTFndQ C -- BOFH excuse #5: static from plastic slide rules
Re: FS-Cache: Duplicate cookie detected
On Fri, 8 Mar 2019, David Howells wrote: > See the attached for a patch that helps with certain kinds of collision, > though I can't see that it should help with what you're seeing since the > RELINQUISHED flag isn't set on the old cookie (fl=222, but 0x10 isn't in > there). You can monitor the number of waits by looking in > /proc/fs/fscache/stats for the: > > Acquire: n=289166 nul=0 noc=0 ok=286331 nbf=2 oom=0 wrq=23748 Running Linux v5.0 with this patch applied does indeed still produce the "Duplicate cookie detected" messages, but I only ever see wrq=0 when running nfstest_cache: https://paste.fedoraproject.org/paste/dkav0FQzYZxE9-V7GphjAQ (Scroll down until the messages start to appear again) Only the n= field seems to change during that test: fedora0# grep wrq n2.log | sort | uniq -c | sort -n 28 Acquire: n=8 nul=0 noc=0 ok=1 nbf=0 oom=0 wrq=0 29 Acquire: n=7 nul=0 noc=0 ok=1 nbf=0 oom=0 wrq=0 34 Acquire: n=6 nul=0 noc=0 ok=1 nbf=0 oom=0 wrq=0 82 Acquire: n=9 nul=0 noc=0 ok=1 nbf=0 oom=0 wrq=0 93 Acquire: n=5 nul=0 noc=0 ok=1 nbf=0 oom=0 wrq=0 HTH, Christian. -- BOFH excuse #5: static from plastic slide rules
Re: FS-Cache: Duplicate cookie detected
On Fri, 8 Mar 2019, David Howells wrote: > > $ mount | grep nfs4 > > nfs:/usr/local/src on /usr/local/src type nfs4 > > (ro,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.56.139,local_lock=none,addr=192.168.0.115) > > > > ...so FS-Cache ("fsc") isn't even used here. > > Interesting. Can you do: > > cat /proc/fs/nfsfs/volumes That seems to confirm the mount options, fsc is disabled: # cat /proc/fs/nfsfs/volumes NV SERVER PORT DEV FSID FSC v4 c0a80073 801 0:46 1cfd45bf1921474d:a795870ea80f5ff7 no > See the attached for a patch that helps with certain kinds of collision, > though I can't see that it should help with what you're seeing since the > RELINQUISHED flag isn't set on the old cookie (fl=222, but 0x10 isn't in > there). You can monitor the number of waits by looking in > /proc/fs/fscache/stats for the: > > Acquire: n=289166 nul=0 noc=0 ok=286331 nbf=2 oom=0 wrq=23748 Ah, the wrq= field gets only introduced by this patch. OK, I'll see if I can build a test kernel with that and will report back. Thanks for looking in to this, Christian. -- BOFH excuse #290: The CPU has shifted, and become decentralized.
Re: FS-Cache: Duplicate cookie detected
On Wed, 6 Mar 2019, David Howells wrote: > I can reproduce a slightly different problem by setting off ~6000 parallel > processes, each reading its own individual directory of files. Ususually I only see it shortly after mount, and only once, but I too can reproduce it with NFStest ([0], and there's a Fedora package too) via "nfstest_cache --server $SERVER --client `hostname`", which then produces a couple of these messages: FS-Cache: Duplicate cookie detected FS-Cache: O-cookie c=2fcc866b [p=c10c6e18 fl=222 nc=0 na=1] FS-Cache: O-cookie d=d5ed73bb n=076c9150 FS-Cache: O-key=[10] '040002000801c0a80073' FS-Cache: N-cookie c=e8d5dcd4 [p=c10c6e18 fl=2 nc=0 na=1] FS-Cache: N-cookie d=d5ed73bb n=a54e9705 FS-Cache: N-key=[10] '040002000801c0a80073' ...and the O-key does indeed seem to resemble the server address, somewhat: >>> s = "040002000801c0a80073"; >>> bytes = ["".join(x) for x in zip(*[iter(s)]*2)]; bytes = [int(x, 16) for x >>> in bytes]; print ".".join(str(x) for x in reversed(bytes)) 115.0.168.192.1.8.0.2.0.4 ^ Mount options on that client are: $ mount | grep nfs4 nfs:/usr/local/src on /usr/local/src type nfs4 (ro,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.56.139,local_lock=none,addr=192.168.0.115) ...so FS-Cache ("fsc") isn't even used here. The server in that scenario is a current Fedora 29 installation. HTH, Christian. [0] http://wiki.linux-nfs.org/wiki/index.php/NFStest > > I also see reports like this: > >FS-Cache: Duplicate cookie detected >FS-Cache: O-cookie c=db33ad59 [p=4bc53500 fl=218 nc=0 na=0] >FS-Cache: O-cookie d= (null) n= (null) >FS-Cache: O-cookie o=6cf6db4f >FS-Cache: O-key=[16] '010001010100e51fc6000323ae02' >FS-Cache: N-cookie c=791c49d0 [p=4bc53500 fl=2 nc=0 na=1] >FS-Cache: N-cookie d=e220fe14 n=d4484489 >FS-Cache: N-key=[16] '010001010100e51fc6000323ae02' > > with no cookie def or netfs data and flags ACQUIRED, RELINQUISHED and > INVALIDATING - which I can insert a wait for. > > David -- BOFH excuse #420: Feature was not beta tested
FS-Cache: Duplicate cookie detected
Hi, ever since ec0328e46d6e ("fscache: Maintain a catalogue of allocated cookies") was commited, people are seeing[0] those "Duplicate cookie detected" messages in syslog, see below. NFS and CIFS mounts appear to continue to work, but these messsages are new and I too am wondering if this is something to worry about. They _are_ logged with pr_err in fs/fscache/cookie.c, but maybe this needs to be changed to a different loglevel? Thanks, Christian. FS-Cache: Duplicate cookie detected FS-Cache: O-cookie c=9da9dbf0 [p=1593f904 fl=222 nc=0 na=1] FS-Cache: O-cookie d=287febd9 n=980c9e8a FS-Cache: O-key=[8] '020001bdc0a80064' FS-Cache: N-cookie c=bfe3f869 [p=1593f904 fl=2 nc=0 na=1] FS-Cache: N-cookie d=287febd9 n=e153f178 FS-Cache: N-key=[8] '020001bdc0a80064' [0] https://bugzilla.kernel.org/show_bug.cgi?id=200145 -- BOFH excuse #318: Your EMAIL is now being delivered by the USPS.
Re: [PATCH] x86/uaccess: Remove unused __addr_ok() macro
On Mon, 25 Feb 2019, Joe Perches wrote: > Looks like it's not used in several arches > > $ git grep -w __addr_ok > arch/arm/include/asm/uaccess.h:#define __addr_ok(addr) > ((void)(addr), 1) > arch/csky/include/asm/uaccess.h:#define __addr_ok(addr) (access_ok(addr, 0)) > arch/openrisc/include/asm/uaccess.h:#define __addr_ok(addr) ((unsigned long) > addr < get_fs()) > arch/sh/include/asm/uaccess.h:#define __addr_ok(addr) \ > arch/sh/include/asm/uaccess.h: __ao_end >= __ao_a && __addr_ok(__ao_end); }) > arch/x86/include/asm/uaccess.h:#define __addr_ok(addr) \ If so, would simly removing it do the trick or is there more magic involved? I don't have that many cross-compilers though and it's not even build-tested: commit f899653c64cce05fde426d0298cd67670f8ab8e2 Author: Christian Kujau Date: Sun Mar 3 22:43:09 2019 -0800 Remove unused __addr_ok() macro. arch/arm/include/asm/uaccess.h | 1 - arch/csky/include/asm/uaccess.h | 2 -- arch/openrisc/include/asm/uaccess.h | 3 --- arch/sh/include/asm/uaccess.h | 5 + arch/x86/include/asm/uaccess.h | 2 -- 5 files changed, 1 insertion(+), 12 deletions(-) Signed-off-by: Christian Kujau diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h index 42aa4a22803c..16411c76076d 100644 --- a/arch/arm/include/asm/uaccess.h +++ b/arch/arm/include/asm/uaccess.h @@ -266,7 +266,6 @@ extern int __put_user_8(void *, unsigned long long); #define USER_DSKERNEL_DS #define segment_eq(a, b) (1) -#define __addr_ok(addr)((void)(addr), 1) #define __range_ok(addr, size) ((void)(addr), 0) #define get_fs() (KERNEL_DS) diff --git a/arch/csky/include/asm/uaccess.h b/arch/csky/include/asm/uaccess.h index eaa1c3403a42..c02b243fecaa 100644 --- a/arch/csky/include/asm/uaccess.h +++ b/arch/csky/include/asm/uaccess.h @@ -24,8 +24,6 @@ static inline int access_ok(const void *addr, unsigned long size) ((unsigned long)(addr + size) < limit)); } -#define __addr_ok(addr) (access_ok(addr, 0)) - extern int __put_user_bad(void); /* diff --git a/arch/openrisc/include/asm/uaccess.h b/arch/openrisc/include/asm/uaccess.h index a44682c8adc3..9198371e30c2 100644 --- a/arch/openrisc/include/asm/uaccess.h +++ b/arch/openrisc/include/asm/uaccess.h @@ -55,9 +55,6 @@ */ #define __range_ok(addr, size) (size <= get_fs() && addr <= (get_fs()-size)) -/* Ensure that addr is below task's addr_limit */ -#define __addr_ok(addr) ((unsigned long) addr < get_fs()) - #define access_ok(addr, size) \ ({ \ unsigned long __ao_addr = (unsigned long)(addr);\ diff --git a/arch/sh/include/asm/uaccess.h b/arch/sh/include/asm/uaccess.h index 5fe751ad7582..b41f6a011474 100644 --- a/arch/sh/include/asm/uaccess.h +++ b/arch/sh/include/asm/uaccess.h @@ -5,9 +5,6 @@ #include #include -#define __addr_ok(addr) \ - ((unsigned long __force)(addr) < current_thread_info()->addr_limit.seg) - /* * __access_ok: Check if address with size is OK or not. * @@ -19,7 +16,7 @@ #define __access_ok(addr, size)({ \ unsigned long __ao_a = (addr), __ao_b = (size); \ unsigned long __ao_end = __ao_a + __ao_b - !!__ao_b;\ - __ao_end >= __ao_a && __addr_ok(__ao_end); }) + __ao_end >= __ao_a; }) #define access_ok(addr, size) \ (__chk_user_ptr(addr), \ diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index c133478d..d630978738dc 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -37,8 +37,6 @@ static inline void set_fs(mm_segment_t fs) #define segment_eq(a, b) ((a).seg == (b).seg) #define user_addr_max() (current->thread.addr_limit.seg) -#define __addr_ok(addr)\ - ((unsigned long __force)(addr) < user_addr_max()) /* * Test whether a block of memory is a valid user space address. -- BOFH excuse #123: user to computer ratio too high.
RIP: e030:move_page_tables+0xaa3/0xb80
Hi, I'm running an Ubuntu "mainline" kernel[0] as a Xen 4.11.1 DomU (PV) and ever since upgrading to Linux 5.0-rcX I get these WARNING messages shown below. Going back in my logs[1] I can see that I got a similar messages for v4.20 too, but with v5.0 they appear more often and upgrading from v5.0-rc3 to -rc4 made it even worse, now the messages show up quickly after boot and some commands (w, ps, top) become stuck and shutdown would hang too. I found an email thread[2] from earlier this month (hence the CC list) about this, but could not find out if this issue has been concluded or even fixed. I've gone back to v4.20 now and the message hasn't appeared yet, but it probably will in a few days again. Let me know if you need more details, v5.0-rc4 should make it easier for me to reproduce. Thanks, Christian. [0] https://kernel.ubuntu.com/~kernel-ppa/mainline https://wiki.ubuntu.com/Kernel/MainlineBuilds [1] http://nerdbynature.de/bits/5.0.0-rc4/kern_msg.txt [2] https://www.spinics.net/lists/stable/msg279001.html WARNING: CPU: 1 PID: 386 at arch/x86/xen/multicalls.c:102 xen_mc_flush+0x196/0x1f0 Modules linked in: rpcsec_gss_krb5 auth_rpcgss lz4 lz4_compress crct10dif_pclmul xen_kbdfront(-) ghash_clmulni_intel xen_fbfront fb_sys_fops syscopyarea sysfillrect sysimgblt aesni_intel aes_x86_64 crypto_simd cryptd glue_helper intel_rapl_perf sch_fq_codel nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nf_conntrack zram nf_defrag_ipv6 nf_defrag_ipv4 reiserfs nfsv4 nfs nf_tables_set lockd grace fscache dm_crypt sunrpc btrfs nf_tables nfnetlink zstd_compress ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear [last unloaded: mac_hid] CPU: 1 PID: 386 Comm: systemd-udevd Not tainted 5.0.0-05rc4-generic #201901272036 RIP: e030:xen_mc_flush+0x196/0x1f0 Code: 48 c1 e0 05 4d 8b 55 38 4d 8b 45 40 48 05 00 10 00 81 e8 5d ae dd 00 49 89 45 18 48 c1 e8 3f 48 89 c6 85 f6 0f 84 05 ff ff ff <0f> 0b 65 8b 0d b1 92 fe 7e 41 8b 55 00 48 c7 c7 30 b0 2d 82 e8 f4 RSP: e02b:c900410dbd00 EFLAGS: 00010002 RAX: 888175c946d8 RBX: 777f8000 RCX: 0001 RDX: 888175c946d8 RSI: 0001 RDI: 888175c94310 RBP: c900410dbd30 R08: 0001 R09: 88816fe5b000 R10: 7ff0 R11: 0011 R12: 000f R13: 888175c94300 R14: 0200 R15: FS: 7fca75421680() GS:888175c8() knlGS: CS: 1e030 DS: ES: CR0: 80050033 CR2: 5606a0d8bfb8 CR3: 00016d51a000 CR4: 00042660 Call Trace: __xen_pgd_pin+0x10c/0x300 xen_activate_mm+0x28/0x40 xen_dup_mmap+0xe/0x10 copy_process.part.37+0x1e7a/0x1f70 _do_fork+0xe8/0x3a0 __x64_sys_clone+0x27/0x30 do_syscall_64+0x5a/0x110 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -- BOFH excuse #160: non-redundant fan failure
Re: [PATCH] Revert "scripts/setlocalversion: git: Make -dirty check more robust"
On Tue, 6 Nov 2018, Brian Norris wrote: > > Perhaps both scenarios could be satisfied by having > > scripts/setlocalversion first check if .git has write permissions, and > > acting accordingly. Looking into history, this actually used to be > > done, but cdf2bc632ebc ("scripts/setlocalversion on write-protected > > source tree", 2013-06-14) removed the updating of the index. > > A "writeable" check (e.g., [ -w . ]) would be sufficient for our case. > But I'm not so sure about that older NFS report, and I'm also not sure > that we should be writing to the source tree at all in this case. Maybe > we can also check whether there's a build output directory specified? FWIW, the issue I reported back in 2013[0] was not an ill-configured NFS export, but a read-only NFS export (and then a read-write exported NFS export, but the user compiling the kernel did not have write permission) and so "test -w .git" did not help in determining if the source tree can actually written to. And depending on the user's shell[1], this may or may not still be the case. So I'm all for the $(touch .git/some-file-here) test to decide if the kernel has to be modified during build. Christian. [0] https://lkml.org/lkml/2013/6/14/574 [1] https://manpages.debian.org/unstable/dash/dash.1.en.html > > However, I admit I don't understand the justification in that commit > > from 2013. I'm no NFS expert, but perhaps the real problem there is an > > incorrectly configured NFS setup (uid/gid mismatch between NFS > > client/server, or permissions mismatch between mount options and NFS > > server?). Christian Kujau: can you speak to that? > > > > Well, we could also make our check $(touch .git/some-file-here > > 2>/dev/null && ...) instead of $(test -w .git) to handle misconfigured > > NFS setups. But not sure if that has its own problems. > > Trying to 'touch' the source tree will also break us. No matter whether > you redirect stderr, our sandbox will still notice the build is doing > something fishy and complain. -- BOFH excuse #192: runaway cat on system.
Re: [PATCH] Revert "scripts/setlocalversion: git: Make -dirty check more robust"
On Tue, 6 Nov 2018, Brian Norris wrote: > > Perhaps both scenarios could be satisfied by having > > scripts/setlocalversion first check if .git has write permissions, and > > acting accordingly. Looking into history, this actually used to be > > done, but cdf2bc632ebc ("scripts/setlocalversion on write-protected > > source tree", 2013-06-14) removed the updating of the index. > > A "writeable" check (e.g., [ -w . ]) would be sufficient for our case. > But I'm not so sure about that older NFS report, and I'm also not sure > that we should be writing to the source tree at all in this case. Maybe > we can also check whether there's a build output directory specified? FWIW, the issue I reported back in 2013[0] was not an ill-configured NFS export, but a read-only NFS export (and then a read-write exported NFS export, but the user compiling the kernel did not have write permission) and so "test -w .git" did not help in determining if the source tree can actually written to. And depending on the user's shell[1], this may or may not still be the case. So I'm all for the $(touch .git/some-file-here) test to decide if the kernel has to be modified during build. Christian. [0] https://lkml.org/lkml/2013/6/14/574 [1] https://manpages.debian.org/unstable/dash/dash.1.en.html > > However, I admit I don't understand the justification in that commit > > from 2013. I'm no NFS expert, but perhaps the real problem there is an > > incorrectly configured NFS setup (uid/gid mismatch between NFS > > client/server, or permissions mismatch between mount options and NFS > > server?). Christian Kujau: can you speak to that? > > > > Well, we could also make our check $(touch .git/some-file-here > > 2>/dev/null && ...) instead of $(test -w .git) to handle misconfigured > > NFS setups. But not sure if that has its own problems. > > Trying to 'touch' the source tree will also break us. No matter whether > you redirect stderr, our sandbox will still notice the build is doing > something fishy and complain. -- BOFH excuse #192: runaway cat on system.
Re: [Jfs-discussion] [PATCH] jfs: Expand usercopy whitelist for inline inode data
On Fri, 17 Aug 2018, Kees Cook wrote: > On Thu, Aug 16, 2018 at 11:56 PM, Christian Kujau > wrote: > > On Fri, 3 Aug 2018, Kees Cook via Jfs-discussion wrote: > >> Bart Massey reported what turned out to be a usercopy whitelist false > >> positive in JFS when symlink contents exceeded 128 bytes. The inline > >> inode data (i_inline) is actually designed to overflow into the "extended > > > > So, this may be a stupid question, but: is there a way to disable this > > hardened usercopy thing with a boot option maybe? > > > > Apparently, CONFIG_HARDENED_USERCOPY_FALLBACK was disabled in Debian's > > 4.16.0-0.bpo.2-amd64 (4.16.16) kernels[0] and I have a VMware guest here > > that prints a BUG message (below) whenever a certain directory is being > > accesses. ls(1) is fine, but "ls -l" (i.e. with stat()) produces the splat > > below. And indeed, the target of one of the symlinks inside is 129 > > characters long, and every attempt to stat it prints the splat below. > > > > Going back to 4.16.0-0.bpo.1-amd64 (4.16.5) helps, but I was wondering if > > there was a magic boot option to disable it while I wait for 4.18 to land > > in Debian? I booted with hardened_usercopy=off, but it doesn't seem to > > have an effect and the directory is still inaccessible. > > Precisely this was just added upstream[1] for 4.19 but isn't available > in 4.16. It should be trivial to backport it, though, if Ben wants to > do that? (The JFS fix is in the 4.17 and 4.18 -stable trees now, too, > BTW.) Ah, OK. While the patch does apply (almost) cleanly to 4.16, I think I'll just wait until it makes its way into the Debian (backports) kernel, as nobody else seems to be annoyed by this :-) Thanks! Christian. -- BOFH excuse #53: Little hamster in running wheel had coronary; waiting for replacement to be Fedexed from Wyoming
Re: [Jfs-discussion] [PATCH] jfs: Expand usercopy whitelist for inline inode data
On Fri, 17 Aug 2018, Kees Cook wrote: > On Thu, Aug 16, 2018 at 11:56 PM, Christian Kujau > wrote: > > On Fri, 3 Aug 2018, Kees Cook via Jfs-discussion wrote: > >> Bart Massey reported what turned out to be a usercopy whitelist false > >> positive in JFS when symlink contents exceeded 128 bytes. The inline > >> inode data (i_inline) is actually designed to overflow into the "extended > > > > So, this may be a stupid question, but: is there a way to disable this > > hardened usercopy thing with a boot option maybe? > > > > Apparently, CONFIG_HARDENED_USERCOPY_FALLBACK was disabled in Debian's > > 4.16.0-0.bpo.2-amd64 (4.16.16) kernels[0] and I have a VMware guest here > > that prints a BUG message (below) whenever a certain directory is being > > accesses. ls(1) is fine, but "ls -l" (i.e. with stat()) produces the splat > > below. And indeed, the target of one of the symlinks inside is 129 > > characters long, and every attempt to stat it prints the splat below. > > > > Going back to 4.16.0-0.bpo.1-amd64 (4.16.5) helps, but I was wondering if > > there was a magic boot option to disable it while I wait for 4.18 to land > > in Debian? I booted with hardened_usercopy=off, but it doesn't seem to > > have an effect and the directory is still inaccessible. > > Precisely this was just added upstream[1] for 4.19 but isn't available > in 4.16. It should be trivial to backport it, though, if Ben wants to > do that? (The JFS fix is in the 4.17 and 4.18 -stable trees now, too, > BTW.) Ah, OK. While the patch does apply (almost) cleanly to 4.16, I think I'll just wait until it makes its way into the Debian (backports) kernel, as nobody else seems to be annoyed by this :-) Thanks! Christian. -- BOFH excuse #53: Little hamster in running wheel had coronary; waiting for replacement to be Fedexed from Wyoming
Re: [Jfs-discussion] [PATCH] jfs: Expand usercopy whitelist for inline inode data
On Fri, 3 Aug 2018, Kees Cook via Jfs-discussion wrote: > Bart Massey reported what turned out to be a usercopy whitelist false > positive in JFS when symlink contents exceeded 128 bytes. The inline > inode data (i_inline) is actually designed to overflow into the "extended So, this may be a stupid question, but: is there a way to disable this hardened usercopy thing with a boot option maybe? Apparently, CONFIG_HARDENED_USERCOPY_FALLBACK was disabled in Debian's 4.16.0-0.bpo.2-amd64 (4.16.16) kernels[0] and I have a VMware guest here that prints a BUG message (below) whenever a certain directory is being accesses. ls(1) is fine, but "ls -l" (i.e. with stat()) produces the splat below. And indeed, the target of one of the symlinks inside is 129 characters long, and every attempt to stat it prints the splat below. Going back to 4.16.0-0.bpo.1-amd64 (4.16.5) helps, but I was wondering if there was a magic boot option to disable it while I wait for 4.18 to land in Debian? I booted with hardened_usercopy=off, but it doesn't seem to have an effect and the directory is still inaccessible. Thanks, Christian. [0] https://salsa.debian.org/kernel-team/linux/tree/stretch-backports/debian/config/ ---[ end trace dbb1a6dfa1411526 ]--- usercopy: Kernel memory exposure attempt detected from SLUB object 'jfs_ip' (offset 288, size 129)! [ cut here ] kernel BUG at /build/linux-hvYKKE/linux-4.17.8/mm/usercopy.c:100! invalid opcode: [#2] SMP PTI Modules linked in: xt_tcpudp iptable_filter binfmt_misc zram zsmalloc vmw_vsock_vmci_transport vsock ip_tables x_tables xts twofish_x86_64_3way twofish_x86_64 twofish_common lrw jfs glue_helper gf128mul dm_crypt dm_mod sd_mod evdev vmxnet3 mptsas scsi_transport_sas mptscsih mptbase vmw_vmci ata_piix libata scsi_mod button CPU: 0 PID: 1349 Comm: ls Tainted: G D 4.17.0-0.bpo.1-amd64 #1 Debian 4.17.8-1~bpo9+1 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/21/2015 RIP: 0010:usercopy_abort+0x69/0x80 RSP: 0018:b84e40e2fe18 EFLAGS: 00010286 RAX: 0063 RBX: 0081 RCX: RDX: RSI: 9786ffc16738 RDI: 9786ffc16738 RBP: 0081 R08: R09: 042e R10: 9c68af71 R11: 323120657a697320 R12: 0001 R13: 9786f93146a1 R14: 0082 R15: 559dd2edb170 FS: 7fe8f13733c0() GS:9786ffc0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 559dd2edb088 CR3: 3d104002 CR4: 003606f0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: __check_heap_object+0xeb/0x120 __check_object_size+0xb8/0x1a0 readlink_copy+0x3e/0x60 vfs_readlink+0x60/0x120 do_readlinkat+0xf9/0x120 __x64_sys_readlink+0x1b/0x20 do_syscall_64+0x55/0x110 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fe8f0c6fe47 RSP: 002b:7ffe94d04528 EFLAGS: 0202 ORIG_RAX: 0059 RAX: ffda RBX: 0082 RCX: 7fe8f0c6fe47 RDX: 0082 RSI: 559dd2edb170 RDI: 7ffe94d04570 RBP: 559dd2edb170 R08: 0003 R09: 0090 R10: R11: 0202 R12: 7ffe94d04570 R13: 7ffe94d04570 R14: 3fff R15: 7ffe Code: 0f 44 d0 53 48 c7 c0 58 05 65 9c 51 48 c7 c6 12 f9 63 9c 41 53 48 89 f9 48 0f 45 f0 4c 89 d2 48 c7 c7 40 06 65 9c e8 05 97 e9 ff <0f> 0b 49 c7 c1 03 09 66 9c 4d 89 cb 4d 89 c8 eb a5 66 0f 1f 44 RIP: usercopy_abort+0x69/0x80 RSP: b84e40e2fe18 ---[ end trace dbb1a6dfa1411527 ]--- -- BOFH excuse #404: Sysadmin accidentally destroyed pager with a large hammer.
Re: [Jfs-discussion] [PATCH] jfs: Expand usercopy whitelist for inline inode data
On Fri, 3 Aug 2018, Kees Cook via Jfs-discussion wrote: > Bart Massey reported what turned out to be a usercopy whitelist false > positive in JFS when symlink contents exceeded 128 bytes. The inline > inode data (i_inline) is actually designed to overflow into the "extended So, this may be a stupid question, but: is there a way to disable this hardened usercopy thing with a boot option maybe? Apparently, CONFIG_HARDENED_USERCOPY_FALLBACK was disabled in Debian's 4.16.0-0.bpo.2-amd64 (4.16.16) kernels[0] and I have a VMware guest here that prints a BUG message (below) whenever a certain directory is being accesses. ls(1) is fine, but "ls -l" (i.e. with stat()) produces the splat below. And indeed, the target of one of the symlinks inside is 129 characters long, and every attempt to stat it prints the splat below. Going back to 4.16.0-0.bpo.1-amd64 (4.16.5) helps, but I was wondering if there was a magic boot option to disable it while I wait for 4.18 to land in Debian? I booted with hardened_usercopy=off, but it doesn't seem to have an effect and the directory is still inaccessible. Thanks, Christian. [0] https://salsa.debian.org/kernel-team/linux/tree/stretch-backports/debian/config/ ---[ end trace dbb1a6dfa1411526 ]--- usercopy: Kernel memory exposure attempt detected from SLUB object 'jfs_ip' (offset 288, size 129)! [ cut here ] kernel BUG at /build/linux-hvYKKE/linux-4.17.8/mm/usercopy.c:100! invalid opcode: [#2] SMP PTI Modules linked in: xt_tcpudp iptable_filter binfmt_misc zram zsmalloc vmw_vsock_vmci_transport vsock ip_tables x_tables xts twofish_x86_64_3way twofish_x86_64 twofish_common lrw jfs glue_helper gf128mul dm_crypt dm_mod sd_mod evdev vmxnet3 mptsas scsi_transport_sas mptscsih mptbase vmw_vmci ata_piix libata scsi_mod button CPU: 0 PID: 1349 Comm: ls Tainted: G D 4.17.0-0.bpo.1-amd64 #1 Debian 4.17.8-1~bpo9+1 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/21/2015 RIP: 0010:usercopy_abort+0x69/0x80 RSP: 0018:b84e40e2fe18 EFLAGS: 00010286 RAX: 0063 RBX: 0081 RCX: RDX: RSI: 9786ffc16738 RDI: 9786ffc16738 RBP: 0081 R08: R09: 042e R10: 9c68af71 R11: 323120657a697320 R12: 0001 R13: 9786f93146a1 R14: 0082 R15: 559dd2edb170 FS: 7fe8f13733c0() GS:9786ffc0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 559dd2edb088 CR3: 3d104002 CR4: 003606f0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: __check_heap_object+0xeb/0x120 __check_object_size+0xb8/0x1a0 readlink_copy+0x3e/0x60 vfs_readlink+0x60/0x120 do_readlinkat+0xf9/0x120 __x64_sys_readlink+0x1b/0x20 do_syscall_64+0x55/0x110 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fe8f0c6fe47 RSP: 002b:7ffe94d04528 EFLAGS: 0202 ORIG_RAX: 0059 RAX: ffda RBX: 0082 RCX: 7fe8f0c6fe47 RDX: 0082 RSI: 559dd2edb170 RDI: 7ffe94d04570 RBP: 559dd2edb170 R08: 0003 R09: 0090 R10: R11: 0202 R12: 7ffe94d04570 R13: 7ffe94d04570 R14: 3fff R15: 7ffe Code: 0f 44 d0 53 48 c7 c0 58 05 65 9c 51 48 c7 c6 12 f9 63 9c 41 53 48 89 f9 48 0f 45 f0 4c 89 d2 48 c7 c7 40 06 65 9c e8 05 97 e9 ff <0f> 0b 49 c7 c1 03 09 66 9c 4d 89 cb 4d 89 c8 eb a5 66 0f 1f 44 RIP: usercopy_abort+0x69/0x80 RSP: b84e40e2fe18 ---[ end trace dbb1a6dfa1411527 ]--- -- BOFH excuse #404: Sysadmin accidentally destroyed pager with a large hammer.
Re: [PATCH 00/25] staging: erofs: introduce erofs file system
On Thu, 26 Jul 2018, Gao Xiang wrote: > EROFS file system is a read-only file system with compression > support designed for certain devices (especially embeded > devices) with very limited physical memory and lots of memory Out of curiousity, and as Richard already asked[0] - what about existing file system, why can't they be used or extended instead of introducing yet another file system into the kernel? JFFS2? UBIFS? CramFs? SquashFS? ROMFS? F2FS? YAFFS? Christian. [0] https://marc.info/?l=linux-kernel=152783930418348=2 -- BOFH excuse #247: Due to Federal Budget problems we have been forced to cut back on the number of users able to access the system at one time. (namely none allowed)
Re: [PATCH 00/25] staging: erofs: introduce erofs file system
On Thu, 26 Jul 2018, Gao Xiang wrote: > EROFS file system is a read-only file system with compression > support designed for certain devices (especially embeded > devices) with very limited physical memory and lots of memory Out of curiousity, and as Richard already asked[0] - what about existing file system, why can't they be used or extended instead of introducing yet another file system into the kernel? JFFS2? UBIFS? CramFs? SquashFS? ROMFS? F2FS? YAFFS? Christian. [0] https://marc.info/?l=linux-kernel=152783930418348=2 -- BOFH excuse #247: Due to Federal Budget problems we have been forced to cut back on the number of users able to access the system at one time. (namely none allowed)
Re: 4.15-rc6+ hang
On Thu, 4 Jan 2018, Tom Hromatka wrote: > > > [0.00] [ cut here ] > > > [0.00] XSAVE consistency problem, dumping leaves > > I think this is a vbox issue, with virtualbox not exposing all the > > xsave state, so that when the kernel adds up the xsave areas, the end > > result doesn't match what the total size is reported to be. > > It seems probable that this is a VirtualBox issue. I was > able to boot my exact 4.15-rc6+ kernel in qemu-kvm v1.5.3 > just fine. This was discussed on vbox-dev back in May 2017 (see the whole thread for more details): https://www.virtualbox.org/pipermail/vbox-dev/2017-May/014466.html Does that help? Christian. -- BOFH excuse #9: doppler effect
Re: 4.15-rc6+ hang
On Thu, 4 Jan 2018, Tom Hromatka wrote: > > > [0.00] [ cut here ] > > > [0.00] XSAVE consistency problem, dumping leaves > > I think this is a vbox issue, with virtualbox not exposing all the > > xsave state, so that when the kernel adds up the xsave areas, the end > > result doesn't match what the total size is reported to be. > > It seems probable that this is a VirtualBox issue. I was > able to boot my exact 4.15-rc6+ kernel in qemu-kvm v1.5.3 > just fine. This was discussed on vbox-dev back in May 2017 (see the whole thread for more details): https://www.virtualbox.org/pipermail/vbox-dev/2017-May/014466.html Does that help? Christian. -- BOFH excuse #9: doppler effect
WARNING: CPU: 1 PID: 1384 at lib/iov_iter.c:695 copy_page_to_iter+0x240/0x3b0
Hi, this just happened on an i686 machine of mine: [ cut here ] WARNING: CPU: 1 PID: 1384 at lib/iov_iter.c:695 copy_page_to_iter+0x240/0x3b0 Modules linked in: xfs algif_skcipher af_alg uas nfsv4 dns_resolver nfs nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_meta nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 lz4 nf_defrag_ipv4 lz4_compress nft_ct nf_conntrack libcrc32c crc32c_generic nft_set_hash nf_tables_inet nf_tables_ipv6 nf_tables_ipv4 nf_tables nfnetlink cpufreq_conservative sch_fq_codel zram evdev tg3 ptp pps_core lpc_ich libphy input_leds ideapad_laptop sparse_keymap wmi serpent_sse2_i586 thermal serpent_generic lrw glue_helper ablk_helper cryptd video xts dm_crypt acpi_cpufreq arc4 iTCO_wdt iTCO_vendor_support i2c_i801 fscache loop coretemp battery b43 bcma mac80211 cfg80211 dm_mod dax ssb mmc_core rfkill led_class rng_core pcmcia pcmcia_core nfsd auth_rpcgss oid_registry ac nfs_acl lockd grace sunrpc usb_storage sd_mod atkbd libps2 uhci_hcd ata_piix libata scsi_mod ehci_pci ehci_hcd usbcore usb_common i8042 serio jfs [last unloaded: soundcore] CPU: 1 PID: 1384 Comm: java Not tainted 4.14.4-1.0-ARCH #1 Hardware name: LENOVO Lenovo /Mariana , BIOS 14CN94WW 06/29/2009 task: f27c1380 task.stack: f1c64000 EIP: copy_page_to_iter+0x240/0x3b0 EFLAGS: 00010286 CPU: 1 EAX: 1000 EBX: ffb48000 ECX: 02c0 EDX: 8001006c ESI: f67ecb60 EDI: 0d40 EBP: f1c65e30 ESP: f1c65e08 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 80050033 CR2: 09f060cc CR3: 3260f000 CR4: 06d0 Call Trace: ? touch_atime+0x2b/0xb0 generic_file_read_iter+0x458/0x8c0 ? xfs_ilock+0x10d/0x150 [xfs] ? xfs_file_buffered_aio_read+0xed/0x100 [xfs] xfs_file_buffered_aio_read+0x4e/0x100 [xfs] ? set_next_entity+0x13f/0x8b0 xfs_file_read_iter+0x54/0xc0 [xfs] __vfs_read+0xe7/0x140 vfs_read+0x7b/0x130 SyS_pread64+0x81/0xb0 do_fast_syscall_32+0x71/0x1d0 entry_SYSENTER_32+0x4e/0x7c EIP: 0xb7f69cd9 EFLAGS: 0293 CPU: 1 EAX: ffda EBX: 009b ECX: 23c77f10 EDX: 001a ESI: 156172c0 EDI: EBP: b7f50e70 ESP: 1f32ea60 DS: 007b ES: 007b FS: GS: 0033 SS: 007b Code: 75 ec e9 ff fe ff ff 8d 74 26 00 8b 55 ec 8b 45 08 85 d2 8b 58 0c 74 09 e8 6e d7 ff ff 84 c0 75 1a 31 f6 e9 13 ff ff ff 8d 76 00 <0f> ff 31 f6 83 c4 1c 5b 89 f0 5e 5f 5d c3 66 90 8b 7d 08 8b 45 ---[ end trace 0002deba6d00a28c ]--- This i686 laptop is running 4.14.4-1.0-ARCH [0] and is usually running just fine, although memory pressure is usually quite hight due to some Java program running on that machine. For some reason the system was even more busy today, commands would take a long time to complete and I rebooted the machine. Shortly after boot (and after starting this Java program again), the warning above happened. I couldn't find this exact message in the archives, the closest thing I found was (mentioning that "EIP:copy_page_to_iter" message): > 4879b7ae05 ("Merge tag 'dmaengine-4.12-rc1' of .."): WARNING: kernel > stack regs at bd92bc2e in 01-cpu-hotplug:3811 has bad 'bp' value 01be > https://patchwork.kernel.org/patch/9981273/ The XFS file system is mounted with: > XFS (dm-2): EXPERIMENTAL reverse mapping btree feature enabled. Use at your > own risk! > XFS (dm-2): EXPERIMENTAL reflink feature enabled. Use at your own risk! But I did not experience any problems with that, yet :) Full dmesg & .config: http://nerdbynature.de/bits/4.14/ Any pointers? Thanks, Christian. $ mount | grep xfs /dev/mapper/opt on /opt type xfs (rw,nosuid,nodev,relatime,attr2,inode64,noquota) $ xfs_info /opt/ meta-data=/dev/mapper/optisize=512agcount=4, agsize=9079797 blks = sectsz=512 attr=2, projid32bit=1 = crc=1finobt=1 spinodes=0 rmapbt=1 = reflink=1 data = bsize=4096 blocks=36319185, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=1 log =internal bsize=4096 blocks=17733, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 [0] https://mirror.archlinux32.org/i686/core/linux-4.14.4-1.0-i686.pkg.tar.xz -- BOFH excuse #413: Cow-tippers tipped a cow onto the server.
WARNING: CPU: 1 PID: 1384 at lib/iov_iter.c:695 copy_page_to_iter+0x240/0x3b0
Hi, this just happened on an i686 machine of mine: [ cut here ] WARNING: CPU: 1 PID: 1384 at lib/iov_iter.c:695 copy_page_to_iter+0x240/0x3b0 Modules linked in: xfs algif_skcipher af_alg uas nfsv4 dns_resolver nfs nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_meta nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 lz4 nf_defrag_ipv4 lz4_compress nft_ct nf_conntrack libcrc32c crc32c_generic nft_set_hash nf_tables_inet nf_tables_ipv6 nf_tables_ipv4 nf_tables nfnetlink cpufreq_conservative sch_fq_codel zram evdev tg3 ptp pps_core lpc_ich libphy input_leds ideapad_laptop sparse_keymap wmi serpent_sse2_i586 thermal serpent_generic lrw glue_helper ablk_helper cryptd video xts dm_crypt acpi_cpufreq arc4 iTCO_wdt iTCO_vendor_support i2c_i801 fscache loop coretemp battery b43 bcma mac80211 cfg80211 dm_mod dax ssb mmc_core rfkill led_class rng_core pcmcia pcmcia_core nfsd auth_rpcgss oid_registry ac nfs_acl lockd grace sunrpc usb_storage sd_mod atkbd libps2 uhci_hcd ata_piix libata scsi_mod ehci_pci ehci_hcd usbcore usb_common i8042 serio jfs [last unloaded: soundcore] CPU: 1 PID: 1384 Comm: java Not tainted 4.14.4-1.0-ARCH #1 Hardware name: LENOVO Lenovo /Mariana , BIOS 14CN94WW 06/29/2009 task: f27c1380 task.stack: f1c64000 EIP: copy_page_to_iter+0x240/0x3b0 EFLAGS: 00010286 CPU: 1 EAX: 1000 EBX: ffb48000 ECX: 02c0 EDX: 8001006c ESI: f67ecb60 EDI: 0d40 EBP: f1c65e30 ESP: f1c65e08 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 80050033 CR2: 09f060cc CR3: 3260f000 CR4: 06d0 Call Trace: ? touch_atime+0x2b/0xb0 generic_file_read_iter+0x458/0x8c0 ? xfs_ilock+0x10d/0x150 [xfs] ? xfs_file_buffered_aio_read+0xed/0x100 [xfs] xfs_file_buffered_aio_read+0x4e/0x100 [xfs] ? set_next_entity+0x13f/0x8b0 xfs_file_read_iter+0x54/0xc0 [xfs] __vfs_read+0xe7/0x140 vfs_read+0x7b/0x130 SyS_pread64+0x81/0xb0 do_fast_syscall_32+0x71/0x1d0 entry_SYSENTER_32+0x4e/0x7c EIP: 0xb7f69cd9 EFLAGS: 0293 CPU: 1 EAX: ffda EBX: 009b ECX: 23c77f10 EDX: 001a ESI: 156172c0 EDI: EBP: b7f50e70 ESP: 1f32ea60 DS: 007b ES: 007b FS: GS: 0033 SS: 007b Code: 75 ec e9 ff fe ff ff 8d 74 26 00 8b 55 ec 8b 45 08 85 d2 8b 58 0c 74 09 e8 6e d7 ff ff 84 c0 75 1a 31 f6 e9 13 ff ff ff 8d 76 00 <0f> ff 31 f6 83 c4 1c 5b 89 f0 5e 5f 5d c3 66 90 8b 7d 08 8b 45 ---[ end trace 0002deba6d00a28c ]--- This i686 laptop is running 4.14.4-1.0-ARCH [0] and is usually running just fine, although memory pressure is usually quite hight due to some Java program running on that machine. For some reason the system was even more busy today, commands would take a long time to complete and I rebooted the machine. Shortly after boot (and after starting this Java program again), the warning above happened. I couldn't find this exact message in the archives, the closest thing I found was (mentioning that "EIP:copy_page_to_iter" message): > 4879b7ae05 ("Merge tag 'dmaengine-4.12-rc1' of .."): WARNING: kernel > stack regs at bd92bc2e in 01-cpu-hotplug:3811 has bad 'bp' value 01be > https://patchwork.kernel.org/patch/9981273/ The XFS file system is mounted with: > XFS (dm-2): EXPERIMENTAL reverse mapping btree feature enabled. Use at your > own risk! > XFS (dm-2): EXPERIMENTAL reflink feature enabled. Use at your own risk! But I did not experience any problems with that, yet :) Full dmesg & .config: http://nerdbynature.de/bits/4.14/ Any pointers? Thanks, Christian. $ mount | grep xfs /dev/mapper/opt on /opt type xfs (rw,nosuid,nodev,relatime,attr2,inode64,noquota) $ xfs_info /opt/ meta-data=/dev/mapper/optisize=512agcount=4, agsize=9079797 blks = sectsz=512 attr=2, projid32bit=1 = crc=1finobt=1 spinodes=0 rmapbt=1 = reflink=1 data = bsize=4096 blocks=36319185, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=1 log =internal bsize=4096 blocks=17733, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 [0] https://mirror.archlinux32.org/i686/core/linux-4.14.4-1.0-i686.pkg.tar.xz -- BOFH excuse #413: Cow-tippers tipped a cow onto the server.
Re: swap_info_get: Bad swap offset entry 0200f8a7
On Fri, 20 Oct 2017, huang ying wrote: > > 4 May < Linux version 4.11.2-1-ARCH > > 4 Jun < Linux version 4.11.3-1-ARCH > > 7 Jul < Linux version 4.11.9-1-ARCH > > 4 Aug < Linux version 4.12.8-2-ARCH > > 24 Sep < Linux version 4.12.13-1-ARCH > > 158 Oct < Linux version 4.13.5-1-ARCH > > So you have never seen this before 4.11 like 4.10? Unfortunately the kernel logs for that machine only go back until May 2017 and I cannot tell if that hasn't happened before. I've seen these messages appear since then but didn't bother much. But as it now happens more frequently, I thought I should mention this to the list. > Which operations will trigger this error messages? I'm not able to reproduce it at will, but I suspect that memory pressure triggers these messages. The machine in question is an Lenovo Ideapad S10 notebook running 24x7 and is equipped with 1 GB of RAM. Two Java processes are basically using up all the memory, so usually it tooks like this: $ free -m total used free shared buff/cache available Mem: 99486667 1 6020 Swap:760437 322 $ zramctl NAME ALGORITHM DISKSIZE DATA COMPR TOTAL STREAMS MOUNTPOINT /dev/zram0 lz4 248.7M 247M 92.3M 97.4M 2 [SWAP] I just assumed the message is triggered when the system is really low on memory and maybe zram is too slow to provide the memory requested. But that's just my layman's assumption :-) For example, today's message was emitted during the night: Oct 20 01:26:18 len kernel: [638973.207849] \ swap_info_get: Bad swap offset entry 0200f8a7 And here are the sysstat numbers for that time frame: $ sar -r -s 00:00 -e 02:00 Linux 4.13.5-1-ARCH (len) 10/20/2017 _i686_ (2 CPU) 12:00:01 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit kbactive kbinact kbdirty 12:10:06 AM 70076948404 93.12 4 19004 1556176 86.58376608379408 220 12:20:02 AM 80488937992 92.10 4180404 1563952 87.01380184327736 5568 12:30:03 AM 83296935184 91.82 4137260 1569776 87.3432951233 280 12:40:03 AM 65188953292 93.60 4 21156 1571048 87.41386644389820 1144 12:50:03 AM 67512950968 93.37 4 33452 1570628 87.38378936381580 1304 01:00:07 AM 65520952960 93.57 4 24996 1573180 87.53385396386152 904 01:10:03 AM 66956951524 93.43 4 35520 1572696 87.50379548379364 172 01:20:02 AM 67440951040 93.38 4 88736 1569864 87.34381764370472 7080 01:30:03 AM 70048948432 93.12 4 29212 1572504 87.49383516381900 1832 01:40:04 AM 71532946948 92.98 4 29220 1570096 87.35380120380284 1000 01:50:03 AM 65828952652 93.54 4 34408 1570604 87.38381040381028 1604 Average:70353948127 93.09 4 57579 1569139 87.30376661371613 1919 == If that is unreadable, here it is again: https://paste.debian.net/991927/ > Is it possible for you to check > whether the error exists for normal swap device (not ZRAM)? I have "normal" (but encrpted) swap configured but with a lower priority: cat /proc/swaps FilenameTypeSizeUsedPriority /dev/dm-0 partition 524284 194348 0 /dev/zram0 partition 254616 253536 32767 I shall disable the zram device and disable encryption too and will report back if the message appears again. > 32bit or 64bit kernel do you use? I'm using an i686 kernel for this Atom N270 processor (with HT enabled). Thanks for your response, Christian. -- BOFH excuse #403: Sysadmin didn't hear pager go off due to loud music from bar-room speakers.
Re: swap_info_get: Bad swap offset entry 0200f8a7
On Fri, 20 Oct 2017, huang ying wrote: > > 4 May < Linux version 4.11.2-1-ARCH > > 4 Jun < Linux version 4.11.3-1-ARCH > > 7 Jul < Linux version 4.11.9-1-ARCH > > 4 Aug < Linux version 4.12.8-2-ARCH > > 24 Sep < Linux version 4.12.13-1-ARCH > > 158 Oct < Linux version 4.13.5-1-ARCH > > So you have never seen this before 4.11 like 4.10? Unfortunately the kernel logs for that machine only go back until May 2017 and I cannot tell if that hasn't happened before. I've seen these messages appear since then but didn't bother much. But as it now happens more frequently, I thought I should mention this to the list. > Which operations will trigger this error messages? I'm not able to reproduce it at will, but I suspect that memory pressure triggers these messages. The machine in question is an Lenovo Ideapad S10 notebook running 24x7 and is equipped with 1 GB of RAM. Two Java processes are basically using up all the memory, so usually it tooks like this: $ free -m total used free shared buff/cache available Mem: 99486667 1 6020 Swap:760437 322 $ zramctl NAME ALGORITHM DISKSIZE DATA COMPR TOTAL STREAMS MOUNTPOINT /dev/zram0 lz4 248.7M 247M 92.3M 97.4M 2 [SWAP] I just assumed the message is triggered when the system is really low on memory and maybe zram is too slow to provide the memory requested. But that's just my layman's assumption :-) For example, today's message was emitted during the night: Oct 20 01:26:18 len kernel: [638973.207849] \ swap_info_get: Bad swap offset entry 0200f8a7 And here are the sysstat numbers for that time frame: $ sar -r -s 00:00 -e 02:00 Linux 4.13.5-1-ARCH (len) 10/20/2017 _i686_ (2 CPU) 12:00:01 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit kbactive kbinact kbdirty 12:10:06 AM 70076948404 93.12 4 19004 1556176 86.58376608379408 220 12:20:02 AM 80488937992 92.10 4180404 1563952 87.01380184327736 5568 12:30:03 AM 83296935184 91.82 4137260 1569776 87.3432951233 280 12:40:03 AM 65188953292 93.60 4 21156 1571048 87.41386644389820 1144 12:50:03 AM 67512950968 93.37 4 33452 1570628 87.38378936381580 1304 01:00:07 AM 65520952960 93.57 4 24996 1573180 87.53385396386152 904 01:10:03 AM 66956951524 93.43 4 35520 1572696 87.50379548379364 172 01:20:02 AM 67440951040 93.38 4 88736 1569864 87.34381764370472 7080 01:30:03 AM 70048948432 93.12 4 29212 1572504 87.49383516381900 1832 01:40:04 AM 71532946948 92.98 4 29220 1570096 87.35380120380284 1000 01:50:03 AM 65828952652 93.54 4 34408 1570604 87.38381040381028 1604 Average:70353948127 93.09 4 57579 1569139 87.30376661371613 1919 == If that is unreadable, here it is again: https://paste.debian.net/991927/ > Is it possible for you to check > whether the error exists for normal swap device (not ZRAM)? I have "normal" (but encrpted) swap configured but with a lower priority: cat /proc/swaps FilenameTypeSizeUsedPriority /dev/dm-0 partition 524284 194348 0 /dev/zram0 partition 254616 253536 32767 I shall disable the zram device and disable encryption too and will report back if the message appears again. > 32bit or 64bit kernel do you use? I'm using an i686 kernel for this Atom N270 processor (with HT enabled). Thanks for your response, Christian. -- BOFH excuse #403: Sysadmin didn't hear pager go off due to loud music from bar-room speakers.
swap_info_get: Bad swap offset entry 0200f8a7
Hi, every now and then (and more frequently now) I receive the following message on this Atom N270 netbook: swap_info_get: Bad swap offset entry 0200f8a7 This started to show up a few months ago but appears to happen more frequently now: 4 May < Linux version 4.11.2-1-ARCH 4 Jun < Linux version 4.11.3-1-ARCH 7 Jul < Linux version 4.11.9-1-ARCH 4 Aug < Linux version 4.12.8-2-ARCH 24 Sep < Linux version 4.12.13-1-ARCH 158 Oct < Linux version 4.13.5-1-ARCH I've only found (very) old reports for this[0][2] with either no solution[1] or some hinting that this may be caused by hardware errors. In my case howerver no kernel BUG messages or oopses are involved and no PTE errors are logged. The machine appears to be very stable, although memory usage is quite high on that machine (but no OOM situations so far either). As the machine is only equipped with 1GB of RAM, I'm using ZRAM on this system, which usually looks something like this: $ zramctl NAME ALGORITHM DISKSIZE DATA COMPR TOTAL STREAMS MOUNTPOINT /dev/zram0 lz4 248.7M 195.7M 74M 78.7M 2 [SWAP] I suspect that, when memory pressure is high, zram may not be quick enough to decompress a page leading to these messages, but then I'd have expected a zram error message too. Can anybody comment on these messages? If they're really indicating a hardware error, shouldn't there be other messages too? So far, rasdaemon has not logged any errors. Thanks, Christian. [0] http://lkml.iu.edu/hypermail/linux/kernel/0204.3/0165.html [1] https://bugzilla.redhat.com/show_bug.cgi?id=432337 [2] https://access.redhat.com/solutions/218733 -- BOFH excuse #323: Your processor has processed too many instructions. Turn it off immediately, do not type any commands!!
swap_info_get: Bad swap offset entry 0200f8a7
Hi, every now and then (and more frequently now) I receive the following message on this Atom N270 netbook: swap_info_get: Bad swap offset entry 0200f8a7 This started to show up a few months ago but appears to happen more frequently now: 4 May < Linux version 4.11.2-1-ARCH 4 Jun < Linux version 4.11.3-1-ARCH 7 Jul < Linux version 4.11.9-1-ARCH 4 Aug < Linux version 4.12.8-2-ARCH 24 Sep < Linux version 4.12.13-1-ARCH 158 Oct < Linux version 4.13.5-1-ARCH I've only found (very) old reports for this[0][2] with either no solution[1] or some hinting that this may be caused by hardware errors. In my case howerver no kernel BUG messages or oopses are involved and no PTE errors are logged. The machine appears to be very stable, although memory usage is quite high on that machine (but no OOM situations so far either). As the machine is only equipped with 1GB of RAM, I'm using ZRAM on this system, which usually looks something like this: $ zramctl NAME ALGORITHM DISKSIZE DATA COMPR TOTAL STREAMS MOUNTPOINT /dev/zram0 lz4 248.7M 195.7M 74M 78.7M 2 [SWAP] I suspect that, when memory pressure is high, zram may not be quick enough to decompress a page leading to these messages, but then I'd have expected a zram error message too. Can anybody comment on these messages? If they're really indicating a hardware error, shouldn't there be other messages too? So far, rasdaemon has not logged any errors. Thanks, Christian. [0] http://lkml.iu.edu/hypermail/linux/kernel/0204.3/0165.html [1] https://bugzilla.redhat.com/show_bug.cgi?id=432337 [2] https://access.redhat.com/solutions/218733 -- BOFH excuse #323: Your processor has processed too many instructions. Turn it off immediately, do not type any commands!!
Re: [Kernel.org Helpdesk #40777] Re: Linux 4.12-rc1 (file locations)
On Mon, 15 May 2017, Konstantin Ryabitsev via RT wrote: > On 2017-05-15 14:34:56, francoisvalen...@gmail.com wrote: > > It doesn't work with Firefox-53.0. After quite a long time while > > firefox > > uses 100% of CPU, I finally get a text file and not a gzip file of the > > patch for 4.12-rc1. It was almost instantaneous previously. I don't > > see > > this as a progress. > > Firefox will request a gzip version of the patch, download it and then ungzip > it for you and display it in the browser. If you'd rather not display > that, please use a commandline tool like wget or curl to get the patch. Yeah, same here: clicking on 4.12-rc2/patch on the kernel.org main page makes Firefox 53 freeze for a few minutes, and then display (!) the text file (85 MB!) in full. Wow. > We are trying to identify who are the people who still need to download > patches as opposed to using git directly, and what their use-case I never use the links on the kernel.org main page to download patches, but still: can those be changed to say ".gz" or something, so that $browser won't _display_ it by default but _download_ it instead? Thanks, Christian. -- BOFH excuse #435: Internet shut down due to maintenance
Re: [Kernel.org Helpdesk #40777] Re: Linux 4.12-rc1 (file locations)
On Mon, 15 May 2017, Konstantin Ryabitsev via RT wrote: > On 2017-05-15 14:34:56, francoisvalen...@gmail.com wrote: > > It doesn't work with Firefox-53.0. After quite a long time while > > firefox > > uses 100% of CPU, I finally get a text file and not a gzip file of the > > patch for 4.12-rc1. It was almost instantaneous previously. I don't > > see > > this as a progress. > > Firefox will request a gzip version of the patch, download it and then ungzip > it for you and display it in the browser. If you'd rather not display > that, please use a commandline tool like wget or curl to get the patch. Yeah, same here: clicking on 4.12-rc2/patch on the kernel.org main page makes Firefox 53 freeze for a few minutes, and then display (!) the text file (85 MB!) in full. Wow. > We are trying to identify who are the people who still need to download > patches as opposed to using git directly, and what their use-case I never use the links on the kernel.org main page to download patches, but still: can those be changed to say ".gz" or something, so that $browser won't _display_ it by default but _download_ it instead? Thanks, Christian. -- BOFH excuse #435: Internet shut down due to maintenance
Re: [Kernel.org Helpdesk #40777] Re: Linux 4.12-rc1 (file locations)
On Mon, 15 May 2017, Konstantin Ryabitsev via RT wrote: > On 2017-05-15 14:34:56, francoisvalen...@gmail.com wrote: > > It doesn't work with Firefox-53.0. After quite a long time while > > firefox > > uses 100% of CPU, I finally get a text file and not a gzip file of the > > patch for 4.12-rc1. It was almost instantaneous previously. I don't > > see > > this as a progress. > > Firefox will request a gzip version of the patch, download it and then ungzip > it for you and display it in the browser. If you'd rather not display > that, please use a commandline tool like wget or curl to get the patch. Yeah, same here: clicking on 4.12-rc2/patch on the kernel.org main page makes Firefox 53 freeze for a few minutes, and then display (!) the text file (85 MB!) in full. Wow. > We are trying to identify who are the people who still need to download > patches as opposed to using git directly, and what their use-case I never use the links on the kernel.org main page to download patches, but still: can those be changed to say ".gz" or something, so that $browser won't _display_ it by default but _download_ it instead? Thanks, Christian. -- BOFH excuse #435: Internet shut down due to maintenance
Re: [Kernel.org Helpdesk #40777] Re: Linux 4.12-rc1 (file locations)
On Mon, 15 May 2017, Konstantin Ryabitsev via RT wrote: > On 2017-05-15 14:34:56, francoisvalen...@gmail.com wrote: > > It doesn't work with Firefox-53.0. After quite a long time while > > firefox > > uses 100% of CPU, I finally get a text file and not a gzip file of the > > patch for 4.12-rc1. It was almost instantaneous previously. I don't > > see > > this as a progress. > > Firefox will request a gzip version of the patch, download it and then ungzip > it for you and display it in the browser. If you'd rather not display > that, please use a commandline tool like wget or curl to get the patch. Yeah, same here: clicking on 4.12-rc2/patch on the kernel.org main page makes Firefox 53 freeze for a few minutes, and then display (!) the text file (85 MB!) in full. Wow. > We are trying to identify who are the people who still need to download > patches as opposed to using git directly, and what their use-case I never use the links on the kernel.org main page to download patches, but still: can those be changed to say ".gz" or something, so that $browser won't _display_ it by default but _download_ it instead? Thanks, Christian. -- BOFH excuse #435: Internet shut down due to maintenance
Re: [PATCH v4 2/2] procfs/tasks: add a simple per-task procfs hidepid= field
On Mon, 13 Feb 2017, Kees Cook wrote: > Okay, cool. Thanks. (Also, where does "setpriv" live? I must need a > new set of util-linux or something?) Indeed, a newer version of util-linux[0] should do, although Debian/testing appears to have an extra package just for "setpriv": https://packages.debian.org/stretch/setpriv C. [0] https://git.kernel.org/cgit/utils/util-linux/util-linux.git/commit/?id=5600c40 -- BOFH excuse #65: system needs to be rebooted
Re: [PATCH v4 2/2] procfs/tasks: add a simple per-task procfs hidepid= field
On Mon, 13 Feb 2017, Kees Cook wrote: > Okay, cool. Thanks. (Also, where does "setpriv" live? I must need a > new set of util-linux or something?) Indeed, a newer version of util-linux[0] should do, although Debian/testing appears to have an extra package just for "setpriv": https://packages.debian.org/stretch/setpriv C. [0] https://git.kernel.org/cgit/utils/util-linux/util-linux.git/commit/?id=5600c40 -- BOFH excuse #65: system needs to be rebooted
alg: comp: Compression test 1 failed for lz4
Hi, while the LZ4 and LZ4HC module appears to be available on PowerPC 32-bit (it's a PowerBook G4), I get these warnings below during module load. The lzo module is working just fine and I'm using it extensively for ZRAM on that machine. Is this a configuration[0] error or is LZ4 just not supported for this architecture? Thanks, Christian. [0] http://nerdbynature.de/bits/4.10-rc7/config_4.10-rc7.txt - but I get the same messages with a stock Debian/4.9 configuration too. $ modprobe lz4 alg: comp: Compression test 1 failed for lz4-generic : f0 10 4a 6f 69 6e 20 75 73 20 6e 6f 77 20 61 6e 0010: 64 20 73 68 61 72 65 20 74 68 65 20 73 6f 66 74 0020: 77 00 0d 0f 00 23 0b 50 77 61 72 65 20 alg: acomp: Compression test 1 failed for lz4-scomp : f0 10 4a 6f 69 6e 20 75 73 20 6e 6f 77 20 61 6e 0010: 64 20 73 68 61 72 65 20 74 68 65 20 73 6f 66 74 0020: 77 00 0d 0f 00 23 0b 50 77 61 72 65 20 $ modprobe lz4hc alg: comp: Compression test 1 failed for lz4hc-generic : f0 10 4a 6f 69 6e 20 75 73 20 6e 6f 77 20 61 6e 0010: 64 20 73 68 61 72 65 20 74 68 65 20 73 6f 66 74 0020: 77 00 0d 0f 00 23 0b 50 77 61 72 65 20 alg: acomp: Compression test 1 failed for lz4hc-scomp : f0 10 4a 6f 69 6e 20 75 73 20 6e 6f 77 20 61 6e 0010: 64 20 73 68 61 72 65 20 74 68 65 20 73 6f 66 74 0020: 77 00 0d 0f 00 23 0b 50 77 61 72 65 20 -- BOFH excuse #174: Backbone adjustment
alg: comp: Compression test 1 failed for lz4
Hi, while the LZ4 and LZ4HC module appears to be available on PowerPC 32-bit (it's a PowerBook G4), I get these warnings below during module load. The lzo module is working just fine and I'm using it extensively for ZRAM on that machine. Is this a configuration[0] error or is LZ4 just not supported for this architecture? Thanks, Christian. [0] http://nerdbynature.de/bits/4.10-rc7/config_4.10-rc7.txt - but I get the same messages with a stock Debian/4.9 configuration too. $ modprobe lz4 alg: comp: Compression test 1 failed for lz4-generic : f0 10 4a 6f 69 6e 20 75 73 20 6e 6f 77 20 61 6e 0010: 64 20 73 68 61 72 65 20 74 68 65 20 73 6f 66 74 0020: 77 00 0d 0f 00 23 0b 50 77 61 72 65 20 alg: acomp: Compression test 1 failed for lz4-scomp : f0 10 4a 6f 69 6e 20 75 73 20 6e 6f 77 20 61 6e 0010: 64 20 73 68 61 72 65 20 74 68 65 20 73 6f 66 74 0020: 77 00 0d 0f 00 23 0b 50 77 61 72 65 20 $ modprobe lz4hc alg: comp: Compression test 1 failed for lz4hc-generic : f0 10 4a 6f 69 6e 20 75 73 20 6e 6f 77 20 61 6e 0010: 64 20 73 68 61 72 65 20 74 68 65 20 73 6f 66 74 0020: 77 00 0d 0f 00 23 0b 50 77 61 72 65 20 alg: acomp: Compression test 1 failed for lz4hc-scomp : f0 10 4a 6f 69 6e 20 75 73 20 6e 6f 77 20 61 6e 0010: 64 20 73 68 61 72 65 20 74 68 65 20 73 6f 66 74 0020: 77 00 0d 0f 00 23 0b 50 77 61 72 65 20 -- BOFH excuse #174: Backbone adjustment
Re: [PATCH RFC] powerpc/32: fix handling of stack protector with recent GCC
On Mon, 16 Jan 2017, Christophe Leroy wrote: > Since 2005, powerpc GCC doesn't manage anymore __stack_chk_guard as > a global variable but as some value located at -0x7008(r2) Is this still an "RFC" or is there a chance that this will land in 4.10? Thanks, Christian. > In the Linux kernel, r2 is used as a pointer to current task struct. > > This patch changes the meaning of r2 when stack protector > is activated: > - current is taken from thread_info and not kept in r2 anymore > - r2 is set to current + offset of stack canary + 0x7008 so > that -0x7008(r2) directly points to current->stack_canary > > current could have been more efficiently calculated from r2 > but some circular inclusion prevent inserting struct task_struct > into arch/powerpc/include/asm/current.h so it is not possible > to get offset of stack_canary within current task_struct from there. > > fixes: 6533b7c16ee57 ("powerpc: Initial stack protector > (-fstack-protector) support") > Reported-by: Christian Kujau <li...@nerdbynature.de> > > Signed-off-by: Christophe Leroy <christophe.le...@c-s.fr> > --- > Christian, can you test it ? > > arch/powerpc/include/asm/current.h| 12 +++- > arch/powerpc/include/asm/stackprotector.h | 13 + > arch/powerpc/kernel/entry_32.S| 19 +++ > arch/powerpc/kernel/head_32.S | 7 +++ > arch/powerpc/kernel/head_40x.S| 4 > arch/powerpc/kernel/head_44x.S| 4 > arch/powerpc/kernel/head_8xx.S| 4 > arch/powerpc/kernel/head_fsl_booke.S | 7 +++ > arch/powerpc/kernel/process.c | 6 -- > 9 files changed, 61 insertions(+), 15 deletions(-) > > diff --git a/arch/powerpc/include/asm/current.h > b/arch/powerpc/include/asm/current.h > index e2c7f06..2f67f02 100644 > --- a/arch/powerpc/include/asm/current.h > +++ b/arch/powerpc/include/asm/current.h > @@ -27,8 +27,16 @@ static inline struct task_struct *get_current(void) > } > #define current get_current() > > -#else > +#else /* __powerpc64__ */ > +#if defined(CONFIG_CC_STACKPROTECTOR) > +#include > > +static inline struct task_struct *get_current(void) > +{ > + return current_thread_info()->task; > +} > +#define current get_current() > +#else > /* > * We keep `current' in r2 for speed. > */ > @@ -36,5 +44,7 @@ register struct task_struct *current asm ("r2"); > > #endif > > +#endif /* __powerpc64__ */ > + > #endif /* __KERNEL__ */ > #endif /* _ASM_POWERPC_CURRENT_H */ > diff --git a/arch/powerpc/include/asm/stackprotector.h > b/arch/powerpc/include/asm/stackprotector.h > index 6720190..bf30509 100644 > --- a/arch/powerpc/include/asm/stackprotector.h > +++ b/arch/powerpc/include/asm/stackprotector.h > @@ -12,12 +12,18 @@ > #ifndef _ASM_STACKPROTECTOR_H > #define _ASM_STACKPROTECTOR_H > > +#ifdef CONFIG_PPC64 > +#define SSP_OFFSET 0x7010 > +#else > +#define SSP_OFFSET 0x7008 > +#endif > + > +#ifndef __ASSEMBLY__ > + > #include > #include > #include > > -extern unsigned long __stack_chk_guard; > - > /* > * Initialize the stackprotector canary value. > * > @@ -34,7 +40,6 @@ static __always_inline void boot_init_stack_canary(void) > canary ^= LINUX_VERSION_CODE; > > current->stack_canary = canary; > - __stack_chk_guard = current->stack_canary; > } > - > +#endif /* __ASSEMBLY__ */ > #endif /* _ASM_STACKPROTECTOR_H */ > diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S > index 5742dbd..b3a363c 100644 > --- a/arch/powerpc/kernel/entry_32.S > +++ b/arch/powerpc/kernel/entry_32.S > @@ -34,6 +34,7 @@ > #include > #include > #include > +#include > > /* > * MSR_KERNEL is > 0x1 on 4xx/Book-E since it include MSR_CE. > @@ -149,6 +150,9 @@ transfer_to_handler: > mfspr r12,SPRN_SPRG_THREAD > addir2,r12,-THREAD > tovirt(r2,r2) /* set r2 to current */ > +#if defined(CONFIG_CC_STACKPROTECTOR) > + addir2,r2,TSK_STACK_CANARY+SSP_OFFSET > +#endif > beq 2f /* if from user, fix up THREAD.regs */ > addir11,r1,STACK_FRAME_OVERHEAD > stw r11,PT_REGS(r12) > @@ -385,6 +389,9 @@ syscall_exit_cont: > lwz r3,GPR3(r1) > 1: > #endif /* CONFIG_TRACE_IRQFLAGS */ > +#if defined(CONFIG_CC_STACKPROTECTOR) > + subir2,r2,TSK_STACK_CANARY+SSP_OFFSET > +#endif > #if defined(CONFIG_4xx) || defined(CONFIG_BOOKE) > /* If the process has its own DBCR0 value, load it u
Re: [PATCH RFC] powerpc/32: fix handling of stack protector with recent GCC
On Mon, 16 Jan 2017, Christophe Leroy wrote: > Since 2005, powerpc GCC doesn't manage anymore __stack_chk_guard as > a global variable but as some value located at -0x7008(r2) Is this still an "RFC" or is there a chance that this will land in 4.10? Thanks, Christian. > In the Linux kernel, r2 is used as a pointer to current task struct. > > This patch changes the meaning of r2 when stack protector > is activated: > - current is taken from thread_info and not kept in r2 anymore > - r2 is set to current + offset of stack canary + 0x7008 so > that -0x7008(r2) directly points to current->stack_canary > > current could have been more efficiently calculated from r2 > but some circular inclusion prevent inserting struct task_struct > into arch/powerpc/include/asm/current.h so it is not possible > to get offset of stack_canary within current task_struct from there. > > fixes: 6533b7c16ee57 ("powerpc: Initial stack protector > (-fstack-protector) support") > Reported-by: Christian Kujau > > Signed-off-by: Christophe Leroy > --- > Christian, can you test it ? > > arch/powerpc/include/asm/current.h| 12 +++- > arch/powerpc/include/asm/stackprotector.h | 13 + > arch/powerpc/kernel/entry_32.S| 19 +++ > arch/powerpc/kernel/head_32.S | 7 +++ > arch/powerpc/kernel/head_40x.S| 4 > arch/powerpc/kernel/head_44x.S| 4 > arch/powerpc/kernel/head_8xx.S| 4 > arch/powerpc/kernel/head_fsl_booke.S | 7 +++ > arch/powerpc/kernel/process.c | 6 -- > 9 files changed, 61 insertions(+), 15 deletions(-) > > diff --git a/arch/powerpc/include/asm/current.h > b/arch/powerpc/include/asm/current.h > index e2c7f06..2f67f02 100644 > --- a/arch/powerpc/include/asm/current.h > +++ b/arch/powerpc/include/asm/current.h > @@ -27,8 +27,16 @@ static inline struct task_struct *get_current(void) > } > #define current get_current() > > -#else > +#else /* __powerpc64__ */ > +#if defined(CONFIG_CC_STACKPROTECTOR) > +#include > > +static inline struct task_struct *get_current(void) > +{ > + return current_thread_info()->task; > +} > +#define current get_current() > +#else > /* > * We keep `current' in r2 for speed. > */ > @@ -36,5 +44,7 @@ register struct task_struct *current asm ("r2"); > > #endif > > +#endif /* __powerpc64__ */ > + > #endif /* __KERNEL__ */ > #endif /* _ASM_POWERPC_CURRENT_H */ > diff --git a/arch/powerpc/include/asm/stackprotector.h > b/arch/powerpc/include/asm/stackprotector.h > index 6720190..bf30509 100644 > --- a/arch/powerpc/include/asm/stackprotector.h > +++ b/arch/powerpc/include/asm/stackprotector.h > @@ -12,12 +12,18 @@ > #ifndef _ASM_STACKPROTECTOR_H > #define _ASM_STACKPROTECTOR_H > > +#ifdef CONFIG_PPC64 > +#define SSP_OFFSET 0x7010 > +#else > +#define SSP_OFFSET 0x7008 > +#endif > + > +#ifndef __ASSEMBLY__ > + > #include > #include > #include > > -extern unsigned long __stack_chk_guard; > - > /* > * Initialize the stackprotector canary value. > * > @@ -34,7 +40,6 @@ static __always_inline void boot_init_stack_canary(void) > canary ^= LINUX_VERSION_CODE; > > current->stack_canary = canary; > - __stack_chk_guard = current->stack_canary; > } > - > +#endif /* __ASSEMBLY__ */ > #endif /* _ASM_STACKPROTECTOR_H */ > diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S > index 5742dbd..b3a363c 100644 > --- a/arch/powerpc/kernel/entry_32.S > +++ b/arch/powerpc/kernel/entry_32.S > @@ -34,6 +34,7 @@ > #include > #include > #include > +#include > > /* > * MSR_KERNEL is > 0x1 on 4xx/Book-E since it include MSR_CE. > @@ -149,6 +150,9 @@ transfer_to_handler: > mfspr r12,SPRN_SPRG_THREAD > addir2,r12,-THREAD > tovirt(r2,r2) /* set r2 to current */ > +#if defined(CONFIG_CC_STACKPROTECTOR) > + addir2,r2,TSK_STACK_CANARY+SSP_OFFSET > +#endif > beq 2f /* if from user, fix up THREAD.regs */ > addir11,r1,STACK_FRAME_OVERHEAD > stw r11,PT_REGS(r12) > @@ -385,6 +389,9 @@ syscall_exit_cont: > lwz r3,GPR3(r1) > 1: > #endif /* CONFIG_TRACE_IRQFLAGS */ > +#if defined(CONFIG_CC_STACKPROTECTOR) > + subir2,r2,TSK_STACK_CANARY+SSP_OFFSET > +#endif > #if defined(CONFIG_4xx) || defined(CONFIG_BOOKE) > /* If the process has its own DBCR0 value, load it up. The internal > debug mode bit tells us t
btrfs_alloc_tree_block: Faulting instruction address: 0xc02d4584
Hi, after upgrading this powerpc32 box from 4.10-rc2 to -rc4, the message below occured a few hours after boot. Full dmesg and .config: http://nerdbynature.de/bits/4.10-rc4/ Any ideas? Thanks, Christian. Faulting instruction address: 0xc02d4584 Oops: Kernel access of bad area, sig: 11 [#1] PowerMac Modules linked in: ecb xt_tcpudp iptable_filter ip_tables x_tables nfnetlink_log nfnetlink sha256_generic twofish_generic twofish_common usb_storage therm_adt746x loop i2c_powermac arc4 firewire_sbp2 b43 rng_core ssb bcma mac80211 cfg80211 ecryptfs [last unloaded: nbd] CPU: 0 PID: 1395 Comm: btrfs-transacti Tainted: GW 4.10.0-rc4-1-gab8184b #1 task: ee7162e0 task.stack: ee9cc000 NIP: c02d4584 LR: c02d4574 CTR: c00d0df0 REGS: ee9cdaa0 TRAP: 0300 Tainted: GW (4.10.0-rc4-1-gab8184b) MSR: 9032CR: 24422248 XER: DAR: 10581054 DSISR: 4200 GPR00: c02d4574 ee9cdb50 ee71d4b8 10581050 0001 dbc88118 0020 GPR08: 1205 0004 24422444 93c3 GPR16: f000 0001 ee260800 GPR24: 0001 ee9cdc1b eef5c1a0 1000 ee47 dbc88118 ee470170 NIP [c02d4584] btrfs_alloc_tree_block+0x18c/0x5c4 LR [c02d4574] btrfs_alloc_tree_block+0x17c/0x5c4 Call Trace: [ee9cdb50] [c02d4574] btrfs_alloc_tree_block+0x17c/0x5c4 (unreliable) [ee9cdbf0] [c02b86d4] __btrfs_cow_block+0x110/0x638 [ee9cdc70] [c02b8d74] btrfs_cow_block+0xdc/0x1b0 [ee9cdca0] [c02bc48c] btrfs_search_slot+0x1c0/0x904 [ee9cdd10] [c02dc680] btrfs_lookup_inode+0x3c/0x124 [ee9cdd50] [c02ec204] btrfs_update_inode_item+0x4c/0x10c [ee9cdd80] [c02d05e4] cache_save_setup+0xc0/0x400 [ee9cdde0] [c02d4d54] btrfs_start_dirty_block_groups+0x184/0x47c [ee9cde50] [c02e7e84] btrfs_commit_transaction+0x148/0xac4 [ee9cdeb0] [c02e313c] transaction_kthread+0x1d0/0x1ec [ee9cdf00] [c004f1fc] kthread+0xf8/0x124 [ee9cdf40] [c0011480] ret_from_kernel_thread+0x5c/0x64 --- interrupt: 0 at (null) LR = (null) Instruction dump: 4800b3ed 7f838040 7c7e1b78 419d0430 806300d4 81db 81fb0004 4bdfe2b9 3924 7ee6bb78 38630050 7fc5f378 <7dc91d2c> 7de01d2c 809501cf 807501cb ---[ end trace 937683537ecd986b ]--- -- BOFH excuse #342: HTTPD Error 4004 : very old Intel cpu - insufficient processing power
btrfs_alloc_tree_block: Faulting instruction address: 0xc02d4584
Hi, after upgrading this powerpc32 box from 4.10-rc2 to -rc4, the message below occured a few hours after boot. Full dmesg and .config: http://nerdbynature.de/bits/4.10-rc4/ Any ideas? Thanks, Christian. Faulting instruction address: 0xc02d4584 Oops: Kernel access of bad area, sig: 11 [#1] PowerMac Modules linked in: ecb xt_tcpudp iptable_filter ip_tables x_tables nfnetlink_log nfnetlink sha256_generic twofish_generic twofish_common usb_storage therm_adt746x loop i2c_powermac arc4 firewire_sbp2 b43 rng_core ssb bcma mac80211 cfg80211 ecryptfs [last unloaded: nbd] CPU: 0 PID: 1395 Comm: btrfs-transacti Tainted: GW 4.10.0-rc4-1-gab8184b #1 task: ee7162e0 task.stack: ee9cc000 NIP: c02d4584 LR: c02d4574 CTR: c00d0df0 REGS: ee9cdaa0 TRAP: 0300 Tainted: GW (4.10.0-rc4-1-gab8184b) MSR: 9032 CR: 24422248 XER: DAR: 10581054 DSISR: 4200 GPR00: c02d4574 ee9cdb50 ee71d4b8 10581050 0001 dbc88118 0020 GPR08: 1205 0004 24422444 93c3 GPR16: f000 0001 ee260800 GPR24: 0001 ee9cdc1b eef5c1a0 1000 ee47 dbc88118 ee470170 NIP [c02d4584] btrfs_alloc_tree_block+0x18c/0x5c4 LR [c02d4574] btrfs_alloc_tree_block+0x17c/0x5c4 Call Trace: [ee9cdb50] [c02d4574] btrfs_alloc_tree_block+0x17c/0x5c4 (unreliable) [ee9cdbf0] [c02b86d4] __btrfs_cow_block+0x110/0x638 [ee9cdc70] [c02b8d74] btrfs_cow_block+0xdc/0x1b0 [ee9cdca0] [c02bc48c] btrfs_search_slot+0x1c0/0x904 [ee9cdd10] [c02dc680] btrfs_lookup_inode+0x3c/0x124 [ee9cdd50] [c02ec204] btrfs_update_inode_item+0x4c/0x10c [ee9cdd80] [c02d05e4] cache_save_setup+0xc0/0x400 [ee9cdde0] [c02d4d54] btrfs_start_dirty_block_groups+0x184/0x47c [ee9cde50] [c02e7e84] btrfs_commit_transaction+0x148/0xac4 [ee9cdeb0] [c02e313c] transaction_kthread+0x1d0/0x1ec [ee9cdf00] [c004f1fc] kthread+0xf8/0x124 [ee9cdf40] [c0011480] ret_from_kernel_thread+0x5c/0x64 --- interrupt: 0 at (null) LR = (null) Instruction dump: 4800b3ed 7f838040 7c7e1b78 419d0430 806300d4 81db 81fb0004 4bdfe2b9 3924 7ee6bb78 38630050 7fc5f378 <7dc91d2c> 7de01d2c 809501cf 807501cb ---[ end trace 937683537ecd986b ]--- -- BOFH excuse #342: HTTPD Error 4004 : very old Intel cpu - insufficient processing power
Re: [PATCH RFC] powerpc/32: fix handling of stack protector with recent GCC
On Mon, 16 Jan 2017, Christophe Leroy wrote: > Christian, can you test it ? OK, so with that applied to v4.10-rc4, compilation still fails with GCC 4.9.2 and CC_STACKPROTECTOR_STRONG=y, see below. But it compiles just fine with CC_STACKPROTECTOR_REGULAR=y and boots to! Cross-compiling the same with GCC 5.2.0 works, even for CC_STACKPROTECTOR_STRONG=y and the system boots just fine. So, with that limitation, feel free to add: Tested-by: Christian Kujau <li...@nerdbynature.de> Thanks for the fix! Christian. $ gcc --version | head -1 gcc-4.9.real (Debian 4.9.2-10) 4.9.2 $ grep CC_STACKPROTECTOR_STRONG $DIR/.config CONFIG_CC_STACKPROTECTOR_STRONG=y $ make O=$DIR V=1 bindeb-pkg [...] + ld -EB -m elf32ppc -Bstatic --build-id -X -o .tmp_vmlinux1 -T ./arch/powerpc/kernel/vmlinux.lds arch/powerpc/kernel/head_32.o arch/powerpc/kernel/fpu.o arch/powerpc/kernel/vector.o arch/powerpc/kernel/prom_init.o init/built-in.o --start-group usr/built-in.o arch/powerpc/kernel/built-in.o arch/powerpc/mm/built-in.o arch/powerpc/lib/built-in.o arch/powerpc/sysdev/built-in.o arch/powerpc/platforms/built-in.o arch/powerpc/math-emu/built-in.o arch/powerpc/crypto/built-in.o arch/powerpc/net/built-in.o kernel/built-in.o certs/built-in.o mm/built-in.o fs/built-in.o ipc/built-in.o security/built-in.o crypto/built-in.o block/built-in.o lib/lib.a lib/built-in.o drivers/built-in.o sound/built-in.o firmware/built-in.o net/built-in.o virt/built-in.o --end-group arch/powerpc/platforms/built-in.o: In function `bootx_printf': /usr/local/src/linux-git/arch/powerpc/platforms/powermac/bootx_init.c:88: undefined reference to `__stack_chk_fail_local' arch/powerpc/platforms/built-in.o: In function `bootx_add_display_props': /usr/local/src/linux-git/arch/powerpc/platforms/powermac/bootx_init.c:211: undefined reference to `__stack_chk_fail_local' arch/powerpc/platforms/built-in.o: In function `bootx_scan_dt_build_struct': /usr/local/src/linux-git/arch/powerpc/platforms/powermac/bootx_init.c:350: undefined reference to `__stack_chk_fail_local' arch/powerpc/platforms/built-in.o: In function `bootx_init': /usr/local/src/linux-git/arch/powerpc/platforms/powermac/bootx_init.c:596: undefined reference to `__stack_chk_fail_local' /usr/bin/ld.bfd.real: .tmp_vmlinux1: hidden symbol `__stack_chk_fail_local' isn't defined /usr/bin/ld.bfd.real: final link failed: Bad value -- BOFH excuse #66: bit bucket overflow
Re: [PATCH RFC] powerpc/32: fix handling of stack protector with recent GCC
On Mon, 16 Jan 2017, Christophe Leroy wrote: > Christian, can you test it ? OK, so with that applied to v4.10-rc4, compilation still fails with GCC 4.9.2 and CC_STACKPROTECTOR_STRONG=y, see below. But it compiles just fine with CC_STACKPROTECTOR_REGULAR=y and boots to! Cross-compiling the same with GCC 5.2.0 works, even for CC_STACKPROTECTOR_STRONG=y and the system boots just fine. So, with that limitation, feel free to add: Tested-by: Christian Kujau Thanks for the fix! Christian. $ gcc --version | head -1 gcc-4.9.real (Debian 4.9.2-10) 4.9.2 $ grep CC_STACKPROTECTOR_STRONG $DIR/.config CONFIG_CC_STACKPROTECTOR_STRONG=y $ make O=$DIR V=1 bindeb-pkg [...] + ld -EB -m elf32ppc -Bstatic --build-id -X -o .tmp_vmlinux1 -T ./arch/powerpc/kernel/vmlinux.lds arch/powerpc/kernel/head_32.o arch/powerpc/kernel/fpu.o arch/powerpc/kernel/vector.o arch/powerpc/kernel/prom_init.o init/built-in.o --start-group usr/built-in.o arch/powerpc/kernel/built-in.o arch/powerpc/mm/built-in.o arch/powerpc/lib/built-in.o arch/powerpc/sysdev/built-in.o arch/powerpc/platforms/built-in.o arch/powerpc/math-emu/built-in.o arch/powerpc/crypto/built-in.o arch/powerpc/net/built-in.o kernel/built-in.o certs/built-in.o mm/built-in.o fs/built-in.o ipc/built-in.o security/built-in.o crypto/built-in.o block/built-in.o lib/lib.a lib/built-in.o drivers/built-in.o sound/built-in.o firmware/built-in.o net/built-in.o virt/built-in.o --end-group arch/powerpc/platforms/built-in.o: In function `bootx_printf': /usr/local/src/linux-git/arch/powerpc/platforms/powermac/bootx_init.c:88: undefined reference to `__stack_chk_fail_local' arch/powerpc/platforms/built-in.o: In function `bootx_add_display_props': /usr/local/src/linux-git/arch/powerpc/platforms/powermac/bootx_init.c:211: undefined reference to `__stack_chk_fail_local' arch/powerpc/platforms/built-in.o: In function `bootx_scan_dt_build_struct': /usr/local/src/linux-git/arch/powerpc/platforms/powermac/bootx_init.c:350: undefined reference to `__stack_chk_fail_local' arch/powerpc/platforms/built-in.o: In function `bootx_init': /usr/local/src/linux-git/arch/powerpc/platforms/powermac/bootx_init.c:596: undefined reference to `__stack_chk_fail_local' /usr/bin/ld.bfd.real: .tmp_vmlinux1: hidden symbol `__stack_chk_fail_local' isn't defined /usr/bin/ld.bfd.real: final link failed: Bad value -- BOFH excuse #66: bit bucket overflow
DEBUG_LOCKS_WARN_ON(1) / lockdep.c:3134 lockdep_init_map+0x1e8/0x1f0
Hi, booting v4.10-rc2 on this PowerPC G4 machine prints the following early on, but then continues to boot and the machine is running fine so far: BUG: key ef0ba7d0 not in .data! DEBUG_LOCKS_WARN_ON(1) [ cut here ] WARNING: CPU: 0 PID: 1 at /usr/local/src/linux-git/kernel/locking/lockdep.c:3134 lockdep_init_map+0x1e8/0x1f0 Modules linked in: CPU: 0 PID: 1 Comm: swapper Not tainted 4.10.0-rc2 #4 task: ef04aa60 task.stack: ef042000 NIP: c005eb78 LR: c005eb78 CTR: REGS: ef043d70 TRAP: 0700 Not tainted (4.10.0-rc2) MSR: 02029032CR: 4822 XER: 2000 GPR00: c005eb78 ef043e20 ef04aa60 0016 0001 c0068b24 0001 GPR08: 4ead 00d6 2824 c00047f0 GPR16: c08b9280 effedfa0 c078d6ac c107 c078eddc c078ee14 GPR24: c078ed24 c078ee24 0002 ef085a00 c08b ef0ba7d0 ef0ba7b4 NIP [c005eb78] lockdep_init_map+0x1e8/0x1f0 LR [c005eb78] lockdep_init_map+0x1e8/0x1f0 Call Trace: [ef043e20] [c005eb78] lockdep_init_map+0x1e8/0x1f0 (unreliable) [ef043e40] [c083adb4] kw_i2c_add+0xc0/0x134 [ef043e60] [c083b29c] pmac_i2c_init+0x3b8/0x518 [ef043ea0] [c00040c0] do_one_initcall+0x40/0x174 [ef043f00] [c0834064] kernel_init_freeable+0x134/0x1cc [ef043f30] [c0004808] kernel_init+0x18/0x110 [ef043f40] [c0010ad8] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 4837259d 2f83 41befec0 3d20c08b 812953a0 2f89 409efeb0 3c60c079 3c80c07b 3884b48c 3863f9e4 4860d52d <0fe0> 4bfffe94 9421ff70 7c0802a6 ---[ end trace 8a79d8041d87d000 ]--- Full dmesg and .config: http://nerdbynature.de/bits/4.10-rc2/ Thanks for listening, Christian. -- BOFH excuse #409: The vulcan-death-grip ping has been applied.
DEBUG_LOCKS_WARN_ON(1) / lockdep.c:3134 lockdep_init_map+0x1e8/0x1f0
Hi, booting v4.10-rc2 on this PowerPC G4 machine prints the following early on, but then continues to boot and the machine is running fine so far: BUG: key ef0ba7d0 not in .data! DEBUG_LOCKS_WARN_ON(1) [ cut here ] WARNING: CPU: 0 PID: 1 at /usr/local/src/linux-git/kernel/locking/lockdep.c:3134 lockdep_init_map+0x1e8/0x1f0 Modules linked in: CPU: 0 PID: 1 Comm: swapper Not tainted 4.10.0-rc2 #4 task: ef04aa60 task.stack: ef042000 NIP: c005eb78 LR: c005eb78 CTR: REGS: ef043d70 TRAP: 0700 Not tainted (4.10.0-rc2) MSR: 02029032 CR: 4822 XER: 2000 GPR00: c005eb78 ef043e20 ef04aa60 0016 0001 c0068b24 0001 GPR08: 4ead 00d6 2824 c00047f0 GPR16: c08b9280 effedfa0 c078d6ac c107 c078eddc c078ee14 GPR24: c078ed24 c078ee24 0002 ef085a00 c08b ef0ba7d0 ef0ba7b4 NIP [c005eb78] lockdep_init_map+0x1e8/0x1f0 LR [c005eb78] lockdep_init_map+0x1e8/0x1f0 Call Trace: [ef043e20] [c005eb78] lockdep_init_map+0x1e8/0x1f0 (unreliable) [ef043e40] [c083adb4] kw_i2c_add+0xc0/0x134 [ef043e60] [c083b29c] pmac_i2c_init+0x3b8/0x518 [ef043ea0] [c00040c0] do_one_initcall+0x40/0x174 [ef043f00] [c0834064] kernel_init_freeable+0x134/0x1cc [ef043f30] [c0004808] kernel_init+0x18/0x110 [ef043f40] [c0010ad8] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 4837259d 2f83 41befec0 3d20c08b 812953a0 2f89 409efeb0 3c60c079 3c80c07b 3884b48c 3863f9e4 4860d52d <0fe0> 4bfffe94 9421ff70 7c0802a6 ---[ end trace 8a79d8041d87d000 ]--- Full dmesg and .config: http://nerdbynature.de/bits/4.10-rc2/ Thanks for listening, Christian. -- BOFH excuse #409: The vulcan-death-grip ping has been applied.
Re: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function
On Thu, 15 Dec 2016, Jason A. Donenfeld wrote: > > I'd still drop the "24" unless you really think we're going to have > > multiple variants coming into the kernel. > > Okay. I don't have a problem with this, unless anybody has some reason > to the contrary. What if the 2/4-round version falls and we need more rounds to withstand future cryptoanalysis? We'd then have siphash_ and siphash48_ functions, no? My amateurish bike-shedding argument would be "let's keep the 24 then" :-) C. -- BOFH excuse #354: Chewing gum on /dev/sd3c
Re: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function
On Thu, 15 Dec 2016, Jason A. Donenfeld wrote: > > I'd still drop the "24" unless you really think we're going to have > > multiple variants coming into the kernel. > > Okay. I don't have a problem with this, unless anybody has some reason > to the contrary. What if the 2/4-round version falls and we need more rounds to withstand future cryptoanalysis? We'd then have siphash_ and siphash48_ functions, no? My amateurish bike-shedding argument would be "let's keep the 24 then" :-) C. -- BOFH excuse #354: Chewing gum on /dev/sd3c
Re: Locking API testsuite output mangled
On Wed, 23 Nov 2016, Michael Ellerman wrote: > That's nothing powerpc specific AFAICS, does this fix it? Hm, so s/printk/pr_cont/ - but not in all places? But yeah, this fixes it for me, at least on x86. Tested-by: Christian Kujau <li...@nerdbynature.de> Thank you! Christian. > > cheers > > diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c > index 872a15a2a637..f3a217ea0388 100644 > --- a/lib/locking-selftest.c > +++ b/lib/locking-selftest.c > @@ -980,23 +980,23 @@ static void dotest(void (*testcase_fn)(void), int > expected, int lockclass_mask) > #ifndef CONFIG_PROVE_LOCKING > if (expected == FAILURE && debug_locks) { > expected_testcase_failures++; > - printk("failed|"); > + pr_cont("failed|"); > } > else > #endif > if (debug_locks != expected) { > unexpected_testcase_failures++; > - printk("FAILED|"); > + pr_cont("FAILED|"); > > dump_stack(); > } else { > testcase_successes++; > - printk(" ok |"); > + pr_cont(" ok |"); > } > testcase_total++; > > if (debug_locks_verbose) > - printk(" lockclass mask: %x, debug_locks: %d, expected: %d\n", > + pr_cont(" lockclass mask: %x, debug_locks: %d, expected: %d\n", > lockclass_mask, debug_locks, expected); > /* >* Some tests (e.g. double-unlock) might corrupt the preemption > @@ -1021,26 +1021,26 @@ static inline void print_testname(const char > *testname) > #define DO_TESTCASE_1(desc, name, nr)\ > print_testname(desc"/"#nr); \ > dotest(name##_##nr, SUCCESS, LOCKTYPE_RWLOCK); \ > - printk("\n"); > + pr_cont("\n"); > > #define DO_TESTCASE_1B(desc, name, nr) \ > print_testname(desc"/"#nr); \ > dotest(name##_##nr, FAILURE, LOCKTYPE_RWLOCK); \ > - printk("\n"); > + pr_cont("\n"); > > #define DO_TESTCASE_3(desc, name, nr)\ > print_testname(desc"/"#nr); \ > dotest(name##_spin_##nr, FAILURE, LOCKTYPE_SPIN); \ > dotest(name##_wlock_##nr, FAILURE, LOCKTYPE_RWLOCK);\ > dotest(name##_rlock_##nr, SUCCESS, LOCKTYPE_RWLOCK);\ > - printk("\n"); > + pr_cont("\n"); > > #define DO_TESTCASE_3RW(desc, name, nr) \ > print_testname(desc"/"#nr); \ > dotest(name##_spin_##nr, FAILURE, LOCKTYPE_SPIN|LOCKTYPE_RWLOCK);\ > dotest(name##_wlock_##nr, FAILURE, LOCKTYPE_RWLOCK);\ > dotest(name##_rlock_##nr, SUCCESS, LOCKTYPE_RWLOCK);\ > - printk("\n"); > + pr_cont("\n"); > > #define DO_TESTCASE_6(desc, name)\ > print_testname(desc); \ > @@ -1050,7 +1050,7 @@ static inline void print_testname(const char *testname) > dotest(name##_mutex, FAILURE, LOCKTYPE_MUTEX); \ > dotest(name##_wsem, FAILURE, LOCKTYPE_RWSEM); \ > dotest(name##_rsem, FAILURE, LOCKTYPE_RWSEM); \ > - printk("\n"); > + pr_cont("\n"); > > #define DO_TESTCASE_6_SUCCESS(desc, name)\ > print_testname(desc); \ > @@ -1060,7 +1060,7 @@ static inline void print_testname(const char *testname) > dotest(name##_mutex, SUCCESS, LOCKTYPE_MUTEX); \ > dotest(name##_wsem, SUCCESS, LOCKTYPE_RWSEM); \ > dotest(name##_rsem, SUCCESS, LOCKTYPE_RWSEM); \ > - printk("\n"); > + pr_cont("\n"); > > /* > * 'read' variant: rlocks must not trigger. > @@ -1073,7 +1073,7 @@ static inline void print_testname(const char *testname) > dotest(name##_mutex, FAILURE, LOCKTYPE_MUTEX); \ > dotest(name##_wsem, FAILURE, LOCKTYPE_RWSEM); \ > dotest(name##_rsem, FAILURE, LOCKTYPE_RWSEM); \ > - printk("\n"); > + pr_cont("\n"); > > #define DO_TESTCASE_2I(desc, name, nr) \ > DO_TESTCASE_1("hard-"desc, name##_hard, nr);\ > @@ -1726,25 +1726,25 @@ static void ww_tests(void) > dotest(ww_test_fail_acquire, S
Re: Locking API testsuite output mangled
On Wed, 23 Nov 2016, Michael Ellerman wrote: > That's nothing powerpc specific AFAICS, does this fix it? Hm, so s/printk/pr_cont/ - but not in all places? But yeah, this fixes it for me, at least on x86. Tested-by: Christian Kujau Thank you! Christian. > > cheers > > diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c > index 872a15a2a637..f3a217ea0388 100644 > --- a/lib/locking-selftest.c > +++ b/lib/locking-selftest.c > @@ -980,23 +980,23 @@ static void dotest(void (*testcase_fn)(void), int > expected, int lockclass_mask) > #ifndef CONFIG_PROVE_LOCKING > if (expected == FAILURE && debug_locks) { > expected_testcase_failures++; > - printk("failed|"); > + pr_cont("failed|"); > } > else > #endif > if (debug_locks != expected) { > unexpected_testcase_failures++; > - printk("FAILED|"); > + pr_cont("FAILED|"); > > dump_stack(); > } else { > testcase_successes++; > - printk(" ok |"); > + pr_cont(" ok |"); > } > testcase_total++; > > if (debug_locks_verbose) > - printk(" lockclass mask: %x, debug_locks: %d, expected: %d\n", > + pr_cont(" lockclass mask: %x, debug_locks: %d, expected: %d\n", > lockclass_mask, debug_locks, expected); > /* >* Some tests (e.g. double-unlock) might corrupt the preemption > @@ -1021,26 +1021,26 @@ static inline void print_testname(const char > *testname) > #define DO_TESTCASE_1(desc, name, nr)\ > print_testname(desc"/"#nr); \ > dotest(name##_##nr, SUCCESS, LOCKTYPE_RWLOCK); \ > - printk("\n"); > + pr_cont("\n"); > > #define DO_TESTCASE_1B(desc, name, nr) \ > print_testname(desc"/"#nr); \ > dotest(name##_##nr, FAILURE, LOCKTYPE_RWLOCK); \ > - printk("\n"); > + pr_cont("\n"); > > #define DO_TESTCASE_3(desc, name, nr)\ > print_testname(desc"/"#nr); \ > dotest(name##_spin_##nr, FAILURE, LOCKTYPE_SPIN); \ > dotest(name##_wlock_##nr, FAILURE, LOCKTYPE_RWLOCK);\ > dotest(name##_rlock_##nr, SUCCESS, LOCKTYPE_RWLOCK);\ > - printk("\n"); > + pr_cont("\n"); > > #define DO_TESTCASE_3RW(desc, name, nr) \ > print_testname(desc"/"#nr); \ > dotest(name##_spin_##nr, FAILURE, LOCKTYPE_SPIN|LOCKTYPE_RWLOCK);\ > dotest(name##_wlock_##nr, FAILURE, LOCKTYPE_RWLOCK);\ > dotest(name##_rlock_##nr, SUCCESS, LOCKTYPE_RWLOCK);\ > - printk("\n"); > + pr_cont("\n"); > > #define DO_TESTCASE_6(desc, name)\ > print_testname(desc); \ > @@ -1050,7 +1050,7 @@ static inline void print_testname(const char *testname) > dotest(name##_mutex, FAILURE, LOCKTYPE_MUTEX); \ > dotest(name##_wsem, FAILURE, LOCKTYPE_RWSEM); \ > dotest(name##_rsem, FAILURE, LOCKTYPE_RWSEM); \ > - printk("\n"); > + pr_cont("\n"); > > #define DO_TESTCASE_6_SUCCESS(desc, name)\ > print_testname(desc); \ > @@ -1060,7 +1060,7 @@ static inline void print_testname(const char *testname) > dotest(name##_mutex, SUCCESS, LOCKTYPE_MUTEX); \ > dotest(name##_wsem, SUCCESS, LOCKTYPE_RWSEM); \ > dotest(name##_rsem, SUCCESS, LOCKTYPE_RWSEM); \ > - printk("\n"); > + pr_cont("\n"); > > /* > * 'read' variant: rlocks must not trigger. > @@ -1073,7 +1073,7 @@ static inline void print_testname(const char *testname) > dotest(name##_mutex, FAILURE, LOCKTYPE_MUTEX); \ > dotest(name##_wsem, FAILURE, LOCKTYPE_RWSEM); \ > dotest(name##_rsem, FAILURE, LOCKTYPE_RWSEM); \ > - printk("\n"); > + pr_cont("\n"); > > #define DO_TESTCASE_2I(desc, name, nr) \ > DO_TESTCASE_1("hard-"desc, name##_hard, nr);\ > @@ -1726,25 +1726,25 @@ static void ww_tests(void) > dotest(ww_test_fail_acquire, SUCCESS, LOCKTYPE_WW); &
Locking API testsuite output mangled
The "Locking API testsuite" output during bootup (with CONFIG_DEBUG_LOCKING_API_SELFTESTS=y) on this PowerPC system looks mangled, possibly related to the recent printk changes (4bcc595ccd80, "printk: reinstate KERN_CONT for printing continuation lines"). Before (e.g. with v4.6) it looked like this: http://nerdbynature.de/bits/4.6.0-rc7/dmesg.txt See below for the current output. Christian. [0.001417] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar [0.001439] ... MAX_LOCKDEP_SUBCLASSES: 8 [0.001453] ... MAX_LOCK_DEPTH: 48 [0.001467] ... MAX_LOCKDEP_KEYS:8191 [0.001482] ... CLASSHASH_SIZE: 4096 [0.001497] ... MAX_LOCKDEP_ENTRIES: 32768 [0.001511] ... MAX_LOCKDEP_CHAINS: 65536 [0.001526] ... CHAINHASH_SIZE: 32768 [0.001541] memory used by lock dependency info: 5167 kB [0.001557] per task-struct memory footprint: 1536 bytes [0.001574] [0.001587] | Locking API testsuite: [0.001600] [0.001622] | spin |wlock |rlock |mutex | wsem | rsem | [0.001644] -- [0.001681] A-A deadlock: [0.001705] ok | [0.003198] ok | [0.004555] ok | [0.005962] ok | [0.007307] ok | [0.008647] ok | [0.010015] A-B-B-A deadlock: [0.010045] ok | [0.011401] ok | [0.012736] ok | [0.014116] ok | [0.015458] ok | [0.016812] ok | [0.018175] A-B-B-C-C-A deadlock: [0.018212] ok | [0.019575] ok | [0.020916] ok | [0.022304] ok | [0.023654] ok | [0.025017] ok | [0.026382] A-B-C-A-B-C deadlock: [0.026419] ok | [0.027781] ok | [0.029122] ok | [0.030510] ok | [0.031860] ok | [0.033223] ok | [0.034587] A-B-B-C-C-D-D-A deadlock: [0.034633] ok | [0.036007] ok | [0.037356] ok | [0.038757] ok | [0.040118] ok | [0.041492] ok | [0.042859] A-B-C-D-B-D-D-A deadlock: [0.042905] ok | [0.044278] ok | [0.045628] ok | [0.047029] ok | [0.048388] ok | [0.049761] ok | [0.051130] A-B-C-D-B-C-D-A deadlock: [0.051176] ok | [0.052551] ok | [0.053901] ok | [0.055303] ok | [0.056665] ok | [0.058040] ok | [0.059408] double unlock: [0.059429] ok | [0.060774] ok | [0.062103] ok | [0.063469] ok | [0.064800] ok | [0.066145] ok | [0.067508] initialize held: [0.067527] ok | [0.068870] ok | [0.070198] ok | [0.071561] ok | [0.072892] ok | [0.074235] ok | [0.075596] bad unlock order: [0.075623] ok | [0.076979] ok | [0.078316] ok | [0.079691] ok | [0.081031] ok | [0.082387] ok | [0.083753] -- [0.083791] recursive read-lock: [0.083804] | [0.083830] ok | [0.085157] | [0.085183] ok | [0.086526]recursive read-lock #2: [0.086539] | [0.086564] ok | [0.087908] | [0.087936] ok | [0.089280] mixed read-write-lock: [0.089293] | [0.089320] ok | [0.090643] | [0.090672] ok | [0.092035] mixed write-read-lock: [0.092048] | [0.092075] ok | [0.093399] | [0.093428] ok | [0.094771] -- [0.094809] hard-irqs-on + irq-safe-A/12: [0.094829] ok | [0.096192] ok | [0.097523] ok | [0.098882] soft-irqs-on + irq-safe-A/12: [0.098904] ok | [0.100270] ok | [0.101602] ok | [0.102962] hard-irqs-on + irq-safe-A/21: [0.102982] ok | [0.104345] ok | [0.105678] ok | [0.107037] soft-irqs-on + irq-safe-A/21: [0.107058] ok | [0.108422] ok | [0.109754] ok | [0.12]sirq-safe-A => hirqs-on/12: [0.33] ok | [0.112498] ok | [0.113830] ok | [0.115189]sirq-safe-A => hirqs-on/21: [0.115209] ok | [0.116574] ok | [0.117907] ok | [0.119266] hard-safe-A + irqs-on/12: [0.119286] ok | [0.120649] ok | [0.121981] ok | [0.123341] soft-safe-A + irqs-on/12: [0.123362] ok | [0.124727] ok | [0.126061] ok | [0.127420]
Locking API testsuite output mangled
The "Locking API testsuite" output during bootup (with CONFIG_DEBUG_LOCKING_API_SELFTESTS=y) on this PowerPC system looks mangled, possibly related to the recent printk changes (4bcc595ccd80, "printk: reinstate KERN_CONT for printing continuation lines"). Before (e.g. with v4.6) it looked like this: http://nerdbynature.de/bits/4.6.0-rc7/dmesg.txt See below for the current output. Christian. [0.001417] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar [0.001439] ... MAX_LOCKDEP_SUBCLASSES: 8 [0.001453] ... MAX_LOCK_DEPTH: 48 [0.001467] ... MAX_LOCKDEP_KEYS:8191 [0.001482] ... CLASSHASH_SIZE: 4096 [0.001497] ... MAX_LOCKDEP_ENTRIES: 32768 [0.001511] ... MAX_LOCKDEP_CHAINS: 65536 [0.001526] ... CHAINHASH_SIZE: 32768 [0.001541] memory used by lock dependency info: 5167 kB [0.001557] per task-struct memory footprint: 1536 bytes [0.001574] [0.001587] | Locking API testsuite: [0.001600] [0.001622] | spin |wlock |rlock |mutex | wsem | rsem | [0.001644] -- [0.001681] A-A deadlock: [0.001705] ok | [0.003198] ok | [0.004555] ok | [0.005962] ok | [0.007307] ok | [0.008647] ok | [0.010015] A-B-B-A deadlock: [0.010045] ok | [0.011401] ok | [0.012736] ok | [0.014116] ok | [0.015458] ok | [0.016812] ok | [0.018175] A-B-B-C-C-A deadlock: [0.018212] ok | [0.019575] ok | [0.020916] ok | [0.022304] ok | [0.023654] ok | [0.025017] ok | [0.026382] A-B-C-A-B-C deadlock: [0.026419] ok | [0.027781] ok | [0.029122] ok | [0.030510] ok | [0.031860] ok | [0.033223] ok | [0.034587] A-B-B-C-C-D-D-A deadlock: [0.034633] ok | [0.036007] ok | [0.037356] ok | [0.038757] ok | [0.040118] ok | [0.041492] ok | [0.042859] A-B-C-D-B-D-D-A deadlock: [0.042905] ok | [0.044278] ok | [0.045628] ok | [0.047029] ok | [0.048388] ok | [0.049761] ok | [0.051130] A-B-C-D-B-C-D-A deadlock: [0.051176] ok | [0.052551] ok | [0.053901] ok | [0.055303] ok | [0.056665] ok | [0.058040] ok | [0.059408] double unlock: [0.059429] ok | [0.060774] ok | [0.062103] ok | [0.063469] ok | [0.064800] ok | [0.066145] ok | [0.067508] initialize held: [0.067527] ok | [0.068870] ok | [0.070198] ok | [0.071561] ok | [0.072892] ok | [0.074235] ok | [0.075596] bad unlock order: [0.075623] ok | [0.076979] ok | [0.078316] ok | [0.079691] ok | [0.081031] ok | [0.082387] ok | [0.083753] -- [0.083791] recursive read-lock: [0.083804] | [0.083830] ok | [0.085157] | [0.085183] ok | [0.086526]recursive read-lock #2: [0.086539] | [0.086564] ok | [0.087908] | [0.087936] ok | [0.089280] mixed read-write-lock: [0.089293] | [0.089320] ok | [0.090643] | [0.090672] ok | [0.092035] mixed write-read-lock: [0.092048] | [0.092075] ok | [0.093399] | [0.093428] ok | [0.094771] -- [0.094809] hard-irqs-on + irq-safe-A/12: [0.094829] ok | [0.096192] ok | [0.097523] ok | [0.098882] soft-irqs-on + irq-safe-A/12: [0.098904] ok | [0.100270] ok | [0.101602] ok | [0.102962] hard-irqs-on + irq-safe-A/21: [0.102982] ok | [0.104345] ok | [0.105678] ok | [0.107037] soft-irqs-on + irq-safe-A/21: [0.107058] ok | [0.108422] ok | [0.109754] ok | [0.12]sirq-safe-A => hirqs-on/12: [0.33] ok | [0.112498] ok | [0.113830] ok | [0.115189]sirq-safe-A => hirqs-on/21: [0.115209] ok | [0.116574] ok | [0.117907] ok | [0.119266] hard-safe-A + irqs-on/12: [0.119286] ok | [0.120649] ok | [0.121981] ok | [0.123341] soft-safe-A + irqs-on/12: [0.123362] ok | [0.124727] ok | [0.126061] ok | [0.127420]
jfs: mangled lockdep splat
For some time now, I always[0] receive a lockdep warning when there's some disk I/O on the system. But recently the warning looks kinda mangled, I suspect the recent printk change (4bcc595ccd80, "printk: reinstate KERN_CONT for printing continuation lines") to be the reason for that. In previous versions, the warning looked like this: http://nerdbynature.de/bits/4.6.0-rc7/dmesg.txt Below is the new warning, which is barely readable anymore. Of course, best would be for the warning to vanish (hehe) but maybe the printout could be fixed too? Thanks, Christian. [ 2401.254353] = [ 2401.254410] [ INFO: possible irq lock inversion dependency detected ] [ 2401.254469] 4.9.0-rc6 #1 Not tainted [ 2401.254506] - [ 2401.254560] kswapd0/282 just changed the state of lock: [ 2401.254620] ( [ 2401.254647] _ip->rdwrlock [ 2401.254685] #2 [ 2401.254698] ){-.} [ 2401.254730] , at: [ 2401.254764] [] jfs_get_block+0x50/0x370 [ 2401.254812] but this lock took another, RECLAIM_FS-unsafe lock in the past: [ 2401.254868] ( [ 2401.254890] _ip->commit_mutex [ 2401.254927] ){+.+.+.} [ 2401.254945] and interrupts could create inverse lock ordering between them. [ 2401.255041] other info that might help us debug this: [ 2401.255097] Possible interrupt unsafe locking scenario: [ 2401.255160]CPU0CPU1 [ 2401.255203] [ 2401.255243] lock( [ 2401.255273] _ip->commit_mutex [ 2401.255310] ); [ 2401.255334]local_irq_disable(); [ 2401.255381]lock( [ 2401.255420] _ip->rdwrlock [ 2401.255454] #2 [ 2401.255467] ); [ 2401.255494]lock( [ 2401.255536] _ip->commit_mutex [ 2401.255573] ); [ 2401.255596] [ 2401.255623] lock( [ 2401.255648] _ip->rdwrlock [ 2401.256059] #2 [ 2401.256071] ); [ 2401.256446] *** DEADLOCK *** [ 2401.257522] no locks held by kswapd0/282. [ 2401.257888] the shortest dependencies between 2nd lock and 1st lock: [ 2401.258622] -> [ 2401.258645] ( [ 2401.259014] _ip->commit_mutex [ 2401.259047] ){+.+.+.} [ 2401.259418] ops: 31698 [ 2401.259435] { [ 2401.259800] HARDIRQ-ON-W [ 2401.259829] at: [ 2401.260192] [ 2401.260236] [] lock_acquire+0x4c/0x68 [ 2401.260619] [ 2401.260657] [] mutex_lock_nested+0x38/0x2f8 [ 2401.261048] [ 2401.261108] [] jfs_create+0x88/0x2c4 [ 2401.261839] [ 2401.261996] [] path_openat+0xc1c/0x100c [ 2401.262689] [ 2401.262860] [] do_filp_open+0xb0/0x100 [ 2401.263639] [ 2401.263678] [] do_sys_open+0x154/0x21c [ 2401.264368] [ 2401.264411] [] ret_from_syscall+0x0/0x38 [ 2401.265070] SOFTIRQ-ON-W [ 2401.265099] at: [ 2401.265579] [ 2401.265751] [] lock_acquire+0x4c/0x68 [ 2401.266268] [ 2401.266437] [] mutex_lock_nested+0x38/0x2f8 [ 2401.266975] [ 2401.267135] [] jfs_create+0x88/0x2c4 [ 2401.267679] [ 2401.267837] [] path_openat+0xc1c/0x100c [ 2401.268394] [ 2401.268560] [] do_filp_open+0xb0/0x100 [ 2401.269104] [ 2401.269273] [] do_sys_open+0x154/0x21c [ 2401.269969] [ 2401.270040] [] ret_from_syscall+0x0/0x38 [ 2401.270727] RECLAIM_FS-ON-W [ 2401.270761] at: [ 2401.271279] [ 2401.271418] [] lockdep_trace_alloc+0x8c/0xe4 [ 2401.272017] [ 2401.272122] [] __kmalloc+0x40/0x14c [ 2401.272706] [ 2401.272839] [] __jfs_set_acl+0xa0/0x1a4 [ 2401.273428] [ 2401.273541] [] jfs_set_acl+0x50/0x9c [ 2401.274135] [ 2401.274267] [] posix_acl_chmod+0xf0/0x130 [ 2401.274824] [ 2401.274991] [] notify_change+0x1c4/0x42c [ 2401.275690] [ 2401.275727] [] chmod_common+0x74/0x10c [ 2401.276382] [ 2401.276419] [] SyS_fchmod+0x30/0x64 [ 2401.277090] [ 2401.277129] [] ret_from_syscall+0x0/0x38 [ 2401.277805] INITIAL USE [ 2401.277834] at: [ 2401.278352] [ 2401.278525] [] lock_acquire+0x4c/0x68 [ 2401.279054] [ 2401.279224] [] mutex_lock_nested+0x38/0x2f8 [ 2401.279783] [ 2401.279944] [] jfs_create+0x88/0x2c4 [ 2401.280528] [ 2401.280645] [] path_openat+0xc1c/0x100c [ 2401.281287] [ 2401.281358] [] do_filp_open+0xb0/0x100 [ 2401.282048] [ 2401.282084] [] do_sys_open+0x154/0x21c [ 2401.282840] [ 2401.282880] [] ret_from_syscall+0x0/0x38 [ 2401.283590] } [ 2401.284175] ... key at: [ 2401.284247] []
jfs: mangled lockdep splat
For some time now, I always[0] receive a lockdep warning when there's some disk I/O on the system. But recently the warning looks kinda mangled, I suspect the recent printk change (4bcc595ccd80, "printk: reinstate KERN_CONT for printing continuation lines") to be the reason for that. In previous versions, the warning looked like this: http://nerdbynature.de/bits/4.6.0-rc7/dmesg.txt Below is the new warning, which is barely readable anymore. Of course, best would be for the warning to vanish (hehe) but maybe the printout could be fixed too? Thanks, Christian. [ 2401.254353] = [ 2401.254410] [ INFO: possible irq lock inversion dependency detected ] [ 2401.254469] 4.9.0-rc6 #1 Not tainted [ 2401.254506] - [ 2401.254560] kswapd0/282 just changed the state of lock: [ 2401.254620] ( [ 2401.254647] _ip->rdwrlock [ 2401.254685] #2 [ 2401.254698] ){-.} [ 2401.254730] , at: [ 2401.254764] [] jfs_get_block+0x50/0x370 [ 2401.254812] but this lock took another, RECLAIM_FS-unsafe lock in the past: [ 2401.254868] ( [ 2401.254890] _ip->commit_mutex [ 2401.254927] ){+.+.+.} [ 2401.254945] and interrupts could create inverse lock ordering between them. [ 2401.255041] other info that might help us debug this: [ 2401.255097] Possible interrupt unsafe locking scenario: [ 2401.255160]CPU0CPU1 [ 2401.255203] [ 2401.255243] lock( [ 2401.255273] _ip->commit_mutex [ 2401.255310] ); [ 2401.255334]local_irq_disable(); [ 2401.255381]lock( [ 2401.255420] _ip->rdwrlock [ 2401.255454] #2 [ 2401.255467] ); [ 2401.255494]lock( [ 2401.255536] _ip->commit_mutex [ 2401.255573] ); [ 2401.255596] [ 2401.255623] lock( [ 2401.255648] _ip->rdwrlock [ 2401.256059] #2 [ 2401.256071] ); [ 2401.256446] *** DEADLOCK *** [ 2401.257522] no locks held by kswapd0/282. [ 2401.257888] the shortest dependencies between 2nd lock and 1st lock: [ 2401.258622] -> [ 2401.258645] ( [ 2401.259014] _ip->commit_mutex [ 2401.259047] ){+.+.+.} [ 2401.259418] ops: 31698 [ 2401.259435] { [ 2401.259800] HARDIRQ-ON-W [ 2401.259829] at: [ 2401.260192] [ 2401.260236] [] lock_acquire+0x4c/0x68 [ 2401.260619] [ 2401.260657] [] mutex_lock_nested+0x38/0x2f8 [ 2401.261048] [ 2401.261108] [] jfs_create+0x88/0x2c4 [ 2401.261839] [ 2401.261996] [] path_openat+0xc1c/0x100c [ 2401.262689] [ 2401.262860] [] do_filp_open+0xb0/0x100 [ 2401.263639] [ 2401.263678] [] do_sys_open+0x154/0x21c [ 2401.264368] [ 2401.264411] [] ret_from_syscall+0x0/0x38 [ 2401.265070] SOFTIRQ-ON-W [ 2401.265099] at: [ 2401.265579] [ 2401.265751] [] lock_acquire+0x4c/0x68 [ 2401.266268] [ 2401.266437] [] mutex_lock_nested+0x38/0x2f8 [ 2401.266975] [ 2401.267135] [] jfs_create+0x88/0x2c4 [ 2401.267679] [ 2401.267837] [] path_openat+0xc1c/0x100c [ 2401.268394] [ 2401.268560] [] do_filp_open+0xb0/0x100 [ 2401.269104] [ 2401.269273] [] do_sys_open+0x154/0x21c [ 2401.269969] [ 2401.270040] [] ret_from_syscall+0x0/0x38 [ 2401.270727] RECLAIM_FS-ON-W [ 2401.270761] at: [ 2401.271279] [ 2401.271418] [] lockdep_trace_alloc+0x8c/0xe4 [ 2401.272017] [ 2401.272122] [] __kmalloc+0x40/0x14c [ 2401.272706] [ 2401.272839] [] __jfs_set_acl+0xa0/0x1a4 [ 2401.273428] [ 2401.273541] [] jfs_set_acl+0x50/0x9c [ 2401.274135] [ 2401.274267] [] posix_acl_chmod+0xf0/0x130 [ 2401.274824] [ 2401.274991] [] notify_change+0x1c4/0x42c [ 2401.275690] [ 2401.275727] [] chmod_common+0x74/0x10c [ 2401.276382] [ 2401.276419] [] SyS_fchmod+0x30/0x64 [ 2401.277090] [ 2401.277129] [] ret_from_syscall+0x0/0x38 [ 2401.277805] INITIAL USE [ 2401.277834] at: [ 2401.278352] [ 2401.278525] [] lock_acquire+0x4c/0x68 [ 2401.279054] [ 2401.279224] [] mutex_lock_nested+0x38/0x2f8 [ 2401.279783] [ 2401.279944] [] jfs_create+0x88/0x2c4 [ 2401.280528] [ 2401.280645] [] path_openat+0xc1c/0x100c [ 2401.281287] [ 2401.281358] [] do_filp_open+0xb0/0x100 [ 2401.282048] [ 2401.282084] [] do_sys_open+0x154/0x21c [ 2401.282840] [ 2401.282880] [] ret_from_syscall+0x0/0x38 [ 2401.283590] } [ 2401.284175] ... key at: [ 2401.284247] []
Re: [4.8-rc1] make bindeb-pkg O= fails
[re-send] On Mon, 8 Aug 2016, frank paulsen wrote: > in 4.8-rc1 "make bindeb-pkg O=../debian" fails: > | find: `scripts/gcc-plugins': No such file or directory > | /usr/src/linus/scripts/package/Makefile:97: recipe for target > 'bindeb-pkg' failed > > this is due to a missing directory scripts/gcc-plugins if using O= > > removing line 335 of scripts/package/builddeb helps: > | (cd $objtree; find scripts/gcc-plugins -name \*.so -o -name > gcc-common.h) >> "$objtree/debian/hdrobjfiles" > > this clearly isn't the right fix, but i checked it anyway and the > paket gets built. This was introduced in 6b90bd4ba40b38dc13c2782469c1c77e4ed79915 ("GCC plugin infrastructure"). Not failing hard when scripts/gcc-plugins cannot be found, does the trick as well. But that too just papers over the issue. Hopefully Emese has a better idea on how to solve this :-) diff --git a/scripts/package/builddeb b/scripts/package/builddeb index e1c09e2..89757f6 100755 --- a/scripts/package/builddeb +++ b/scripts/package/builddeb @@ -332,7 +332,7 @@ if grep -q '^CONFIG_STACK_VALIDATION=y' $KCONFIG_CONFIG ; then (cd $objtree; find tools/objtool -type f -executable) >> "$objtree/debian/hdrobjfiles" fi (cd $objtree; find arch/$SRCARCH/include Module.symvers include scripts -type f) >> "$objtree/debian/hdrobjfiles" -(cd $objtree; find scripts/gcc-plugins -name \*.so -o -name gcc-common.h) >> "$objtree/debian/hdrobjfiles" +(cd $objtree; find scripts/gcc-plugins -name \*.so -o -name gcc-common.h) >> "$objtree/debian/hdrobjfiles" || true destdir=$kernel_headers_dir/usr/src/linux-headers-$version mkdir -p "$destdir" (cd $srctree; tar -c -f - -T -) < "$objtree/debian/hdrsrcfiles" | (cd $destdir; tar -xf -) Thanks, Christian. -- BOFH excuse #269: Melting hard drives -- make bzImage, not war
Re: [4.8-rc1] make bindeb-pkg O= fails
[re-send] On Mon, 8 Aug 2016, frank paulsen wrote: > in 4.8-rc1 "make bindeb-pkg O=../debian" fails: > | find: `scripts/gcc-plugins': No such file or directory > | /usr/src/linus/scripts/package/Makefile:97: recipe for target > 'bindeb-pkg' failed > > this is due to a missing directory scripts/gcc-plugins if using O= > > removing line 335 of scripts/package/builddeb helps: > | (cd $objtree; find scripts/gcc-plugins -name \*.so -o -name > gcc-common.h) >> "$objtree/debian/hdrobjfiles" > > this clearly isn't the right fix, but i checked it anyway and the > paket gets built. This was introduced in 6b90bd4ba40b38dc13c2782469c1c77e4ed79915 ("GCC plugin infrastructure"). Not failing hard when scripts/gcc-plugins cannot be found, does the trick as well. But that too just papers over the issue. Hopefully Emese has a better idea on how to solve this :-) diff --git a/scripts/package/builddeb b/scripts/package/builddeb index e1c09e2..89757f6 100755 --- a/scripts/package/builddeb +++ b/scripts/package/builddeb @@ -332,7 +332,7 @@ if grep -q '^CONFIG_STACK_VALIDATION=y' $KCONFIG_CONFIG ; then (cd $objtree; find tools/objtool -type f -executable) >> "$objtree/debian/hdrobjfiles" fi (cd $objtree; find arch/$SRCARCH/include Module.symvers include scripts -type f) >> "$objtree/debian/hdrobjfiles" -(cd $objtree; find scripts/gcc-plugins -name \*.so -o -name gcc-common.h) >> "$objtree/debian/hdrobjfiles" +(cd $objtree; find scripts/gcc-plugins -name \*.so -o -name gcc-common.h) >> "$objtree/debian/hdrobjfiles" || true destdir=$kernel_headers_dir/usr/src/linux-headers-$version mkdir -p "$destdir" (cd $srctree; tar -c -f - -T -) < "$objtree/debian/hdrsrcfiles" | (cd $destdir; tar -xf -) Thanks, Christian. -- BOFH excuse #269: Melting hard drives -- make bzImage, not war
Makefile.sphinx:17: The 'sphinx-build' command was not found
Hi, since 22cba31bae ("Documentation/sphinx: add basic working Sphinx configuration and build") the following warning is emitted when running "make help": $ make help > /dev/null Documentation/Makefile.sphinx:17: The 'sphinx-build' command was not found. Make sure you have Sphinx installed and in PATH, or set the SPHINXBUILD make variable to point to the full path of the 'sphinx-build' executable. Indeed, I don't have "sphinx-build" installed (nor do I want to build documentation), running "make SPHINXBUILD=/bin/true help" makes the warning go away. Is there a way to omit the warning when running "make help"? E.g. by not including Documentation/Makefile.sphinx for that target? Thanks, Christian. -- BOFH excuse #296: The hardware bus needs a new token.
Makefile.sphinx:17: The 'sphinx-build' command was not found
Hi, since 22cba31bae ("Documentation/sphinx: add basic working Sphinx configuration and build") the following warning is emitted when running "make help": $ make help > /dev/null Documentation/Makefile.sphinx:17: The 'sphinx-build' command was not found. Make sure you have Sphinx installed and in PATH, or set the SPHINXBUILD make variable to point to the full path of the 'sphinx-build' executable. Indeed, I don't have "sphinx-build" installed (nor do I want to build documentation), running "make SPHINXBUILD=/bin/true help" makes the warning go away. Is there a way to omit the warning when running "make help"? E.g. by not including Documentation/Makefile.sphinx for that target? Thanks, Christian. -- BOFH excuse #296: The hardware bus needs a new token.
Re: [PATCH] KERNEL: resource: Fix bug on leakage in /proc/iomem file
On Wed, 6 Apr 2016, e...@abdsec.com wrote: > First, I wrote your attached patch, but then I thought zeroing other > /proc/iomem values would be better. So I changed it. On my systems, /proc/iomem, /proc/ioports and others get their world-readable bits removed during bootup - I guess that would mitigate this issue too? Christian. -- BOFH excuse #184: loop found in loop in redundant loopback
Re: [PATCH] KERNEL: resource: Fix bug on leakage in /proc/iomem file
On Wed, 6 Apr 2016, e...@abdsec.com wrote: > First, I wrote your attached patch, but then I thought zeroing other > /proc/iomem values would be better. So I changed it. On my systems, /proc/iomem, /proc/ioports and others get their world-readable bits removed during bootup - I guess that would mitigate this issue too? Christian. -- BOFH excuse #184: loop found in loop in redundant loopback
iwlwifi: Error sending REPLY_ADD_STA
Hello, sometimes the Wifi adapter (Wireless-N 2230) in this Lenovo Thinkpad E431 "disappears" and cannot be fixed by re-loading the iwlwifi kernel module either. Only a reboot will do. When I was running 3.16.0-4-amd64 from Debian/stable, I noticed the following message, but only _once_ and Wifi worked fine even with that message: iwlwifi :04:00.0: Error sending REPLY_TX_LINK_QUALITY_CMD: time out after 2000ms. iwlwifi :04:00.0: Current CMD queue read_ptr 25 write_ptr 26 iwlwifi :04:00.0: Loaded firmware version: 18.168.6.1 iwlwifi :04:00.0: Microcode SW error detected. Restarting 0x200. Now with Linux 4.2 it happens more often, I'll attach a the error below, the full kernel logs and .config can be found here: http://nerdbynature.de/bits/v4.2/ The initial error messages so far: Linux version 3.16.0-4-amd64, Aug 17 iwlwifi :04:00.0: Error sending REPLY_TX_LINK_QUALITY_CMD: time out after 2000ms. Linux version 4.2.0-rc7, Aug 30 iwlwifi :04:00.0: Error sending REPLY_ADD_STA: time out after 2000ms. Linux version 4.2.0, Sep 19 iwlwifi :04:00.0: Error sending REPLY_ADD_STA: time out after 2000ms. Linux version 4.2.0, Sep 20 iwlwifi :04:00.0: Error sending REPLY_ADD_STA: time out after 2000ms. Linux version 4.2.0, Sep 25 iwlwifi :04:00.0: Error sending REPLY_ADD_STA: time out after 2000ms. I found old reports of these messages on the net, but either they were marked fixed[0] or years old[1][2] or not applicable to my card[3] Does anybody has an idea what's going on here? Thanks, Christian. [0] https://bugzilla.redhat.com/show_bug.cgi?id=1029785 [1] https://answers.launchpad.net/ubuntu/+source/gnome-nettool/+question/221076 [2] https://lkml.org/lkml/2012/2/8/247 [3] https://lists.fedoraproject.org/pipermail/users/2011-June/400906.html === dmesg usb 1-3: USB disconnect, device number 2 iwlwifi :04:00.0: Error sending REPLY_ADD_STA: time out after 2000ms. iwlwifi :04:00.0: Current CMD queue read_ptr 192 write_ptr 193 [ cut here ] WARNING: CPU: 2 PID: 25891 at /usr/local/src/linux-git/drivers/net/wireless/iwlwifi/pcie/trans.c:1444 iwl_trans_pcie_grab_nic_access+0x100/0x110 [iwlwifi]() Timeout waiting for hardware access (CSR_GP_CNTRL 0x) Modules linked in: md4 nls_iso8859_15 cifs auth_rpcgss oid_registry nfsv4 dns_resolver xfs libcrc32c ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter xt_conntrack nf_conntrack ip_tables x_tables ctr ccm pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) nfs lockd grace fscache sunrpc uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common videodev sha256_ssse3 sha256_generic hmac x86_pkg_temp_thermal drbg snd_hda_codec_hdmi intel_powerclamp coretemp aesni_intel aes_x86_64 glue_helper arc4 lrw gf128mul ablk_helper cryptd snd_hda_codec_conexant snd_hda_codec_generic snd_pcsp psmouse iwldvm snd_hda_intel snd_hda_codec mac80211 snd_hwdep iwlwifi snd_hda_core cfg80211 lpc_ich i2c_i801 snd_pcm snd_timer snd shpchp soundcore battery ac processor loop fuse autofs4 mmc_block hid_logitech_hidpp hid_logitech_dj hid_generic usbhid hid btrfs xor raid6_pq rtsx_pci_sdmmc mmc_core xhci_pci ehci_pci xhci_hcd e hci_hcd sr_mod cdrom sg rtsx_pci usbcore mfd_core usb_common thermal CPU: 2 PID: 25891 Comm: kworker/u16:1 Tainted: G O4.2.0 #2 Hardware name: LENOVO 6277CTO/6277CTO, BIOS HEET48WW (1.29 ) 03/13/2015 Workqueue: phy0 ieee80211_ba_session_work [mac80211] c0327bd8 baacb049 c0327bd8 8158173a 8801cffb7aa0 8104ccb7 8800c27f4000 8800c27f7ca8 8801cffb7b38 8104cd48 Call Trace: [] ? dump_stack+0x40/0x50 [] ? warn_slowpath_common+0x77/0xb0 [] ? warn_slowpath_fmt+0x58/0x80 [] ? iwl_trans_pcie_grab_nic_access+0x100/0x110 [iwlwifi] [] ? iwl_write_prph+0x2e/0x70 [iwlwifi] [] ? iwl_force_nmi+0x1d/0x60 [iwlwifi] [] ? iwl_trans_pcie_send_hcmd+0x3c0/0x420 [iwlwifi] [] ? wait_woken+0x80/0x80 [] ? iwl_send_add_sta+0x7f/0xd0 [iwldvm] [] ? iwl_sta_rx_agg_stop+0xfb/0x150 [iwldvm] [] ? iwlagn_mac_ampdu_action+0x103/0x1e0 [iwldvm] [] ? ___ieee80211_stop_rx_ba_session+0xaf/0x1b0 [mac80211] [] ? ieee80211_ba_session_work+0x100/0x170 [mac80211] [] ? process_one_work+0x137/0x360 [] ? pwq_activate_delayed_work+0x27/0x40 [] ? worker_thread+0x5d/0x450 [] ? perf_cgroup_switch+0x1a0/0x1a0 [] ? rescuer_thread+0x310/0x310 [] ? kthread+0xda/0xf0 [] ? kthread_create_on_node+0x1b0/0x1b0 [] ? ret_from_fork+0x3f/0x70 [] ? kthread_create_on_node+0x1b0/0x1b0 ---[ end trace 6435c974dd1d2317 ]--- iwlwifi :04:00.0: Loaded firmware version: 18.168.6.1 iwlwifi :04:00.0: Start IWL Error Log Dump: iwlwifi :04:00.0: Status: 0x004C, count: -30719 iwlwifi :04:00.0: 0xBAACB049 | ADVANCED_SYSASSERT iwlwifi :04:00.0: 0x | uPc
iwlwifi: Error sending REPLY_ADD_STA
Hello, sometimes the Wifi adapter (Wireless-N 2230) in this Lenovo Thinkpad E431 "disappears" and cannot be fixed by re-loading the iwlwifi kernel module either. Only a reboot will do. When I was running 3.16.0-4-amd64 from Debian/stable, I noticed the following message, but only _once_ and Wifi worked fine even with that message: iwlwifi :04:00.0: Error sending REPLY_TX_LINK_QUALITY_CMD: time out after 2000ms. iwlwifi :04:00.0: Current CMD queue read_ptr 25 write_ptr 26 iwlwifi :04:00.0: Loaded firmware version: 18.168.6.1 iwlwifi :04:00.0: Microcode SW error detected. Restarting 0x200. Now with Linux 4.2 it happens more often, I'll attach a the error below, the full kernel logs and .config can be found here: http://nerdbynature.de/bits/v4.2/ The initial error messages so far: Linux version 3.16.0-4-amd64, Aug 17 iwlwifi :04:00.0: Error sending REPLY_TX_LINK_QUALITY_CMD: time out after 2000ms. Linux version 4.2.0-rc7, Aug 30 iwlwifi :04:00.0: Error sending REPLY_ADD_STA: time out after 2000ms. Linux version 4.2.0, Sep 19 iwlwifi :04:00.0: Error sending REPLY_ADD_STA: time out after 2000ms. Linux version 4.2.0, Sep 20 iwlwifi :04:00.0: Error sending REPLY_ADD_STA: time out after 2000ms. Linux version 4.2.0, Sep 25 iwlwifi :04:00.0: Error sending REPLY_ADD_STA: time out after 2000ms. I found old reports of these messages on the net, but either they were marked fixed[0] or years old[1][2] or not applicable to my card[3] Does anybody has an idea what's going on here? Thanks, Christian. [0] https://bugzilla.redhat.com/show_bug.cgi?id=1029785 [1] https://answers.launchpad.net/ubuntu/+source/gnome-nettool/+question/221076 [2] https://lkml.org/lkml/2012/2/8/247 [3] https://lists.fedoraproject.org/pipermail/users/2011-June/400906.html === dmesg usb 1-3: USB disconnect, device number 2 iwlwifi :04:00.0: Error sending REPLY_ADD_STA: time out after 2000ms. iwlwifi :04:00.0: Current CMD queue read_ptr 192 write_ptr 193 [ cut here ] WARNING: CPU: 2 PID: 25891 at /usr/local/src/linux-git/drivers/net/wireless/iwlwifi/pcie/trans.c:1444 iwl_trans_pcie_grab_nic_access+0x100/0x110 [iwlwifi]() Timeout waiting for hardware access (CSR_GP_CNTRL 0x) Modules linked in: md4 nls_iso8859_15 cifs auth_rpcgss oid_registry nfsv4 dns_resolver xfs libcrc32c ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter xt_conntrack nf_conntrack ip_tables x_tables ctr ccm pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) nfs lockd grace fscache sunrpc uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common videodev sha256_ssse3 sha256_generic hmac x86_pkg_temp_thermal drbg snd_hda_codec_hdmi intel_powerclamp coretemp aesni_intel aes_x86_64 glue_helper arc4 lrw gf128mul ablk_helper cryptd snd_hda_codec_conexant snd_hda_codec_generic snd_pcsp psmouse iwldvm snd_hda_intel snd_hda_codec mac80211 snd_hwdep iwlwifi snd_hda_core cfg80211 lpc_ich i2c_i801 snd_pcm snd_timer snd shpchp soundcore battery ac processor loop fuse autofs4 mmc_block hid_logitech_hidpp hid_logitech_dj hid_generic usbhid hid btrfs xor raid6_pq rtsx_pci_sdmmc mmc_core xhci_pci ehci_pci xhci_hcd e hci_hcd sr_mod cdrom sg rtsx_pci usbcore mfd_core usb_common thermal CPU: 2 PID: 25891 Comm: kworker/u16:1 Tainted: G O4.2.0 #2 Hardware name: LENOVO 6277CTO/6277CTO, BIOS HEET48WW (1.29 ) 03/13/2015 Workqueue: phy0 ieee80211_ba_session_work [mac80211] c0327bd8 baacb049 c0327bd8 8158173a 8801cffb7aa0 8104ccb7 8800c27f4000 8800c27f7ca8 8801cffb7b38 8104cd48 Call Trace: [] ? dump_stack+0x40/0x50 [] ? warn_slowpath_common+0x77/0xb0 [] ? warn_slowpath_fmt+0x58/0x80 [] ? iwl_trans_pcie_grab_nic_access+0x100/0x110 [iwlwifi] [] ? iwl_write_prph+0x2e/0x70 [iwlwifi] [] ? iwl_force_nmi+0x1d/0x60 [iwlwifi] [] ? iwl_trans_pcie_send_hcmd+0x3c0/0x420 [iwlwifi] [] ? wait_woken+0x80/0x80 [] ? iwl_send_add_sta+0x7f/0xd0 [iwldvm] [] ? iwl_sta_rx_agg_stop+0xfb/0x150 [iwldvm] [] ? iwlagn_mac_ampdu_action+0x103/0x1e0 [iwldvm] [] ? ___ieee80211_stop_rx_ba_session+0xaf/0x1b0 [mac80211] [] ? ieee80211_ba_session_work+0x100/0x170 [mac80211] [] ? process_one_work+0x137/0x360 [] ? pwq_activate_delayed_work+0x27/0x40 [] ? worker_thread+0x5d/0x450 [] ? perf_cgroup_switch+0x1a0/0x1a0 [] ? rescuer_thread+0x310/0x310 [] ? kthread+0xda/0xf0 [] ? kthread_create_on_node+0x1b0/0x1b0 [] ? ret_from_fork+0x3f/0x70 [] ? kthread_create_on_node+0x1b0/0x1b0 ---[ end trace 6435c974dd1d2317 ]--- iwlwifi :04:00.0: Loaded firmware version: 18.168.6.1 iwlwifi :04:00.0: Start IWL Error Log Dump: iwlwifi :04:00.0: Status: 0x004C, count: -30719 iwlwifi :04:00.0: 0xBAACB049 | ADVANCED_SYSASSERT iwlwifi :04:00.0: 0x | uPc
Re: [PATCH] drivers/base: fix typo
On Thu, 20 Aug 2015, Junesung Lee wrote: > The word "filesystem" is being used without the space. I think both versions are acceptable: https://en.wiktionary.org/wiki/file_system Even though the version w/o the space appears to be more common in the source: $ git grep file\ system | wc -l 1473 $ git grep filesystem | wc -l 4321 Christian. > > Signed-off-by: Junesung Lee > --- > drivers/base/Kconfig | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig > index 98504ec..9140666 100644 > --- a/drivers/base/Kconfig > +++ b/drivers/base/Kconfig > @@ -42,7 +42,7 @@ config DEVTMPFS > rescue systems, and reliably handles dynamic major/minor numbers. > > Notice: if CONFIG_TMPFS isn't enabled, the simpler ramfs > - file system will be used instead. > + filesystem will be used instead. > > config DEVTMPFS_MOUNT > bool "Automount devtmpfs at /dev, after the kernel mounted the rootfs" > @@ -100,7 +100,7 @@ config FIRMWARE_IN_KERNEL > Enabling this option will build each required firmware blob > into the kernel directly, where request_firmware() will find > them without having to call out to userspace. This may be > - useful if your root file system requires a device that uses > + useful if your root filesystem requires a device that uses > such firmware and do not wish to use an initrd. > > This single option controls the inclusion of firmware for > -- > 2.1.4 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- BOFH excuse #416: We're out of slots on the server -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] drivers/base: fix typo
On Thu, 20 Aug 2015, Junesung Lee wrote: The word filesystem is being used without the space. I think both versions are acceptable: https://en.wiktionary.org/wiki/file_system Even though the version w/o the space appears to be more common in the source: $ git grep file\ system | wc -l 1473 $ git grep filesystem | wc -l 4321 Christian. Signed-off-by: Junesung Lee junesoung...@gmail.com --- drivers/base/Kconfig | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig index 98504ec..9140666 100644 --- a/drivers/base/Kconfig +++ b/drivers/base/Kconfig @@ -42,7 +42,7 @@ config DEVTMPFS rescue systems, and reliably handles dynamic major/minor numbers. Notice: if CONFIG_TMPFS isn't enabled, the simpler ramfs - file system will be used instead. + filesystem will be used instead. config DEVTMPFS_MOUNT bool Automount devtmpfs at /dev, after the kernel mounted the rootfs @@ -100,7 +100,7 @@ config FIRMWARE_IN_KERNEL Enabling this option will build each required firmware blob into the kernel directly, where request_firmware() will find them without having to call out to userspace. This may be - useful if your root file system requires a device that uses + useful if your root filesystem requires a device that uses such firmware and do not wish to use an initrd. This single option controls the inclusion of firmware for -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- BOFH excuse #416: We're out of slots on the server -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 4.1-rc6: ATA link is slow to respond, please be patient
On August 8, 2015 1:57:05 AM PDT, Denis Kirjanov wrote: >On 8/7/15, Christian Kujau wrote: >> Hi, >> >> this PowerBook G4 was running 3.16 for a while but now I wanted to >upgrade >> to latest mainline. However, during bootup the following happens: >> >> === >> [2.237102] ata1: PATA max UDMA/100 irq 39 >> [2.401708] ata1.00: ATA-8: SAMSUNG HM061GC, LR100-10, max >UDMA/100 >> [2.401764] ata1.00: 117231408 sectors, multi 16: LBA48 >> [2.417633] ata1.00: configured for UDMA/100 >> [ 44.918102] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action >0x0 >> [ 44.920452] ata1.00: failed command: READ DMA >> [ 44.922725] ata1.00: cmd c8/00:88:64:c2:12/00:00:00:00:00/e0 tag 0 >dma >> 69632 in >> [ 44.927257] ata1.00: status: { DRDY } >> [ 49.971784] ata1.00: qc timeout (cmd 0xec) >> [ 49.976529] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4) >> [ 49.978908] ata1.00: revalidation failed (errno=-5) >> [ 55.019662] ata1: link is slow to respond, please be patient >(ready=0) >> [ 60.007677] ata1: device not ready (errno=-16), forcing hardreset >> [ 60.012670] ata1: soft resetting link >> [ 60.193638] ata1.00: configured for UDMA/100 >> [ 60.196158] ata1.00: device reported invalid CHS sector 0 >> [ 60.198610] ata1: EH complete >> === > >Just tried 4.2.0-rc5+ and haven't hit the issue. > >[ 17.180034] pata-pci-macio 0002:20:0d.0: enabling device ( -> >0002) >[ 17.185862] adb: starting probe task... >[ 17.196011] pata-pci-macio 0002:20:0d.0: Activating pata-macio >chipset UniNorth ATA-6, Apple bus ID 3 >[ 17.202312] scsi host0: pata_macio >[ 17.203698] ata1: PATA max UDMA/100 irq 39 >[ 17.219397] adb devices: [2]: 2 c4 [7]: 7 1f >[ 17.225400] ADB keyboard at 2, handler 1 >[ 17.225560] Detected ADB keyboard, type ISO, swapping keys. >[ 17.226642] input: ADB keyboard as /devices/virtual/input/input0 >[ 17.227590] input: ADB Powerbook buttons as >/devices/virtual/input/input1 >[ 17.227795] adb: finished probe task... >[ 17.368537] ata1.00: ATA-6: TOSHIBA MK8026GAX, PA005B, max UDMA/100 >[ 17.368717] ata1.00: 156301488 sectors, multi 16: LBA48 >[ 17.376346] ata1.00: configured for UDMA/100 >[ 17.377544] scsi 0:0:0:0: Direct-Access ATA TOSHIBA >MK8026GA 5B PQ: 0 ANSI: 5 >[ 17.386989] sd 0:0:0:0: [sda] 156301488 512-byte logical blocks: >(80.0 GB/74.5 GiB) >[ 17.393144] sd 0:0:0:0: [sda] Write Protect is off >[ 17.397579] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 >[ 17.398215] sd 0:0:0:0: Attached scsi generic sg0 type 0 >[ 17.404124] sd 0:0:0:0: [sda] Write cache: enabled, read cache: >enabled, doesn't support DPO or FUA >[ 17.661225] sda: [mac] sda1 sda2 sda3 sda4 >[ 17.672937] sd 0:0:0:0: [sda] Attached SCSI disk >[ 18.223985] pata-macio 0.0002:ata-3: Activating pata-macio >chipset KeyLargo ATA-3, Apple bus ID 0 >[ 18.233397] scsi host1: pata_macio >[ 18.239172] ata2: PATA max MWDMA2 irq 24 > > >> >> This happens only once, but systemd thinks there's a hard problem and >will >> drop to a recovery shell. I can start sshd and login remotely and >then the >> system appears to be running just fine. >> >> This happened in 4.2.0-rc5 so I went back a few versions and found >that >> 4.1-rc5 was OK (the error does not show up and the system boots just >fine) >> and 4.1-rc6 is not. >> >> Unfortunately a git-bisect between these two versions went completly >off >> the charts, I don't know what happened here: >> >> == >> first bad commit: >> >> 0fa372b6c95013af1334b3d5c9b5f03a70ecedab is the first bad commit >> commit 0fa372b6c95013af1334b3d5c9b5f03a70ecedab >> Author: Takashi Iwai >> Date: Wed May 27 16:17:19 2015 +0200 >> >> ALSA: hda - Fix noise on AMD radeon 290x controller >> == >> >> I don't have this driver (or ALSA) even selected. I can reproduce >this >> error pretty reliably and I'd like to attempt another git-bisect >> run when I'm more awake. But maybe somebody recognizes this error and >> has a hint where this could come from? >> >> dmesg & .config: http://nerdbynature.de/bits/v4.1-rc6/ >> >> Thanks, >> Christian. >> -- >> BOFH excuse #225: >> >> It's those computer people in X {city of world}. They keep stuffing >things >> up. >> ___ >> Linuxppc-dev mailing list >> linuxppc-...@lists.ozlabs.org >> https://lists.ozlabs.org/listinfo/linuxppc-dev Can you send me your .config or did you use my .config, verbatim? I'll try another git-bisect later today. Thanks, Christian. -- make bzImage, not war -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 4.1-rc6: ATA link is slow to respond, please be patient
On August 8, 2015 1:57:05 AM PDT, Denis Kirjanov k...@linux-powerpc.org wrote: On 8/7/15, Christian Kujau li...@nerdbynature.de wrote: Hi, this PowerBook G4 was running 3.16 for a while but now I wanted to upgrade to latest mainline. However, during bootup the following happens: === [2.237102] ata1: PATA max UDMA/100 irq 39 [2.401708] ata1.00: ATA-8: SAMSUNG HM061GC, LR100-10, max UDMA/100 [2.401764] ata1.00: 117231408 sectors, multi 16: LBA48 [2.417633] ata1.00: configured for UDMA/100 [ 44.918102] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 44.920452] ata1.00: failed command: READ DMA [ 44.922725] ata1.00: cmd c8/00:88:64:c2:12/00:00:00:00:00/e0 tag 0 dma 69632 in [ 44.927257] ata1.00: status: { DRDY } [ 49.971784] ata1.00: qc timeout (cmd 0xec) [ 49.976529] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4) [ 49.978908] ata1.00: revalidation failed (errno=-5) [ 55.019662] ata1: link is slow to respond, please be patient (ready=0) [ 60.007677] ata1: device not ready (errno=-16), forcing hardreset [ 60.012670] ata1: soft resetting link [ 60.193638] ata1.00: configured for UDMA/100 [ 60.196158] ata1.00: device reported invalid CHS sector 0 [ 60.198610] ata1: EH complete === Just tried 4.2.0-rc5+ and haven't hit the issue. [ 17.180034] pata-pci-macio 0002:20:0d.0: enabling device ( - 0002) [ 17.185862] adb: starting probe task... [ 17.196011] pata-pci-macio 0002:20:0d.0: Activating pata-macio chipset UniNorth ATA-6, Apple bus ID 3 [ 17.202312] scsi host0: pata_macio [ 17.203698] ata1: PATA max UDMA/100 irq 39 [ 17.219397] adb devices: [2]: 2 c4 [7]: 7 1f [ 17.225400] ADB keyboard at 2, handler 1 [ 17.225560] Detected ADB keyboard, type ISO, swapping keys. [ 17.226642] input: ADB keyboard as /devices/virtual/input/input0 [ 17.227590] input: ADB Powerbook buttons as /devices/virtual/input/input1 [ 17.227795] adb: finished probe task... [ 17.368537] ata1.00: ATA-6: TOSHIBA MK8026GAX, PA005B, max UDMA/100 [ 17.368717] ata1.00: 156301488 sectors, multi 16: LBA48 [ 17.376346] ata1.00: configured for UDMA/100 [ 17.377544] scsi 0:0:0:0: Direct-Access ATA TOSHIBA MK8026GA 5B PQ: 0 ANSI: 5 [ 17.386989] sd 0:0:0:0: [sda] 156301488 512-byte logical blocks: (80.0 GB/74.5 GiB) [ 17.393144] sd 0:0:0:0: [sda] Write Protect is off [ 17.397579] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 [ 17.398215] sd 0:0:0:0: Attached scsi generic sg0 type 0 [ 17.404124] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 17.661225] sda: [mac] sda1 sda2 sda3 sda4 [ 17.672937] sd 0:0:0:0: [sda] Attached SCSI disk [ 18.223985] pata-macio 0.0002:ata-3: Activating pata-macio chipset KeyLargo ATA-3, Apple bus ID 0 [ 18.233397] scsi host1: pata_macio [ 18.239172] ata2: PATA max MWDMA2 irq 24 This happens only once, but systemd thinks there's a hard problem and will drop to a recovery shell. I can start sshd and login remotely and then the system appears to be running just fine. This happened in 4.2.0-rc5 so I went back a few versions and found that 4.1-rc5 was OK (the error does not show up and the system boots just fine) and 4.1-rc6 is not. Unfortunately a git-bisect between these two versions went completly off the charts, I don't know what happened here: == first bad commit: 0fa372b6c95013af1334b3d5c9b5f03a70ecedab is the first bad commit commit 0fa372b6c95013af1334b3d5c9b5f03a70ecedab Author: Takashi Iwai ti...@suse.de Date: Wed May 27 16:17:19 2015 +0200 ALSA: hda - Fix noise on AMD radeon 290x controller == I don't have this driver (or ALSA) even selected. I can reproduce this error pretty reliably and I'd like to attempt another git-bisect run when I'm more awake. But maybe somebody recognizes this error and has a hint where this could come from? dmesg .config: http://nerdbynature.de/bits/v4.1-rc6/ Thanks, Christian. -- BOFH excuse #225: It's those computer people in X {city of world}. They keep stuffing things up. ___ Linuxppc-dev mailing list linuxppc-...@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev Can you send me your .config or did you use my .config, verbatim? I'll try another git-bisect later today. Thanks, Christian. -- make bzImage, not war -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
4.1-rc6: ATA link is slow to respond, please be patient
Hi, this PowerBook G4 was running 3.16 for a while but now I wanted to upgrade to latest mainline. However, during bootup the following happens: === [2.237102] ata1: PATA max UDMA/100 irq 39 [2.401708] ata1.00: ATA-8: SAMSUNG HM061GC, LR100-10, max UDMA/100 [2.401764] ata1.00: 117231408 sectors, multi 16: LBA48 [2.417633] ata1.00: configured for UDMA/100 [ 44.918102] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 44.920452] ata1.00: failed command: READ DMA [ 44.922725] ata1.00: cmd c8/00:88:64:c2:12/00:00:00:00:00/e0 tag 0 dma 69632 in [ 44.927257] ata1.00: status: { DRDY } [ 49.971784] ata1.00: qc timeout (cmd 0xec) [ 49.976529] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4) [ 49.978908] ata1.00: revalidation failed (errno=-5) [ 55.019662] ata1: link is slow to respond, please be patient (ready=0) [ 60.007677] ata1: device not ready (errno=-16), forcing hardreset [ 60.012670] ata1: soft resetting link [ 60.193638] ata1.00: configured for UDMA/100 [ 60.196158] ata1.00: device reported invalid CHS sector 0 [ 60.198610] ata1: EH complete === This happens only once, but systemd thinks there's a hard problem and will drop to a recovery shell. I can start sshd and login remotely and then the system appears to be running just fine. This happened in 4.2.0-rc5 so I went back a few versions and found that 4.1-rc5 was OK (the error does not show up and the system boots just fine) and 4.1-rc6 is not. Unfortunately a git-bisect between these two versions went completly off the charts, I don't know what happened here: == first bad commit: 0fa372b6c95013af1334b3d5c9b5f03a70ecedab is the first bad commit commit 0fa372b6c95013af1334b3d5c9b5f03a70ecedab Author: Takashi Iwai Date: Wed May 27 16:17:19 2015 +0200 ALSA: hda - Fix noise on AMD radeon 290x controller == I don't have this driver (or ALSA) even selected. I can reproduce this error pretty reliably and I'd like to attempt another git-bisect run when I'm more awake. But maybe somebody recognizes this error and has a hint where this could come from? dmesg & .config: http://nerdbynature.de/bits/v4.1-rc6/ Thanks, Christian. -- BOFH excuse #225: It's those computer people in X {city of world}. They keep stuffing things up. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
4.1-rc6: ATA link is slow to respond, please be patient
Hi, this PowerBook G4 was running 3.16 for a while but now I wanted to upgrade to latest mainline. However, during bootup the following happens: === [2.237102] ata1: PATA max UDMA/100 irq 39 [2.401708] ata1.00: ATA-8: SAMSUNG HM061GC, LR100-10, max UDMA/100 [2.401764] ata1.00: 117231408 sectors, multi 16: LBA48 [2.417633] ata1.00: configured for UDMA/100 [ 44.918102] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 44.920452] ata1.00: failed command: READ DMA [ 44.922725] ata1.00: cmd c8/00:88:64:c2:12/00:00:00:00:00/e0 tag 0 dma 69632 in [ 44.927257] ata1.00: status: { DRDY } [ 49.971784] ata1.00: qc timeout (cmd 0xec) [ 49.976529] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4) [ 49.978908] ata1.00: revalidation failed (errno=-5) [ 55.019662] ata1: link is slow to respond, please be patient (ready=0) [ 60.007677] ata1: device not ready (errno=-16), forcing hardreset [ 60.012670] ata1: soft resetting link [ 60.193638] ata1.00: configured for UDMA/100 [ 60.196158] ata1.00: device reported invalid CHS sector 0 [ 60.198610] ata1: EH complete === This happens only once, but systemd thinks there's a hard problem and will drop to a recovery shell. I can start sshd and login remotely and then the system appears to be running just fine. This happened in 4.2.0-rc5 so I went back a few versions and found that 4.1-rc5 was OK (the error does not show up and the system boots just fine) and 4.1-rc6 is not. Unfortunately a git-bisect between these two versions went completly off the charts, I don't know what happened here: == first bad commit: 0fa372b6c95013af1334b3d5c9b5f03a70ecedab is the first bad commit commit 0fa372b6c95013af1334b3d5c9b5f03a70ecedab Author: Takashi Iwai ti...@suse.de Date: Wed May 27 16:17:19 2015 +0200 ALSA: hda - Fix noise on AMD radeon 290x controller == I don't have this driver (or ALSA) even selected. I can reproduce this error pretty reliably and I'd like to attempt another git-bisect run when I'm more awake. But maybe somebody recognizes this error and has a hint where this could come from? dmesg .config: http://nerdbynature.de/bits/v4.1-rc6/ Thanks, Christian. -- BOFH excuse #225: It's those computer people in X {city of world}. They keep stuffing things up. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] fallback to hostname in scripts/package/builddeb
Hi, I happened to build a kernel with "make deb-pkg" on a machine with no network connectivity, but this failed with: [...] INSTALL debian/headertmp/usr/include/asm/ (65 files) hostname: Name or service not known ../scripts/package/Makefile:90: recipe for target 'deb-pkg' failed make[2]: *** [deb-pkg] Error 1 In scripts/package/builddeb it tries to construct an email address (that can be queried in /proc/version later on) but with no network, the "hostname -f" fails. The following patch falls back to just use the shortname if we cannot determine our FQDN. Signed-off-by: Christian Kujau diff --git a/scripts/package/builddeb b/scripts/package/builddeb index 88dbf23..7de1d1c 100755 --- a/scripts/package/builddeb +++ b/scripts/package/builddeb @@ -206,7 +206,7 @@ if [ -n "$DEBEMAIL" ]; then elif [ -n "$EMAIL" ]; then email=$EMAIL else - email=$(id -nu)@$(hostname -f) + email=$(id -nu)@$(hostname -f 2>/dev/null || hostname) fi if [ -n "$DEBFULLNAME" ]; then name=$DEBFULLNAME -- BOFH excuse #334: 50% of the manual is in .pdf readme files -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] fallback to hostname in scripts/package/builddeb
Hi, I happened to build a kernel with make deb-pkg on a machine with no network connectivity, but this failed with: [...] INSTALL debian/headertmp/usr/include/asm/ (65 files) hostname: Name or service not known ../scripts/package/Makefile:90: recipe for target 'deb-pkg' failed make[2]: *** [deb-pkg] Error 1 In scripts/package/builddeb it tries to construct an email address (that can be queried in /proc/version later on) but with no network, the hostname -f fails. The following patch falls back to just use the shortname if we cannot determine our FQDN. Signed-off-by: Christian Kujau li...@nerdbynature.de diff --git a/scripts/package/builddeb b/scripts/package/builddeb index 88dbf23..7de1d1c 100755 --- a/scripts/package/builddeb +++ b/scripts/package/builddeb @@ -206,7 +206,7 @@ if [ -n $DEBEMAIL ]; then elif [ -n $EMAIL ]; then email=$EMAIL else - email=$(id -nu)@$(hostname -f) + email=$(id -nu)@$(hostname -f 2/dev/null || hostname) fi if [ -n $DEBFULLNAME ]; then name=$DEBFULLNAME -- BOFH excuse #334: 50% of the manual is in .pdf readme files -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: WARNING: CPU: 6 PID: 79 at fs/proc/generic.c:521 remove_proc_entry+0x170/0x180()
On Tue, 19 Aug 2014 at 20:13, Cong Wang wrote: > On Tue, Aug 19, 2014 at 7:50 PM, Jiang Liu wrote: > > Hi Kujau, > > It seems like a different issue, something wrong with > > void nfs_fs_proc_net_exit(struct net *net) > > http://marc.info/?l=linux-nfs=140821782107427=2 Thanks, that helped! Christian. -- BOFH excuse #182: endothermal recalibration -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
WARNING: CPU: 6 PID: 79 at fs/proc/generic.c:521 remove_proc_entry+0x170/0x180()
Hi, the warning below appeared while booting 3.17.0-rc1. I haven't seen the warning before, but found a recent report on oops.kernel.org: http://oops.kernel.org/oops/warning-at-fs-proc-generic-c521-remove_proc_entry0x18f-0x1a0/ and also reports from July 2014, where the issue was reported to be fixed: https://lkml.org/lkml/2014/7/16/9 https://lkml.org/lkml/2014/7/18/116 And the patch really made it into 3.17.0-rc1, so maybe it's something else this time. Details and .config: http://nerdbynature.de/bits/3.17-rc1/ Thanks, Christian. [ cut here ] WARNING: CPU: 6 PID: 79 at /usr/local/src/linux-git/fs/proc/generic.c:521 remove_proc_entry+0x170/0x180() remove_proc_entry: removing non-empty directory 'fs/nfsfs', leaking at least 'volumes' Modules linked in: uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common btusb videodev bluetooth hid_logitech_dj sha256_ssse3 sha256_generic twofish_generic twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64 twofish_common nfs xts lockd sunrpc arc4 coretemp x86_pkg_temp_thermal usbhid intel_powerclamp hid iwldvm kvm_intel mac80211 kvm snd_hda_codec_hdmi i2c_i801 iwlwifi cfg80211 thinkpad_acpi snd_hda_codec_conexant snd_hda_codec_generic nvram hwmon led_class wmi rtc_cmos i915 snd_hda_intel i2c_algo_bit snd_hda_controller drm_kms_helper snd_hda_codec drm snd_hwdep snd_pcm i2ccore snd_timer snd soundcore fuse autofs4 btrfs xor raid6_pq aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd sr_mod cdrom sg ehci_pci ehci_hcd xhci_hcd CPU: 6 PID: 79 Comm: kworker/u16:6 Not tainted 3.17.0-rc1 #1 Hardware name: LENOVO 6277CTO/6277CTO, BIOS HEET42WW (1.23 ) 01/27/2014 Workqueue: netns cleanup_net 0009 8149c1e2 880406c8fd18 8104ee6d 880406701580 880406c8fd68 0005 c0883bae c0883bb1 8104eed7 815b1578 88040030 Call Trace: [] ? dump_stack+0x41/0x51 [] ? warn_slowpath_common+0x6d/0x90 [] ? warn_slowpath_fmt+0x47/0x50 [] ? proc_entry_rundown+0x41/0x80 [] ? remove_proc_entry+0x170/0x180 [] ? nfs_net_exit+0x9/0x20 [nfs] [] ? ops_exit_list.isra.2+0x31/0x60 [] ? cleanup_net+0x100/0x1e0 [] ? process_one_work+0x16b/0x3b0 [] ? worker_thread+0x63/0x490 [] ? rescuer_thread+0x280/0x280 [] ? kthread+0xca/0xe0 [] ? kthread_create_on_node+0x170/0x170 [] ? ret_from_fork+0x7c/0xb0 [] ? kthread_create_on_node+0x170/0x170 ---[ end trace c92165dd3f372cf6 ]--- -- BOFH excuse #285: Telecommunications is upgrading. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
WARNING: CPU: 6 PID: 79 at fs/proc/generic.c:521 remove_proc_entry+0x170/0x180()
Hi, the warning below appeared while booting 3.17.0-rc1. I haven't seen the warning before, but found a recent report on oops.kernel.org: http://oops.kernel.org/oops/warning-at-fs-proc-generic-c521-remove_proc_entry0x18f-0x1a0/ and also reports from July 2014, where the issue was reported to be fixed: https://lkml.org/lkml/2014/7/16/9 https://lkml.org/lkml/2014/7/18/116 And the patch really made it into 3.17.0-rc1, so maybe it's something else this time. Details and .config: http://nerdbynature.de/bits/3.17-rc1/ Thanks, Christian. [ cut here ] WARNING: CPU: 6 PID: 79 at /usr/local/src/linux-git/fs/proc/generic.c:521 remove_proc_entry+0x170/0x180() remove_proc_entry: removing non-empty directory 'fs/nfsfs', leaking at least 'volumes' Modules linked in: uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common btusb videodev bluetooth hid_logitech_dj sha256_ssse3 sha256_generic twofish_generic twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64 twofish_common nfs xts lockd sunrpc arc4 coretemp x86_pkg_temp_thermal usbhid intel_powerclamp hid iwldvm kvm_intel mac80211 kvm snd_hda_codec_hdmi i2c_i801 iwlwifi cfg80211 thinkpad_acpi snd_hda_codec_conexant snd_hda_codec_generic nvram hwmon led_class wmi rtc_cmos i915 snd_hda_intel i2c_algo_bit snd_hda_controller drm_kms_helper snd_hda_codec drm snd_hwdep snd_pcm i2ccore snd_timer snd soundcore fuse autofs4 btrfs xor raid6_pq aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd sr_mod cdrom sg ehci_pci ehci_hcd xhci_hcd CPU: 6 PID: 79 Comm: kworker/u16:6 Not tainted 3.17.0-rc1 #1 Hardware name: LENOVO 6277CTO/6277CTO, BIOS HEET42WW (1.23 ) 01/27/2014 Workqueue: netns cleanup_net 0009 8149c1e2 880406c8fd18 8104ee6d 880406701580 880406c8fd68 0005 c0883bae c0883bb1 8104eed7 815b1578 88040030 Call Trace: [8149c1e2] ? dump_stack+0x41/0x51 [8104ee6d] ? warn_slowpath_common+0x6d/0x90 [8104eed7] ? warn_slowpath_fmt+0x47/0x50 [811955d1] ? proc_entry_rundown+0x41/0x80 [81199b50] ? remove_proc_entry+0x170/0x180 [c0873a79] ? nfs_net_exit+0x9/0x20 [nfs] [813e5951] ? ops_exit_list.isra.2+0x31/0x60 [813e6150] ? cleanup_net+0x100/0x1e0 [8106316b] ? process_one_work+0x16b/0x3b0 [81063ed3] ? worker_thread+0x63/0x490 [81063e70] ? rescuer_thread+0x280/0x280 [8106848a] ? kthread+0xca/0xe0 [810683c0] ? kthread_create_on_node+0x170/0x170 [814a1b7c] ? ret_from_fork+0x7c/0xb0 [810683c0] ? kthread_create_on_node+0x170/0x170 ---[ end trace c92165dd3f372cf6 ]--- -- BOFH excuse #285: Telecommunications is upgrading. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: WARNING: CPU: 6 PID: 79 at fs/proc/generic.c:521 remove_proc_entry+0x170/0x180()
On Tue, 19 Aug 2014 at 20:13, Cong Wang wrote: On Tue, Aug 19, 2014 at 7:50 PM, Jiang Liu jiang@linux.intel.com wrote: Hi Kujau, It seems like a different issue, something wrong with void nfs_fs_proc_net_exit(struct net *net) http://marc.info/?l=linux-nfsm=140821782107427w=2 Thanks, that helped! Christian. -- BOFH excuse #182: endothermal recalibration -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] remove non-existent files from MAINTAINERS
Inspired by some recent cleanups in MAINTAINERS the following files (F:) cannot be found any more in the tree: * arch/arm/mach-s5pv210/mach-aquila.c * arch/arm/mach-s5pv210/mach-goni.c Those two got removed in 28c8331 ("ARM: S5PV210: Remove support for board files"). Cc: Tomasz Figa Cc: Kyungmin Park * arch/arm/configs/genmai_defconfig This one got removed in 3ed27bd9 ("ARM: shmobile: genmai: remove defconfig"). Cc: Simon Horman Cc: Magnus Damm * drivers/mmc/host/sdhci-st.c This one was sent to be included in June 2014 but got dropped shortly after: "mmc: sdhci-st: Intial support for ST SDHCI controller" https://lkml.org/lkml/2014/6/4/446 https://lkml.org/lkml/2014/7/9/340 Cc: Peter Griffin Cc: Ulf Hansson * drivers/rtc/rtc-sec.c A MAINTAINERS fix was attempted in November 2012, but dismissed as rtc-sec.c was still being worked on. Alas, it's still not there. "MAINTAINERS: fix drivers/rtc/rtc-sec.c" http://lkml.iu.edu/hypermail/linux/kernel/1211.2/04820.html Cc: Sangbeom Kim Cc: Cesar Eduardo Barros Signed-off-by: Christian Kujau diff --git a/MAINTAINERS b/MAINTAINERS index 7e2eb4c..7831e8d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1303,8 +1303,7 @@ ARM/SAMSUNG MOBILE MACHINE SUPPORT M: Kyungmin Park L: linux-arm-ker...@lists.infradead.org (moderated for non-subscribers) S: Maintained -F: arch/arm/mach-s5pv210/mach-aquila.c -F: arch/arm/mach-s5pv210/mach-goni.c +F: arch/arm/mach-s5pv210/ ARM/SAMSUNG S5P SERIES 2D GRAPHICS ACCELERATION (G2D) SUPPORT M: Kyungmin Park @@ -1347,7 +1346,6 @@ F:arch/arm/boot/dts/sh* F: arch/arm/configs/ape6evm_defconfig F: arch/arm/configs/armadillo800eva_defconfig F: arch/arm/configs/bockw_defconfig -F: arch/arm/configs/genmai_defconfig F: arch/arm/configs/koelsch_defconfig F: arch/arm/configs/kzm9g_defconfig F: arch/arm/configs/lager_defconfig @@ -1383,7 +1381,6 @@ F:drivers/pinctrl/pinctrl-st.c F: drivers/media/rc/st_rc.c F: drivers/i2c/busses/i2c-st.c F: drivers/tty/serial/st-asc.c -F: drivers/mmc/host/sdhci-st.c ARM/TECHNOLOGIC SYSTEMS TS7250 MACHINE SUPPORT M: Lennert Buytenhek @@ -7809,7 +7806,6 @@ S:Supported F: drivers/mfd/sec*.c F: drivers/regulator/s2m*.c F: drivers/regulator/s5m*.c -F: drivers/rtc/rtc-sec.c F: include/linux/mfd/samsung/ SAMSUNG S5P/EXYNOS4 SOC SERIES CAMERA SUBSYSTEM DRIVERS -- BOFH excuse #419: Repeated reboots of the system failed to solve problem -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] remove non-existent files from MAINTAINERS
Inspired by some recent cleanups in MAINTAINERS the following files (F:) cannot be found any more in the tree: * arch/arm/mach-s5pv210/mach-aquila.c * arch/arm/mach-s5pv210/mach-goni.c Those two got removed in 28c8331 (ARM: S5PV210: Remove support for board files). Cc: Tomasz Figa t.f...@samsung.com Cc: Kyungmin Park kyungmin.p...@samsung.com * arch/arm/configs/genmai_defconfig This one got removed in 3ed27bd9 (ARM: shmobile: genmai: remove defconfig). Cc: Simon Horman ho...@verge.net.au Cc: Magnus Damm magnus.d...@gmail.com * drivers/mmc/host/sdhci-st.c This one was sent to be included in June 2014 but got dropped shortly after: mmc: sdhci-st: Intial support for ST SDHCI controller https://lkml.org/lkml/2014/6/4/446 https://lkml.org/lkml/2014/7/9/340 Cc: Peter Griffin peter.grif...@linaro.org Cc: Ulf Hansson ulf.hans...@linaro.org * drivers/rtc/rtc-sec.c A MAINTAINERS fix was attempted in November 2012, but dismissed as rtc-sec.c was still being worked on. Alas, it's still not there. MAINTAINERS: fix drivers/rtc/rtc-sec.c http://lkml.iu.edu/hypermail/linux/kernel/1211.2/04820.html Cc: Sangbeom Kim sbki...@samsung.com Cc: Cesar Eduardo Barros ces...@cesarb.eti.br Signed-off-by: Christian Kujau li...@nerdbynature.de diff --git a/MAINTAINERS b/MAINTAINERS index 7e2eb4c..7831e8d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1303,8 +1303,7 @@ ARM/SAMSUNG MOBILE MACHINE SUPPORT M: Kyungmin Park kyungmin.p...@samsung.com L: linux-arm-ker...@lists.infradead.org (moderated for non-subscribers) S: Maintained -F: arch/arm/mach-s5pv210/mach-aquila.c -F: arch/arm/mach-s5pv210/mach-goni.c +F: arch/arm/mach-s5pv210/ ARM/SAMSUNG S5P SERIES 2D GRAPHICS ACCELERATION (G2D) SUPPORT M: Kyungmin Park kyungmin.p...@samsung.com @@ -1347,7 +1346,6 @@ F:arch/arm/boot/dts/sh* F: arch/arm/configs/ape6evm_defconfig F: arch/arm/configs/armadillo800eva_defconfig F: arch/arm/configs/bockw_defconfig -F: arch/arm/configs/genmai_defconfig F: arch/arm/configs/koelsch_defconfig F: arch/arm/configs/kzm9g_defconfig F: arch/arm/configs/lager_defconfig @@ -1383,7 +1381,6 @@ F:drivers/pinctrl/pinctrl-st.c F: drivers/media/rc/st_rc.c F: drivers/i2c/busses/i2c-st.c F: drivers/tty/serial/st-asc.c -F: drivers/mmc/host/sdhci-st.c ARM/TECHNOLOGIC SYSTEMS TS7250 MACHINE SUPPORT M: Lennert Buytenhek ker...@wantstofly.org @@ -7809,7 +7806,6 @@ S:Supported F: drivers/mfd/sec*.c F: drivers/regulator/s2m*.c F: drivers/regulator/s5m*.c -F: drivers/rtc/rtc-sec.c F: include/linux/mfd/samsung/ SAMSUNG S5P/EXYNOS4 SOC SERIES CAMERA SUBSYSTEM DRIVERS -- BOFH excuse #419: Repeated reboots of the system failed to solve problem -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.14.0-rc2: WARNING: at mm/slub.c:1007
On Fri, 14 Feb 2014 at 12:14, Dave Chinner wrote: > > OK, so the "possible irq lock inversion dependency detected" is a lockdep > > regression, as you explained in the xfs-list thread. What about the > > "RECLAIM_FS-safe -> RECLAIM_FS-unsafe lock order detected" warning - I > > haven't seen it again though, only once with 3.14.0-rc2. > > That was also an i_lock/mmapsem issue, so it's likely to be the same > root cause. I'm testing a fix for it at the moment. Understood. Thanks for looking into this. Christian. -- BOFH excuse #129: The ring needs another token -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.14.0-rc2: WARNING: at mm/slub.c:1007
On Fri, 14 Feb 2014 at 09:26, Dave Chinner wrote: > > after upgrading from 3.13-rc8 to 3.14.0-rc2 on this PowerPC G4 machine, > > the WARNING below was printed. > > > > Shortly after, a lockdep warning appeared (possibly related to my > > post to the XFS list yesterday[0]). > > Unlikely. OK, so the "possible irq lock inversion dependency detected" is a lockdep regression, as you explained in the xfs-list thread. What about the "RECLAIM_FS-safe -> RECLAIM_FS-unsafe lock order detected" warning - I haven't seen it again though, only once with 3.14.0-rc2. Christian. -- BOFH excuse #108: The air conditioning water supply pipe ruptured over the machine room -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.14.0-rc2: WARNING: at mm/slub.c:1007
On Thu, 13 Feb 2014 at 11:53, Christian Kujau wrote: > after upgrading from 3.13-rc8 to 3.14.0-rc2 on this PowerPC G4 machine, > the WARNING below was printed. > > Shortly after, a lockdep warning appeared (possibly related to my > post to the XFS list yesterday[0]). Sigh, only _after_ sending the email, I came across an earlier posting on lkml: http://marc.info/?l=linux-mm=139145788623391 Sorry for the noise. These out-of-memory messages below appeared without the WARNING though and started somewhere in 3.13, but are impossible to bisect, as they're happening only every few days / weeks. Christian. > Even later in the log an out-of-memory error appeared, that may or may not > be relatd to that WARNING at all but which I'm trying to chase down ever > since 3.13, but which tends to appear more often lately. > > Can anyone take a look if this is something to worry about? > > Full dmesg & .config: http://nerdbynature.de/bits/3.14-rc2/mm/ > > Thanks, > Christian. > > [0] http://oss.sgi.com/pipermail/xfs/2014-February/034054.html > > [ cut here ] > WARNING: at /usr/local/src/linux-git/mm/slub.c:1007 > Modules linked in: md5 ecb nfs i2c_powermac therm_adt746x ecryptfs arc4 > firewire_sbp2 b43 usb_storage mac80211 cfg80211 > CPU: 0 PID: 9025 Comm: nfsd Not tainted 3.14.0-rc2 #1 > task: efbf8000 ti: ed2a task.ti: ed2a > NIP: c00ccc28 LR: c00ccc20 CTR: > REGS: ed2a1980 TRAP: 0700 Not tainted (3.14.0-rc2) > MSR: 00021032 CR: 22f82b82 XER: 2000 > > GPR00: c00ccc20 ed2a1a30 efbf8000 ef96e550 > 2ce0 > GPR08: 0001 efbf86f8 05e7 82fc2b88 0001 > 00080011 > GPR16: c076 ef96e564 00100100 00200200 > c1203914 > GPR24: ef96e540 0002 ef96fa80 ed2a > c1203900 > NIP [c00ccc28] deactivate_slab+0x4c0/0x538 > LR [c00ccc20] deactivate_slab+0x4b8/0x538 > Call Trace: > [ed2a1a30] [c00ccc20] deactivate_slab+0x4b8/0x538 (unreliable) > [ed2a1ae0] [c055d5f0] __slab_alloc.constprop.77+0x260/0x38c > [ed2a1b50] [c00cd524] kmem_cache_alloc+0x118/0x140 > [ed2a1b70] [c01de4bc] kmem_zone_alloc+0x94/0x108 > [ed2a1ba0] [c01cccd4] xfs_inode_alloc+0x2c/0xd4 > [ed2a1bc0] [c01cd7a4] xfs_iget+0x2e4/0x584 > [ed2a1c30] [c020e664] xfs_lookup+0xc8/0xe4 > [ed2a1c70] [c01d3c28] xfs_vn_lookup+0x64/0xbc > [ed2a1c90] [c00db3ac] lookup_real+0x30/0x70 > [ed2a1ca0] [c00dc384] __lookup_hash+0x3c/0x58 > [ed2a1cc0] [c00e1438] lookup_one_len+0x10c/0x15c > [ed2a1ce0] [c01a170c] nfsd4_encode_dirent+0xb4/0x328 > [ed2a1d10] [c018f580] nfsd_readdir+0x1d4/0x288 > [ed2a1d90] [c019d648] nfsd4_encode_readdir+0x138/0x1f4 > [ed2a1dd0] [c01a1b18] nfsd4_encode_operation+0x8c/0xf0 > [ed2a1df0] [c019aa4c] nfsd4_proc_compound+0x1b8/0x4f8 > [ed2a1e30] [c0189d20] nfsd_dispatch+0x90/0x1a0 > [ed2a1e50] [c0536b04] svc_process+0x3d0/0x698 > [ed2a1e90] [c01895bc] nfsd+0xc0/0x120 > [ed2a1eb0] [c004f8fc] kthread+0xbc/0xd0 > [ed2a1f40] [c0010ae4] ret_from_kernel_thread+0x5c/0x64 > Instruction dump: > 7fe4fb78 800100b4 b9c10068 7d810120 7d808120 7c0803a6 382100b0 4bfffb00 > 80610048 4bf95dc5 2f83 40beff4c <0fe0> 4b44 815e000c 394a0001 > ---[ end trace 1f5ed3ea8b3e4403 ]--- > > > -- > BOFH excuse #65: > > system needs to be rebooted > > ___ > xfs mailing list > x...@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs > -- BOFH excuse #65: system needs to be rebooted -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
3.14.0-rc2: WARNING: at mm/slub.c:1007
Hi, after upgrading from 3.13-rc8 to 3.14.0-rc2 on this PowerPC G4 machine, the WARNING below was printed. Shortly after, a lockdep warning appeared (possibly related to my post to the XFS list yesterday[0]). Even later in the log an out-of-memory error appeared, that may or may not be relatd to that WARNING at all but which I'm trying to chase down ever since 3.13, but which tends to appear more often lately. Can anyone take a look if this is something to worry about? Full dmesg & .config: http://nerdbynature.de/bits/3.14-rc2/mm/ Thanks, Christian. [0] http://oss.sgi.com/pipermail/xfs/2014-February/034054.html [ cut here ] WARNING: at /usr/local/src/linux-git/mm/slub.c:1007 Modules linked in: md5 ecb nfs i2c_powermac therm_adt746x ecryptfs arc4 firewire_sbp2 b43 usb_storage mac80211 cfg80211 CPU: 0 PID: 9025 Comm: nfsd Not tainted 3.14.0-rc2 #1 task: efbf8000 ti: ed2a task.ti: ed2a NIP: c00ccc28 LR: c00ccc20 CTR: REGS: ed2a1980 TRAP: 0700 Not tainted (3.14.0-rc2) MSR: 00021032 CR: 22f82b82 XER: 2000 GPR00: c00ccc20 ed2a1a30 efbf8000 ef96e550 2ce0 GPR08: 0001 efbf86f8 05e7 82fc2b88 0001 00080011 GPR16: c076 ef96e564 00100100 00200200 c1203914 GPR24: ef96e540 0002 ef96fa80 ed2a c1203900 NIP [c00ccc28] deactivate_slab+0x4c0/0x538 LR [c00ccc20] deactivate_slab+0x4b8/0x538 Call Trace: [ed2a1a30] [c00ccc20] deactivate_slab+0x4b8/0x538 (unreliable) [ed2a1ae0] [c055d5f0] __slab_alloc.constprop.77+0x260/0x38c [ed2a1b50] [c00cd524] kmem_cache_alloc+0x118/0x140 [ed2a1b70] [c01de4bc] kmem_zone_alloc+0x94/0x108 [ed2a1ba0] [c01cccd4] xfs_inode_alloc+0x2c/0xd4 [ed2a1bc0] [c01cd7a4] xfs_iget+0x2e4/0x584 [ed2a1c30] [c020e664] xfs_lookup+0xc8/0xe4 [ed2a1c70] [c01d3c28] xfs_vn_lookup+0x64/0xbc [ed2a1c90] [c00db3ac] lookup_real+0x30/0x70 [ed2a1ca0] [c00dc384] __lookup_hash+0x3c/0x58 [ed2a1cc0] [c00e1438] lookup_one_len+0x10c/0x15c [ed2a1ce0] [c01a170c] nfsd4_encode_dirent+0xb4/0x328 [ed2a1d10] [c018f580] nfsd_readdir+0x1d4/0x288 [ed2a1d90] [c019d648] nfsd4_encode_readdir+0x138/0x1f4 [ed2a1dd0] [c01a1b18] nfsd4_encode_operation+0x8c/0xf0 [ed2a1df0] [c019aa4c] nfsd4_proc_compound+0x1b8/0x4f8 [ed2a1e30] [c0189d20] nfsd_dispatch+0x90/0x1a0 [ed2a1e50] [c0536b04] svc_process+0x3d0/0x698 [ed2a1e90] [c01895bc] nfsd+0xc0/0x120 [ed2a1eb0] [c004f8fc] kthread+0xbc/0xd0 [ed2a1f40] [c0010ae4] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 7fe4fb78 800100b4 b9c10068 7d810120 7d808120 7c0803a6 382100b0 4bfffb00 80610048 4bf95dc5 2f83 40beff4c <0fe0> 4b44 815e000c 394a0001 ---[ end trace 1f5ed3ea8b3e4403 ]--- -- BOFH excuse #65: system needs to be rebooted -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
3.14.0-rc2: WARNING: at mm/slub.c:1007
Hi, after upgrading from 3.13-rc8 to 3.14.0-rc2 on this PowerPC G4 machine, the WARNING below was printed. Shortly after, a lockdep warning appeared (possibly related to my post to the XFS list yesterday[0]). Even later in the log an out-of-memory error appeared, that may or may not be relatd to that WARNING at all but which I'm trying to chase down ever since 3.13, but which tends to appear more often lately. Can anyone take a look if this is something to worry about? Full dmesg .config: http://nerdbynature.de/bits/3.14-rc2/mm/ Thanks, Christian. [0] http://oss.sgi.com/pipermail/xfs/2014-February/034054.html [ cut here ] WARNING: at /usr/local/src/linux-git/mm/slub.c:1007 Modules linked in: md5 ecb nfs i2c_powermac therm_adt746x ecryptfs arc4 firewire_sbp2 b43 usb_storage mac80211 cfg80211 CPU: 0 PID: 9025 Comm: nfsd Not tainted 3.14.0-rc2 #1 task: efbf8000 ti: ed2a task.ti: ed2a NIP: c00ccc28 LR: c00ccc20 CTR: REGS: ed2a1980 TRAP: 0700 Not tainted (3.14.0-rc2) MSR: 00021032 ME,IR,DR,RI CR: 22f82b82 XER: 2000 GPR00: c00ccc20 ed2a1a30 efbf8000 ef96e550 2ce0 GPR08: 0001 efbf86f8 05e7 82fc2b88 0001 00080011 GPR16: c076 ef96e564 00100100 00200200 c1203914 GPR24: ef96e540 0002 ef96fa80 ed2a c1203900 NIP [c00ccc28] deactivate_slab+0x4c0/0x538 LR [c00ccc20] deactivate_slab+0x4b8/0x538 Call Trace: [ed2a1a30] [c00ccc20] deactivate_slab+0x4b8/0x538 (unreliable) [ed2a1ae0] [c055d5f0] __slab_alloc.constprop.77+0x260/0x38c [ed2a1b50] [c00cd524] kmem_cache_alloc+0x118/0x140 [ed2a1b70] [c01de4bc] kmem_zone_alloc+0x94/0x108 [ed2a1ba0] [c01cccd4] xfs_inode_alloc+0x2c/0xd4 [ed2a1bc0] [c01cd7a4] xfs_iget+0x2e4/0x584 [ed2a1c30] [c020e664] xfs_lookup+0xc8/0xe4 [ed2a1c70] [c01d3c28] xfs_vn_lookup+0x64/0xbc [ed2a1c90] [c00db3ac] lookup_real+0x30/0x70 [ed2a1ca0] [c00dc384] __lookup_hash+0x3c/0x58 [ed2a1cc0] [c00e1438] lookup_one_len+0x10c/0x15c [ed2a1ce0] [c01a170c] nfsd4_encode_dirent+0xb4/0x328 [ed2a1d10] [c018f580] nfsd_readdir+0x1d4/0x288 [ed2a1d90] [c019d648] nfsd4_encode_readdir+0x138/0x1f4 [ed2a1dd0] [c01a1b18] nfsd4_encode_operation+0x8c/0xf0 [ed2a1df0] [c019aa4c] nfsd4_proc_compound+0x1b8/0x4f8 [ed2a1e30] [c0189d20] nfsd_dispatch+0x90/0x1a0 [ed2a1e50] [c0536b04] svc_process+0x3d0/0x698 [ed2a1e90] [c01895bc] nfsd+0xc0/0x120 [ed2a1eb0] [c004f8fc] kthread+0xbc/0xd0 [ed2a1f40] [c0010ae4] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 7fe4fb78 800100b4 b9c10068 7d810120 7d808120 7c0803a6 382100b0 4bfffb00 80610048 4bf95dc5 2f83 40beff4c 0fe0 4b44 815e000c 394a0001 ---[ end trace 1f5ed3ea8b3e4403 ]--- -- BOFH excuse #65: system needs to be rebooted -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.14.0-rc2: WARNING: at mm/slub.c:1007
On Thu, 13 Feb 2014 at 11:53, Christian Kujau wrote: after upgrading from 3.13-rc8 to 3.14.0-rc2 on this PowerPC G4 machine, the WARNING below was printed. Shortly after, a lockdep warning appeared (possibly related to my post to the XFS list yesterday[0]). Sigh, only _after_ sending the email, I came across an earlier posting on lkml: http://marc.info/?l=linux-mmm=139145788623391 Sorry for the noise. These out-of-memory messages below appeared without the WARNING though and started somewhere in 3.13, but are impossible to bisect, as they're happening only every few days / weeks. Christian. Even later in the log an out-of-memory error appeared, that may or may not be relatd to that WARNING at all but which I'm trying to chase down ever since 3.13, but which tends to appear more often lately. Can anyone take a look if this is something to worry about? Full dmesg .config: http://nerdbynature.de/bits/3.14-rc2/mm/ Thanks, Christian. [0] http://oss.sgi.com/pipermail/xfs/2014-February/034054.html [ cut here ] WARNING: at /usr/local/src/linux-git/mm/slub.c:1007 Modules linked in: md5 ecb nfs i2c_powermac therm_adt746x ecryptfs arc4 firewire_sbp2 b43 usb_storage mac80211 cfg80211 CPU: 0 PID: 9025 Comm: nfsd Not tainted 3.14.0-rc2 #1 task: efbf8000 ti: ed2a task.ti: ed2a NIP: c00ccc28 LR: c00ccc20 CTR: REGS: ed2a1980 TRAP: 0700 Not tainted (3.14.0-rc2) MSR: 00021032 ME,IR,DR,RI CR: 22f82b82 XER: 2000 GPR00: c00ccc20 ed2a1a30 efbf8000 ef96e550 2ce0 GPR08: 0001 efbf86f8 05e7 82fc2b88 0001 00080011 GPR16: c076 ef96e564 00100100 00200200 c1203914 GPR24: ef96e540 0002 ef96fa80 ed2a c1203900 NIP [c00ccc28] deactivate_slab+0x4c0/0x538 LR [c00ccc20] deactivate_slab+0x4b8/0x538 Call Trace: [ed2a1a30] [c00ccc20] deactivate_slab+0x4b8/0x538 (unreliable) [ed2a1ae0] [c055d5f0] __slab_alloc.constprop.77+0x260/0x38c [ed2a1b50] [c00cd524] kmem_cache_alloc+0x118/0x140 [ed2a1b70] [c01de4bc] kmem_zone_alloc+0x94/0x108 [ed2a1ba0] [c01cccd4] xfs_inode_alloc+0x2c/0xd4 [ed2a1bc0] [c01cd7a4] xfs_iget+0x2e4/0x584 [ed2a1c30] [c020e664] xfs_lookup+0xc8/0xe4 [ed2a1c70] [c01d3c28] xfs_vn_lookup+0x64/0xbc [ed2a1c90] [c00db3ac] lookup_real+0x30/0x70 [ed2a1ca0] [c00dc384] __lookup_hash+0x3c/0x58 [ed2a1cc0] [c00e1438] lookup_one_len+0x10c/0x15c [ed2a1ce0] [c01a170c] nfsd4_encode_dirent+0xb4/0x328 [ed2a1d10] [c018f580] nfsd_readdir+0x1d4/0x288 [ed2a1d90] [c019d648] nfsd4_encode_readdir+0x138/0x1f4 [ed2a1dd0] [c01a1b18] nfsd4_encode_operation+0x8c/0xf0 [ed2a1df0] [c019aa4c] nfsd4_proc_compound+0x1b8/0x4f8 [ed2a1e30] [c0189d20] nfsd_dispatch+0x90/0x1a0 [ed2a1e50] [c0536b04] svc_process+0x3d0/0x698 [ed2a1e90] [c01895bc] nfsd+0xc0/0x120 [ed2a1eb0] [c004f8fc] kthread+0xbc/0xd0 [ed2a1f40] [c0010ae4] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 7fe4fb78 800100b4 b9c10068 7d810120 7d808120 7c0803a6 382100b0 4bfffb00 80610048 4bf95dc5 2f83 40beff4c 0fe0 4b44 815e000c 394a0001 ---[ end trace 1f5ed3ea8b3e4403 ]--- -- BOFH excuse #65: system needs to be rebooted ___ xfs mailing list x...@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs -- BOFH excuse #65: system needs to be rebooted -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.14.0-rc2: WARNING: at mm/slub.c:1007
On Fri, 14 Feb 2014 at 09:26, Dave Chinner wrote: after upgrading from 3.13-rc8 to 3.14.0-rc2 on this PowerPC G4 machine, the WARNING below was printed. Shortly after, a lockdep warning appeared (possibly related to my post to the XFS list yesterday[0]). Unlikely. OK, so the possible irq lock inversion dependency detected is a lockdep regression, as you explained in the xfs-list thread. What about the RECLAIM_FS-safe - RECLAIM_FS-unsafe lock order detected warning - I haven't seen it again though, only once with 3.14.0-rc2. Christian. -- BOFH excuse #108: The air conditioning water supply pipe ruptured over the machine room -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.14.0-rc2: WARNING: at mm/slub.c:1007
On Fri, 14 Feb 2014 at 12:14, Dave Chinner wrote: OK, so the possible irq lock inversion dependency detected is a lockdep regression, as you explained in the xfs-list thread. What about the RECLAIM_FS-safe - RECLAIM_FS-unsafe lock order detected warning - I haven't seen it again though, only once with 3.14.0-rc2. That was also an i_lock/mmapsem issue, so it's likely to be the same root cause. I'm testing a fix for it at the moment. Understood. Thanks for looking into this. Christian. -- BOFH excuse #129: The ring needs another token -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
3.13-rc3: BUG: soft lockup - CPU#0 stuck for 23s!
I noticed that my machine locks up quite often with 3.13.-rc3. PowerPC G4 again, but this machine was pretty much rock solid until now: when there's lots of disk I/O going on, the system locks up, but not entirely: the calltrace is still written to netconsole (but not to its local disk) and answers ping requests - but SSH login is impossible and a reset is needed. The workload of the machine has not changed, when there's disk I/O it means that either rsync is running or some crazy remote Java application is scanning over this machine's NFS shares. There's sometimes "xfs" mentioned in the call trace and the disk I/O is all happening on the xfs mounts, that's why I Cc'ed the xfs mailing list. More details on: http://nerdbynature.de/bits/3.13-rc3/ Any ideas? The most recent lockup is from today below, this time it wasn't rsync or NFS but I was experimenting with xfs on a loop device, backed by a 1GB file, like this: $ dd if=/dev/zero of=/usr/local/test.img bs=1M count=1k $ losetup -f /usr/local/test.img && mkfs.xfs /dev/loop0 $ mount -t xfs /dev/loop0 /mnt/disk $ cd /mnt/disk $ cp -ax / /mnt/disk - which filled the disk $ rm -rf lib/ - make some room $ i=1; while true; do printf "$i "; dd if=/dev/zero of=f$i \ count=100 bs=100k; i=$(($i+1)); done - filling the disk again => and then the machine locked up. [308783.613600] BUG: soft lockup - CPU#0 stuck for 23s! [kworker/u2:1:14542] [308783.613703] Modules linked in: md5 ecb nfs i2c_powermac therm_adt746x ecryptfs arc4 b43 firewire_sbp2 usb_storage mac80211 cfg80211 [308783.613944] irq event stamp: 37770086 [308783.613980] hardirqs last enabled at (37770085): [] _raw_spin_unlock_irq+0x30/0x60 [308783.614075] hardirqs last disabled at (37770086): [] reenable_mmu+0x30/0x88 [308783.614156] softirqs last enabled at (37764418): [] __do_softirq+0x168/0x1e8 [308783.614236] softirqs last disabled at (37764411): [] irq_exit+0xa4/0xc8 [308783.614312] CPU: 0 PID: 14542 Comm: kworker/u2:1 Not tainted 3.13.0-rc3-00365-gc48b660 #1 [308783.614384] Workqueue: writeback bdi_writeback_workfn (flush-7:0) [308783.614454] task: e8d20bb0 ti: e0c5a000 task.ti: e0c5a000 [308783.614499] NIP: c0546ffc LR: c0546ff0 CTR: [308783.614543] REGS: e0c5ba80 TRAP: 0901 Not tainted (3.13.0-rc3-00365-gc48b660) [308783.614596] MSR: 9032 ,ME ,IR ,DR ,RI > CR: 444c2224 XER: 2000 [308783.614739] #012GPR00: #012GPR08: [308783.614998] NIP [c0546ffc] _raw_spin_unlock_irq+0x3c/0x60 [308783.615047] LR [c0546ff0] _raw_spin_unlock_irq+0x30/0x60 [308783.615089] Call Trace: [308783.615121] [e0c5bb30] [c0546ff0] _raw_spin_unlock_irq+0x30/0x60 (unreliable) [308783.615202] [e0c5bb40] [c009f0e4] __set_page_dirty_nobuffers+0xc8/0x144 [308783.615264] [e0c5bb60] [c01bec28] xfs_vm_writepage+0x90/0x57c [308783.615322] [e0c5bbf0] [c009e618] __writepage+0x24/0x7c [308783.615376] [e0c5bc00] [c009ec38] write_cache_pages+0x1d0/0x380 [308783.615433] [e0c5bca0] [c009ee34] generic_writepages+0x4c/0x70 [308783.615493] [e0c5bce0] [c00f9af8] __writeback_single_inode+0x34/0x12c [308783.615968] [e0c5bd00] [c00f9e74] writeback_sb_inodes+0x1f4/0x344 [308783.616418] [e0c5bd70] [c00fa050] __writeback_inodes_wb+0x8c/0xd0 [308783.616864] [e0c5bda0] [c00fa258] wb_writeback+0x1c4/0x1cc [308783.617306] [e0c5bdd0] [c00fae14] bdi_writeback_workfn+0x158/0x33c [308783.617751] [e0c5be50] [c004906c] process_one_work+0x19c/0x3f0 [308783.618193] [e0c5be80] [c0049a0c] worker_thread+0x128/0x3c0 [308783.618630] [e0c5beb0] [c004fa8c] kthread+0xbc/0xd0 [308783.619071] [e0c5bf40] [c001099c] ret_from_kernel_thread+0x5c/0x64 [308783.619501] Instruction dump: [308783.619915] 7ca802a6 [308783.620437] 4bb1c681 -- BOFH excuse #446: Mailer-daemon is busy burning your message in hell. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
3.13-rc3: BUG: soft lockup - CPU#0 stuck for 23s!
I noticed that my machine locks up quite often with 3.13.-rc3. PowerPC G4 again, but this machine was pretty much rock solid until now: when there's lots of disk I/O going on, the system locks up, but not entirely: the calltrace is still written to netconsole (but not to its local disk) and answers ping requests - but SSH login is impossible and a reset is needed. The workload of the machine has not changed, when there's disk I/O it means that either rsync is running or some crazy remote Java application is scanning over this machine's NFS shares. There's sometimes xfs mentioned in the call trace and the disk I/O is all happening on the xfs mounts, that's why I Cc'ed the xfs mailing list. More details on: http://nerdbynature.de/bits/3.13-rc3/ Any ideas? The most recent lockup is from today below, this time it wasn't rsync or NFS but I was experimenting with xfs on a loop device, backed by a 1GB file, like this: $ dd if=/dev/zero of=/usr/local/test.img bs=1M count=1k $ losetup -f /usr/local/test.img mkfs.xfs /dev/loop0 $ mount -t xfs /dev/loop0 /mnt/disk $ cd /mnt/disk $ cp -ax / /mnt/disk - which filled the disk $ rm -rf lib/ - make some room $ i=1; while true; do printf $i ; dd if=/dev/zero of=f$i \ count=100 bs=100k; i=$(($i+1)); done - filling the disk again = and then the machine locked up. [308783.613600] BUG: soft lockup - CPU#0 stuck for 23s! [kworker/u2:1:14542] [308783.613703] Modules linked in: md5 ecb nfs i2c_powermac therm_adt746x ecryptfs arc4 b43 firewire_sbp2 usb_storage mac80211 cfg80211 [308783.613944] irq event stamp: 37770086 [308783.613980] hardirqs last enabled at (37770085): [c0546ff0] _raw_spin_unlock_irq+0x30/0x60 [308783.614075] hardirqs last disabled at (37770086): [c0010700] reenable_mmu+0x30/0x88 [308783.614156] softirqs last enabled at (37764418): [c00354d4] __do_softirq+0x168/0x1e8 [308783.614236] softirqs last disabled at (37764411): [c0035990] irq_exit+0xa4/0xc8 [308783.614312] CPU: 0 PID: 14542 Comm: kworker/u2:1 Not tainted 3.13.0-rc3-00365-gc48b660 #1 [308783.614384] Workqueue: writeback bdi_writeback_workfn (flush-7:0) [308783.614454] task: e8d20bb0 ti: e0c5a000 task.ti: e0c5a000 [308783.614499] NIP: c0546ffc LR: c0546ff0 CTR: [308783.614543] REGS: e0c5ba80 TRAP: 0901 Not tainted (3.13.0-rc3-00365-gc48b660) [308783.614596] MSR: 9032 ,ME ,IR ,DR ,RI CR: 444c2224 XER: 2000 [308783.614739] #012GPR00: #012GPR08: [308783.614998] NIP [c0546ffc] _raw_spin_unlock_irq+0x3c/0x60 [308783.615047] LR [c0546ff0] _raw_spin_unlock_irq+0x30/0x60 [308783.615089] Call Trace: [308783.615121] [e0c5bb30] [c0546ff0] _raw_spin_unlock_irq+0x30/0x60 (unreliable) [308783.615202] [e0c5bb40] [c009f0e4] __set_page_dirty_nobuffers+0xc8/0x144 [308783.615264] [e0c5bb60] [c01bec28] xfs_vm_writepage+0x90/0x57c [308783.615322] [e0c5bbf0] [c009e618] __writepage+0x24/0x7c [308783.615376] [e0c5bc00] [c009ec38] write_cache_pages+0x1d0/0x380 [308783.615433] [e0c5bca0] [c009ee34] generic_writepages+0x4c/0x70 [308783.615493] [e0c5bce0] [c00f9af8] __writeback_single_inode+0x34/0x12c [308783.615968] [e0c5bd00] [c00f9e74] writeback_sb_inodes+0x1f4/0x344 [308783.616418] [e0c5bd70] [c00fa050] __writeback_inodes_wb+0x8c/0xd0 [308783.616864] [e0c5bda0] [c00fa258] wb_writeback+0x1c4/0x1cc [308783.617306] [e0c5bdd0] [c00fae14] bdi_writeback_workfn+0x158/0x33c [308783.617751] [e0c5be50] [c004906c] process_one_work+0x19c/0x3f0 [308783.618193] [e0c5be80] [c0049a0c] worker_thread+0x128/0x3c0 [308783.618630] [e0c5beb0] [c004fa8c] kthread+0xbc/0xd0 [308783.619071] [e0c5bf40] [c001099c] ret_from_kernel_thread+0x5c/0x64 [308783.619501] Instruction dump: [308783.619915] 7ca802a6 [308783.620437] 4bb1c681 -- BOFH excuse #446: Mailer-daemon is busy burning your message in hell. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: fs: proc: lockdep spew and questions
On Sun, 8 Dec 2013 at 22:14, Sasha Levin wrote: > So how would you suggest to deal with the execution issue in procfs? Files will not be executable by itsself if /proc is mounted with noexec, as some distributions now do by default. C. -- BOFH excuse #14: sounds like a Windows problem, try calling Microsoft support -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: fs: proc: lockdep spew and questions
On Sun, 8 Dec 2013 at 22:14, Sasha Levin wrote: So how would you suggest to deal with the execution issue in procfs? Files will not be executable by itsself if /proc is mounted with noexec, as some distributions now do by default. C. -- BOFH excuse #14: sounds like a Windows problem, try calling Microsoft support -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] scripts/kconfig/menu.c: warning: jump may be used uninitialized in this function
On Sun, 27 Oct 2013 at 18:28, Christian Kujau wrote: > While doing "make oldconfig" on 3.12-rc7 with gcc-4.7.2 (Debian), the > following warning is printed: > > HOSTCC scripts/kconfig/zconf.tab.o > In file included from scripts/kconfig/zconf.tab.c:2537:0: > /usr/local/src/linux-git/scripts/kconfig/menu.c: In function ‘get_symbol_str’: > /usr/local/src/linux-git/scripts/kconfig/menu.c:586:18: warning: ‘jump’ may > be used uninitialized in this function [-Wmaybe-uninitialized] > /usr/local/src/linux-git/scripts/kconfig/menu.c:547:19: note: ‘jump’ was > declared here Grrr, only after I sent this message I found this was reported in September already by Madhavan Srinivasan: https://lkml.org/lkml/2013/9/19/24 Does anybody know the state of this fix? Thanks, Christian. > The following patch seems to fix that: > > Signed-off-by: Christian Kujau > > diff --git a/scripts/kconfig/menu.c b/scripts/kconfig/menu.c > index c1d5320..23b1827 100644 > --- a/scripts/kconfig/menu.c > +++ b/scripts/kconfig/menu.c > @@ -544,7 +544,7 @@ static void get_prompt_str(struct gstr *r, struct > property *prop, > { > int i, j; > struct menu *submenu[8], *menu, *location = NULL; > - struct jump_key *jump; > + struct jump_key *jump = NULL; > > str_printf(r, _("Prompt: %s\n"), _(prop->text)); > menu = prop->menu->parent; > > > Christian. > -- > BOFH excuse #177: > > sticktion > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- BOFH excuse #449: greenpeace free'd the mallocs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] scripts/kconfig/menu.c: warning: jump may be used uninitialized in this function
While doing "make oldconfig" on 3.12-rc7 with gcc-4.7.2 (Debian), the following warning is printed: HOSTCC scripts/kconfig/zconf.tab.o In file included from scripts/kconfig/zconf.tab.c:2537:0: /usr/local/src/linux-git/scripts/kconfig/menu.c: In function ‘get_symbol_str’: /usr/local/src/linux-git/scripts/kconfig/menu.c:586:18: warning: ‘jump’ may be used uninitialized in this function [-Wmaybe-uninitialized] /usr/local/src/linux-git/scripts/kconfig/menu.c:547:19: note: ‘jump’ was declared here The following patch seems to fix that: Signed-off-by: Christian Kujau diff --git a/scripts/kconfig/menu.c b/scripts/kconfig/menu.c index c1d5320..23b1827 100644 --- a/scripts/kconfig/menu.c +++ b/scripts/kconfig/menu.c @@ -544,7 +544,7 @@ static void get_prompt_str(struct gstr *r, struct property *prop, { int i, j; struct menu *submenu[8], *menu, *location = NULL; - struct jump_key *jump; + struct jump_key *jump = NULL; str_printf(r, _("Prompt: %s\n"), _(prop->text)); menu = prop->menu->parent; Christian. -- BOFH excuse #177: sticktion -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/