On Sat, Oct 22, 2016 at 5:02 AM, Thomas Gleixner <t...@linutronix.de> wrote: > > please pull the latest timers-urgent-for-linus git tree from: > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git > timers-urgent-for-linus
Hmm. I just hit something that looks very much like a timer bug. I'm working on cleaning up the KERN_CONT handling of printk.c, and in the process I'm adding a timer that flushes incomplete lines to the logs. The code is pretty simple, and it basically just does static void deferred_cont_flush(void) { static DEFINE_TIMER(timer, flush_timer, 0, 0); if (!cont.len) return; mod_timer(&timer, jiffies + HZ/10); } which is then called whenever we have a incomplete line (the "flush_timer()" function will just flush the pending line to the console and to the logs. It worked fine in testing, but at shutdown I just got this: do_trap: 165 callbacks suppressed traps: gnome-shell[1339] trap int3 ip:7f465aace15b sp:7fffc54f1670 error:0 followed by general protection fault: 0000 [#1] SMP Modules linked in: fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_broute bridge stp llc ebtable_nat ip6tabl$ wmi pinctrl_intel acpi_als kfifo_buf tpm_tis tpm_tis_core industrialio acpi_pad tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_crypt i915 crct10dif_pclmul crc32_pclmul crc32c_intel i2c_algo_bit drm_kms_helper syscopyarea se$ CPU: 7 PID: 1339 Comm: gnome-shell Tainted: G O 4.9.0-rc1-00244-g0c2b6dc4fd4f-dirty #6 Hardware name: System manufacturer System Product Name/Z170-K, BIOS 1803 05/06/2016 task: ffff95c443bcd940 task.stack: ffffa6ce82d6c000 RIP: 0010:[<ffffffffb10f9df1>] [<ffffffffb10f9df1>] detach_if_pending+0x41/0xd0 RSP: 0000:ffffa6ce82d6fd08 EFLAGS: 00010086 RAX: dead000000000200 RBX: 0000000100089285 RCX: 0000000000000000 RDX: ffffffffb1c99580 RSI: ffff95c476ccf140 RDI: ffffffffb1c50920 RBP: ffffa6ce82d6fd20 R08: 000000000000004a R09: 0000000000000000 R10: 000000000000000f R11: ffffffffb1f0912c R12: ffffffffb1c50920 R13: 0000000000000000 R14: ffff95c476ccf140 R15: ffffffffb1c50920 FS: 00007f46639adac0(0000) GS:ffff95c476dc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fea54725010 CR3: 0000000410fa3000 CR4: 00000000003406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Stack: 0000000100089285 0000000000000052 0000000000000051 ffffa6ce82d6fd90 ffffffffb10fc4e0 ffffffffb1c50948 0000000100089222 00000000cbeae39b ffffffffb1f0912b 0000000000000246 00000000cbeae39b ffffffffb1a074af Call Trace: [<ffffffffb10fc4e0>] mod_timer+0xd0/0x3b0 [<ffffffffb10e6670>] deferred_cont_flush.part.12+0x20/0x30 [<ffffffffb10e7139>] console_unlock+0x489/0x4a0 [<ffffffffb10e092e>] ? down_trylock+0x2e/0x40 [<ffffffffb10e7443>] vprintk_emit+0x2f3/0x480 [<ffffffffb10e77a9>] vprintk_default+0x29/0x40 [<ffffffffb11a1a3b>] printk+0x57/0x73 [<ffffffffb13c7521>] ? ___ratelimit+0x91/0xd0 [<ffffffffb101ed76>] do_trap+0x116/0x140 [<ffffffffb101f2bf>] do_int3+0xdf/0x110 [<ffffffffb17a17cb>] int3+0x5b/0x60 Code: 41 55 41 54 53 48 83 3f 00 0f 84 81 00 00 00 41 89 d5 49 89 fc 0f 1f 44 00 00 49 8b 04 24 49 8b 54 24 08 48 85 c0 48 89 02 74 04 <48> 89 50 08 45 84 ed 74 09 49 c7 44 24 08 00 00 00 00 48 b8 00 RIP [<ffffffffb10f9df1>] detach_if_pending+0x41/0xd0 RSP <ffffa6ce82d6fd08> ---[ end trace e4a7055c685d5e1e ]--- (there was another GP fault at the same address immediately afterwards, and then the machine was dead). The code disassembly at the faulting instruction is 0: 48 89 50 08 mov %rdx,0x8(%rax) 4: 45 84 ed test %r13b,%r13b 7: 74 09 je 0x12 and obviously %rax is bogus. In fact, %rax has the value "dead000000000200", which is just LIST_POISON2. Now, normally, I'd just say "my code is obviously buggy crap", and I did spend a bit of time on that theory. But quite frankly, that timer really _is_ local to that one function, and the absolutely only thing that is ever done on that timer is that "mod_timer()" call. So I don't think it's my new code. What triggers this is do_trap() doing two pr_info("%s[%d] trap %s ip:%lx sp:%lx error:%lx", tsk->comm, tsk->pid, str, regs->ip, regs->sp, error_code); print_vma_addr(" in ", regs->ip); pr_cont("\n"); which does two printk's without '\n' in close succession, so the first one will turn on the timer, and then the second one will update it again very soon afterwards. So I started looking at mod_timer() and detach_if_pending(), but I don't see anything obvious. I can certainly find the code: the oops clearly happens in detach_timer(): __hlist_del(entry); if (clear_pending) entry->pprev = NULL; entry->next = LIST_POISON2; and the "test %r13b,%r13b" instruction that follows the GP insutrction is the test of "clear_pending", and the access that faults is the if (next) next->pprev = pprev; in __hlist_del(), where "next" is that %rax LIST_POISON2 value. So it smells like "entry" was simply detached _twice_. The second time it hit that LIST_POISON2 poisoning and died. Some locking problem in the timer handling? Without knowing anythign at all, I'd be inclined to blame f00c0afdfa62 ("timers: Implement optimization for same expiry time in mod_timer()") which seems to do unlocked games with mod_timer(). But I may be full of crap. I literally just looked at "what has changed recently in this area", on the assumption that timers have historically been robust. The fact that this happened not during my testing (I had test cases that did similar things), but at shutdown, makes me also wonder if maybe some CPU hot-unplug thing might have been involved. Concurrent timer migration from a CPU that was going down? Although honestly, it doesn't seem like that, this happens when gnome-shell is exiting, and it's simply not that late in the shutdown process yet. I don't know. It still bothers me that this happens with new code that I was just testing out, and my gut feel is that "timers are so core and tested from other code that it can't *possibly* be bug in mod_timer()". But the code really is *so* simple: literally a single static timer that is only _ever_ touched by "mod_timer()" and absolutely nothing else. So even if everything else in my KERN_CONT cleanup was completely buggy crap, I'm not seeing how it could screw up the timer. Ideas? Linus