On Mon, Feb 27, 2017 at 01:39:25PM -0500, Tejun Heo wrote: > On Mon, Feb 27, 2017 at 12:14:39PM -0500, Dave Jones wrote: > > Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > > CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.10.0-think+ #9 > > task: ffff88017f105440 task.stack: ffffc90000094000 > > RIP: 0010:__queue_work+0x2d/0x700 > > RSP: 0018:ffff880507c03df8 EFLAGS: 00010046 > > RAX: 0000000000000082 RBX: 0000000000000101 RCX: 0000000000000002 > > RDX: ffff88047bf07c98 RSI: 0000000000000000 RDI: 0000000000000000 > > RBP: ffff880507c03e30 R08: 0000000000000001 R09: ffffffff8294bf68 > > R10: ffff880507c03e58 R11: 0000000000000000 R12: ffff88047bf07ce8 > > R13: 0000000000000000 R14: 0000000000000000 R15: ffff88047bf07c98 > > FS: 0000000000000000(0000) GS:ffff880507c00000(0000) > > knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 00000000000001c2 CR3: 0000000004e11000 CR4: 00000000001406e0 > > Call Trace: > > <IRQ> > > ? work_on_cpu+0xb0/0xb0 > > delayed_work_timer_fn+0x1e/0x20 > > call_timer_fn+0xbd/0x480 > ... > > Code starting with the faulting instruction > > =========================================== > > 0: 41 f6 85 c2 01 00 00 testb $0x1,0x1c2(%r13) > > 7: 01 > > 8: 0f 85 22 04 00 00 jne 0x430 > > e: 49 rex.WB > > f: bc eb 83 b5 80 mov $0x80b583eb,%esp > > 14: 46 rex.RX > > > > 0000000000003cf0 <__queue_work>: > > { > > 3cf0: e8 00 00 00 00 callq 3cf5 <__queue_work+0x5> > > 3cf5: 55 push %rbp > > 3cf6: 48 89 e5 mov %rsp,%rbp > > 3cf9: 41 57 push %r15 > > 3cfb: 49 89 d7 mov %rdx,%r15 > > 3cfe: 41 56 push %r14 > > unsigned int req_cpu = cpu; > > 3d00: 41 89 fe mov %edi,%r14d > > { > > 3d03: 41 55 push %r13 > > 3d05: 49 89 f5 mov %rsi,%r13 > > 3d08: 41 54 push %r12 > > 3d0a: 53 push %rbx > > 3d0b: 48 83 ec 10 sub $0x10,%rsp > > 3d0f: 89 7d d4 mov %edi,-0x2c(%rbp) > > asm volatile("# __raw_save_flags\n\t" > > 3d12: 9c pushfq > > 3d13: 58 pop %rax > > WARN_ON_ONCE(!irqs_disabled()); > > 3d14: f6 c4 02 test $0x2,%ah > > 3d17: 0f 85 06 04 00 00 jne 4123 <__queue_work+0x433> > > if (unlikely(wq->flags & __WQ_DRAINING) && > > 3d1d: 41 f6 85 c2 01 00 00 testb $0x1,0x1c2(%r13) > > > > > > So we called __queue_work with a null wq. > > So, that's somebody calling queue_delayed_work[_on]() with a NULL wq > and when the timeout expires the timer callback trying to queue > against NULL. Hmm... the work function would be able to tell us who > queued it but it isn't part of the information dumped here (would be > 0x18(%rdx)). > > I'll add a sanity check on queue_delayed_work_on() so that we can > catch it synchronously when it happens.
I dumped work->func, and found it pointed to smc_close_sock_put_work <smc maintainer cc'd> Dave