Hi all,

I'm seeing an interesting hang when trinity is trying to load a large module, 
where
the size passed by userspace is larger than the amount of memory actually 
allocated
in userspace and passed in the 'from' parameter:

[ 1549.080032] NMI watchdog: BUG: soft lockup - CPU#11 stuck for 22s! 
[trinity-c11:12699]
[ 1549.080032] Modules linked in:
[ 1549.080032] hardirqs last enabled at (4202427): restore_args 
(arch/x86/kernel/entry_64.S:750)
[ 1549.080032] hardirqs last disabled at (4202428): apic_timer_interrupt 
(arch/x86/kernel/entry_64.S:890)
[ 1549.080032] softirqs last enabled at (4202426): __do_softirq 
(./arch/x86/include/asm/preempt.h:22 kernel/softirq.c:300)
[ 1549.080032] softirqs last disabled at (4202421): irq_exit 
(kernel/softirq.c:350 kernel/softirq.c:391)
[ 1549.080032] CPU: 11 PID: 12699 Comm: trinity-c11 Not tainted 
4.0.0-rc6-next-20150402-sasha-00039-ge0bdae3-dirty #2125
[ 1549.080032] task: ffff880260f30000 ti: ffff88025ebe8000 task.ti: 
ffff88025ebe8000
[ 1549.080032] RIP: copy_user_handle_tail (arch/x86/lib/usercopy_64.c:85)
[ 1549.080032] RSP: 0000:ffff88025ebefe38  EFLAGS: 00010202
[ 1549.080032] RAX: 00000000356cb494 RBX: 1ffff1004bd7dfcb RCX: 0000000000000000
[ 1549.080032] RDX: 000000009277d652 RSI: ffffc90078bba001 RDI: ffffc90078bba000
[ 1549.080032] RBP: ffff88025ebefe38 R08: 0000000000000000 R09: 0000000002f75001
[ 1549.080032] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88025ebefe78
[ 1549.080032] R13: ffff88025ebeff18 R14: 0000000094949494 R15: 0000000000da91be
[ 1549.080032] FS:  00007f1f669df700(0000) GS:ffff8803f2800000(0000) 
knlGS:0000000000000000
[ 1549.080032] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1549.080032] CR2: 0000000002f75000 CR3: 000000025d031000 CR4: 00000000000007a0
[ 1549.080032] Stack:
[ 1549.080032]  ffff88025ebeff48 ffffffff963c5ef0 ffffc9001993c000 
0000000000da91be
[ 1549.080032]  0000000041b58ab3 ffffffffa32918f3 ffffffff963c5da0 
0000000000000286
[ 1549.080032]  ffffc9001993c000 0000000094949494 0000000000000000 
0000000000000000
[ 1549.080032] Call Trace:
[ 1549.080032] SyS_init_module (kernel/module.c:2505 kernel/module.c:3401 
kernel/module.c:3388)
[ 1549.080032] ? load_module (kernel/module.c:3388)
[ 1549.080032] ia32_do_call (arch/x86/ia32/ia32entry.S:501)
[ 1549.080032] Code: d0 66 66 90 8a 06 66 66 90 45 85 c0 74 d3 85 c9 74 23 89 
d0 45 31 c0 eb 08 83 e8 01 48 89 f7 74 14 48 8d 77 01 44 89 c1 66 66 90 <c6> 07 
00 66 66 90 85 c9 74 e4 89 d0 66 66 90 5d c3 0f 1f 44 00
All code
========
   0:   d0 66 66                shlb   0x66(%rsi)
   3:   90                      nop
   4:   8a 06                   mov    (%rsi),%al
   6:   66 66 90                data32 xchg %ax,%ax
   9:   45 85 c0                test   %r8d,%r8d
   c:   74 d3                   je     0xffffffffffffffe1
   e:   85 c9                   test   %ecx,%ecx
  10:   74 23                   je     0x35
  12:   89 d0                   mov    %edx,%eax
  14:   45 31 c0                xor    %r8d,%r8d
  17:   eb 08                   jmp    0x21
  19:   83 e8 01                sub    $0x1,%eax
  1c:   48 89 f7                mov    %rsi,%rdi
  1f:   74 14                   je     0x35
  21:   48 8d 77 01             lea    0x1(%rdi),%rsi
  25:   44 89 c1                mov    %r8d,%ecx
  28:   66 66 90                data32 xchg %ax,%ax
  2b:*  c6 07 00                movb   $0x0,(%rdi)              <-- trapping 
instruction
  2e:   66 66 90                data32 xchg %ax,%ax
  31:   85 c9                   test   %ecx,%ecx
  33:   74 e4                   je     0x19
  35:   89 d0                   mov    %edx,%eax
  37:   66 66 90                data32 xchg %ax,%ax
  3a:   5d                      pop    %rbp
  3b:   c3                      retq
  3c:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)

Code starting with the faulting instruction
===========================================
   0:   c6 07 00                movb   $0x0,(%rdi)
   3:   66 66 90                data32 xchg %ax,%ax
   6:   85 c9                   test   %ecx,%ecx
   8:   74 e4                   je     0xffffffffffffffee
   a:   89 d0                   mov    %edx,%eax
   c:   66 66 90                data32 xchg %ax,%ax
   f:   5d                      pop    %rbp
  10:   c3                      retq
  11:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)


This is the result of getting copy_user_handle_tail to zero out a large block of
kernel memory very inefficiently:

        for (c = 0, zero_len = len; zerorest && zero_len; --zero_len)
                if (__put_user_nocheck(c, to++, sizeof(char)))
                        break;


Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to