(2014/06/05 0:23), Peter Moody wrote:
> 
> On Wed, Jun 04 2014 at 07:07, Masami Hiramatsu wrote:
> 
>>> Thank you for reporting that. I've tried to reproduce it with your code, but
>>> not succeeded yet. Could you share us your kernel config too?
>>
>> Hmm, it seems that on my environment (Fedora20, gcc version 4.8.2 20131212),
>> do_execve() in sys_execve has been optimized out (and do_execve_common() is
>> also renamed). I'll try to rebuild it. However, since such optimization 
>> sometimes
>> depends on kernel config, I'd like to do it with your config.
>>
>> Thank you,
> 
> Sure thing, sorry for not attaching it to begin with.
> 
> One other thing is that, at least on the systems I've been able to repro on, 
> the more processes,
> the more likely I was to not emit a splat before just deadlocking the 
> machine. eg. on a 12 core
> machine, I got the splat with 32 processes and a deadlock with 50. On a 2 
> core qemu virtual
> machine I got a deadlock with 32 and a splat with something like 12 or 16.
> 
> And FWIW, I'm running ubuntu precise, with gcc version 4.6.3 (Ubuntu/Linaro 
> 4.6.3-1ubuntu5)


Thank you for sharing the kconfig. I saw the CONFIG_DEBUG_ATOMIC_SLEEP was not 
set
in your kconfig. When I set that and run your test, I had (a lot of) below 
warnings
instead of deadlock.

[  342.072132] BUG: sleeping function called from invalid context at 
/home/fedora/ksrc/linux-3/kernel/fork.c:615
[  342.080684] in_atomic(): 1, irqs_disabled(): 1, pid: 5017, name: execve
[  342.080684] INFO: lockdep is turned off.
[  342.080684] irq event stamp: 0
[  342.080684] hardirqs last  enabled at (0): [<          (null)>]           
(null)
[  342.080684] hardirqs last disabled at (0): [<ffffffff81045468>] 
copy_process.part.31+0x5ba/0x183d
[  342.080684] softirqs last  enabled at (0): [<ffffffff81045468>] 
copy_process.part.31+0x5ba/0x183d
[  342.080684] softirqs last disabled at (0): [<          (null)>]           
(null)
[  342.080684] CPU: 5 PID: 5017 Comm: execve Not tainted 3.15.0-rc8+ #7
[  342.080684] Hardware name: Red Hat Inc. OpenStack Nova, BIOS 0.5.1 01/01/2007
[  342.080684]  0000000000000000 ffff8803ff81bdf8 ffffffff81554140 
ffff88040a9df500
[  342.080684]  ffff8803ff81be08 ffffffff8106d17c ffff8803ff81be20 
ffffffff81044bd8
[  342.080684]  ffffffff8114ad8f ffff8803ff81be30 ffffffffa015802d 
ffff8803ff81be88
[  342.080684] Call Trace:
[  342.080684]  [<ffffffff81554140>] dump_stack+0x4d/0x66
[  342.080684]  [<ffffffff8106d17c>] __might_sleep+0x118/0x11a
[  342.080684]  [<ffffffff81044bd8>] mmput+0x20/0xd9
[  342.080684]  [<ffffffff8114ad8f>] ? SyS_execve+0x2a/0x2e
[  342.080684]  [<ffffffffa015802d>] exec_handler+0x2d/0x34 [exec_mm_probe]
[  342.080684]  [<ffffffff81032a2c>] trampoline_handler+0x11b/0x1ac
[  342.080684]  [<ffffffff8103265a>] kretprobe_trampoline+0x25/0x4c
[  342.080684]  [<ffffffff81032635>] ? kretprobe_trampoline_holder+0x9/0x9
[  342.080684]  [<ffffffff8155ca99>] stub_execve+0x69/0xa0

Here, as you can see, calling mmput() in kretprobe handler is actually the root 
cause
of this problem.

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to