On 27.3.2015 22:36, Sasha Levin wrote:
> On 03/27/2015 06:07 AM, Vlastimil Babka wrote:
>>> [ 3614.918852] trinity-c7      D ffff8802f4487b58 26976 16252   9410 
>>> 0x10000000
>>>> [ 3614.919580]  ffff8802f4487b58 ffff8802f6b98ca8 0000000000000000 
>>>> 0000000000000000
>>>> [ 3614.920435]  ffff88017d3e0558 ffff88017d3e0530 ffff8802f6b98008 
>>>> ffff88016bad0000
>>>> [ 3614.921219]  ffff8802f6b98000 ffff8802f4487b38 ffff8802f4480000 
>>>> ffffed005e890002
>>>> [ 3614.922069] Call Trace:
>>>> [ 3614.922346] schedule (./arch/x86/include/asm/bitops.h:311 
>>>> (discriminator 1) kernel/sched/core.c:2827 (discriminator 1))
>>>> [ 3614.923023] schedule_preempt_disabled (kernel/sched/core.c:2859)
>>>> [ 3614.923707] mutex_lock_nested (kernel/locking/mutex.c:585 
>>>> kernel/locking/mutex.c:623)
>>>> [ 3614.924486] ? lru_add_drain_all (mm/swap.c:867)
>>>> [ 3614.925211] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2580 
>>>> kernel/locking/lockdep.c:2622)
>>>> [ 3614.925970] ? lru_add_drain_all (mm/swap.c:867)
>>>> [ 3614.926692] ? mutex_trylock (kernel/locking/mutex.c:621)
>>>> [ 3614.927464] ? mpol_new (mm/mempolicy.c:285)
>>>> [ 3614.928044] lru_add_drain_all (mm/swap.c:867)
>>>> [ 3614.928608] migrate_prep (mm/migrate.c:64)
>>>> [ 3614.929092] SYSC_mbind (mm/mempolicy.c:1188 mm/mempolicy.c:1319)
>>>> [ 3614.929619] ? rcu_eqs_exit_common (kernel/rcu/tree.c:735 (discriminator 
>>>> 8))
>>>> [ 3614.930318] ? __mpol_equal (mm/mempolicy.c:1304)
>>>> [ 3614.930877] ? trace_hardirqs_on (kernel/locking/lockdep.c:2630)
>>>> [ 3614.931485] ? syscall_trace_enter_phase2 (arch/x86/kernel/ptrace.c:1592)
>>>> [ 3614.932184] SyS_mbind (mm/mempolicy.c:1301)
>> That looks like trinity-c7 is waiting ot in too, but later on (after some 
>> more
>> listings like this for trinity-c7, probably threads?) we have:
>>
> 
> It keeps changing constantly, even in this trace the process is blocking on 
> the mutex

I think it's multiple threads of process with same name trinity-c7, and the
thread 16935 of trinity-c7 does have the mutex locked and is waiting on
something else.

> rather than doing something useful, and in the next trace it's a different 
> process.

And the next trace is from the same run, just later, i.e. it doesn't hang
completely, but makes too slow progress so that 20 minutes hang timer catches
this? I'm not sure here.

If it's too slow, I can imagine it could be simply optimized - if one thread
manages to lock the mutex, it can tell all threads waiting *at that moment* that
they can just return when the first thread is done - it has done the necessary
work for all of them already. But I wonder if this contention happens in
practice. And that certainly doesn't explain any regression that apparently 
occured.

> 
> Thanks,
> Sasha
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to