Re: memory-hotplug : possible circular locking dependency detected

2012-09-13 Thread Yasuaki Ishimatsu
Hi Wen,

2012/09/14 15:02, Wen Congyang wrote:
> At 09/13/2012 02:19 PM, Yasuaki Ishimatsu Wrote:
>> When I offline a memory on linux-3.6-rc5, "possible circular
>> locking dependency detected" messages are shown.
>> Are the messages known problem?
> 
> It is a known problem, but it doesn't cause a deadlock.
> There is 3 locks: memory hotplug's lock, memory hotplug
> notifier's lock, and ksm_thread_mutex.
> 
> ksm_thread_mutex is locked when the memory is going offline
> and is unlocked when the memory is offlined or the offlining
> is cancelled. So we meet the warning messages. But it
> doesn't cause deadlock, because we lock mem_hotplug_mutex
> first.

Thank you for your clarification.
I understand it. So I ignore the messages.

Thanks,
Yasuaki Ishimatsu

> 
> Thanks
> Wen Congyang
> 
>>
>> [  201.596363] Offlined Pages 32768
>> [  201.596373] remove from free list 14 1024 148000
>> [  201.596493] remove from free list 140400 1024 148000
>> [  201.596612] remove from free list 140800 1024 148000
>> [  201.596730] remove from free list 140c00 1024 148000
>> [  201.596849] remove from free list 141000 1024 148000
>> [  201.596968] remove from free list 141400 1024 148000
>> [  201.597049] remove from free list 141800 1024 148000
>> [  201.597049] remove from free list 141c00 1024 148000
>> [  201.597049] remove from free list 142000 1024 148000
>> [  201.597049] remove from free list 142400 1024 148000
>> [  201.597049] remove from free list 142800 1024 148000
>> [  201.597049] remove from free list 142c00 1024 148000
>> [  201.597049] remove from free list 143000 1024 148000
>> [  201.597049] remove from free list 143400 1024 148000
>> [  201.597049] remove from free list 143800 1024 148000
>> [  201.597049] remove from free list 143c00 1024 148000
>> [  201.597049] remove from free list 144000 1024 148000
>> [  201.597049] remove from free list 144400 1024 148000
>> [  201.597049] remove from free list 144800 1024 148000
>> [  201.597049] remove from free list 144c00 1024 148000
>> [  201.597049] remove from free list 145000 1024 148000
>> [  201.597049] remove from free list 145400 1024 148000
>> [  201.597049] remove from free list 145800 1024 148000
>> [  201.597049] remove from free list 145c00 1024 148000
>> [  201.597049] remove from free list 146000 1024 148000
>> [  201.597049] remove from free list 146400 1024 148000
>> [  201.597049] remove from free list 146800 1024 148000
>> [  201.597049] remove from free list 146c00 1024 148000
>> [  201.597049] remove from free list 147000 1024 148000
>> [  201.597049] remove from free list 147400 1024 148000
>> [  201.597049] remove from free list 147800 1024 148000
>> [  201.597049] remove from free list 147c00 1024 148000
>> [  201.602143]
>> [  201.602150] ==
>> [  201.602153] [ INFO: possible circular locking dependency detected ]
>> [  201.602157] 3.6.0-rc5 #1 Not tainted
>> [  201.602159] ---
>> [  201.602162] bash/2789 is trying to acquire lock:
>> [  201.602164]  ((memory_chain).rwsem){.+.+.+}, at: [] 
>> __blocking_notifier_call_chain+0x66/0xd0
>> [  201.602180]
>> [  201.602180] but task is already holding lock:
>> [  201.602182]  (ksm_thread_mutex/1){+.+.+.}, at: [] 
>> ksm_memory_callback+0x3a/0xc0
>> [  201.602194]
>> [  201.602194] which lock already depends on the new lock.
>> [  201.602194]
>> [  201.602197]
>> [  201.602197] the existing dependency chain (in reverse order) is:
>> [  201.602200]
>> [  201.602200] -> #1 (ksm_thread_mutex/1){+.+.+.}:
>> [  201.602208][] validate_chain+0x6d9/0x7e0
>> [  201.602214][] __lock_acquire+0x2f6/0x4f0
>> [  201.602219][] lock_acquire+0x9d/0x190
>> [  201.602223][] __mutex_lock_common+0x5c/0x420
>> [  201.602229][] mutex_lock_nested+0x4a/0x60
>> [  201.602234][] ksm_memory_callback+0x3a/0xc0
>> [  201.602239][] notifier_call_chain+0x67/0x150
>> [  201.602244][] 
>> __blocking_notifier_call_chain+0x7b/0xd0
>> [  201.602250][] 
>> blocking_notifier_call_chain+0x16/0x20
>> [  201.602255][] memory_notify+0x1b/0x20
>> [  201.602261][] offline_pages+0x1b1/0x470
>> [  201.602267][] remove_memory+0x1e/0x20
>> [  201.602273][] memory_block_action+0xa1/0x190
>> [  201.602278][] 
>> memory_block_change_state+0x79/0xe0
>> [  201.602282][] store_mem_state+0xc2/0xd

memory-hotplug : possible circular locking dependency detected

2012-09-12 Thread Yasuaki Ishimatsu
When I offline a memory on linux-3.6-rc5, "possible circular
locking dependency detected" messages are shown.
Are the messages known problem?

[  201.596363] Offlined Pages 32768
[  201.596373] remove from free list 14 1024 148000
[  201.596493] remove from free list 140400 1024 148000
[  201.596612] remove from free list 140800 1024 148000
[  201.596730] remove from free list 140c00 1024 148000
[  201.596849] remove from free list 141000 1024 148000
[  201.596968] remove from free list 141400 1024 148000
[  201.597049] remove from free list 141800 1024 148000
[  201.597049] remove from free list 141c00 1024 148000
[  201.597049] remove from free list 142000 1024 148000
[  201.597049] remove from free list 142400 1024 148000
[  201.597049] remove from free list 142800 1024 148000
[  201.597049] remove from free list 142c00 1024 148000
[  201.597049] remove from free list 143000 1024 148000
[  201.597049] remove from free list 143400 1024 148000
[  201.597049] remove from free list 143800 1024 148000
[  201.597049] remove from free list 143c00 1024 148000
[  201.597049] remove from free list 144000 1024 148000
[  201.597049] remove from free list 144400 1024 148000
[  201.597049] remove from free list 144800 1024 148000
[  201.597049] remove from free list 144c00 1024 148000
[  201.597049] remove from free list 145000 1024 148000
[  201.597049] remove from free list 145400 1024 148000
[  201.597049] remove from free list 145800 1024 148000
[  201.597049] remove from free list 145c00 1024 148000
[  201.597049] remove from free list 146000 1024 148000
[  201.597049] remove from free list 146400 1024 148000
[  201.597049] remove from free list 146800 1024 148000
[  201.597049] remove from free list 146c00 1024 148000
[  201.597049] remove from free list 147000 1024 148000
[  201.597049] remove from free list 147400 1024 148000
[  201.597049] remove from free list 147800 1024 148000
[  201.597049] remove from free list 147c00 1024 148000
[  201.602143] 
[  201.602150] ==
[  201.602153] [ INFO: possible circular locking dependency detected ]
[  201.602157] 3.6.0-rc5 #1 Not tainted
[  201.602159] ---
[  201.602162] bash/2789 is trying to acquire lock:
[  201.602164]  ((memory_chain).rwsem){.+.+.+}, at: [] 
__blocking_notifier_call_chain+0x66/0xd0
[  201.602180] 
[  201.602180] but task is already holding lock:
[  201.602182]  (ksm_thread_mutex/1){+.+.+.}, at: [] 
ksm_memory_callback+0x3a/0xc0
[  201.602194] 
[  201.602194] which lock already depends on the new lock.
[  201.602194] 
[  201.602197] 
[  201.602197] the existing dependency chain (in reverse order) is:
[  201.602200] 
[  201.602200] -> #1 (ksm_thread_mutex/1){+.+.+.}:
[  201.602208][] validate_chain+0x6d9/0x7e0
[  201.602214][] __lock_acquire+0x2f6/0x4f0
[  201.602219][] lock_acquire+0x9d/0x190
[  201.602223][] __mutex_lock_common+0x5c/0x420
[  201.602229][] mutex_lock_nested+0x4a/0x60
[  201.602234][] ksm_memory_callback+0x3a/0xc0
[  201.602239][] notifier_call_chain+0x67/0x150
[  201.602244][] 
__blocking_notifier_call_chain+0x7b/0xd0
[  201.602250][] 
blocking_notifier_call_chain+0x16/0x20
[  201.602255][] memory_notify+0x1b/0x20
[  201.602261][] offline_pages+0x1b1/0x470
[  201.602267][] remove_memory+0x1e/0x20
[  201.602273][] memory_block_action+0xa1/0x190
[  201.602278][] memory_block_change_state+0x79/0xe0
[  201.602282][] store_mem_state+0xc2/0xd0
[  201.602287][] dev_attr_store+0x20/0x30
[  201.602293][] sysfs_write_file+0xa3/0x100
[  201.602299][] vfs_write+0xd0/0x1a0
[  201.602304][] sys_write+0x54/0xa0
[  201.602309][] system_call_fastpath+0x16/0x1b
[  201.602315] 
[  201.602315] -> #0 ((memory_chain).rwsem){.+.+.+}:
[  201.602322][] check_prev_add+0x527/0x550
[  201.602326][] validate_chain+0x6d9/0x7e0
[  201.602331][] __lock_acquire+0x2f6/0x4f0
[  201.602335][] lock_acquire+0x9d/0x190
[  201.602340][] down_read+0x51/0xa0
[  201.602345][] 
__blocking_notifier_call_chain+0x66/0xd0
[  201.602350][] 
blocking_notifier_call_chain+0x16/0x20
[  201.602355][] memory_notify+0x1b/0x20
[  201.602360][] offline_pages+0x3c7/0x470
[  201.602365][] remove_memory+0x1e/0x20
[  201.602370][] memory_block_action+0xa1/0x190
[  201.602375][] memory_block_change_state+0x79/0xe0
[  201.602379][] store_mem_state+0xc2/0xd0
[  201.602385][] dev_attr_store+0x20/0x30
[  201.602389][] sysfs_write_file+0xa3/0x100
[  201.602394][] vfs_write+0xd0/0x1a0
[  201.602398][] sys_write+0x54/0xa0
[  201.602403][] system_call_fastpath+0x16/0x1b
[  201.602408] 
[  201.602408] other info that might help us debug this:
[  201.602408] 
[  201.602412]  Possible unsafe locking s