On 03/21/2014 02:35 PM, Benjamin LaHaise wrote:
Hi all,

Based on the issues reported by Tang and Gu, I've come up with the an
alternative fix that avoids adding additional locking in the event read
code path.  The fix is to take the ring_lock mutex during page migration,
which is already used to syncronize event readers and thus does not add
any new locking requirements in aio_read_events_ring().  I've dropped
the patches from Tang and Gu as a result.  This patch is now in my
git://git.kvack.org/~bcrl/aio-next.git tree and will be sent to Linus
once a few other people chime in with their reviews of this change.
Please review Tang, Gu.  Thanks!

Hi Benjamin,

This patch seems to trigger:

[  433.476216] ======================================================
[  433.478468] [ INFO: possible circular locking dependency detected ]
[  433.480900] 3.14.0-rc7-next-20140324-sasha-00015-g1fb7de8 #267 Tainted: G    
    W
[  433.480900] -------------------------------------------------------
[  433.480900] trinity-c57/13776 is trying to acquire lock:
[  433.480900]  (&ctx->ring_lock){+.+.+.}, at: aio_migratepage 
(include/linux/spinlock.h:303 fs/aio.c:306)
[  433.480900]
[  433.480900] but task is already holding lock:
[  433.480900]  (&mm->mmap_sem){++++++}, at: SYSC_move_pages (mm/migrate.c:1215 
mm/migrate.c:1353 mm/migrate.c:1508)
[  433.480900]
[  433.480900] which lock already depends on the new lock.
[  433.480900]
[  433.480900]
[  433.480900] the existing dependency chain (in reverse order) is:
[  433.480900]
-> #1 (&mm->mmap_sem){++++++}:
[  433.480900]        lock_acquire (arch/x86/include/asm/current.h:14 
kernel/locking/lockdep.c:3602)
[  433.480900]        down_write (arch/x86/include/asm/rwsem.h:130 
kernel/locking/rwsem.c:50)
[  433.480900]        SyS_io_setup (fs/aio.c:442 fs/aio.c:689 fs/aio.c:1201 
fs/aio.c:1184)
[  433.480900]        tracesys (arch/x86/kernel/entry_64.S:749)
[  433.480900]
-> #0 (&ctx->ring_lock){+.+.+.}:
[  433.480900]        __lock_acquire (kernel/locking/lockdep.c:1840 
kernel/locking/lockdep.c:1945 kernel/locking/lockdep.c:2131 
kernel/locking/lockdep.c:3182)
[  433.480900]        lock_acquire (arch/x86/include/asm/current.h:14 
kernel/locking/lockdep.c:3602)
[  433.480900]        mutex_lock_nested (kernel/locking/mutex.c:486 
kernel/locking/mutex.c:587)
[  433.480900]        aio_migratepage (include/linux/spinlock.h:303 
fs/aio.c:306)
[  433.480900]        move_to_new_page (mm/migrate.c:777)
[  433.480900]        migrate_pages (mm/migrate.c:921 mm/migrate.c:960 
mm/migrate.c:1126)
[  433.480900]        SYSC_move_pages (mm/migrate.c:1278 mm/migrate.c:1353 
mm/migrate.c:1508)
[  433.480900]        SyS_move_pages (mm/migrate.c:1456)
[  433.480900]        tracesys (arch/x86/kernel/entry_64.S:749)
[  433.480900]
[  433.480900] other info that might help us debug this:
[  433.480900]
[  433.480900]  Possible unsafe locking scenario:
[  433.480900]
[  433.480900]        CPU0                    CPU1
[  433.480900]        ----                    ----
[  433.480900]   lock(&mm->mmap_sem);
[  433.480900]                                lock(&ctx->ring_lock);
[  433.480900]                                lock(&mm->mmap_sem);
[  433.480900]   lock(&ctx->ring_lock);
[  433.480900]
[  433.480900]  *** DEADLOCK ***
[  433.480900]
[  433.480900] 1 lock held by trinity-c57/13776:
[  433.480900]  #0:  (&mm->mmap_sem){++++++}, at: SYSC_move_pages 
(mm/migrate.c:1215 mm/migrate.c:1353 mm/migrate.c:1508)
[  433.480900]
[  433.480900] stack backtrace:
[  433.480900] CPU: 4 PID: 13776 Comm: trinity-c57 Tainted: G        W     
3.14.0-rc7-next-20140324-sasha-00015-g1fb7de8 #267
[  433.480900]  ffffffff87a80790 ffff8806abbbb9a8 ffffffff844bae02 
0000000000000000
[  433.480900]  ffffffff87a80790 ffff8806abbbb9f8 ffffffff844ad86d 
0000000000000001
[  433.480900]  ffff8806abbbba88 ffff8806abbbb9f8 ffff8806ab8fbcf0 
ffff8806ab8fbd28
[  433.480900] Call Trace:
[  433.480900]  dump_stack (lib/dump_stack.c:52)
[  433.480900]  print_circular_bug (kernel/locking/lockdep.c:1216)
[  433.480900]  __lock_acquire (kernel/locking/lockdep.c:1840 
kernel/locking/lockdep.c:1945 kernel/locking/lockdep.c:2131 
kernel/locking/lockdep.c:3182)
[  433.480900]  ? sched_clock (arch/x86/include/asm/paravirt.h:192 
arch/x86/kernel/tsc.c:305)
[  433.480900]  ? sched_clock_local (kernel/sched/clock.c:214)
[  433.480900]  ? sched_clock_cpu (kernel/sched/clock.c:311)
[  433.480900]  ? __lock_acquire (kernel/locking/lockdep.c:3189)
[  433.480900]  lock_acquire (arch/x86/include/asm/current.h:14 
kernel/locking/lockdep.c:3602)
[  433.480900]  ? aio_migratepage (include/linux/spinlock.h:303 fs/aio.c:306)
[  433.480900]  mutex_lock_nested (kernel/locking/mutex.c:486 
kernel/locking/mutex.c:587)
[  433.480900]  ? aio_migratepage (include/linux/spinlock.h:303 fs/aio.c:306)
[  433.480900]  ? aio_migratepage (fs/aio.c:303)
[  433.480900]  ? aio_migratepage (include/linux/spinlock.h:303 fs/aio.c:306)
[  433.480900]  ? aio_migratepage (include/linux/rcupdate.h:324 
include/linux/rcupdate.h:909 include/linux/percpu-refcount.h:117 fs/aio.c:297)
[  433.480900]  ? preempt_count_sub (kernel/sched/core.c:2527)
[  433.480900]  aio_migratepage (include/linux/spinlock.h:303 fs/aio.c:306)
[  433.480900]  ? aio_migratepage (include/linux/rcupdate.h:886 
include/linux/percpu-refcount.h:108 fs/aio.c:297)
[  433.480900]  ? mutex_unlock (kernel/locking/mutex.c:220)
[  433.480900]  move_to_new_page (mm/migrate.c:777)
[  433.480900]  ? try_to_unmap (mm/rmap.c:1516)
[  433.480900]  ? try_to_unmap_nonlinear (mm/rmap.c:1113)
[  433.480900]  ? invalid_migration_vma (mm/rmap.c:1472)
[  433.480900]  ? page_remove_rmap (mm/rmap.c:1380)
[  433.480900]  ? anon_vma_fork (mm/rmap.c:446)
[  433.480900]  migrate_pages (mm/migrate.c:921 mm/migrate.c:960 
mm/migrate.c:1126)
[  433.480900]  ? follow_page_mask (mm/memory.c:1544)
[  433.480900]  ? alloc_misplaced_dst_page (mm/migrate.c:1177)
[  433.480900]  SYSC_move_pages (mm/migrate.c:1278 mm/migrate.c:1353 
mm/migrate.c:1508)
[  433.480900]  ? SYSC_move_pages (include/linux/rcupdate.h:800 
mm/migrate.c:1472)
[  433.480900]  ? sched_clock (arch/x86/include/asm/paravirt.h:192 
arch/x86/kernel/tsc.c:305)
[  433.480900]  SyS_move_pages (mm/migrate.c:1456)
[  433.480900]  tracesys (arch/x86/kernel/entry_64.S:749)


Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to