On Thu, May 04, 2017 at 07:56:57AM -0600, Jens Axboe wrote:
> On 05/04/2017 07:05 AM, Peter Zijlstra wrote:
> > 
> > By poking at /debug/sched_features I triggered the following splat:
> > 
> >  [] ======================================================
> >  [] WARNING: possible circular locking dependency detected
> >  [] 4.11.0-00873-g964c8b7-dirty #694 Not tainted
> >  [] ------------------------------------------------------
> >  [] bash/2109 is trying to acquire lock:
> >  []  (cpu_hotplug_lock.rw_sem){++++++}, at: [<ffffffff8120cb8b>] 
> > static_key_slow_dec+0x1b/0x50
> >  [] 
> >  [] but task is already holding lock:
> >  []  (&sb->s_type->i_mutex_key#4){+++++.}, at: [<ffffffff81140216>] 
> > sched_feat_write+0x86/0x170
> >  [] 
> >  [] which lock already depends on the new lock.
> >  [] 
> >  [] 
> >  [] the existing dependency chain (in reverse order) is:
> >  [] 
> >  [] -> #2 (&sb->s_type->i_mutex_key#4){+++++.}:
> >  []        lock_acquire+0x100/0x210
> >  []        down_write+0x28/0x60
> >  []        start_creating+0x5e/0xf0
> >  []        debugfs_create_dir+0x13/0x110
> >  []        blk_mq_debugfs_register+0x21/0x70
> >  []        blk_mq_register_dev+0x64/0xd0
> >  []        blk_register_queue+0x6a/0x170
> >  []        device_add_disk+0x22d/0x440
> >  []        loop_add+0x1f3/0x280
> >  []        loop_init+0x104/0x142
> >  []        do_one_initcall+0x43/0x180
> >  []        kernel_init_freeable+0x1de/0x266
> >  []        kernel_init+0xe/0x100
> >  []        ret_from_fork+0x31/0x40
> >  [] 
> >  [] -> #1 (all_q_mutex){+.+.+.}:
> >  []        lock_acquire+0x100/0x210
> >  []        __mutex_lock+0x6c/0x960
> >  []        mutex_lock_nested+0x1b/0x20
> >  []        blk_mq_init_allocated_queue+0x37c/0x4e0
> >  []        blk_mq_init_queue+0x3a/0x60
> >  []        loop_add+0xe5/0x280
> >  []        loop_init+0x104/0x142
> >  []        do_one_initcall+0x43/0x180
> >  []        kernel_init_freeable+0x1de/0x266
> >  []        kernel_init+0xe/0x100
> >  []        ret_from_fork+0x31/0x40
> > 
> >  []  *** DEADLOCK ***
> >  [] 
> >  [] 3 locks held by bash/2109:
> >  []  #0:  (sb_writers#11){.+.+.+}, at: [<ffffffff81292bcd>] 
> > vfs_write+0x17d/0x1a0
> >  []  #1:  (debugfs_srcu){......}, at: [<ffffffff8155a90d>] 
> > full_proxy_write+0x5d/0xd0
> >  []  #2:  (&sb->s_type->i_mutex_key#4){+++++.}, at: [<ffffffff81140216>] 
> > sched_feat_write+0x86/0x170
> >  [] 
> >  [] stack backtrace:
> >  [] CPU: 9 PID: 2109 Comm: bash Not tainted 4.11.0-00873-g964c8b7-dirty #694
> >  [] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS 
> > SE5C600.86B.02.02.0002.122320131210 12/23/2013
> >  [] Call Trace:
> > 
> >  []  lock_acquire+0x100/0x210
> >  []  get_online_cpus+0x2a/0x90
> >  []  static_key_slow_dec+0x1b/0x50
> >  []  static_key_disable+0x20/0x30
> >  []  sched_feat_write+0x131/0x170
> >  []  full_proxy_write+0x97/0xd0
> >  []  __vfs_write+0x28/0x120
> >  []  vfs_write+0xb5/0x1a0
> >  []  SyS_write+0x49/0xa0
> >  []  entry_SYSCALL_64_fastpath+0x23/0xc2
> > 
> > This is because of the cpu hotplug lock rework. Break the chain at #1
> > by reversing the lock acquisition order. This way i_mutex_key#4 no
> > longer depends on cpu_hotplug_lock and things are good.
> 
> Thanks Peter, applied.

Note that the hotplug rework is still work-in-progress and lives in a
-tip branch.

That said, the patch is harmless outside of that, so yes it can travel
upstream independently. But note that mainline cannot yet trigger that
splat.

Reply via email to