On Thu, Mar 05, 2015 at 10:17:14AM +0000, Filipe David Manana wrote:
> On Thu, Mar 5, 2015 at 2:56 AM, Zygo Blaxell
> <ce3g8...@umail.furryterror.org> wrote:
> > rsync seems to get stuck just by reading the same file that extent-same is
> > acting upon.
> >
> > Mar  4 21:35:08 sneezy kernel: [89798.758960] INFO: task rsync:7425 blocked 
> > for more than 1800 seconds.
> > Mar  4 21:35:08 sneezy kernel: [89798.759007]       Tainted: G        W     
> >  3.18.8-zb64+ #1
> > Mar  4 21:35:08 sneezy kernel: [89798.759048] "echo 0 > 
> > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > Mar  4 21:35:08 sneezy kernel: [89798.759121] rsync           D 
> > 0000000000000000     0  7425   7423 0x00000000
> > Mar  4 21:35:08 sneezy kernel: [89798.759129]  ffff88005afaf988 
> > 0000000000000096 0000000000000286 ffff8800036c1000
> > Mar  4 21:35:08 sneezy kernel: [89798.759135]  00000000001e0880 
> > ffffffff81e452d0 ffff88005afaffd8 00000000001e0880
> > Mar  4 21:35:08 sneezy kernel: [89798.759141]  ffff88024dac9000 
> > ffff8800036c1000 ffff88005afaf8b8 0000000000000000
> > Mar  4 21:35:08 sneezy kernel: [89798.759147] Call Trace:
> > Mar  4 21:35:08 sneezy kernel: [89798.759158]  [<ffffffff810211fd>] ? 
> > native_sched_clock+0x2d/0xb0
> > Mar  4 21:35:08 sneezy kernel: [89798.759164]  [<ffffffff810c61fe>] ? 
> > put_lock_stats.isra.18+0xe/0x30
> > Mar  4 21:35:08 sneezy kernel: [89798.759170]  [<ffffffff810c638d>] ? 
> > lock_release_holdtime.part.19+0x16d/0x220
> > Mar  4 21:35:08 sneezy kernel: [89798.759176]  [<ffffffff813bb208>] ? 
> > lock_extent_bits+0x1a8/0x200
> > Mar  4 21:35:08 sneezy kernel: [89798.759182]  [<ffffffff818b1989>] 
> > schedule+0x29/0x70
> > Mar  4 21:35:08 sneezy kernel: [89798.759187]  [<ffffffff813bb20d>] 
> > lock_extent_bits+0x1ad/0x200
> > Mar  4 21:35:08 sneezy kernel: [89798.759192]  [<ffffffff810bd940>] ? 
> > __wake_up_sync+0x20/0x20
> > Mar  4 21:35:08 sneezy kernel: [89798.759197]  [<ffffffff813bb273>] 
> > lock_extent+0x13/0x20
> > Mar  4 21:35:08 sneezy kernel: [89798.759202]  [<ffffffff813be1ec>] 
> > __extent_readpages.constprop.37+0x21c/0x2c0
> > Mar  4 21:35:08 sneezy kernel: [89798.759208]  [<ffffffff813a0fe0>] ? 
> > btrfs_direct_IO+0x360/0x360
> > Mar  4 21:35:08 sneezy kernel: [89798.759213]  [<ffffffff813bf619>] 
> > extent_readpages+0x179/0x1c0
> > Mar  4 21:35:08 sneezy kernel: [89798.759217]  [<ffffffff813a0fe0>] ? 
> > btrfs_direct_IO+0x360/0x360
> > Mar  4 21:35:08 sneezy kernel: [89798.759224]  [<ffffffff8139f57f>] 
> > btrfs_readpages+0x1f/0x30
> > Mar  4 21:35:08 sneezy kernel: [89798.759229]  [<ffffffff811aa0e7>] 
> > __do_page_cache_readahead+0x207/0x2a0
> > Mar  4 21:35:08 sneezy kernel: [89798.759233]  [<ffffffff811a9fb0>] ? 
> > __do_page_cache_readahead+0xd0/0x2a0
> > Mar  4 21:35:08 sneezy kernel: [89798.759239]  [<ffffffff811aa262>] 
> > ondemand_readahead+0xe2/0x320
> > Mar  4 21:35:08 sneezy kernel: [89798.759243]  [<ffffffff811aa300>] ? 
> > ondemand_readahead+0x180/0x320
> > Mar  4 21:35:08 sneezy kernel: [89798.759248]  [<ffffffff811aa5d1>] 
> > page_cache_sync_readahead+0x31/0x50
> > Mar  4 21:35:08 sneezy kernel: [89798.759253]  [<ffffffff8119d3fc>] 
> > generic_file_read_iter+0x50c/0x620
> > Mar  4 21:35:08 sneezy kernel: [89798.759259]  [<ffffffff818b7d71>] ? 
> > _raw_spin_unlock_irq+0x41/0x70
> > Mar  4 21:35:08 sneezy kernel: [89798.759264]  [<ffffffff81219c1e>] 
> > new_sync_read+0x7e/0xb0
> > Mar  4 21:35:08 sneezy kernel: [89798.759270]  [<ffffffff8121a3d7>] 
> > vfs_read+0x97/0x180
> > Mar  4 21:35:08 sneezy kernel: [89798.759275]  [<ffffffff8121afbd>] 
> > SyS_read+0x4d/0xc0
> > Mar  4 21:35:08 sneezy kernel: [89798.759280]  [<ffffffff818b8c2d>] 
> > system_call_fastpath+0x16/0x1b
> > Mar  4 21:35:08 sneezy kernel: [89798.759284] no locks held by rsync/7425.
> > Mar  4 21:35:08 sneezy kernel: [89798.759290] INFO: task btrsame:23612 
> > blocked for more than 1800 seconds.
> > Mar  4 21:35:08 sneezy kernel: [89798.759329]       Tainted: G        W     
> >  3.18.8-zb64+ #1
> > Mar  4 21:35:08 sneezy kernel: [89798.759367] "echo 0 > 
> > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > Mar  4 21:35:08 sneezy kernel: [89798.759437] btrsame         D 
> > 0000000000000000     0 23612  27339 0x00000000
> > Mar  4 21:35:08 sneezy kernel: [89798.759443]  ffff880003c2bc18 
> > 0000000000000096 ffffffff82927070 ffff88000a881000
> > Mar  4 21:35:08 sneezy kernel: [89798.759449]  00000000001e0880 
> > 00000000ffffffff ffff880003c2bfd8 00000000001e0880
> > Mar  4 21:35:08 sneezy kernel: [89798.759455]  ffff880014765000 
> > ffff88000a881000 ffff880003c2bb58 ffffffff810a6621
> > Mar  4 21:35:08 sneezy kernel: [89798.759461] Call Trace:
> > Mar  4 21:35:08 sneezy kernel: [89798.759467]  [<ffffffff810a6621>] ? 
> > preempt_count_sub+0x51/0x60
> > Mar  4 21:35:08 sneezy kernel: [89798.759471]  [<ffffffff818b2390>] ? 
> > bit_wait+0x60/0x60
> > Mar  4 21:35:08 sneezy kernel: [89798.759475]  [<ffffffff810c90a9>] ? 
> > mark_held_locks+0x79/0xb0
> > Mar  4 21:35:08 sneezy kernel: [89798.759481]  [<ffffffff81102c1d>] ? 
> > ktime_get+0x10d/0x150
> > Mar  4 21:35:08 sneezy kernel: [89798.759486]  [<ffffffff81020bf9>] ? 
> > read_tsc+0x9/0x10
> > Mar  4 21:35:08 sneezy kernel: [89798.759490]  [<ffffffff81102bbc>] ? 
> > ktime_get+0xac/0x150
> > Mar  4 21:35:08 sneezy kernel: [89798.759496]  [<ffffffff8114e713>] ? 
> > __delayacct_blkio_start+0x23/0x30
> > Mar  4 21:35:08 sneezy kernel: [89798.759500]  [<ffffffff818b2390>] ? 
> > bit_wait+0x60/0x60
> > Mar  4 21:35:08 sneezy kernel: [89798.759504]  [<ffffffff818b1989>] 
> > schedule+0x29/0x70
> > Mar  4 21:35:08 sneezy kernel: [89798.759508]  [<ffffffff818b1cb8>] 
> > io_schedule+0x98/0x100
> > Mar  4 21:35:08 sneezy kernel: [89798.759512]  [<ffffffff818b23c4>] 
> > bit_wait_io+0x34/0x60
> > Mar  4 21:35:08 sneezy kernel: [89798.759517]  [<ffffffff818b224b>] 
> > __wait_on_bit_lock+0x4b/0xb0
> > Mar  4 21:35:08 sneezy kernel: [89798.759521]  [<ffffffff8119b791>] ? 
> > find_get_entry+0x81/0xe0
> > Mar  4 21:35:08 sneezy kernel: [89798.759526]  [<ffffffff8119b586>] 
> > __lock_page+0xa6/0xb0
> > Mar  4 21:35:08 sneezy kernel: [89798.759530]  [<ffffffff810bd980>] ? 
> > autoremove_wake_function+0x40/0x40
> > Mar  4 21:35:08 sneezy kernel: [89798.759535]  [<ffffffff8119c0e8>] 
> > pagecache_get_page+0x168/0x1a0
> > Mar  4 21:35:08 sneezy kernel: [89798.759540]  [<ffffffff813d0d4d>] 
> > extent_same_get_page+0x2d/0xc0
> > Mar  4 21:35:08 sneezy kernel: [89798.759545]  [<ffffffff813d7e79>] 
> > btrfs_ioctl+0x2729/0x2930
> > Mar  4 21:35:08 sneezy kernel: [89798.759550]  [<ffffffff810c61fe>] ? 
> > put_lock_stats.isra.18+0xe/0x30
> > Mar  4 21:35:08 sneezy kernel: [89798.759556]  [<ffffffff811c9d2e>] ? 
> > might_fault+0x5e/0xc0
> > Mar  4 21:35:08 sneezy kernel: [89798.759562]  [<ffffffff8122e940>] 
> > do_vfs_ioctl+0x2f0/0x510
> > Mar  4 21:35:08 sneezy kernel: [89798.759567]  [<ffffffff818b8c59>] ? 
> > sysret_check+0x22/0x5d
> > Mar  4 21:35:08 sneezy kernel: [89798.759571]  [<ffffffff8122ebe1>] 
> > SyS_ioctl+0x81/0xa0
> > Mar  4 21:35:08 sneezy kernel: [89798.759576]  [<ffffffff818b8c2d>] 
> > system_call_fastpath+0x16/0x1b
> > Mar  4 21:35:08 sneezy kernel: [89798.759580] 3 locks held by btrsame/23612:
> > Mar  4 21:35:08 sneezy kernel: [89798.759582]  #0:  (sb_writers#8){.+.+.+}, 
> > at: [<ffffffff8123d0a8>] mnt_want_write_file+0x28/0x60
> > Mar  4 21:35:08 sneezy kernel: [89798.759593]  #1:  
> > (&sb->s_type->i_mutex_key#12/1){+.+.+.}, at: [<ffffffff813d7aa5>] 
> > btrfs_ioctl+0x2355/0x2930
> > Mar  4 21:35:08 sneezy kernel: [89798.759604]  #2:  
> > (&sb->s_type->i_mutex_key#12/2){+.+.+.}, at: [<ffffffff813d7bf1>] 
> > btrfs_ioctl+0x24a1/0x2930
> >
> 
> Normal, readpages() is called with the pages locked and then it tries
> to lock the extent range in the inode's io tree, while extent-same
> does the opposite, it locks the range first and then tries to lock the
> pages.

Just FYI, this definitely looks like a deadlock. I'm putting together a
small program to (hopefully) reproduce trivially on my end. Reversing the
order in extent_same should work but I have to read around the code to be sure
it's done correctly.
        --Mark

--
Mark Fasheh
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to