Re: [PATCH] Btrfs: fix log replay failure after unlink and link combination
On Fri, Mar 2, 2018 at 7:00 PM, Liu Bowrote: > On Fri, Mar 02, 2018 at 11:29:33AM -0700, Liu Bo wrote: >> On Wed, Feb 28, 2018 at 03:56:10PM +, fdman...@kernel.org wrote: >> > From: Filipe Manana >> > >> > If we have a file with 2 (or more) hard links in the same directory, >> > remove one of the hard links, create a new file (or link an existing file) >> > in the same directory with the name of the removed hard link, and then >> > finally fsync the new file, we end up with a log that fails to replay, >> > causing a mount failure. >> > >> > Example: >> > >> > $ mkfs.btrfs -f /dev/sdb >> > $ mount /dev/sdb /mnt >> > >> > $ mkdir /mnt/testdir >> > $ touch /mnt/testdir/foo >> > $ ln /mnt/testdir/foo /mnt/testdir/bar >> > >> > $ sync >> > >> > $ unlink /mnt/testdir/bar >> > $ touch /mnt/testdir/bar > > So here, "unlink & touch" similar to rename, therefore we could also > be conservative and set /mnt/testdir 's last_unlink_trans to force to > commit transaction. That would cause any fsync on any file inside the directory to result in a full transaction commit. We had in the past more "exotic" scenarios with renames and creating new files with the same old name where the initial solution was just that, forcing a full commit [1], just to find out later that it caused big performance drop on real workloads [2]. That's why I'm avoiding it, even because fsync'ing any file in the directory (existing or new) isn't uncommon at all. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=56f23fdbb600e6087db7b009775b95ce07cc3195 [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=44f714dae50a2e795d3268a6831762aa6fa54f55 thanks > > Thanks, > > -liubo > >> > $ xfs_io -c "fsync" /mnt/testdir/bar >> > >> > >> > >> > $ mount /dev/sdb /mnt >> > mount: mount(2) failed: /mnt: No such file or directory >> > >> > When replaying the log, for that example, we also see the following in >> > dmesg/syslog: >> > >> > [71813.671307] BTRFS info (device dm-0): failed to delete reference to >> > bar, inode 258 parent 257 >> > [71813.674204] [ cut here ] >> > [71813.675694] BTRFS: Transaction aborted (error -2) >> > [71813.677236] WARNING: CPU: 1 PID: 13231 at fs/btrfs/inode.c:4128 >> > __btrfs_unlink_inode+0x17b/0x355 [btrfs] >> > [71813.679669] Modules linked in: btrfs xfs f2fs dm_flakey dm_mod dax >> > ghash_clmulni_intel ppdev pcbc aesni_intel aes_x86_64 crypto_simd cryptd >> > glue_helper evdev psmouse i2c_piix4 parport_pc i2c_core pcspkr sg >> > serio_raw parport button sunrpc loop autofs4 ext4 crc16 mbcache jbd2 >> > zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov >> > async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c >> > crc32c_generic raid1 raid0 multipath linear md_mod ata_generic sd_mod >> > virtio_scsi ata_piix libata virtio_pci virtio_ring crc32c_intel floppy >> > virtio e1000 scsi_mod [last unloaded: btrfs] >> > [71813.679669] CPU: 1 PID: 13231 Comm: mount Tainted: GW >> > 4.15.0-rc9-btrfs-next-56+ #1 >> > [71813.679669] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), >> > BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014 >> > [71813.679669] RIP: 0010:__btrfs_unlink_inode+0x17b/0x355 [btrfs] >> > [71813.679669] RSP: 0018:c90001cef738 EFLAGS: 00010286 >> > [71813.679669] RAX: 0025 RBX: 880217ce4708 RCX: >> > 0001 >> > [71813.679669] RDX: RSI: 81c14bae RDI: >> > >> > [71813.679669] RBP: c90001cef7c0 R08: 0001 R09: >> > 0001 >> > [71813.679669] R10: c90001cef5e0 R11: 8343f007 R12: >> > 880217d474c8 >> > [71813.679669] R13: fffe R14: 88021ccf1548 R15: >> > 0101 >> > [71813.679669] FS: 7f7cee84c480() GS:88023fc8() >> > knlGS: >> > [71813.679669] CS: 0010 DS: ES: CR0: 80050033 >> > [71813.679669] CR2: 7f7cedc1abf9 CR3: 0002354b4003 CR4: >> > 001606e0 >> > [71813.679669] Call Trace: >> > [71813.679669] btrfs_unlink_inode+0x17/0x41 [btrfs] >> > [71813.679669] drop_one_dir_item+0xfa/0x131 [btrfs] >> > [71813.679669] add_inode_ref+0x71e/0x851 [btrfs] >> > [71813.679669] ? __lock_is_held+0x39/0x71 >> > [71813.679669] ? replay_one_buffer+0x53/0x53a [btrfs] >> > [71813.679669] replay_one_buffer+0x4a4/0x53a [btrfs] >> > [71813.679669] ? rcu_read_unlock+0x3a/0x57 >> > [71813.679669] ? __lock_is_held+0x39/0x71 >> > [71813.679669] walk_up_log_tree+0x101/0x1d2 [btrfs] >> > [71813.679669] walk_log_tree+0xad/0x188 [btrfs] >> > [71813.679669] btrfs_recover_log_trees+0x1fa/0x31e [btrfs] >> > [71813.679669] ? replay_one_extent+0x544/0x544 [btrfs] >> > [71813.679669] open_ctree+0x1cf6/0x2209 [btrfs] >> >
Re: [PATCH] Btrfs: fix log replay failure after unlink and link combination
On Fri, Mar 2, 2018 at 6:29 PM, Liu Bowrote: > On Wed, Feb 28, 2018 at 03:56:10PM +, fdman...@kernel.org wrote: >> From: Filipe Manana >> >> If we have a file with 2 (or more) hard links in the same directory, >> remove one of the hard links, create a new file (or link an existing file) >> in the same directory with the name of the removed hard link, and then >> finally fsync the new file, we end up with a log that fails to replay, >> causing a mount failure. >> >> Example: >> >> $ mkfs.btrfs -f /dev/sdb >> $ mount /dev/sdb /mnt >> >> $ mkdir /mnt/testdir >> $ touch /mnt/testdir/foo >> $ ln /mnt/testdir/foo /mnt/testdir/bar >> >> $ sync >> >> $ unlink /mnt/testdir/bar >> $ touch /mnt/testdir/bar >> $ xfs_io -c "fsync" /mnt/testdir/bar >> >> >> >> $ mount /dev/sdb /mnt >> mount: mount(2) failed: /mnt: No such file or directory >> >> When replaying the log, for that example, we also see the following in >> dmesg/syslog: >> >> [71813.671307] BTRFS info (device dm-0): failed to delete reference to >> bar, inode 258 parent 257 >> [71813.674204] [ cut here ] >> [71813.675694] BTRFS: Transaction aborted (error -2) >> [71813.677236] WARNING: CPU: 1 PID: 13231 at fs/btrfs/inode.c:4128 >> __btrfs_unlink_inode+0x17b/0x355 [btrfs] >> [71813.679669] Modules linked in: btrfs xfs f2fs dm_flakey dm_mod dax >> ghash_clmulni_intel ppdev pcbc aesni_intel aes_x86_64 crypto_simd cryptd >> glue_helper evdev psmouse i2c_piix4 parport_pc i2c_core pcspkr sg serio_raw >> parport button sunrpc loop autofs4 ext4 crc16 mbcache jbd2 zstd_decompress >> zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq >> async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 >> multipath linear md_mod ata_generic sd_mod virtio_scsi ata_piix libata >> virtio_pci virtio_ring crc32c_intel floppy virtio e1000 scsi_mod [last >> unloaded: btrfs] >> [71813.679669] CPU: 1 PID: 13231 Comm: mount Tainted: GW >> 4.15.0-rc9-btrfs-next-56+ #1 >> [71813.679669] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS >> rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014 >> [71813.679669] RIP: 0010:__btrfs_unlink_inode+0x17b/0x355 [btrfs] >> [71813.679669] RSP: 0018:c90001cef738 EFLAGS: 00010286 >> [71813.679669] RAX: 0025 RBX: 880217ce4708 RCX: >> 0001 >> [71813.679669] RDX: RSI: 81c14bae RDI: >> >> [71813.679669] RBP: c90001cef7c0 R08: 0001 R09: >> 0001 >> [71813.679669] R10: c90001cef5e0 R11: 8343f007 R12: >> 880217d474c8 >> [71813.679669] R13: fffe R14: 88021ccf1548 R15: >> 0101 >> [71813.679669] FS: 7f7cee84c480() GS:88023fc8() >> knlGS: >> [71813.679669] CS: 0010 DS: ES: CR0: 80050033 >> [71813.679669] CR2: 7f7cedc1abf9 CR3: 0002354b4003 CR4: >> 001606e0 >> [71813.679669] Call Trace: >> [71813.679669] btrfs_unlink_inode+0x17/0x41 [btrfs] >> [71813.679669] drop_one_dir_item+0xfa/0x131 [btrfs] >> [71813.679669] add_inode_ref+0x71e/0x851 [btrfs] >> [71813.679669] ? __lock_is_held+0x39/0x71 >> [71813.679669] ? replay_one_buffer+0x53/0x53a [btrfs] >> [71813.679669] replay_one_buffer+0x4a4/0x53a [btrfs] >> [71813.679669] ? rcu_read_unlock+0x3a/0x57 >> [71813.679669] ? __lock_is_held+0x39/0x71 >> [71813.679669] walk_up_log_tree+0x101/0x1d2 [btrfs] >> [71813.679669] walk_log_tree+0xad/0x188 [btrfs] >> [71813.679669] btrfs_recover_log_trees+0x1fa/0x31e [btrfs] >> [71813.679669] ? replay_one_extent+0x544/0x544 [btrfs] >> [71813.679669] open_ctree+0x1cf6/0x2209 [btrfs] >> [71813.679669] btrfs_mount_root+0x368/0x482 [btrfs] >> [71813.679669] ? trace_hardirqs_on_caller+0x14c/0x1a6 >> [71813.679669] ? __lockdep_init_map+0x176/0x1c2 >> [71813.679669] ? mount_fs+0x64/0x10b >> [71813.679669] mount_fs+0x64/0x10b >> [71813.679669] vfs_kern_mount+0x68/0xce >> [71813.679669] btrfs_mount+0x13e/0x772 [btrfs] >> [71813.679669] ? trace_hardirqs_on_caller+0x14c/0x1a6 >> [71813.679669] ? __lockdep_init_map+0x176/0x1c2 >> [71813.679669] ? mount_fs+0x64/0x10b >> [71813.679669] mount_fs+0x64/0x10b >> [71813.679669] vfs_kern_mount+0x68/0xce >> [71813.679669] do_mount+0x6e5/0x973 >> [71813.679669] ? memdup_user+0x3e/0x5c >> [71813.679669] SyS_mount+0x72/0x98 >> [71813.679669] entry_SYSCALL_64_fastpath+0x1e/0x8b >> [71813.679669] RIP: 0033:0x7f7cedf150ba >> [71813.679669] RSP: 002b:7ffca71da688 EFLAGS: 0206 >> [71813.679669] Code: 7f a0 e8 51 0c fd ff 48 8b 43 50 f0 0f ba a8 30 2c 00 >> 00 02 72 17 41 83 fd fb 74 11 44 89 ee 48 c7 c7 7d 11 7f a0 e8 38 f5 8d e0 >> <0f> ff 44 89 e9 ba 20 10 00 00 eb 4d 48 8b 4d b0 48 8b 75 88 4c >> [71813.679669] ---[ end trace
Re: [PATCH] Btrfs: fix log replay failure after unlink and link combination
On Fri, Mar 02, 2018 at 11:29:33AM -0700, Liu Bo wrote: > On Wed, Feb 28, 2018 at 03:56:10PM +, fdman...@kernel.org wrote: > > From: Filipe Manana> > > > If we have a file with 2 (or more) hard links in the same directory, > > remove one of the hard links, create a new file (or link an existing file) > > in the same directory with the name of the removed hard link, and then > > finally fsync the new file, we end up with a log that fails to replay, > > causing a mount failure. > > > > Example: > > > > $ mkfs.btrfs -f /dev/sdb > > $ mount /dev/sdb /mnt > > > > $ mkdir /mnt/testdir > > $ touch /mnt/testdir/foo > > $ ln /mnt/testdir/foo /mnt/testdir/bar > > > > $ sync > > > > $ unlink /mnt/testdir/bar > > $ touch /mnt/testdir/bar So here, "unlink & touch" similar to rename, therefore we could also be conservative and set /mnt/testdir 's last_unlink_trans to force to commit transaction. Thanks, -liubo > > $ xfs_io -c "fsync" /mnt/testdir/bar > > > > > > > > $ mount /dev/sdb /mnt > > mount: mount(2) failed: /mnt: No such file or directory > > > > When replaying the log, for that example, we also see the following in > > dmesg/syslog: > > > > [71813.671307] BTRFS info (device dm-0): failed to delete reference to > > bar, inode 258 parent 257 > > [71813.674204] [ cut here ] > > [71813.675694] BTRFS: Transaction aborted (error -2) > > [71813.677236] WARNING: CPU: 1 PID: 13231 at fs/btrfs/inode.c:4128 > > __btrfs_unlink_inode+0x17b/0x355 [btrfs] > > [71813.679669] Modules linked in: btrfs xfs f2fs dm_flakey dm_mod dax > > ghash_clmulni_intel ppdev pcbc aesni_intel aes_x86_64 crypto_simd cryptd > > glue_helper evdev psmouse i2c_piix4 parport_pc i2c_core pcspkr sg serio_raw > > parport button sunrpc loop autofs4 ext4 crc16 mbcache jbd2 zstd_decompress > > zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq > > async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 > > multipath linear md_mod ata_generic sd_mod virtio_scsi ata_piix libata > > virtio_pci virtio_ring crc32c_intel floppy virtio e1000 scsi_mod [last > > unloaded: btrfs] > > [71813.679669] CPU: 1 PID: 13231 Comm: mount Tainted: GW > > 4.15.0-rc9-btrfs-next-56+ #1 > > [71813.679669] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > > BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014 > > [71813.679669] RIP: 0010:__btrfs_unlink_inode+0x17b/0x355 [btrfs] > > [71813.679669] RSP: 0018:c90001cef738 EFLAGS: 00010286 > > [71813.679669] RAX: 0025 RBX: 880217ce4708 RCX: > > 0001 > > [71813.679669] RDX: RSI: 81c14bae RDI: > > > > [71813.679669] RBP: c90001cef7c0 R08: 0001 R09: > > 0001 > > [71813.679669] R10: c90001cef5e0 R11: 8343f007 R12: > > 880217d474c8 > > [71813.679669] R13: fffe R14: 88021ccf1548 R15: > > 0101 > > [71813.679669] FS: 7f7cee84c480() GS:88023fc8() > > knlGS: > > [71813.679669] CS: 0010 DS: ES: CR0: 80050033 > > [71813.679669] CR2: 7f7cedc1abf9 CR3: 0002354b4003 CR4: > > 001606e0 > > [71813.679669] Call Trace: > > [71813.679669] btrfs_unlink_inode+0x17/0x41 [btrfs] > > [71813.679669] drop_one_dir_item+0xfa/0x131 [btrfs] > > [71813.679669] add_inode_ref+0x71e/0x851 [btrfs] > > [71813.679669] ? __lock_is_held+0x39/0x71 > > [71813.679669] ? replay_one_buffer+0x53/0x53a [btrfs] > > [71813.679669] replay_one_buffer+0x4a4/0x53a [btrfs] > > [71813.679669] ? rcu_read_unlock+0x3a/0x57 > > [71813.679669] ? __lock_is_held+0x39/0x71 > > [71813.679669] walk_up_log_tree+0x101/0x1d2 [btrfs] > > [71813.679669] walk_log_tree+0xad/0x188 [btrfs] > > [71813.679669] btrfs_recover_log_trees+0x1fa/0x31e [btrfs] > > [71813.679669] ? replay_one_extent+0x544/0x544 [btrfs] > > [71813.679669] open_ctree+0x1cf6/0x2209 [btrfs] > > [71813.679669] btrfs_mount_root+0x368/0x482 [btrfs] > > [71813.679669] ? trace_hardirqs_on_caller+0x14c/0x1a6 > > [71813.679669] ? __lockdep_init_map+0x176/0x1c2 > > [71813.679669] ? mount_fs+0x64/0x10b > > [71813.679669] mount_fs+0x64/0x10b > > [71813.679669] vfs_kern_mount+0x68/0xce > > [71813.679669] btrfs_mount+0x13e/0x772 [btrfs] > > [71813.679669] ? trace_hardirqs_on_caller+0x14c/0x1a6 > > [71813.679669] ? __lockdep_init_map+0x176/0x1c2 > > [71813.679669] ? mount_fs+0x64/0x10b > > [71813.679669] mount_fs+0x64/0x10b > > [71813.679669] vfs_kern_mount+0x68/0xce > > [71813.679669] do_mount+0x6e5/0x973 > > [71813.679669] ? memdup_user+0x3e/0x5c > > [71813.679669] SyS_mount+0x72/0x98 > > [71813.679669] entry_SYSCALL_64_fastpath+0x1e/0x8b > > [71813.679669] RIP: 0033:0x7f7cedf150ba > > [71813.679669] RSP: 002b:7ffca71da688 EFLAGS:
Re: [PATCH] Btrfs: fix log replay failure after unlink and link combination
On Wed, Feb 28, 2018 at 03:56:10PM +, fdman...@kernel.org wrote: > From: Filipe Manana> > If we have a file with 2 (or more) hard links in the same directory, > remove one of the hard links, create a new file (or link an existing file) > in the same directory with the name of the removed hard link, and then > finally fsync the new file, we end up with a log that fails to replay, > causing a mount failure. > > Example: > > $ mkfs.btrfs -f /dev/sdb > $ mount /dev/sdb /mnt > > $ mkdir /mnt/testdir > $ touch /mnt/testdir/foo > $ ln /mnt/testdir/foo /mnt/testdir/bar > > $ sync > > $ unlink /mnt/testdir/bar > $ touch /mnt/testdir/bar > $ xfs_io -c "fsync" /mnt/testdir/bar > > > > $ mount /dev/sdb /mnt > mount: mount(2) failed: /mnt: No such file or directory > > When replaying the log, for that example, we also see the following in > dmesg/syslog: > > [71813.671307] BTRFS info (device dm-0): failed to delete reference to bar, > inode 258 parent 257 > [71813.674204] [ cut here ] > [71813.675694] BTRFS: Transaction aborted (error -2) > [71813.677236] WARNING: CPU: 1 PID: 13231 at fs/btrfs/inode.c:4128 > __btrfs_unlink_inode+0x17b/0x355 [btrfs] > [71813.679669] Modules linked in: btrfs xfs f2fs dm_flakey dm_mod dax > ghash_clmulni_intel ppdev pcbc aesni_intel aes_x86_64 crypto_simd cryptd > glue_helper evdev psmouse i2c_piix4 parport_pc i2c_core pcspkr sg serio_raw > parport button sunrpc loop autofs4 ext4 crc16 mbcache jbd2 zstd_decompress > zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq > async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 > multipath linear md_mod ata_generic sd_mod virtio_scsi ata_piix libata > virtio_pci virtio_ring crc32c_intel floppy virtio e1000 scsi_mod [last > unloaded: btrfs] > [71813.679669] CPU: 1 PID: 13231 Comm: mount Tainted: GW > 4.15.0-rc9-btrfs-next-56+ #1 > [71813.679669] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014 > [71813.679669] RIP: 0010:__btrfs_unlink_inode+0x17b/0x355 [btrfs] > [71813.679669] RSP: 0018:c90001cef738 EFLAGS: 00010286 > [71813.679669] RAX: 0025 RBX: 880217ce4708 RCX: > 0001 > [71813.679669] RDX: RSI: 81c14bae RDI: > > [71813.679669] RBP: c90001cef7c0 R08: 0001 R09: > 0001 > [71813.679669] R10: c90001cef5e0 R11: 8343f007 R12: > 880217d474c8 > [71813.679669] R13: fffe R14: 88021ccf1548 R15: > 0101 > [71813.679669] FS: 7f7cee84c480() GS:88023fc8() > knlGS: > [71813.679669] CS: 0010 DS: ES: CR0: 80050033 > [71813.679669] CR2: 7f7cedc1abf9 CR3: 0002354b4003 CR4: > 001606e0 > [71813.679669] Call Trace: > [71813.679669] btrfs_unlink_inode+0x17/0x41 [btrfs] > [71813.679669] drop_one_dir_item+0xfa/0x131 [btrfs] > [71813.679669] add_inode_ref+0x71e/0x851 [btrfs] > [71813.679669] ? __lock_is_held+0x39/0x71 > [71813.679669] ? replay_one_buffer+0x53/0x53a [btrfs] > [71813.679669] replay_one_buffer+0x4a4/0x53a [btrfs] > [71813.679669] ? rcu_read_unlock+0x3a/0x57 > [71813.679669] ? __lock_is_held+0x39/0x71 > [71813.679669] walk_up_log_tree+0x101/0x1d2 [btrfs] > [71813.679669] walk_log_tree+0xad/0x188 [btrfs] > [71813.679669] btrfs_recover_log_trees+0x1fa/0x31e [btrfs] > [71813.679669] ? replay_one_extent+0x544/0x544 [btrfs] > [71813.679669] open_ctree+0x1cf6/0x2209 [btrfs] > [71813.679669] btrfs_mount_root+0x368/0x482 [btrfs] > [71813.679669] ? trace_hardirqs_on_caller+0x14c/0x1a6 > [71813.679669] ? __lockdep_init_map+0x176/0x1c2 > [71813.679669] ? mount_fs+0x64/0x10b > [71813.679669] mount_fs+0x64/0x10b > [71813.679669] vfs_kern_mount+0x68/0xce > [71813.679669] btrfs_mount+0x13e/0x772 [btrfs] > [71813.679669] ? trace_hardirqs_on_caller+0x14c/0x1a6 > [71813.679669] ? __lockdep_init_map+0x176/0x1c2 > [71813.679669] ? mount_fs+0x64/0x10b > [71813.679669] mount_fs+0x64/0x10b > [71813.679669] vfs_kern_mount+0x68/0xce > [71813.679669] do_mount+0x6e5/0x973 > [71813.679669] ? memdup_user+0x3e/0x5c > [71813.679669] SyS_mount+0x72/0x98 > [71813.679669] entry_SYSCALL_64_fastpath+0x1e/0x8b > [71813.679669] RIP: 0033:0x7f7cedf150ba > [71813.679669] RSP: 002b:7ffca71da688 EFLAGS: 0206 > [71813.679669] Code: 7f a0 e8 51 0c fd ff 48 8b 43 50 f0 0f ba a8 30 2c 00 > 00 02 72 17 41 83 fd fb 74 11 44 89 ee 48 c7 c7 7d 11 7f a0 e8 38 f5 8d e0 > <0f> ff 44 89 e9 ba 20 10 00 00 eb 4d 48 8b 4d b0 48 8b 75 88 4c > [71813.679669] ---[ end trace 83bd473fc5b4663b ]--- > [71813.854764] BTRFS: error (device dm-0) in __btrfs_unlink_inode:4128: > errno=-2 No such entry > [71813.886994] BTRFS: error (device
[PATCH] Btrfs: fix log replay failure after unlink and link combination
From: Filipe MananaIf we have a file with 2 (or more) hard links in the same directory, remove one of the hard links, create a new file (or link an existing file) in the same directory with the name of the removed hard link, and then finally fsync the new file, we end up with a log that fails to replay, causing a mount failure. Example: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ mkdir /mnt/testdir $ touch /mnt/testdir/foo $ ln /mnt/testdir/foo /mnt/testdir/bar $ sync $ unlink /mnt/testdir/bar $ touch /mnt/testdir/bar $ xfs_io -c "fsync" /mnt/testdir/bar $ mount /dev/sdb /mnt mount: mount(2) failed: /mnt: No such file or directory When replaying the log, for that example, we also see the following in dmesg/syslog: [71813.671307] BTRFS info (device dm-0): failed to delete reference to bar, inode 258 parent 257 [71813.674204] [ cut here ] [71813.675694] BTRFS: Transaction aborted (error -2) [71813.677236] WARNING: CPU: 1 PID: 13231 at fs/btrfs/inode.c:4128 __btrfs_unlink_inode+0x17b/0x355 [btrfs] [71813.679669] Modules linked in: btrfs xfs f2fs dm_flakey dm_mod dax ghash_clmulni_intel ppdev pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper evdev psmouse i2c_piix4 parport_pc i2c_core pcspkr sg serio_raw parport button sunrpc loop autofs4 ext4 crc16 mbcache jbd2 zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod ata_generic sd_mod virtio_scsi ata_piix libata virtio_pci virtio_ring crc32c_intel floppy virtio e1000 scsi_mod [last unloaded: btrfs] [71813.679669] CPU: 1 PID: 13231 Comm: mount Tainted: GW 4.15.0-rc9-btrfs-next-56+ #1 [71813.679669] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014 [71813.679669] RIP: 0010:__btrfs_unlink_inode+0x17b/0x355 [btrfs] [71813.679669] RSP: 0018:c90001cef738 EFLAGS: 00010286 [71813.679669] RAX: 0025 RBX: 880217ce4708 RCX: 0001 [71813.679669] RDX: RSI: 81c14bae RDI: [71813.679669] RBP: c90001cef7c0 R08: 0001 R09: 0001 [71813.679669] R10: c90001cef5e0 R11: 8343f007 R12: 880217d474c8 [71813.679669] R13: fffe R14: 88021ccf1548 R15: 0101 [71813.679669] FS: 7f7cee84c480() GS:88023fc8() knlGS: [71813.679669] CS: 0010 DS: ES: CR0: 80050033 [71813.679669] CR2: 7f7cedc1abf9 CR3: 0002354b4003 CR4: 001606e0 [71813.679669] Call Trace: [71813.679669] btrfs_unlink_inode+0x17/0x41 [btrfs] [71813.679669] drop_one_dir_item+0xfa/0x131 [btrfs] [71813.679669] add_inode_ref+0x71e/0x851 [btrfs] [71813.679669] ? __lock_is_held+0x39/0x71 [71813.679669] ? replay_one_buffer+0x53/0x53a [btrfs] [71813.679669] replay_one_buffer+0x4a4/0x53a [btrfs] [71813.679669] ? rcu_read_unlock+0x3a/0x57 [71813.679669] ? __lock_is_held+0x39/0x71 [71813.679669] walk_up_log_tree+0x101/0x1d2 [btrfs] [71813.679669] walk_log_tree+0xad/0x188 [btrfs] [71813.679669] btrfs_recover_log_trees+0x1fa/0x31e [btrfs] [71813.679669] ? replay_one_extent+0x544/0x544 [btrfs] [71813.679669] open_ctree+0x1cf6/0x2209 [btrfs] [71813.679669] btrfs_mount_root+0x368/0x482 [btrfs] [71813.679669] ? trace_hardirqs_on_caller+0x14c/0x1a6 [71813.679669] ? __lockdep_init_map+0x176/0x1c2 [71813.679669] ? mount_fs+0x64/0x10b [71813.679669] mount_fs+0x64/0x10b [71813.679669] vfs_kern_mount+0x68/0xce [71813.679669] btrfs_mount+0x13e/0x772 [btrfs] [71813.679669] ? trace_hardirqs_on_caller+0x14c/0x1a6 [71813.679669] ? __lockdep_init_map+0x176/0x1c2 [71813.679669] ? mount_fs+0x64/0x10b [71813.679669] mount_fs+0x64/0x10b [71813.679669] vfs_kern_mount+0x68/0xce [71813.679669] do_mount+0x6e5/0x973 [71813.679669] ? memdup_user+0x3e/0x5c [71813.679669] SyS_mount+0x72/0x98 [71813.679669] entry_SYSCALL_64_fastpath+0x1e/0x8b [71813.679669] RIP: 0033:0x7f7cedf150ba [71813.679669] RSP: 002b:7ffca71da688 EFLAGS: 0206 [71813.679669] Code: 7f a0 e8 51 0c fd ff 48 8b 43 50 f0 0f ba a8 30 2c 00 00 02 72 17 41 83 fd fb 74 11 44 89 ee 48 c7 c7 7d 11 7f a0 e8 38 f5 8d e0 <0f> ff 44 89 e9 ba 20 10 00 00 eb 4d 48 8b 4d b0 48 8b 75 88 4c [71813.679669] ---[ end trace 83bd473fc5b4663b ]--- [71813.854764] BTRFS: error (device dm-0) in __btrfs_unlink_inode:4128: errno=-2 No such entry [71813.886994] BTRFS: error (device dm-0) in btrfs_replay_log:2307: errno=-2 No such entry (Failed to recover log tree) [71813.903357] BTRFS error (device dm-0): cleaner transaction attach returned -30 [71814.128078] BTRFS error (device dm-0): open_ctree failed This happens because the log has inode reference