[PATCH 3/6] Btrfs: fix freeing used extents after removing empty block group
There's a race between adding a block group to the list of the unused block groups and removing an unused block group (cleaner kthread) that leads to freeing extents that are in use or a crash during transaction commmit. Basically the cleaner kthread, when executing btrfs_delete_unused_bgs(), might catch the newly added block group to the list fs_info-unused_bgs and clear the range representing the whole group from fs_info-freed_extents[] before the task that added the block group to the list (running update_block_group()) marked the last freed extent as dirty in fs_info-freed_extents (pinned_extents). That is: CPU 1CPU 2 btrfs_delete_unused_bgs() update_block_group() add block group to fs_info-unused_bgs got block group from the list clear_extent_bits for the whole block group range in freed_extents[] set_extent_dirty for the range covering the freed extent in freed_extents[] (fs_info-pinned_extents) block group deleted, and a new block group with the same logical address is created reserve space from the new block group for new data or metadata - the reserved space overlaps the range specified by CPU 1 for set_extent_dirty() commit transaction find all ranges marked as dirty in fs_info-pinned_extents, clear them and add them to the free space cache Alternatively, if CPU 2 doesn't create a new block group with the same logical address, we get a crash/BUG_ON at transaction commit when unpining extent ranges because we can't find a block group for the range marked as dirty by CPU 1. Sample trace: [ 2163.426462] invalid opcode: [#1] SMP DEBUG_PAGEALLOC [ 2163.426640] Modules linked in: btrfs xor raid6_pq dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio crc32c_generic libcrc32c dm_mod nfsd auth_rpc gss oid_registry nfs_acl nfs lockd fscache sunrpc loop psmouse parport_pc parport i2c_piix4 processor thermal_sys i2ccore evdev button pcspkr microcode serio_raw ext4 crc16 jbd2 mbcache sg sr_mod cdrom sd_mod crc_t10dif crct10dif_generic crct10dif_common ata_generic virtio_scsi floppy ata_piix libata e1000 scsi_mod virtio_pci virtio_ring virtio [ 2163.428209] CPU: 0 PID: 11858 Comm: btrfs-transacti Tainted: GW 3.17.0-rc5-btrfs-next-1+ #1 [ 2163.428519] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 [ 2163.428875] task: 88009f2c0650 ti: 8801356bc000 task.ti: 8801356bc000 [ 2163.429157] RIP: 0010:[a037728e] [a037728e] unpin_extent_range.isra.58+0x62/0x192 [btrfs] [ 2163.429562] RSP: 0018:8801356bfda8 EFLAGS: 00010246 [ 2163.429802] RAX: RBX: RCX: [ 2163.429990] RDX: 41bf RSI: 01c0 RDI: 880024307080 [ 2163.430042] RBP: 8801356bfde8 R08: 0068 R09: 88003734f118 [ 2163.430042] R10: 8801356bfcb8 R11: fb69 R12: 8800243070d0 [ 2163.430042] R13: 83c04000 R14: 8800751b0f00 R15: 880024307000 [ 2163.430042] FS: () GS:88013f40() knlGS: [ 2163.430042] CS: 0010 DS: ES: CR0: 8005003b [ 2163.430042] CR2: 7ff10eb43fc0 CR3: 04cb8000 CR4: 06f0 [ 2163.430042] Stack: [ 2163.430042] 8800243070d0 83c08000 83c07fff 88012d6bc800 [ 2163.430042] 8800243070d0 8800751b0f18 8800751b0f00 [ 2163.430042] 8801356bfe18 a037a481 83c04000 83c07fff [ 2163.430042] Call Trace: [ 2163.430042] [a037a481] btrfs_finish_extent_commit+0xac/0xbf [btrfs] [ 2163.430042] [a038c06d] btrfs_commit_transaction+0x6ee/0x882 [btrfs] [ 2163.430042] [a03881f1] transaction_kthread+0xf2/0x1a4 [btrfs] [ 2163.430042] [a03880ff] ? btrfs_cleanup_transaction+0x3d8/0x3d8 [btrfs] [ 2163.430042] [8105966b] kthread+0xb7/0xbf [ 2163.430042] [810595b4] ? __kthread_parkme+0x67/0x67 [ 2163.430042] [813ebeac] ret_from_fork+0x7c/0xb0 [ 2163.430042] [810595b4] ? __kthread_parkme+0x67/0x67 So fix this by making update_block_group() first set the range as dirty in pinned_extents before adding the block group to the unused_bgs list. Signed-off-by: Filipe Manana fdman...@suse.com --- fs/btrfs/extent-tree.c | 21 ++--- 1 file changed, 10 insertions(+), 11 deletions(-)
Re: [PATCH 3/6] Btrfs: fix freeing used extents after removing empty block group
On 11/26/2014 10:28 AM, Filipe Manana wrote: There's a race between adding a block group to the list of the unused block groups and removing an unused block group (cleaner kthread) that leads to freeing extents that are in use or a crash during transaction commmit. Basically the cleaner kthread, when executing btrfs_delete_unused_bgs(), might catch the newly added block group to the list fs_info-unused_bgs and clear the range representing the whole group from fs_info-freed_extents[] before the task that added the block group to the list (running update_block_group()) marked the last freed extent as dirty in fs_info-freed_extents (pinned_extents). That is: CPU 1CPU 2 btrfs_delete_unused_bgs() update_block_group() add block group to fs_info-unused_bgs got block group from the list clear_extent_bits for the whole block group range in freed_extents[] set_extent_dirty for the range covering the freed extent in freed_extents[] (fs_info-pinned_extents) block group deleted, and a new block group with the same logical address is created reserve space from the new block group for new data or metadata - the reserved space overlaps the range specified by CPU 1 for set_extent_dirty() commit transaction find all ranges marked as dirty in fs_info-pinned_extents, clear them and add them to the free space cache Alternatively, if CPU 2 doesn't create a new block group with the same logical address, we get a crash/BUG_ON at transaction commit when unpining extent ranges because we can't find a block group for the range marked as dirty by CPU 1. Sample trace: [ 2163.426462] invalid opcode: [#1] SMP DEBUG_PAGEALLOC [ 2163.426640] Modules linked in: btrfs xor raid6_pq dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio crc32c_generic libcrc32c dm_mod nfsd auth_rpc gss oid_registry nfs_acl nfs lockd fscache sunrpc loop psmouse parport_pc parport i2c_piix4 processor thermal_sys i2ccore evdev button pcspkr microcode serio_raw ext4 crc16 jbd2 mbcache sg sr_mod cdrom sd_mod crc_t10dif crct10dif_generic crct10dif_common ata_generic virtio_scsi floppy ata_piix libata e1000 scsi_mod virtio_pci virtio_ring virtio [ 2163.428209] CPU: 0 PID: 11858 Comm: btrfs-transacti Tainted: GW 3.17.0-rc5-btrfs-next-1+ #1 [ 2163.428519] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 [ 2163.428875] task: 88009f2c0650 ti: 8801356bc000 task.ti: 8801356bc000 [ 2163.429157] RIP: 0010:[a037728e] [a037728e] unpin_extent_range.isra.58+0x62/0x192 [btrfs] [ 2163.429562] RSP: 0018:8801356bfda8 EFLAGS: 00010246 [ 2163.429802] RAX: RBX: RCX: [ 2163.429990] RDX: 41bf RSI: 01c0 RDI: 880024307080 [ 2163.430042] RBP: 8801356bfde8 R08: 0068 R09: 88003734f118 [ 2163.430042] R10: 8801356bfcb8 R11: fb69 R12: 8800243070d0 [ 2163.430042] R13: 83c04000 R14: 8800751b0f00 R15: 880024307000 [ 2163.430042] FS: () GS:88013f40() knlGS: [ 2163.430042] CS: 0010 DS: ES: CR0: 8005003b [ 2163.430042] CR2: 7ff10eb43fc0 CR3: 04cb8000 CR4: 06f0 [ 2163.430042] Stack: [ 2163.430042] 8800243070d0 83c08000 83c07fff 88012d6bc800 [ 2163.430042] 8800243070d0 8800751b0f18 8800751b0f00 [ 2163.430042] 8801356bfe18 a037a481 83c04000 83c07fff [ 2163.430042] Call Trace: [ 2163.430042] [a037a481] btrfs_finish_extent_commit+0xac/0xbf [btrfs] [ 2163.430042] [a038c06d] btrfs_commit_transaction+0x6ee/0x882 [btrfs] [ 2163.430042] [a03881f1] transaction_kthread+0xf2/0x1a4 [btrfs] [ 2163.430042] [a03880ff] ? btrfs_cleanup_transaction+0x3d8/0x3d8 [btrfs] [ 2163.430042] [8105966b] kthread+0xb7/0xbf [ 2163.430042] [810595b4] ? __kthread_parkme+0x67/0x67 [ 2163.430042] [813ebeac] ret_from_fork+0x7c/0xb0 [ 2163.430042] [810595b4] ? __kthread_parkme+0x67/0x67 So fix this by making update_block_group() first set the range as dirty in pinned_extents before adding the block group to the unused_bgs list. Signed-off-by: Filipe Manana fdman...@suse.com --- fs/btrfs/extent-tree.c | 21