[PATCH 3/6] Btrfs: fix freeing used extents after removing empty block group

2014-11-26 Thread Filipe Manana
There's a race between adding a block group to the list of the unused
block groups and removing an unused block group (cleaner kthread) that
leads to freeing extents that are in use or a crash during transaction
commmit. Basically the cleaner kthread, when executing
btrfs_delete_unused_bgs(), might catch the newly added block group to
the list fs_info-unused_bgs and clear the range representing the whole
group from fs_info-freed_extents[] before the task that added the block
group to the list (running update_block_group()) marked the last freed
extent as dirty in fs_info-freed_extents (pinned_extents).

That is:

 CPU 1CPU 2

  btrfs_delete_unused_bgs()
update_block_group()
   add block group to
   fs_info-unused_bgs
got block group from the list
clear_extent_bits for the whole
block group range in freed_extents[]
   set_extent_dirty for the
   range covering the freed
   extent in freed_extents[]
   (fs_info-pinned_extents)

  block group deleted, and a new block
  group with the same logical address is
  created

  reserve space from the new block group
  for new data or metadata - the reserved
  space overlaps the range specified by
  CPU 1 for set_extent_dirty()

  commit transaction
find all ranges marked as dirty in
fs_info-pinned_extents, clear them
and add them to the free space cache

Alternatively, if CPU 2 doesn't create a new block group with the same
logical address, we get a crash/BUG_ON at transaction commit when unpining
extent ranges because we can't find a block group for the range marked as
dirty by CPU 1. Sample trace:

[ 2163.426462] invalid opcode:  [#1] SMP DEBUG_PAGEALLOC
[ 2163.426640] Modules linked in: btrfs xor raid6_pq dm_thin_pool 
dm_persistent_data dm_bio_prison dm_bufio crc32c_generic libcrc32c dm_mod nfsd 
auth_rpc
gss oid_registry nfs_acl nfs lockd fscache sunrpc loop psmouse parport_pc 
parport i2c_piix4 processor thermal_sys i2ccore evdev button pcspkr microcode 
serio_raw ext4 crc16 jbd2 mbcache
 sg sr_mod cdrom sd_mod crc_t10dif crct10dif_generic crct10dif_common 
ata_generic virtio_scsi floppy ata_piix libata e1000 scsi_mod virtio_pci 
virtio_ring virtio
[ 2163.428209] CPU: 0 PID: 11858 Comm: btrfs-transacti Tainted: GW  
3.17.0-rc5-btrfs-next-1+ #1
[ 2163.428519] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[ 2163.428875] task: 88009f2c0650 ti: 8801356bc000 task.ti: 
8801356bc000
[ 2163.429157] RIP: 0010:[a037728e]  [a037728e] 
unpin_extent_range.isra.58+0x62/0x192 [btrfs]
[ 2163.429562] RSP: 0018:8801356bfda8  EFLAGS: 00010246
[ 2163.429802] RAX:  RBX:  RCX: 
[ 2163.429990] RDX: 41bf RSI: 01c0 RDI: 880024307080
[ 2163.430042] RBP: 8801356bfde8 R08: 0068 R09: 88003734f118
[ 2163.430042] R10: 8801356bfcb8 R11: fb69 R12: 8800243070d0
[ 2163.430042] R13: 83c04000 R14: 8800751b0f00 R15: 880024307000
[ 2163.430042] FS:  () GS:88013f40() 
knlGS:
[ 2163.430042] CS:  0010 DS:  ES:  CR0: 8005003b
[ 2163.430042] CR2: 7ff10eb43fc0 CR3: 04cb8000 CR4: 06f0
[ 2163.430042] Stack:
[ 2163.430042]  8800243070d0 83c08000 83c07fff 
88012d6bc800
[ 2163.430042]  8800243070d0 8800751b0f18 8800751b0f00 

[ 2163.430042]  8801356bfe18 a037a481 83c04000 
83c07fff
[ 2163.430042] Call Trace:
[ 2163.430042]  [a037a481] btrfs_finish_extent_commit+0xac/0xbf 
[btrfs]
[ 2163.430042]  [a038c06d] btrfs_commit_transaction+0x6ee/0x882 
[btrfs]
[ 2163.430042]  [a03881f1] transaction_kthread+0xf2/0x1a4 [btrfs]
[ 2163.430042]  [a03880ff] ? btrfs_cleanup_transaction+0x3d8/0x3d8 
[btrfs]
[ 2163.430042]  [8105966b] kthread+0xb7/0xbf
[ 2163.430042]  [810595b4] ? __kthread_parkme+0x67/0x67
[ 2163.430042]  [813ebeac] ret_from_fork+0x7c/0xb0
[ 2163.430042]  [810595b4] ? __kthread_parkme+0x67/0x67

So fix this by making update_block_group() first set the range as dirty
in pinned_extents before adding the block group to the unused_bgs list.

Signed-off-by: Filipe Manana fdman...@suse.com
---
 fs/btrfs/extent-tree.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)


Re: [PATCH 3/6] Btrfs: fix freeing used extents after removing empty block group

2014-11-26 Thread Josef Bacik

On 11/26/2014 10:28 AM, Filipe Manana wrote:

There's a race between adding a block group to the list of the unused
block groups and removing an unused block group (cleaner kthread) that
leads to freeing extents that are in use or a crash during transaction
commmit. Basically the cleaner kthread, when executing
btrfs_delete_unused_bgs(), might catch the newly added block group to
the list fs_info-unused_bgs and clear the range representing the whole
group from fs_info-freed_extents[] before the task that added the block
group to the list (running update_block_group()) marked the last freed
extent as dirty in fs_info-freed_extents (pinned_extents).

That is:

  CPU 1CPU 2

   btrfs_delete_unused_bgs()
update_block_group()
add block group to
fs_info-unused_bgs
 got block group from the list
 clear_extent_bits for the whole
 block group range in freed_extents[]
set_extent_dirty for the
range covering the freed
extent in freed_extents[]
(fs_info-pinned_extents)

   block group deleted, and a new block
   group with the same logical address is
   created

   reserve space from the new block group
   for new data or metadata - the reserved
   space overlaps the range specified by
   CPU 1 for set_extent_dirty()

   commit transaction
 find all ranges marked as dirty in
 fs_info-pinned_extents, clear them
 and add them to the free space cache

Alternatively, if CPU 2 doesn't create a new block group with the same
logical address, we get a crash/BUG_ON at transaction commit when unpining
extent ranges because we can't find a block group for the range marked as
dirty by CPU 1. Sample trace:

[ 2163.426462] invalid opcode:  [#1] SMP DEBUG_PAGEALLOC
[ 2163.426640] Modules linked in: btrfs xor raid6_pq dm_thin_pool 
dm_persistent_data dm_bio_prison dm_bufio crc32c_generic libcrc32c dm_mod nfsd 
auth_rpc
gss oid_registry nfs_acl nfs lockd fscache sunrpc loop psmouse parport_pc 
parport i2c_piix4 processor thermal_sys i2ccore evdev button pcspkr microcode 
serio_raw ext4 crc16 jbd2 mbcache
  sg sr_mod cdrom sd_mod crc_t10dif crct10dif_generic crct10dif_common 
ata_generic virtio_scsi floppy ata_piix libata e1000 scsi_mod virtio_pci 
virtio_ring virtio
[ 2163.428209] CPU: 0 PID: 11858 Comm: btrfs-transacti Tainted: GW  
3.17.0-rc5-btrfs-next-1+ #1
[ 2163.428519] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[ 2163.428875] task: 88009f2c0650 ti: 8801356bc000 task.ti: 
8801356bc000
[ 2163.429157] RIP: 0010:[a037728e]  [a037728e] 
unpin_extent_range.isra.58+0x62/0x192 [btrfs]
[ 2163.429562] RSP: 0018:8801356bfda8  EFLAGS: 00010246
[ 2163.429802] RAX:  RBX:  RCX: 
[ 2163.429990] RDX: 41bf RSI: 01c0 RDI: 880024307080
[ 2163.430042] RBP: 8801356bfde8 R08: 0068 R09: 88003734f118
[ 2163.430042] R10: 8801356bfcb8 R11: fb69 R12: 8800243070d0
[ 2163.430042] R13: 83c04000 R14: 8800751b0f00 R15: 880024307000
[ 2163.430042] FS:  () GS:88013f40() 
knlGS:
[ 2163.430042] CS:  0010 DS:  ES:  CR0: 8005003b
[ 2163.430042] CR2: 7ff10eb43fc0 CR3: 04cb8000 CR4: 06f0
[ 2163.430042] Stack:
[ 2163.430042]  8800243070d0 83c08000 83c07fff 
88012d6bc800
[ 2163.430042]  8800243070d0 8800751b0f18 8800751b0f00 

[ 2163.430042]  8801356bfe18 a037a481 83c04000 
83c07fff
[ 2163.430042] Call Trace:
[ 2163.430042]  [a037a481] btrfs_finish_extent_commit+0xac/0xbf 
[btrfs]
[ 2163.430042]  [a038c06d] btrfs_commit_transaction+0x6ee/0x882 
[btrfs]
[ 2163.430042]  [a03881f1] transaction_kthread+0xf2/0x1a4 [btrfs]
[ 2163.430042]  [a03880ff] ? btrfs_cleanup_transaction+0x3d8/0x3d8 
[btrfs]
[ 2163.430042]  [8105966b] kthread+0xb7/0xbf
[ 2163.430042]  [810595b4] ? __kthread_parkme+0x67/0x67
[ 2163.430042]  [813ebeac] ret_from_fork+0x7c/0xb0
[ 2163.430042]  [810595b4] ? __kthread_parkme+0x67/0x67

So fix this by making update_block_group() first set the range as dirty
in pinned_extents before adding the block group to the unused_bgs list.

Signed-off-by: Filipe Manana fdman...@suse.com
---
  fs/btrfs/extent-tree.c | 21