Hello Folks,

I used to have an array of 4x4TB drives with BTRFS in raid10.
The kernel version is: 3.13-0.bpo.1-amd64
BTRFS version is: v3.14.1

When it was reaching 80% in space I added another 4TB drive to the array with:

> btrfs device add /dev/sdf /mnt/backup

And started the balancing to the new drive:

> btrfs filesystem balance /mnt/backup

This was going for a while for 5-6 hours before it segfaulted with not enough 
free space message.
Now my configuration looks like this:

btrfs fi show /mnt/backup
Label: 'backup'  uuid: ...
        Total devices 5 FS bytes used 5.93TiB
        devid    1 size 3.64TiB used 2.82TiB path /dev/sdd
        devid    2 size 3.64TiB used 2.82TiB path /dev/sdc
        devid    3 size 3.64TiB used 2.81TiB path /dev/sdb
        devid    4 size 3.64TiB used 2.82TiB path /dev/sde
        devid    5 size 3.64TiB used 638.50GiB path /dev/sdf

After this crash happend during the balancing (logs are attached at the end) 
the system remounted my /mnt/backup share as RO.
At this point I started to really worry. I umounted and remounted it manually. 
At the beginning it run some self checks which took like 5 mins then as iotop 
showed it continued with the balancing which failed again the same way. For 
next time after mount I immediately put the balancing on pause (which helped). 

My question is where to go from here? What I going to do right now is to copy 
the most important data to another separated XFS drive.
What I planning to do is:

1, Upgrade the kernel
2, Upgrade BTRFS
3, Continue the balancing.


Could someone please also explain that how is exactly the raid10 setup works 
with ODD number of drives with btrfs? 
Raid10 should be a stripe of mirrors. Now then this sdf drive is mirrored or 
striped or what? 
Some btrfs gurus could tell me that should I be worried of dataloss because of 
this or not?

Would I need even more free space just to add a 5th drive? If so how much more? 

Kernel logs
-----------


Oct 24 17:25:44 backup kernel: [29396.873750] btrfs: relocating block group 
5162588438528 flags 65
Oct 24 17:26:09 backup kernel: [29421.594524] btrfs: found 13126 extents
Oct 24 17:26:38 backup kernel: [29450.769228] btrfs: found 13126 extents
Oct 24 17:26:39 backup kernel: [29451.345198] btrfs: relocating block group 
5161514696704 flags 68
Oct 24 17:31:33 backup kernel: [29745.776810] BTRFS debug (device sdb): 
run_one_delayed_ref returned -28
Oct 24 17:31:33 backup kernel: [29745.776818] ------------[ cut here 
]------------
Oct 24 17:31:33 backup kernel: [29745.776847] WARNING: CPU: 1 PID: 1807 at 
/build/linux-t5aGFh/linux-3.13.10/fs/btrfs/super.c:254 
__btrfs_abort_transaction+0x5a/0x140 [btrfs]()
Oct 24 17:31:33 backup kernel: [29745.776849] btrfs: Transaction aborted (error 
-28)
Oct 24 17:31:33 backup kernel: [29745.776851] Modules linked in: xen_gntdev 
xen_evtchn xenfs xen_privcmd nfsd auth_rpcgss oid_registry nfs_acl nfs lockd 
fscache sunrpc 8021q garp mrp bridge stp llc loop iTCO_wdt iTCO_vendor_support 
lpc_ich radeon mfd_core processor evdev ttm drm_kms_helper drm i2c_algo_bit 
coretemp rng_core serio_raw pcspkr i2c_i801 i2c_core i3000_edac thermal_sys 
button shpchp edac_core ext4 crc16 mbcache jbd2 btrfs xor raid6_pq crc32c 
libcrc32c dm_mod xen_pciback sg sd_mod sr_mod crc_t10dif cdrom crct10dif_common 
ata_generic ahci ata_piix libahci 3w_9xxx libata scsi_mod ehci_pci uhci_hcd 
ehci_hcd e1000e ptp pps_core usbcore usb_common
Oct 24 17:31:33 backup kernel: [29745.776902] CPU: 1 PID: 1807 Comm: 
btrfs-transacti Not tainted 3.13-0.bpo.1-amd64 #1 Debian 3.13.10-1~bpo70+1
Oct 24 17:31:33 backup kernel: [29745.776905] Hardware name: Supermicro 
PDSM4+/PDSM4+, BIOS 6.00 02/05/2007
Oct 24 17:31:33 backup kernel: [29745.776907]  0000000000000000 
ffffffffa0257130 ffffffff814d16c9 ffff88006a7f3cc8
Oct 24 17:31:33 backup kernel: [29745.776911]  ffffffff81060967 
00000000ffffffe4 ffff880004282800 ffff88003b813ec0
Oct 24 17:31:33 backup kernel: [29745.776914]  0000000000000aaa 
ffffffffa0253b60 ffffffff81060a55 ffffffffa0257260
Oct 24 17:31:33 backup kernel: [29745.776918] Call Trace:
Oct 24 17:31:33 backup kernel: [29745.776926]  [<ffffffff814d16c9>] ? 
dump_stack+0x41/0x51
Oct 24 17:31:33 backup kernel: [29745.776931]  [<ffffffff81060967>] ? 
warn_slowpath_common+0x87/0xc0
Oct 24 17:31:33 backup kernel: [29745.776935]  [<ffffffff81060a55>] ? 
warn_slowpath_fmt+0x45/0x50
Oct 24 17:31:33 backup kernel: [29745.776946]  [<ffffffffa01b73ca>] ? 
__btrfs_abort_transaction+0x5a/0x140 [btrfs]
Oct 24 17:31:33 backup kernel: [29745.776959]  [<ffffffffa01d2e72>] ? 
btrfs_run_delayed_refs+0x372/0x530 [btrfs]
Oct 24 17:31:33 backup kernel: [29745.776974]  [<ffffffffa01fa8c3>] ? 
btrfs_run_ordered_operations+0x213/0x2b0 [btrfs]
Oct 24 17:31:33 backup kernel: [29745.776988]  [<ffffffffa01e2fea>] ? 
btrfs_commit_transaction+0x5a/0x990 [btrfs]
Oct 24 17:31:33 backup kernel: [29745.777001]  [<ffffffffa01e1345>] ? 
transaction_kthread+0x1c5/0x240 [btrfs]
Oct 24 17:31:33 backup kernel: [29745.777015]  [<ffffffffa01e1180>] ? 
open_ctree+0x1ff0/0x1ff0 [btrfs]
Oct 24 17:31:33 backup kernel: [29745.777019]  [<ffffffff8108233c>] ? 
kthread+0xbc/0xe0
Oct 24 17:31:33 backup kernel: [29745.777022]  [<ffffffff81082280>] ? 
flush_kthread_worker+0xa0/0xa0
Oct 24 17:31:33 backup kernel: [29745.777026]  [<ffffffff814dee4c>] ? 
ret_from_fork+0x7c/0xb0
Oct 24 17:31:33 backup kernel: [29745.777030]  [<ffffffff81082280>] ? 
flush_kthread_worker+0xa0/0xa0
Oct 24 17:31:33 backup kernel: [29745.777032] ---[ end trace 5de5beb31698a3c1 
]---
Oct 24 17:31:33 backup kernel: [29745.777035] BTRFS error (device sdb) in 
btrfs_run_delayed_refs:2730: errno=-28 No space left
Oct 24 17:31:33 backup kernel: [29745.777512] BTRFS info (device sdb): forced 
readonly
Oct 24 17:31:33 backup kernel: [29745.784767] BTRFS debug (device sdb): 
run_one_delayed_ref returned -28
Oct 24 17:31:33 backup kernel: [29745.784773] BTRFS error (device sdb) in 
btrfs_run_delayed_refs:2730: errno=-28 No space left
Oct 24 17:35:53 backup kernel: [30005.015967] btrfs: device label backup_fs 
devid 3 transid 86656 /dev/sdb
Oct 24 17:35:53 backup kernel: [30005.063903] btrfs: disk space caching is 
enabled
Oct 24 17:43:01 backup kernel: [30433.356660] BTRFS debug (device sdf): 
unlinked 1 orphans
Oct 24 17:43:01 backup kernel: [30433.395645] btrfs: continuing balance
Oct 24 17:43:02 backup kernel: [30434.395936] btrfs: relocating block group 
7434626138112 flags 65
Oct 24 17:43:17 backup kernel: [30449.104022] btrfs: found 8842 extents
Oct 24 17:43:24 backup kernel: [30456.043235] btrfs: found 8834 extents
Oct 24 17:43:24 backup kernel: [30456.580133] btrfs: relocating block group 
7223098998784 flags 68
Oct 24 17:48:42 backup kernel: [30774.465707] btrfs: found 37187 extents
Oct 24 17:48:43 backup kernel: [30775.058570] btrfs: relocating block group 
6782864850944 flags 68
Oct 24 17:52:16 backup kernel: [30988.070735] BTRFS debug (device sdf): 
run_one_delayed_ref returned -28
Oct 24 17:52:16 backup kernel: [30988.070742] ------------[ cut here 
]------------
Oct 24 17:52:16 backup kernel: [30988.070772] WARNING: CPU: 1 PID: 15920 at 
/build/linux-t5aGFh/linux-3.13.10/fs/btrfs/super.c:254 
__btrfs_abort_transaction+0x5a/0x140 [btrfs]()
Oct 24 17:52:16 backup kernel: [30988.070775] btrfs: Transaction aborted (error 
-28)
Oct 24 17:52:16 backup kernel: [30988.070776] Modules linked in: xen_gntdev 
xen_evtchn xenfs xen_privcmd nfsd auth_rpcgss oid_registry nfs_acl nfs lockd 
fscache sunrpc 8021q garp mrp bridge stp llc loop iTCO_wdt iTCO_vendor_support 
lpc_ich radeon mfd_core processor evdev ttm drm_kms_helper drm i2c_algo_bit 
coretemp rng_core serio_raw pcspkr i2c_i801 i2c_core i3000_edac thermal_sys 
button shpchp edac_core ext4 crc16 mbcache jbd2 btrfs xor raid6_pq crc32c 
libcrc32c dm_mod xen_pciback sg sd_mod sr_mod crc_t10dif cdrom crct10dif_common 
ata_generic ahci ata_piix libahci 3w_9xxx libata scsi_mod ehci_pci uhci_hcd 
ehci_hcd e1000e ptp pps_core usbcore usb_common
Oct 24 17:52:16 backup kernel: [30988.070828] CPU: 1 PID: 15920 Comm: 
btrfs-transacti Tainted: G        W    3.13-0.bpo.1-amd64 #1 Debian 
3.13.10-1~bpo70+1
Oct 24 17:52:16 backup kernel: [30988.070830] Hardware name: Supermicro 
PDSM4+/PDSM4+, BIOS 6.00 02/05/2007
Oct 24 17:52:16 backup kernel: [30988.070833]  0000000000000000 
ffffffffa0257130 ffffffff814d16c9 ffff880056d7bcc8
Oct 24 17:52:16 backup kernel: [30988.070838]  ffffffff81060967 
00000000ffffffe4 ffff880003c97000 ffff88006ba9abe0
Oct 24 17:52:16 backup kernel: [30988.070841]  0000000000000aaa 
ffffffffa0253b60 ffffffff81060a55 ffffffffa0257260
Oct 24 17:52:16 backup kernel: [30988.070845] Call Trace:
Oct 24 17:52:16 backup kernel: [30988.070853]  [<ffffffff814d16c9>] ? 
dump_stack+0x41/0x51
Oct 24 17:52:16 backup kernel: [30988.070858]  [<ffffffff81060967>] ? 
warn_slowpath_common+0x87/0xc0
Oct 24 17:52:16 backup kernel: [30988.070862]  [<ffffffff81060a55>] ? 
warn_slowpath_fmt+0x45/0x50
Oct 24 17:52:16 backup kernel: [30988.070873]  [<ffffffffa01b73ca>] ? 
__btrfs_abort_transaction+0x5a/0x140 [btrfs]
Oct 24 17:52:16 backup kernel: [30988.070886]  [<ffffffffa01d2e72>] ? 
btrfs_run_delayed_refs+0x372/0x530 [btrfs]
Oct 24 17:52:16 backup kernel: [30988.070901]  [<ffffffffa01fa8c3>] ? 
btrfs_run_ordered_operations+0x213/0x2b0 [btrfs]
Oct 24 17:52:16 backup kernel: [30988.070915]  [<ffffffffa01e2fea>] ? 
btrfs_commit_transaction+0x5a/0x990 [btrfs]
Oct 24 17:52:16 backup kernel: [30988.070929]  [<ffffffffa01e1345>] ? 
transaction_kthread+0x1c5/0x240 [btrfs]
Oct 24 17:52:16 backup kernel: [30988.070942]  [<ffffffffa01e1180>] ? 
open_ctree+0x1ff0/0x1ff0 [btrfs]
Oct 24 17:52:16 backup kernel: [30988.070946]  [<ffffffff8108233c>] ? 
kthread+0xbc/0xe0
Oct 24 17:52:16 backup kernel: [30988.070949]  [<ffffffff81082280>] ? 
flush_kthread_worker+0xa0/0xa0
Oct 24 17:52:16 backup kernel: [30988.070954]  [<ffffffff814dee4c>] ? 
ret_from_fork+0x7c/0xb0
Oct 24 17:52:16 backup kernel: [30988.070957]  [<ffffffff81082280>] ? 
flush_kthread_worker+0xa0/0xa0
Oct 24 17:52:16 backup kernel: [30988.070960] ---[ end trace 5de5beb31698a3c2 
]---
Oct 24 17:52:16 backup kernel: [30988.070963] BTRFS error (device sdf) in 
btrfs_run_delayed_refs:2730: errno=-28 No space left
Oct 24 17:52:16 backup kernel: [30988.071439] BTRFS info (device sdf): forced 
readonly
Oct 24 17:52:16 backup kernel: [30988.081154] BTRFS debug (device sdf): 
run_one_delayed_ref returned -28
Oct 24 17:52:16 backup kernel: [30988.081161] BTRFS error (device sdf) in 
btrfs_run_delayed_refs:2730: errno=-28 No space left
Oct 24 17:55:34 backup kernel: [31186.936384] btrfs: device label backup_fs 
devid 3 transid 86683 /dev/sdb
Oct 24 17:55:35 backup kernel: [31187.067619] btrfs: disk space caching is 
enabled
Oct 24 18:01:23 backup kernel: [31535.301582] BTRFS debug (device sdf): 
unlinked 1 orphans
Oct 24 18:01:23 backup kernel: [31535.339410] btrfs: continuing balance
Oct 24 18:01:23 backup kernel: [31535.624023] btrfs: relocating block group 
7438921105408 flags 68
Oct 24 18:02:37 backup kernel: [31609.293378] btrfs: found 26705 extents


Thanks!


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to