Re: Still getting a lot of -28 (ENOSPC?) errors during balance
On Tue, 2 Apr 2013 14:04:52 +0600 Roman Mamedov r...@romanrm.ru wrote: With kernel 3.7.10 patched with Btrfs: limit the global reserve to 512mb. (the problem was occuring also without this patch, but seemed to be even worse). At the start of balance: Data: total=31.85GB, used=9.96GB System: total=4.00MB, used=16.00KB Metadata: total=1.01GB, used=696.17MB btrfs balance start -musage=5 -dusage=5 is going on for about 50 minutes Current situation: Balance on '/mnt/r1/' is running 1 out of about 2 chunks balanced (20 considered), 50% left Data: total=30.85GB, used=10.04GB System: total=4.00MB, used=16.00KB Metadata: total=1.01GB, used=851.69MB About 2 hours 10 minutes into the balance, it was still going, with: Data: total=30.85GB, used=10.06GB System: total=4.00MB, used=16.00KB Metadata: total=1.01GB, used=909.16MB Stream of -28 errors continues non-stop in dmesg; At ~2hr20min looks like it decided to allocate some more space for metadata: Data: total=30.85GB, used=10.01GB System: total=4.00MB, used=16.00KB Metadata: total=2.01GB, used=748.56MB And shortly after (~ 2hr25min) it was done. After the balance: Data: total=29.85GB, used=10.01GB System: total=4.00MB, used=16.00KB Metadata: total=2.01GB, used=748.27MB -- With respect, Roman signature.asc Description: PGP signature
Re: Still getting a lot of -28 (ENOSPC?) errors during balance
On Tue, Apr 02, 2013 at 02:04:52AM -0600, Roman Mamedov wrote: Hello, With kernel 3.7.10 patched with Btrfs: limit the global reserve to 512mb. (the problem was occuring also without this patch, but seemed to be even worse). At the start of balance: Data: total=31.85GB, used=9.96GB System: total=4.00MB, used=16.00KB Metadata: total=1.01GB, used=696.17MB btrfs balance start -musage=5 -dusage=5 is going on for about 50 minutes Current situation: Balance on '/mnt/r1/' is running 1 out of about 2 chunks balanced (20 considered), 50% left Data: total=30.85GB, used=10.04GB System: total=4.00MB, used=16.00KB Metadata: total=1.01GB, used=851.69MB And a constant stream of these in dmesg: Can you try this out and see if it helps? Thanks, Josef diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 0d89ff0..9830e86 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -2548,6 +2548,13 @@ static int do_relocation(struct btrfs_trans_handle *trans, list_for_each_entry(edge, node-upper, list[LOWER]) { cond_resched(); + ret = btrfs_block_rsv_refill(rc-extent_root, rc-block_rsv, +rc-extent_root-leafsize, +BTRFS_RESERVE_FLUSH_ALL); + if (ret) { + err = ret; + break; + } upper = edge-node[UPPER]; root = select_reloc_root(trans, rc, upper, edges, nr); BUG_ON(!root); -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Still getting a lot of -28 (ENOSPC?) errors during balance
On Tue, 2 Apr 2013 09:46:26 -0400 Josef Bacik jba...@fusionio.com wrote: On Tue, Apr 02, 2013 at 02:04:52AM -0600, Roman Mamedov wrote: Hello, With kernel 3.7.10 patched with Btrfs: limit the global reserve to 512mb. (the problem was occuring also without this patch, but seemed to be even worse). At the start of balance: Data: total=31.85GB, used=9.96GB System: total=4.00MB, used=16.00KB Metadata: total=1.01GB, used=696.17MB btrfs balance start -musage=5 -dusage=5 is going on for about 50 minutes Current situation: Balance on '/mnt/r1/' is running 1 out of about 2 chunks balanced (20 considered), 50% left Data: total=30.85GB, used=10.04GB System: total=4.00MB, used=16.00KB Metadata: total=1.01GB, used=851.69MB And a constant stream of these in dmesg: Can you try this out and see if it helps? Thanks, Hello, Well that balance has now completed, and unfortunately I don't have a complete image of the filesystem from before, to apply the patch and check if the same operation goes better this time. I'll keep it in mind and will try to test it out if I will get a similar situation again on some filesystem. Generally what seems to make me run into various problems with balance, is the following usage scenario: On an active filesystem (used as /home and root FS), a snapshot is made every 30 minutes with an unique (timestamped) name; and once a day snapshots from more than two days ago are purged. And it goes like this for months. Another variant of this, a backup partition, where snapshots are made every six hours, and all snapshots are kept for 1-3 months before getting purged. I guess this kind of usage causes a lot of internal fragmentation or something, which makes it difficult for a balance to find enough free space to work with. Josef diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 0d89ff0..9830e86 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -2548,6 +2548,13 @@ static int do_relocation(struct btrfs_trans_handle *trans, list_for_each_entry(edge, node-upper, list[LOWER]) { cond_resched(); + ret = btrfs_block_rsv_refill(rc-extent_root, rc-block_rsv, + rc-extent_root-leafsize, + BTRFS_RESERVE_FLUSH_ALL); + if (ret) { + err = ret; + break; + } upper = edge-node[UPPER]; root = select_reloc_root(trans, rc, upper, edges, nr); BUG_ON(!root); -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- With respect, Roman signature.asc Description: PGP signature