On Wed, Apr 24, 2019 at 02:57:47AM +0000, Paul Jones wrote: > > -----Original Message----- > > From: linux-btrfs-ow...@vger.kernel.org <linux-btrfs- > > ow...@vger.kernel.org> On Behalf Of Zygo Blaxell > > Sent: Wednesday, 24 April 2019 9:07 AM > > To: linux-btrfs@vger.kernel.org > > Subject: Global reserve and ENOSPC while deleting snapshots on 5.0.9 > > > > I had a test filesystem that ran out of unallocated space, then ran out of > > metadata space during a snapshot delete, and forced readonly. > > The workload before the failure was a lot of rsync and bees dedupe > > combined with random snapshot creates and deletes. > > > > I tried the usual fix strategies: > > > > 1. Immediately after mount, try to balance to free space for > > metadata > > > > 2. Immediately after mount, add additional disks to provide > > unallocated space for metadata > > > > 3. Mount -o nossd to increase metadata density > > > > #3 had no effect. #1 failed consistently. > > > > #2 was successful, but the additional space was not used because btrfs > > couldn't allocate chunks for metadata because it ran out of metadata space > > for new metadata chunks. > > > > When btrfs-cleaner tried to remove the first pending deleted snapshot, it > > started a transaction that failed due to lack of metadata space. > > Since the transaction failed, the filesystem reverts to its earlier state, > > and > > exactly the same thing happens on the next mount. The 'btrfs dev add' in #2 > > is successful only if it is executed immediately after mount, before the > > btrfs- > > cleaner thread wakes up. > > I had a similar problem on iirc 4.20, except that I couldn't get the new > devices to add (raid1) before the cleaner thread ran, no matter how fast I > added them after mount. > I ended up just commenting out the part that forces the fs to go read only. > The cleaner thread exits gracefully (I think?) so then it was no trouble to > add the devices. > > Is it still necessary to have the fs go read only like that when it's out of > space?
It's definitely a good idea to go read only on generic transaction failures. Maybe it's not such a good idea to lump ENOSPC in with other kinds of transaction failure. > Paul. >
signature.asc
Description: PGP signature