On Wed, Apr 24, 2019 at 02:57:47AM +0000, Paul Jones wrote:
> > -----Original Message-----
> > From: linux-btrfs-ow...@vger.kernel.org <linux-btrfs-
> > ow...@vger.kernel.org> On Behalf Of Zygo Blaxell
> > Sent: Wednesday, 24 April 2019 9:07 AM
> > To: linux-btrfs@vger.kernel.org
> > Subject: Global reserve and ENOSPC while deleting snapshots on 5.0.9
> > 
> > I had a test filesystem that ran out of unallocated space, then ran out of
> > metadata space during a snapshot delete, and forced readonly.
> > The workload before the failure was a lot of rsync and bees dedupe
> > combined with random snapshot creates and deletes.
> > 
> > I tried the usual fix strategies:
> > 
> >     1.  Immediately after mount, try to balance to free space for
> >     metadata
> > 
> >     2.  Immediately after mount, add additional disks to provide
> >     unallocated space for metadata
> > 
> >     3.  Mount -o nossd to increase metadata density
> > 
> > #3 had no effect.  #1 failed consistently.
> > 
> > #2 was successful, but the additional space was not used because btrfs
> > couldn't allocate chunks for metadata because it ran out of metadata space
> > for new metadata chunks.
> > 
> > When btrfs-cleaner tried to remove the first pending deleted snapshot, it
> > started a transaction that failed due to lack of metadata space.
> > Since the transaction failed, the filesystem reverts to its earlier state, 
> > and
> > exactly the same thing happens on the next mount.  The 'btrfs dev add' in #2
> > is successful only if it is executed immediately after mount, before the 
> > btrfs-
> > cleaner thread wakes up.
> 
> I had a similar problem on iirc 4.20, except that I couldn't get the new 
> devices to add (raid1) before the cleaner thread ran, no matter how fast I 
> added them after mount.
> I ended up just commenting out the part that forces the fs to go read only. 
> The cleaner thread exits gracefully (I think?) so then it was no trouble to 
> add the devices.
> 
> Is it still necessary to have the fs go read only like that when it's out of 
> space?

It's definitely a good idea to go read only on generic transaction
failures.

Maybe it's not such a good idea to lump ENOSPC in with other kinds of
transaction failure.

> Paul.
> 

Attachment: signature.asc
Description: PGP signature

Reply via email to