On Thu, Apr 19, 2018 at 03:08:48PM -0700, Drew Bloechl wrote: > I've got a btrfs filesystem that I can't seem to get back to a useful > state. The symptom I started with is that rename() operations started > dying with ENOSPC, and it looks like the metadata allocation on the > filesystem is full: > > # btrfs fi df /broken > Data, RAID0: total=3.63TiB, used=67.00GiB > System, RAID1: total=8.00MiB, used=224.00KiB > Metadata, RAID1: total=3.00GiB, used=2.50GiB > GlobalReserve, single: total=512.00MiB, used=0.00B > > All of the consumable space on the backing devices also seems to be in > use: > > # btrfs fi show /broken > Label: 'mon_data' uuid: 85e52555-7d6d-4346-8b37-8278447eb590 > Total devices 4 FS bytes used 69.50GiB > devid 1 size 931.51GiB used 931.51GiB path /dev/sda1 > devid 2 size 931.51GiB used 931.51GiB path /dev/sdb1 > devid 3 size 931.51GiB used 931.51GiB path /dev/sdc1 > devid 4 size 931.51GiB used 931.51GiB path /dev/sdd1 > > Even the smallest balance operation I can start fails (this doesn't > change even with an extra temporary device added to the filesystem):
Given that both data and metadata levels here require paired chunks, try adding _two_ temporary devices so that it can allocate a new block group. Hugo. > # btrfs balance start -v -dusage=1 /broken > Dumping filters: flags 0x1, state 0x0, force is off > DATA (flags 0x2): balancing, usage=1 > ERROR: error during balancing '/broken': No space left on device > There may be more info in syslog - try dmesg | tail > # dmesg | tail -1 > [11554.296805] BTRFS info (device sdc1): 757 enospc errors during > balance > > The current kernel is 4.15.0 from Debian's stretch-backports > (specifically linux-image-4.15.0-0.bpo.2-amd64), but it was Debian's > 4.9.30 when the filesystem got into this state. I upgraded it in the > hopes that a newer kernel would be smarter, but no dice. > > btrfs-progs is currently at v4.7.3. > > Most of what this filesystem stores is Prometheus 1.8's TSDB for its > metrics, which are constantly written at around 50MB/second. The > filesystem never really gets full as far as data goes, but there's a lot > of never-ending churn for what data is there. > > Question 1: Are there other steps that can be tried to rescue a > filesystem in this state? I still have it mounted in the same state, and > I'm willing to try other things or extract debugging info. > > Question 2: Is there something I could have done to prevent this from > happening in the first place? > > Thanks! -- Hugo Mills | Always be sincere, whether you mean it or not. hugo@... carfax.org.uk | http://carfax.org.uk/ | Flanders & Swann PGP: E2AB1DE4 | The Reluctant Cannibal
signature.asc
Description: Digital signature