On 09/28/16 13:35, Wang Xiaoguang wrote: > hello, > > On 09/28/2016 07:15 PM, Stefan Priebe - Profihost AG wrote: >> Dear list, >> >> is there any chance anybody wants to work with me on the following issue? > Though I'm also somewhat new to btrfs, but I'd like to. > >> >> BTRFS: space_info 4 has 18446742286429913088 free, is not full >> BTRFS: space_info total=98247376896, used=77036814336, pinned=0, >> reserved=0, may_use=1808490201088, readonly=0 >> >> i get this nearly every day. >> >> Here are some msg collected from today and yesterday from different servers: >> | BTRFS: space_info 4 has 18446742182612910080 free, is not full | >> | BTRFS: space_info 4 has 18446742254739439616 free, is not full | >> | BTRFS: space_info 4 has 18446743980225085440 free, is not full | >> | BTRFS: space_info 4 has 18446743619906420736 free, is not full | >> | BTRFS: space_info 4 has 18446743647369576448 free, is not full | >> | BTRFS: space_info 4 has 18446742286429913088 free, is not full >> >> What i tried so far without success: >> - use vanilla 4.8-rc8 kernel >> - use latest vanilla 4.4 kernel >> - use latest 4.4 kernel + patches from holger hoffstaette
Was that 4.4.22? It contains a patch by Goldwyn Rodrigues called "Prevent qgroup->reserved from going subzero" which should prevent this from happening. This should only affect filesystems with enabled quota; you said you didn't have quota enabled, yet some quota-only patches caused problems on your system (despite being scheduled for 4.9 and apparently working fine everywhere else, even when I specifically tested them *with* quota enabled). So, long story short: something doesn't add up. It means either: - you tried my patchset for 4.4.21 (i.e. *without* the above patch) and should bump to .22 right away - you _do_ have qgroups enabled for some reason (systemd?) - your fs is corrupted and needs nuking - you did something else entirely - unknown unknowns aka. ¯\_(ツ)_/¯ There is also the chance that your use of compress-force (or rather compression in general) causes leakage; compression runs asynchronously and I wouldn't be surprised if that is still full of racy races..which would be unfortunate, but you could try to disable compression for a while and see what happens, assuming the space requirements allow this experiment. You have also not told us whether this happens only on one (potentially corrupted/confused) fs or on every one - my impression was that you have several sharded backup filesystems/machines; not sure if that is still the case. If it happens only on one specific fs chances are it's hosed. > I also met enospc error in 4.8-rc6 when doing big files create and delete > tests, > for my cases, I have written some patches to fix it. > Would you please apply my patches to have a try: > btrfs: try to satisfy metadata requests when every flush_space() returns > btrfs: try to write enough delalloc bytes when reclaiming metadata space > btrfs: make shrink_delalloc() try harder to reclaim metadata space These are all in my series for 4.4.22 and seem to work fine, however Stefan's workload has nothing directly to do with big files; instead it's the worst case scenario in terms of fragmentation (of huge files) and a huge number of extents: incremental backups of VMs via rsync --inplace with forced compression. IMHO this way of making backups is suboptimal in basically every possible way, despite its convenience appeal. With such huge space requirements it would be more effective to have a "current backup" to rsync into and then take a snapshot (for fs consistency), pack the snapshot to a tar.gz (massively better compression than with btrfs), dump them into your Ceph cluster as objects with expiry (preferrably a separate EC pool) and then immediately delete the snapshot from the local fs. That should relieve the landing fs from getting overloaded by COWing and too many snapshots (approx. #VMs * #versions). The obvious downside is that restoring an archived snapshot would require some creative efforts. Other alternatives exist, but are probably even more (too) expensive. -h -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html