On Thu, Jun 9, 2016 at 5:41 PM, Duncan <1i5t5.dun...@cox.net> wrote: > Hans van Kranenburg posted on Thu, 09 Jun 2016 01:10:46 +0200 as > excerpted: > >> The next question is what files these extents belong to. To find out, I >> need to open up the extent items I get back and follow a backreference >> to an inode object. Might do that tomorrow, fun. >> >> To be honest, I suspect /var/log and/or the file storage of mailman to >> be the cause of the fragmentation, since there's logging from postfix, >> mailman and nginx going on all day long in a slow but steady tempo. >> While using btrfs for a number of use cases at work now, we normally >> don't use it for the root filesystem. And the cases where it's used as >> root filesystem don't do much logging or mail. > > FWIW, that's one reason I have a dedicated partition (and filesystem) for > logs, here. (The other reason is that should something go runaway log- > spewing, I get a warning much sooner when my log filesystem fills up, not > much later, with much worse implications, when the main filesystem fills > up!) > >> And no, autodefrag is not in the mount options currently. Would that be >> helpful in this case? > > It should be helpful, yes. Be aware that autodefrag works best with > smaller (sub-half-gig) files, however, and that it used to cause > performance issues with larger database and VM files, in particular.
I don't know why you relate filesize and autodefrag. Maybe because you say '... used to cause ...'. autodefrag detects random writes and then tries to defrag a certain range. Its scope size is 256K as far as I see from the code and over time you see VM images that are on a btrfs fs (CoW, hourly ro snapshots) having a lot of 256K (or a bit less) sized extents according to what filefrag reports. I once wanted to try and change the 256K to 1M or even 4M, but I haven't come to that. A 32G VM image would consist of 131072 extents for 256K, 32768 extents for 1M, 8192 extents for 4M. > There used to be a warning on the wiki about that, that was recently > removed, so apparently it's not the issue that it was, but you might wish > to monitor any databases or VMs with gig-plus files to see if it's going > to be a performance issue, once you turn on autodefrag. For very active databases, I don't know what the effects are, with or without autodefrag ( either on SSD and/or HDD). At least on HDD-only, so no persistent SSD caching and noautodefrag, VMs will result in unacceptable performance soon. > The other issue with autodefrag is that if it hasn't been on and things > are heavily fragmented, it can at first drive down performance as it > rewrites all these heavily fragmented files, until it catches up and is > mostly dealing only with the normal refragmentation load. I assume you mean that one only gets a performance drop if you actually do new writes to the fragmented files since autodefrag on. It shouldn't start defragging by itself AFAIK. > Of course the > best way around that is to run autodefrag from the first time you mount > the filesystem and start writing to it, so it never gets overly > fragmented in the first place. For a currently in-use and highly > fragmented filesystem, you have two choices, either backup and do a fresh > mkfs.btrfs so you can start with a clean filesystem and autodefrag from > the beginning, or doing manual defrag. > > However, be aware that if you have snapshots locking down the old extents > in their fragmented form, a manual defrag will copy the data to new > extents without releasing the old ones as they're locked in place by the > snapshots, thus using additional space. Worse, if the filesystem is > already heavily fragmented and snapshots are locking most of those > fragments in place, defrag likely won't help a lot, because the free > space as well will be heavily fragmented. So starting off with a clean > and new filesystem and using autodefrag from the beginning really is your > best bet. If it is about multi-TB fs, I think most important is to have enough unfragmented free space available and hopefully at the beginning of the device if it is flat HDD. Maybe a balance -ddrange=1M..<20% of device> can do that, I haven't tried. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html