Re: autodefrag by default, was: Lots of harddrive chatter
George Mitchell posted on Sun, 21 Jul 2013 16:44:09 -0700 as excerpted: But I think the only unanswered question for me at this point is whether complete defragmentation is even possible using auto-defrag. Unless auto-defrag can work around the in-use file issue, that could be a problem since some heavily used system files are open virtually all the time the system is up and running. Has this issue been investigated and if so are there any system files that don't get defragmented that matter? Or is this a non-issue in that any constantly in use system files don't really matter anyway? I believe Shridnar has it right; writes into a file/directory are the big fragmentation issue for btrfs. But there's one aspect he overlooked -- this is another reason I so strongly stress the autodefrag-from-newly- created-empty-filesystem-on point: for the general case, if autodefrag is on when the files are written in the first place, they won't be fragmented when they're loaded and the file is thus in-use, so there won't be any need to defrag them when in-use. There's two main forms of always-in-use files, executables/libraries etc that nay be memory-mapped, and database/vm-image files where the vm or database is always running. (And arguably, given a broad enough definition of database files, nearly anything else that would fall in this category including vm-images is already covered by that, so...) In the executables/libraries case, the files are generally NOT in-place rewritten, and installations/updates don't tend to be a problem either. Unlike MS where in-use files (used to be? I've been off MS for years so don't know whether this remains true on their current product) cannot/ could-not be replaced without a reboot, on Linux, the kernel allows unlinking and replacement of in-use files, with the references to previously existing file maintained in memory only; no actual storage- location overwrite allowed until there are no further runtime references to the old file. Sometime after you've done some in-use library/elf-file-executable package updates, try this. Look thru /proc/*/maps, where * is the PID of the process you're investigating. (You'll need to be root to look at processes running as other users.) This is a list of files that process has mapped. (It's documented in the kernel documentation, see $KERNELDIR/ Documentation/filesystems/proc.txt and search for /proc/PID/maps.) On the right side is the pathname. What's we're interested in here, however, is what happens when one of those files is replaced. To the right of the pathname there will be a notation: (deleted). These are files that have been unlinked (deleted or replaced), with the kernel maintaining the reference to the old location even tho a file listing won't show the old file any longer, until all existing runtime file references are terminated. There are actually helper-scripts available that will look thru /proc/PID/ maps and tell you which apps you need to restart to use the updated files. Another user of this unlink but keep the reference trick is certain media apps such as flash, that will download a file to a temporary location, load it and keep the open reference, then delete the file so it no longer appears in the filesystem. Among other things, this makes it more difficult to copy files some people seem to think the user shouldn't be copying, since the only way to get to the file once it is unlinked is by somehow grabbing the open reference to it that the app still has. Coming back to the topic at hand, as a result of the above mechanism, updates aren't normally rewritten actually in-place, normally allowing them to be written as a single unfragmented file, or if fragmented, autodefrag will notice and schedule a defragment for the defrag thread. With the exception of something like glibc, where the new library is put to work the next time something runs, that generally leaves time for a defragment if necessary, and ideally, it won't be necessary since the file should have been written in one piece, without fragmentation (unless there's so little space left the filesystem is in use what we can find mode and thus is no longer worried about fragmentation). VM images and database files are a rather different story, since they're OFTEN rewritten in place. The btrfs autodefrag option should handle reasonably small database files such as firefox's sqlite files without too much difficulty. However, there's a warning on the wiki about performance issues with larger database files and VM images (I'd guess in the range of gigabytes). The issues /may/ have been solved by now but I'm not sure. However, it's possible to mark such files (or more likely, the directory they're in, since the marking should be done at creation in ordered to be effective, and files inherit from the directory so will get it at creation if the directory has it) NODATACOW, so they get updated
autodefrag by default, was: Lots of harddrive chatter
On Jul 21, 2013, at 4:38 AM, Duncan 1i5t5.dun...@cox.net wrote: What I'd suggest is to turn on the btrfs autodefrag mount option, and to do it *BEFORE* you start installing stuff on the filesystem. Is there a good reason why autodefrag is not a default mount option? I believe it's a known issue that a number of distro installers (what arch does I'm not sure) tend to fragment their files pretty badly right off the bat if you let them. This would happen if they write data into an existing file, perhaps because they install a package and then customize the config files, or if they don't write whole files at once. And a lot of btrfs installs don't turn on the autodefrag option when they do thet first auto- mount to install stuff. Some installer teams are understandably reluctant to use non-default mount options. Chris Murphy-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: autodefrag by default, was: Lots of harddrive chatter
Chris Murphy posted on Sun, 21 Jul 2013 10:20:48 -0600 as excerpted: On Jul 21, 2013, at 4:38 AM, Duncan 1i5t5.dun...@cox.net wrote: What I'd suggest is to turn on the btrfs autodefrag mount option, and to do it *BEFORE* you start installing stuff on the filesystem. Is there a good reason why autodefrag is not a default mount option? Well, there's the obvious, that btrfs is still in development, lacking such things as the ability to set such options by default using btrfs- tune, and likely with the question of what should be the defaults still unresolved for many cases. Autodefrag can also negatively affect performance especially if it's not on from the beginning, AND at least at one point earlier in btrfs evolution (I'm not sure if it's fixed now or not), the performance for very large and often written into files such as virtual-machine images and large databases was bad, since it could mean constantly rewriting entire large files instead of just the smaller changing pieces of them, thereby being a performance killer for that type of job load. I believe it's a known issue that a number of distro installers (what arch does I'm not sure) tend to fragment their files pretty badly right off the bat if you let them. This would happen if they write data into an existing file, perhaps because they install a package and then customize the config files, or if they don't write whole files at once. And a lot of btrfs installs don't turn on the autodefrag option when they do thet first auto- mount to install stuff. Some installer teams are understandably reluctant to use non-default mount options. It's worth keeping in mind the bigger picture, tho, that in the case of btrfs they're using a still in development filesystem (even if it's not the default, the fact that so few people come here unaware of the wiki or btrfs status as a development filesystem IMO indicates that installers aren't including the warnings about making such even non-default choices that they arguably should be including) where all recommendations are to be ready for loss of data should it occur, as it's a definitely more likely possibility than it should be with a stable filesystem. With that in mind, playing with non-default mount options seems rather trivial by comparison. Still, the previously mentioned constantly written large vm/db file use- case is a big one these days, and with the general purpose installation often not having dedicated partitions for such things (btrfs subvolumes don't yet allow per-subvolume setting of such options)... But for the generally much different use-case of a system volume where all the system binaries and config is stored, autodefrag makes a lot of sense to enable by default. Or installers could simply be better about not writing into existing files in the installation in the first place, so people could turn it on right after installation and not have to worry about existing fragmentation. But... installing to btrfs is really a reasonably new situation, and I'd guess best practices are still evolving just as the filesystem itself is. -- Duncan - List replies preferred. No HTML msgs. Every nonfree program has a lord, a master -- and if you use the program, he is your master. Richard Stallman -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: autodefrag by default, was: Lots of harddrive chatter
On 07/21/2013 03:01 PM, Duncan wrote: Chris Murphy posted on Sun, 21 Jul 2013 10:20:48 -0600 as excerpted: On Jul 21, 2013, at 4:38 AM, Duncan 1i5t5.dun...@cox.net wrote: What I'd suggest is to turn on the btrfs autodefrag mount option, and to do it *BEFORE* you start installing stuff on the filesystem. Is there a good reason why autodefrag is not a default mount option? Well, there's the obvious, that btrfs is still in development, lacking such things as the ability to set such options by default using btrfs- tune, and likely with the question of what should be the defaults still unresolved for many cases. Autodefrag can also negatively affect performance especially if it's not on from the beginning, AND at least at one point earlier in btrfs evolution (I'm not sure if it's fixed now or not), the performance for very large and often written into files such as virtual-machine images and large databases was bad, since it could mean constantly rewriting entire large files instead of just the smaller changing pieces of them, thereby being a performance killer for that type of job load. I believe it's a known issue that a number of distro installers (what arch does I'm not sure) tend to fragment their files pretty badly right off the bat if you let them. This would happen if they write data into an existing file, perhaps because they install a package and then customize the config files, or if they don't write whole files at once. And a lot of btrfs installs don't turn on the autodefrag option when they do thet first auto- mount to install stuff. Some installer teams are understandably reluctant to use non-default mount options. It's worth keeping in mind the bigger picture, tho, that in the case of btrfs they're using a still in development filesystem (even if it's not the default, the fact that so few people come here unaware of the wiki or btrfs status as a development filesystem IMO indicates that installers aren't including the warnings about making such even non-default choices that they arguably should be including) where all recommendations are to be ready for loss of data should it occur, as it's a definitely more likely possibility than it should be with a stable filesystem. With that in mind, playing with non-default mount options seems rather trivial by comparison. Still, the previously mentioned constantly written large vm/db file use- case is a big one these days, and with the general purpose installation often not having dedicated partitions for such things (btrfs subvolumes don't yet allow per-subvolume setting of such options)... But for the generally much different use-case of a system volume where all the system binaries and config is stored, autodefrag makes a lot of sense to enable by default. Or installers could simply be better about not writing into existing files in the installation in the first place, so people could turn it on right after installation and not have to worry about existing fragmentation. But... installing to btrfs is really a reasonably new situation, and I'd guess best practices are still evolving just as the filesystem itself is. Duncan, First of all, thanks so much for the great explanation. It really answers a LOT of questions as to the whole fragmentation issue and covers a lot of bases. And I totally agree with some of your thoughts regarding the still beta status of btrfs and its effect on support and documentation, etc. But I think the only unanswered question for me at this point is whether complete defragmentation is even possible using auto-defrag. Unless auto-defrag can work around the in-use file issue, that could be a problem since some heavily used system files are open virtually all the time the system is up and running. Has this issue been investigated and if so are there any system files that don't get defragmented that matter? Or is this a non-issue in that any constantly in use system files don't really matter anyway? That is really the only question I have before moving away from my current offline approach to the auto-defrag mount option for system filesystems (/, /boot, /usr, /opt, /var, etc). -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: autodefrag by default, was: Lots of harddrive chatter
On Sunday, July 21, 2013 04:44:09 PM George Mitchell wrote: Unless auto-defrag can work around the in-use file issue, that could be a problem since some heavily used system files are open virtually all the time the system is up and running. Has this issue been investigated and if so are there any system files that don't get defragmented that matter? Or is this a non-issue in that any constantly in use system files don't really matter anyway? That is really the only question I have before moving away from my current offline approach to the auto-defrag mount option for system filesystems (/, /boot, /usr, /opt, /var, etc). AFAIK, defragmentation is proportional to amount of writes to a file or direcotries. system files typically are installed once and never rewritten in place, so they should not be much fragmented to begin with. now their directory objects, is a different story and so is things like systemd journal, log files, or database files. -- Regards Shridhar -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: autodefrag by default, was: Lots of harddrive chatter
On 07/21/2013 08:37 PM, Shridhar Daithankar wrote: On Sunday, July 21, 2013 04:44:09 PM George Mitchell wrote: Unless auto-defrag can work around the in-use file issue, that could be a problem since some heavily used system files are open virtually all the time the system is up and running. Has this issue been investigated and if so are there any system files that don't get defragmented that matter? Or is this a non-issue in that any constantly in use system files don't really matter anyway? That is really the only question I have before moving away from my current offline approach to the auto-defrag mount option for system filesystems (/, /boot, /usr, /opt, /var, etc). AFAIK, defragmentation is proportional to amount of writes to a file or direcotries. system files typically are installed once and never rewritten in place, so they should not be much fragmented to begin with. now their directory objects, is a different story and so is things like systemd journal, log files, or database files. Never rewritten in place? I wouldn't go that far. In the case of many distros there is a continual flow of updates which results in some degree of data churning throughout the system filesystems. Just a kernel update, for example, can affect a rather large number of files and directories with new writes while application updates (KDE or even Gnome for example) can cause a large number of files to be rewritten in place. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: autodefrag by default, was: Lots of harddrive chatter
On Sunday, July 21, 2013 08:53:42 PM George Mitchell wrote: system files typically are installed once and never rewritten in place, so they should not be much fragmented to begin with. now their directory objects, is a different story and so is things like systemd journal, log files, or database files. Never rewritten in place? I wouldn't go that far. In the case of many distros there is a continual flow of updates which results in some degree of data churning throughout the system filesystems. Just a kernel update, for example, can affect a rather large number of files and directories with new writes while application updates (KDE or even Gnome for example) can cause a large number of files to be rewritten in place. While it is true that large number of files are changed with each update, most of the update delete the existing files and install new one. That does not lead to fragmentation of a file. Unless the packages are patching the files in place(AFAIK, even delta RPMs patch the RPM and not the individual files), it would not lead to the defragmentation that is problematic on btrfs. There are two types of defragmentations. first is where files are continually added/deleted to the file system, e.g. a mail server. IME btrfs handles that quite well. Another is where a file is constantly updated in place e.g. a postgresql/sqlite database. In the second case, COW nature of btrfs causes the defragmentation which directly affects the performance, at least on spinning hard disks. -- Regards Shridhar -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html