Re: Very various speed of grep operation on btrfs partition
Ok, I am make another experiment. I am buy new HDD and format it with btrfs file system. Also I increased size of grep data and make bash script wich automate testing: #!/bin/bash #For testing on windows machine #grep_path='/cygdrive/e/Sources/inside' #For testing on new HDD #grep_path='/run/media/mikhail/eaa531cd-25f4-4e00-b31f-22665faa9768/sources/inside' #For testing in real life grep_path='/home/mikhail/sources/inside' command="grep -rn 'float:left;display: block;height: 24px;line-height: 1.2em;position: relative;text-align: center;white-space: nowrap;width: 80px;' '$grep_path'" log_file='res.log' exec 3>&1 1>>${log_file} 2>&1 while [ 1 = 1 ] do (( count++ )) echo "PASS: $count" at `date +"%T"` | tee /dev/fd/3 echo $command | tee /dev/fd/3 eval "{ time $command > /dev/null; } |& tee /dev/fd/3" done And get very interesting results: Linux btrfs with NEW HDD: 6.441s (result as in syntetic tests) Linux btrfs with real data HDD (used 94%): 16m52.036s Very bad why??? Data are same with first variant. Windows ntfs NEW HDD: 1m27.643s I am really disappointed why in real life (home folder) have so bad results It's possible HDD which is used 94% optimise speed as on empty hard drive? Both hard disk are same. This is ST4000NM0033-9ZM170. -- Best Regards, Mike Gavrilov.
Re: Very various speed of grep operation on btrfs partition
On 2015-12-06 22:32, Duncan wrote: FWIW, I build kde without the semantic-desktop stuff even enabled at build-time (gentoo offers that option) here. All the kdepim stuff (kmail, etc) uses it, so I dumped the several kdepim related apps (kmail, akregator, kaddressbook) I used here and found alternatives. I don't normally need the indexing, which only takes space for the index and lowers performance, so it's all turned off at build-time. Personally, I just avoid both KDE and GNOME. For me efficiency is most important, and XFCE beats both at that (I've done testing between the three on my laptop with equivalent settings, and XFCE gives me about 50-70% more battery life than either KDE or GNOME, and a much smaller memory footprint), followed closely by lack of lock-in (both GNOME and KDE have a very 'all or nothing' feel to them, and in the case of GNOME, it really is all or nothing). I do have to say though, Okular is by far the best document viewer available for Linux. A day later noticed that the effect of the cache is missing: real 4m33.940s user 0m0.862s sys 0m1.711s That's probably due to something knocking it out of cache overnite. If you have a cronjob running nitely to update the locate-variant database, for example, as many distros do by default, that'd do it, as that scans pretty much the entire filesystem, typically many times the size of RAM, thus trashing cache. The indexer could potentially wipe out cache too, particularly on lower memory machines, if it's actively indexing files, as that would normally pull what it's indexing into cache, throwing something else that hasn't been used for awhile away, unless the indexer is smart enough to do direct access and thus not disturb cache, since it's single-time access and caching it isn't going to do anything but force stuff from cache you use more frequently. Somehow I doubt that the indexer is using O_DIRECT, while that bypasses cache, it also makes things really slow, which is directly counter to their intent to finish indexing as fast as possible (supposedly to minimize performance impact, but that's at odds with their behavior of trashing the cache). That, and the slowest part of indexing is calling stat() on everything (stat is one of the slowest filesystem related system calls, and is in general usually near the top of the list of calls to avoid when doing real-time work, or for that matter anything that needs to be fast). As I understand to solve my problem just need to do the cache is always effective, even if memory occupied by other applications. Is possible to specify minimal size of disk cache? AFAIK, not directly. What happens is that rather than leave the memory empty, the kernel caches stuff as it reads it. If the memory is needed for apps, it's reclaimed from cache and used for apps. So Linux systems tend to run close to zero really free memory, unless you just dropped caches or rebooted, or you just used some memory hog and it's done and just freed its memory, and you haven't read enough files since then to fill that memory back up with cache. Yep, there's no direct way to force it to a fixed size. Personally I would love to be able to reserve some fixed amount of RAM for disk caching, but I also usually run with _a lot_ of swap (usually on the order of 4 to 16 times physical RAM, I do work sometimes on really big images). However, if you're running swap, there's an adjustment, file /proc/sys/vm/swappiness, but would be set on most distros using the sysctrl config (/etc/sysctl.conf and/or /etc/sysctl.d/*), 0-100, that normally controls the balance preference between swapping apps out to keep cache (nearer 100) vs. dumping cache to keep more apps in RAM instead of swapped out (near 0). IIRC the default is 60. You may also want to look into /proc/sys/vm/vfs_cache_pressure as well, the lower that is, the less likely it is that pages in the filesystem cache will be reclaimed. This tends to have a bigger impact if all you care about is the filesystem cache. Don't set it below about 50 though, otherwise it becomes very easy to run the system out of memory. Obviously if you're not running swap, all app memory must be kept in physical RAM as it can't be swapped out, and cache simply uses what's left. Pity that I can't do 'echo 3 > /proc/sys/vm/drop_caches' on Windows machine. It be interesting how fast grep would be work without cache. FWIW, I jumped off of MS when they started shipping malware[1] as part of the OS, with eXPrivacy. So I've no idea if they've something similar, tho I'd be somewhat surprised if they didn't, at least as some obscure and possibly undocumented system call, so you'd have to call it from a program written for that purpose, instead of having it exposed such that any admin with suitable privs can do it with a single line command using only shell builtins, as Linux does. Windows has no way that I know of outside of kernel mode to force the filesystem cache to
Re: Very various speed of grep operation on btrfs partition
Михаил Гаврилов posted on Mon, 07 Dec 2015 02:16:08 +0500 as excerpted: > 2015-12-04 17:59 GMT+05:00 Austin S Hemmelgarn: >> Well, what other things are accessing the filesystem at the same time? >> If you've got something like KDE running with the 'semantic desktop' >> stuff turned on, than that will seriously impact the performance of >> other things using that filesystem. >> >> The other thing to keep in mind, is that caching may be impacting >> things somewhat. To really get a good idea of performance for >> something like this, >> you should run 'sync' followed by 'echo 3 > /proc/sys/vm/drop_caches' >> (you'll need to be root for the second one) prior to each run, and >> ideally have nothing else running on that filesystem. > > Thanks for clarifying. > > I was able to further clarify: > > After resetting the cache on a clean machine after a reboot grep > operation was take: > real 2m54.549s user 0m0.662s sys 0m1.062s > > After turning off the indexing service (tracker) result improved: real > 2m12.182s user 0m0.657s sys 0m1.021s > > > If the cache is not cleaned: > real 0m0.575s user 0m0.467s sys 0m0.108s > > > And the result is stable and all subsequent launches, even when the > indexing service is enabled. FWIW, I build kde without the semantic-desktop stuff even enabled at build-time (gentoo offers that option) here. All the kdepim stuff (kmail, etc) uses it, so I dumped the several kdepim related apps (kmail, akregator, kaddressbook) I used here and found alternatives. I don't normally need the indexing, which only takes space for the index and lowers performance, so it's all turned off at build-time. > A day later noticed that the effect of the cache is missing: > real 4m33.940s user 0m0.862s sys 0m1.711s That's probably due to something knocking it out of cache overnite. If you have a cronjob running nitely to update the locate-variant database, for example, as many distros do by default, that'd do it, as that scans pretty much the entire filesystem, typically many times the size of RAM, thus trashing cache. The indexer could potentially wipe out cache too, particularly on lower memory machines, if it's actively indexing files, as that would normally pull what it's indexing into cache, throwing something else that hasn't been used for awhile away, unless the indexer is smart enough to do direct access and thus not disturb cache, since it's single-time access and caching it isn't going to do anything but force stuff from cache you use more frequently. > As I understand to solve my problem just need to do the cache is always > effective, even if memory occupied by other applications. > > Is possible to specify minimal size of disk cache? AFAIK, not directly. What happens is that rather than leave the memory empty, the kernel caches stuff as it reads it. If the memory is needed for apps, it's reclaimed from cache and used for apps. So Linux systems tend to run close to zero really free memory, unless you just dropped caches or rebooted, or you just used some memory hog and it's done and just freed its memory, and you haven't read enough files since then to fill that memory back up with cache. However, if you're running swap, there's an adjustment, file /proc/sys/vm/swappiness, but would be set on most distros using the sysctrl config (/etc/sysctl.conf and/or /etc/sysctl.d/*), 0-100, that normally controls the balance preference between swapping apps out to keep cache (nearer 100) vs. dumping cache to keep more apps in RAM instead of swapped out (near 0). IIRC the default is 60. Obviously if you're not running swap, all app memory must be kept in physical RAM as it can't be swapped out, and cache simply uses what's left. > Pity that I can't do 'echo 3 > /proc/sys/vm/drop_caches' on Windows > machine. It be interesting how fast grep would be work without cache. FWIW, I jumped off of MS when they started shipping malware[1] as part of the OS, with eXPrivacy. So I've no idea if they've something similar, tho I'd be somewhat surprised if they didn't, at least as some obscure and possibly undocumented system call, so you'd have to call it from a program written for that purpose, instead of having it exposed such that any admin with suitable privs can do it with a single line command using only shell builtins, as Linux does. >> Additionally, do you have some particular reason that you absolutely >> _need_ nodatacow to be enabled for the FS? It usually has no impact on >> performance, but it removes any kind of error correction for file data >> (checksums can't be used safely without COW semantics). It probably >> has no direct impact on what you're seeing here, but it is something >> that really shouldn't be used in most cases at the filesystem level (it >> can be done on given subvolumes or directories, and that's the >> recommended way to do it if you don't want to go down to the per-file >> level). >> >> > I see
Re: Very various speed of grep operation on btrfs partition
2015-12-04 17:59 GMT+05:00 Austin S Hemmelgarn: > Well, what other things are accessing the filesystem at the same time? If > you've got something like KDE running with the 'semantic desktop' stuff > turned on, than that will seriously impact the performance of other things > using that filesystem. > > The other thing to keep in mind, is that caching may be impacting things > somewhat. To really get a good idea of performance for something like this, > you should run 'sync' followed by 'echo 3 > /proc/sys/vm/drop_caches' > (you'll need to be root for the second one) prior to each run, and ideally > have nothing else running on that filesystem. Thanks for clarifying. I was able to further clarify: After resetting the cache on a clean machine after a reboot grep operation was take: real 2m54.549s user 0m0.662s sys 0m1.062s After turning off the indexing service (tracker) result improved: real 2m12.182s user 0m0.657s sys 0m1.021s If the cache is not cleaned: real 0m0.575s user 0m0.467s sys 0m0.108s And the result is stable and all subsequent launches, even when the indexing service is enabled. A day later noticed that the effect of the cache is missing: real 4m33.940s user 0m0.862s sys 0m1.711s As I understand to solve my problem just need to do the cache is always effective, even if memory occupied by other applications. Is possible to specify minimal size of disk cache? Pity that I can't do 'echo 3 > /proc/sys/vm/drop_caches' on Windows machine. It be interesting how fast grep would be work without cache. > On a separate note, if you're either running on a 64-bit system, or have > less than about 2^31 files on the FS, inode_cache will slow things down. > It's intended for stuff like mail spools where you have billions of files > being created and deleted over a few weeks, and quickly use up the inode > numbers. On almost all systems, it will make things run slower, and > possibly result in non-=deterministic filesystem performance like what you > are seeing here. Hmm, less than about 2^31 ? Maybe you mean more? > Additionally, do you have some particular reason that you absolutely _need_ > nodatacow to be enabled for the FS? It usually has no impact on > performance, but it removes any kind of error correction for file data > (checksums can't be used safely without COW semantics). It probably has no > direct impact on what you're seeing here, but it is something that really > shouldn't be used in most cases at the filesystem level (it can be done on > given subvolumes or directories, and that's the recommended way to do it if > you don't want to go down to the per-file level). > I see that some issue with btrfs still not closed: https://code.google.com/p/chromium/issues/detail?id=284738 And gnome-boxes still very slow when COW is enable. -- Best Regards, Mike Gavrilov. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Very various speed of grep operation on btrfs partition
On 2015-12-03 14:36, Михаил Гаврилов wrote: Today on work I needed searching some strings in repository. Only machine with windows was available. I am was using grep from Cygwin for this task and I am was surprised about speed of NTFS partition.I decided to repeat this task on my home Linux workstation. [...snip...] From results we see that search goes sometimes instantly less than a second, and sometimes lasts 4 minutes. /home partition formatted in BTRFS filesystem. I would be interested investigate what is related to search speed. And make that search was always goes less than a second. Here is my mount options: UUID=82df2d84-bf54-46cb-84ba-c88e93677948 /home btrfs subvolid=5,autodefrag,noatime,space_cache,inode_cache,nodatacow 0 0 # uname -a Linux localhost.localdomain 4.2.6-301.fc23.x86_64+debug #1 SMP Fri Nov 20 22:07:57 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux How to start investigation? Well, what other things are accessing the filesystem at the same time? If you've got something like KDE running with the 'semantic desktop' stuff turned on, than that will seriously impact the performance of other things using that filesystem. The other thing to keep in mind, is that caching may be impacting things somewhat. To really get a good idea of performance for something like this, you should run 'sync' followed by 'echo 3 > /proc/sys/vm/drop_caches' (you'll need to be root for the second one) prior to each run, and ideally have nothing else running on that filesystem. On a separate note, if you're either running on a 64-bit system, or have less than about 2^31 files on the FS, inode_cache will slow things down. It's intended for stuff like mail spools where you have billions of files being created and deleted over a few weeks, and quickly use up the inode numbers. On almost all systems, it will make things run slower, and possibly result in non-=deterministic filesystem performance like what you are seeing here. Additionally, do you have some particular reason that you absolutely _need_ nodatacow to be enabled for the FS? It usually has no impact on performance, but it removes any kind of error correction for file data (checksums can't be used safely without COW semantics). It probably has no direct impact on what you're seeing here, but it is something that really shouldn't be used in most cases at the filesystem level (it can be done on given subvolumes or directories, and that's the recommended way to do it if you don't want to go down to the per-file level). smime.p7s Description: S/MIME Cryptographic Signature
Very various speed of grep operation on btrfs partition
Today on work I needed searching some strings in repository. Only machine with windows was available. I am was using grep from Cygwin for this task and I am was surprised about speed of NTFS partition.I decided to repeat this task on my home Linux workstation. [mikhail@localhost ~]$ time grep -rn 'float:left;display: block;height: 24px;line-height: 1.2em;position: relative;text-align: center;white-space: nowrap;width: 80px;' "/home/mikhail/sources/inside/Модули интерфейса/" /home/mikhail/sources/inside/Модули интерфейса/Активные продажи/Темы/Разводящая группы тем/CRMGroupTheme/CRMGroupTheme.xhtml:38: real 3m21.262s user 0m0.914s sys 0m2.288s [mikhail@localhost ~]$ time grep -rn 'float:left;display: block;height: 24px;line-height: 1.2em;position: relative;text-align: center;white-space: nowrap;width: 80px;' "/home/mikhail/sources/inside/Модули интерфейса/" /home/mikhail/sources/inside/Модули интерфейса/Активные продажи/Темы/Разводящая группы тем/CRMGroupTheme/CRMGroupTheme.xhtml:38: real 0m49.921s user 0m0.736s sys 0m0.957s [mikhail@localhost ~]$ time grep -rn 'float:left;display: block;height: 24px;line-height: 1.2em;position: relative;text-align: center;white-space: nowrap;width: 80px;' "/home/mikhail/sources/inside/Модули интерфейса/" /home/mikhail/sources/inside/Модули интерфейса/Активные продажи/Темы/Разводящая группы тем/CRMGroupTheme/CRMGroupTheme.xhtml:38: real 0m1.077s user 0m0.667s sys 0m0.409s [mikhail@localhost ~]$ time grep -rn 'float:left;display: block;height: 24px;line-height: 1.2em;position: relative;text-align: center;white-space: nowrap;width: 80px;' "/home/mikhail/sources/inside/Модули интерфейса/" /home/mikhail/sources/inside/Модули интерфейса/Активные продажи/Темы/Разводящая группы тем/CRMGroupTheme/CRMGroupTheme.xhtml:38: real 0m1.062s user 0m0.657s sys 0m0.386s [mikhail@localhost ~]$ time grep -rn 'float:left;display: block;height: 24px;line-height: 1.2em;position: relative;text-align: center;white-space: nowrap;width: 80px;' "/home/mikhail/sources/inside/Модули интерфейса/" /home/mikhail/sources/inside/Модули интерфейса/Активные продажи/Темы/Разводящая группы тем/CRMGroupTheme/CRMGroupTheme.xhtml:38: real 0m1.020s user 0m0.641s sys 0m0.373s [mikhail@localhost ~]$ time grep -rn 'float:left;display: block;height: 24px;line-height: 1.2em;position: relative;text-align: center;white-space: nowrap;width: 80px;' "/home/mikhail/sources/inside/Модули интерфейса/" /home/mikhail/sources/inside/Модули интерфейса/Активные продажи/Темы/Разводящая группы тем/CRMGroupTheme/CRMGroupTheme.xhtml:38: real 0m0.941s user 0m0.593s sys 0m0.335s [mikhail@localhost ~]$ time grep -rn 'float:left;display: block;height: 24px;line-height: 1.2em;position: relative;text-align: center;white-space: nowrap;width: 80px;' "/home/mikhail/sources/inside/Модули интерфейса/" /home/mikhail/sources/inside/Модули интерфейса/Активные продажи/Темы/Разводящая группы тем/CRMGroupTheme/CRMGroupTheme.xhtml:38: real 0m0.918s user 0m0.594s sys 0m0.322s [mikhail@localhost ~]$ time grep -rn 'float:left;display: block;height: 24px;line-height: 1.2em;position: relative;text-align: center;white-space: nowrap;width: 80px;' "/home/mikhail/sources/inside/Модули интерфейса/" /home/mikhail/sources/inside/Модули интерфейса/Активные продажи/Темы/Разводящая группы тем/CRMGroupTheme/CRMGroupTheme.xhtml:38: real 0m1.053s user 0m0.620s sys 0m0.401s [mikhail@localhost ~]$ time grep -rn 'float:left;display: block;height: 24px;line-height: 1.2em;position: relative;text-align: center;white-space: nowrap;width: 80px;' "/home/mikhail/sources/inside/Модули интерфейса/" /home/mikhail/sources/inside/Модули интерфейса/Активные продажи/Темы/Разводящая группы тем/CRMGroupTheme/CRMGroupTheme.xhtml:38: real 0m1.049s user 0m0.625s sys 0m0.411s [mikhail@localhost ~]$ time grep -rn 'float:left;display: block;height: 24px;line-height: 1.2em;position: relative;text-align: center;white-space: nowrap;width: 80px;' "/home/mikhail/sources/inside/Модули интерфейса/" /home/mikhail/sources/inside/Модули интерфейса/Активные продажи/Темы/Разводящая группы тем/CRMGroupTheme/CRMGroupTheme.xhtml:38: real 2m51.858s user 0m0.885s sys 0m1.863s [mikhail@localhost ~]$ time grep -rn 'float:left;display: block;height: 24px;line-height: 1.2em;position: relative;text-align: center;white-space: nowrap;width: 80px;' "/home/mikhail/sources/inside/Модули интерфейса/" /home/mikhail/sources/inside/Модули интерфейса/Активные продажи/Темы/Разводящая группы тем/CRMGroupTheme/CRMGroupTheme.xhtml:38: real 3m22.615s user 0m0.876s sys 0m2.160s [mikhail@localhost ~]$ time grep -rn 'float:left;display: block;height: 24px;line-height: 1.2em;position: relative;text-align: center;white-space: nowrap;width: 80px;' "/home/mikhail/sources/inside/Модули интерфейса/" /home/mikhail/sources/inside/Модули интерфейса/Активные продажи/Темы/Разводящая группы тем/CRMGroupTheme/CRMGroupTheme.xhtml:38: real 1m16.100s user 0m0.773s sys