Re: Give up on bcache?
On 2017-09-26 18:46, Ferry Toth wrote: Op Tue, 26 Sep 2017 15:52:44 -0400, schreef Austin S. Hemmelgarn: On 2017-09-26 12:50, Ferry Toth wrote: Looking at the Phoronix benchmark here: https://www.phoronix.com/scan.php?page=article=linux414-bcache- raid=2 I think it might be idle hopes to think bcache can be used as a ssd cache for btrfs to significantly improve performance.. True, the benchmark is using ext. It's a benchmark. They're inherently synthetic and workload specific, and therefore should not be trusted to represent things accurately for arbitrary use cases. So what. A decent benchmark tries to measure a specific aspect of the fs. Yes, and it usually measures it using a ridiculously unrealistic workload. Some of the benchmarks in iozone are a good example of this, like the backwards read one (there is nearly nothing that it provides any useful data for). For a benchmark to be meaningful, you have to test what you actually intend to use, and from a practical perspective, that article is primarily testing throughput, which is not something you should be using SSD caching for. I think you agree that applications doing lots of fsyncs (databases, dpkg) are slow on btrfs especially on hdd's, whatever way you measure that (it feels slow, it measures slow, it really is slow). Yes, but they're also slow on _everything_. fsync() is slow. Period. It just more of an issue on BTRFS because it's a CoW filesystem _and_ it's slower than ext4 even with that CoW layer bypassed. On a ssd the problem is less. And most of that is a result of the significantly higher bulk throughput on the SSD, which is not something that SSD caching replicates. So if you can fix that by using a ssd cache or a hybrid solution, how would you like to compare that? It _feels_ faster? That depends. If it's on a desktop, then that actually is one of the best ways to test it, since user perception is your primary quality metric (you can make the fastest system in the world, but if the user can't tell, you've gained nothing). If you're on anything else, you test the actual workload if possible, and a benchmark that tries to replicate the workload if not. Put another way, if you're building a PGSQL server, you should be bench-marking things with a PGSQL bench-marking tool, not some arbitrary that likely won't replicate a PGSQL workload. But the most important one (where btrfs always shows to be a little slow) would be the SQLLite test. And with ext at least performance _degrades_ except for the Writeback mode, and even there is nowhere near what the SSD is capable of. And what makes you think it will be? You're using it as a hot-data cache, not a dedicated write-back cache, and you have the overhead from bcache itself too. Just some simple math based on examining the bcache code suggests you can't get better than about 98% of the SSD's performance if you're lucky, and I'd guess it's more like 80% most of the time. I think with btrfs it will be even worse and that it is a fundamental problem: caching is complex and the cache can not how how the data on the fs is used. Actually, the improvement from using bcache with BTRFS is higher proportionate to the baseline of not using it by a small margin than it is when used with ext4. BTRFS does a lot more with the disk, so you have a lot more time spent accessing the disk, and thus more time that can be reduced by improving disk performance. While the CoW nature of BTRFS does somewhat mitigate the performance improvement from using bcache, it does not completely negate it. I would like to reverse this, how much degradation do you suffer from btrfs on a ssd as baseline compared to btrfs on a mixed ssd/hdd system. Performance-wise? It's workload dependent, but in most case it's a hit regardless of if you're using BTRFS or some other filesystem. If instead you're asking what the difference in device longevity, you can probably expect the SSD to wear out faster in the second case. Unless you have a reasonably big SSD and are using write-around caching, every write will hit the SSD too, and you'll end up with lots of rewrites on the SSD. IMHO you are hoping to get ssd performance at hdd cost. Then you're looking at the wrong tool. The primary use cases for SSD caching are smoothing latency and improving interactivity by reducing head movement. Any other measure of performance is pretty much guaranteed to be worse with SSD caching than just using an SSD, and bulk throughput is often just as bad as, if not worse than, using a regular HDD by itself. If you are that desperate for performance like an SSD, quit whining about cost and just buy an SSD. Decent ones are down to less than 0.40 USD per GB depending on the brand (search 'Crucial MX300' on Amazon if you want an example), so the cost isn't nearly as bad as people make it out to be, especially considering that most the time a normal person who isn't doing multimedia work or
Re: Give up on bcache?
Op Tue, 26 Sep 2017 15:52:44 -0400, schreef Austin S. Hemmelgarn: > On 2017-09-26 12:50, Ferry Toth wrote: >> Looking at the Phoronix benchmark here: >> >> https://www.phoronix.com/scan.php?page=article=linux414-bcache- >> raid=2 >> >> I think it might be idle hopes to think bcache can be used as a ssd >> cache for btrfs to significantly improve performance.. True, the >> benchmark is using ext. > It's a benchmark. They're inherently synthetic and workload specific, > and therefore should not be trusted to represent things accurately for > arbitrary use cases. So what. A decent benchmark tries to measure a specific aspect of the fs. I think you agree that applications doing lots of fsyncs (databases, dpkg) are slow on btrfs especially on hdd's, whatever way you measure that (it feels slow, it measures slow, it really is slow). On a ssd the problem is less. So if you can fix that by using a ssd cache or a hybrid solution, how would you like to compare that? It _feels_ faster? >> But the most important one (where btrfs always shows to be a little >> slow) >> would be the SQLLite test. And with ext at least performance _degrades_ >> except for the Writeback mode, and even there is nowhere near what the >> SSD is capable of. > And what makes you think it will be? You're using it as a hot-data > cache, not a dedicated write-back cache, and you have the overhead from > bcache itself too. Just some simple math based on examining the bcache > code suggests you can't get better than about 98% of the SSD's > performance if you're lucky, and I'd guess it's more like 80% most of > the time. >> >> I think with btrfs it will be even worse and that it is a fundamental >> problem: caching is complex and the cache can not how how the data on >> the fs is used. > Actually, the improvement from using bcache with BTRFS is higher > proportionate to the baseline of not using it by a small margin than it > is when used with ext4. BTRFS does a lot more with the disk, so you > have a lot more time spent accessing the disk, and thus more time that > can be reduced by improving disk performance. While the CoW nature of > BTRFS does somewhat mitigate the performance improvement from using > bcache, it does not completely negate it. I would like to reverse this, how much degradation do you suffer from btrfs on a ssd as baseline compared to btrfs on a mixed ssd/hdd system. IMHO you are hoping to get ssd performance at hdd cost. >> I think the original idea of hot data tracking has a much better chance >> to significantly improve performance. This of course as the SSD's and >> HDD's then will be equal citizens and btrfs itself gets to decide on >> which drive the data is best stored. > First, the user needs to decide, not BTRFS (at least, by default, BTRFS > should not be involved in the decision). Second, tiered storage (that's > what that's properly called) is mostly orthogonal to caching (though > bcache and dm-cache behave like tiered storage once the cache is > warmed). So, on your desktop you really are going to seach for all sqllite, mysql and psql files, dpkg files etc. and move them to the ssd? You can already do that. Go ahead! The big win would be if the file system does that automatically for you. >> With this implemented right, it would also finally silence the never >> ending discussion why not btrfs and why zfs, ext, xfs etc. Which would >> be a plus by its own right. > Even with this, there would still be plenty of reasons to pick one of > those filesystems over BTRFS. There would however be one more reason to > pick BTRFS over ext or XFS (but necessarily not ZFS, it already has > caching built in). Exactly, one more advantage of btrfs and one less of zfs. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Give up on bcache?
On Tue, Sep 26, 2017 at 11:33:19PM +0500, Roman Mamedov wrote: > On Tue, 26 Sep 2017 16:50:00 + (UTC) > Ferry Tothwrote: > > > https://www.phoronix.com/scan.php?page=article=linux414-bcache- > > raid=2 > > > > I think it might be idle hopes to think bcache can be used as a ssd cache > > for btrfs to significantly improve performance.. > > My personal real-world experience shows that SSD caching -- with lvmcache -- > does indeed significantly improve performance of a large Btrfs filesystem with > slowish base storage. > > And that article, sadly, only demonstrates once again the general mediocre > quality of Phoronix content: it is an astonishing oversight to not check out > lvmcache in the same setup, to at least try to draw some useful conclusion, is > it Bcache that is strangely deficient, or SSD caching as a general concept > does not work well in the hardware setup utilized. Also, it looks as if Phoronix' tests don't stress metadata at all. Btrfs is all about metadata, speeding it up greatly helps most workloads. A pipe-dream wishlist would be: * store and access master copy of metadata on SSD only * pin all data blocks referenced by generations not yet mirrored * slowly copy over metadata to HDD -- ⢀⣴⠾⠻⢶⣦⠀ We domesticated dogs 36000 years ago; together we chased ⣾⠁⢰⠒⠀⣿⡁ animals, hung out and licked or scratched our private parts. ⢿⡄⠘⠷⠚⠋⠀ Cats domesticated us 9500 years ago, and immediately we got ⠈⠳⣄ agriculture, towns then cities. -- whitroth on /. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Give up on bcache?
On 2017-09-26 12:50, Ferry Toth wrote: Looking at the Phoronix benchmark here: https://www.phoronix.com/scan.php?page=article=linux414-bcache- raid=2 I think it might be idle hopes to think bcache can be used as a ssd cache for btrfs to significantly improve performance.. True, the benchmark is using ext. It's a benchmark. They're inherently synthetic and workload specific, and therefore should not be trusted to represent things accurately for arbitrary use cases. But the most important one (where btrfs always shows to be a little slow) would be the SQLLite test. And with ext at least performance _degrades_ except for the Writeback mode, and even there is nowhere near what the SSD is capable of. And what makes you think it will be? You're using it as a hot-data cache, not a dedicated write-back cache, and you have the overhead from bcache itself too. Just some simple math based on examining the bcache code suggests you can't get better than about 98% of the SSD's performance if you're lucky, and I'd guess it's more like 80% most of the time. I think with btrfs it will be even worse and that it is a fundamental problem: caching is complex and the cache can not how how the data on the fs is used. Actually, the improvement from using bcache with BTRFS is higher proportionate to the baseline of not using it by a small margin than it is when used with ext4. BTRFS does a lot more with the disk, so you have a lot more time spent accessing the disk, and thus more time that can be reduced by improving disk performance. While the CoW nature of BTRFS does somewhat mitigate the performance improvement from using bcache, it does not completely negate it. I think the original idea of hot data tracking has a much better chance to significantly improve performance. This of course as the SSD's and HDD's then will be equal citizens and btrfs itself gets to decide on which drive the data is best stored. First, the user needs to decide, not BTRFS (at least, by default, BTRFS should not be involved in the decision). Second, tiered storage (that's what that's properly called) is mostly orthogonal to caching (though bcache and dm-cache behave like tiered storage once the cache is warmed). With this implemented right, it would also finally silence the never ending discussion why not btrfs and why zfs, ext, xfs etc. Which would be a plus by its own right. Even with this, there would still be plenty of reasons to pick one of those filesystems over BTRFS. There would however be one more reason to pick BTRFS over ext or XFS (but necessarily not ZFS, it already has caching built in). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Give up on bcache?
Am Tue, 26 Sep 2017 23:33:19 +0500 schrieb Roman Mamedov: > On Tue, 26 Sep 2017 16:50:00 + (UTC) > Ferry Toth wrote: > > > https://www.phoronix.com/scan.php?page=article=linux414-bcache- > > raid=2 > > > > I think it might be idle hopes to think bcache can be used as a ssd > > cache for btrfs to significantly improve performance.. > > My personal real-world experience shows that SSD caching -- with > lvmcache -- does indeed significantly improve performance of a large > Btrfs filesystem with slowish base storage. > > And that article, sadly, only demonstrates once again the general > mediocre quality of Phoronix content: it is an astonishing oversight > to not check out lvmcache in the same setup, to at least try to draw > some useful conclusion, is it Bcache that is strangely deficient, or > SSD caching as a general concept does not work well in the hardware > setup utilized. Bcache is actually not meant to increase benchmark performance except for very few corner cases. It is designed to improve interactivity and perceived performance, reducing head movements. On the bcache homepage there's actually tips on how to benchmark bcache correctly, including warm-up phase and turning on sequential caching. Phoronix doesn't do that, they test default settings, which is imho a good thing but you should know the consequences and research how to turn the knobs. Depending on the caching mode and cache size, the SQlite test may not show real-world numbers. Also, you should optimize some btrfs options to work correctly with bcache, e.g. force it to mount "nossd" as it detects the bcache device as SSD - which is wrong for some workloads, I think especially desktop workloads and most server workloads. Also, you may want to tune udev to correct some attributes so other applications can do their detection and behavior correctly, too: $ cat /etc/udev/rules.d/00-ssd-scheduler.rules ACTION=="add|change", KERNEL=="bcache*", ATTR{queue/rotational}="1" ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/iosched/slice_idle}="0" ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="kyber" ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="1", ATTR{queue/scheduler}="bfq" Take note: on a non-mq system you may want to use noop/deadline/cfq instead of kyber/bfq. I'm running bcache since over two years now and the performance improvement is very very high with boot times going down to 30-40s from 3+ minutes previously, faster app startup times (almost instantly like on SSD), reduced noise by reduced head movements, etc. Also, it has easy setup (no split metadata/data cache, you can attach more than one device to a single cache), and it is rocksolid even when crashing the system. Bcache learns by using LRU for caching: What you don't need will be pushed out of cache over time, what you use, stays. This is actually a lot like "hot data caching". Given a big enough cache, everything of your daily needs would stay in cache, easily achieving hit ratios around 90%. Since sequential access is bypassed, you don't have to worry to flush the cache with large copy operations. My system uses a 512G SSD with 400G dedicated to bcache, attached to 3x 1TB HDD draid0 mraid1 btrfs, filled with 2TB of net data and daily backups using borgbackup. Bcache runs in writeback mode, the backup takes around 15 minutes each night to dig through all data and stores it to an internal intermediate backup also on bcache (xfs, write-around mode). Currently not implemented, this intermediate backup will later be mirrored to external, off-site location. Some of the rest of the SSD is EFI-ESP, some swap space, and over-provisioned area to keep bcache performance high. $ uptime && bcache-status 21:28:44 up 3 days, 20:38, 3 users, load average: 1,18, 1,44, 2,14 --- bcache --- UUIDaacfbcd9-dae5-4377-92d1-6808831a4885 Block Size 4.00 KiB Bucket Size 512.00 KiB Congested? False Read Congestion 2.0ms Write Congestion20.0ms Total Cache Size400 GiB Total Cache Used400 GiB (100%) Total Cache Unused 0 B (0%) Evictable Cache 396 GiB (99%) Replacement Policy [lru] fifo random Cache Mode (Various) Total Hits 2364518 (89%) Total Misses290764 Total Bypass Hits 4284468 (100%) Total Bypass Misses 0 Total Bypassed 215 GiB The bucket size and block size was chosen to best fit with Samsung TLC arrangement. But this is pure theory, I never benchmarked the benefits. I just feel more comfortable that way. ;-) One should also keep in mind: The way how btrfs works cannot optimally use bcache, as cow will obviously invalidate data in bcache - but bcache doesn't have knowledge of this. Of course, such
Re: Give up on bcache?
On Tue, 26 Sep 2017 16:50:00 + (UTC) Ferry Tothwrote: > https://www.phoronix.com/scan.php?page=article=linux414-bcache- > raid=2 > > I think it might be idle hopes to think bcache can be used as a ssd cache > for btrfs to significantly improve performance.. My personal real-world experience shows that SSD caching -- with lvmcache -- does indeed significantly improve performance of a large Btrfs filesystem with slowish base storage. And that article, sadly, only demonstrates once again the general mediocre quality of Phoronix content: it is an astonishing oversight to not check out lvmcache in the same setup, to at least try to draw some useful conclusion, is it Bcache that is strangely deficient, or SSD caching as a general concept does not work well in the hardware setup utilized. -- With respect, Roman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Give up on bcache?
Looking at the Phoronix benchmark here: https://www.phoronix.com/scan.php?page=article=linux414-bcache- raid=2 I think it might be idle hopes to think bcache can be used as a ssd cache for btrfs to significantly improve performance.. True, the benchmark is using ext. But the most important one (where btrfs always shows to be a little slow) would be the SQLLite test. And with ext at least performance _degrades_ except for the Writeback mode, and even there is nowhere near what the SSD is capable of. I think with btrfs it will be even worse and that it is a fundamental problem: caching is complex and the cache can not how how the data on the fs is used. I think the original idea of hot data tracking has a much better chance to significantly improve performance. This of course as the SSD's and HDD's then will be equal citizens and btrfs itself gets to decide on which drive the data is best stored. With this implemented right, it would also finally silence the never ending discussion why not btrfs and why zfs, ext, xfs etc. Which would be a plus by its own right. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html