Is there any fundamental reason not to support huge writeback caches?
(I mean, besides working around bugs and/or questionably poor design
choices which no one wishes to fix.)
The obvious drawback is the increased risk of data loss upon hardware
failure or kernel panic but why couldn't the user be allowed to draw
the line between probability of data loss and potential performance
gains?

The last time I changed hardware, I put double the amount of RAM into
my little home server for the sole reason to use a relatively huge
cache, especially a huge writeback cache. Although I realized it soon
enough that writeback ratios like 20/45 will make the system unstable
(OOM reaping) even if ~90% of the memory is theoretically free = used
as some form of cache, read or write, depending on this ratio
parameter and I ended up below the default to get rid of The Reaper.

My plan was to try and decrease the fragmentation of files which are
created by dumping several parallel real-time video streams into
separate files (and also minimize the HDD head seeks due to that).
(The computer in question is on a UPS.)

On Thu, Dec 1, 2016 at 4:49 PM, Michal Hocko <mho...@kernel.org> wrote:
> On Wed 30-11-16 10:16:53, Marc MERLIN wrote:
>> +folks from linux-mm thread for your suggestion
>>
>> On Wed, Nov 30, 2016 at 01:00:45PM -0500, Austin S. Hemmelgarn wrote:
>> > > swraid5 < bcache < dmcrypt < btrfs
>> > >
>> > > Copying with btrfs send/receive causes massive hangs on the system.
>> > > Please see this explanation from Linus on why the workaround was
>> > > suggested:
>> > > https://lkml.org/lkml/2016/11/29/667
>> > And Linux' assessment is absolutely correct (at least, the general
>> > assessment is, I have no idea about btrfs_start_shared_extent, but I'm more
>> > than willing to bet he's correct that that's the culprit).
>>
>> > > All of this mostly went away with Linus' suggestion:
>> > > echo 2 > /proc/sys/vm/dirty_ratio
>> > > echo 1 > /proc/sys/vm/dirty_background_ratio
>> > >
>> > > But that's hiding the symptom which I think is that btrfs is piling up 
>> > > too many I/O
>> > > requests during btrfs send/receive and btrfs scrub (probably balance 
>> > > too) and not
>> > > looking at resulting impact to system health.
>>
>> > I see pretty much identical behavior using any number of other storage
>> > configurations on a USB 2.0 flash drive connected to a system with 16GB of
>> > RAM with the default dirty ratios because it's trying to cache up to 3.2GB
>> > of data for writeback.  While BTRFS is doing highly sub-optimal things 
>> > here,
>> > the ancient default writeback ratios are just as much a culprit.  I would
>> > suggest that get changed to 200MB or 20% of RAM, whichever is smaller, 
>> > which
>> > would give overall almost identical behavior to x86-32, which in turn works
>> > reasonably well for most cases.  I sadly don't have the time, patience, or
>> > expertise to write up such a patch myself though.
>>
>> Dear linux-mm folks, is that something you could consider (changing the
>> dirty_ratio defaults) given that it affects at least bcache and btrfs
>> (with or without bcache)?
>
> As much as the dirty_*ratio defaults a major PITA this is not something
> that would be _easy_ to change without high risks of regressions. This
> topic has been discussed many times with many good ideas, nothing really
> materialized from them though :/
>
> To be honest I really do hate dirty_*ratio and have seen many issues on
> very large machines and always suggested to use dirty_bytes instead but
> a particular value has always been a challenge to get right. It has
> always been very workload specific.
>
> That being said this is something more for IO people than MM IMHO.
>
> --
> Michal Hocko
> SUSE Labs
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to