Christoph Hellwig wrote: [..] > > Otherwise, by default the kernel should default to taking as much care > > as possible to push writes to the smallest failure domain possible. > > In which case we need remve the device dax direct map and MAP_SYNC. > Reducing the failure domain is not what fsync or REQ_OP_FLUSH are > about, they are about making changes durable. How durable is up to > your device implementation. But if you trust it only a little you > should not offer that half way option to start with.
That's a good point, but (with my Linux kernel developer hat on) I would flip it around and make this the platform vendor's problem. If the platform vendor has validated ADR* and that platform power supplies maintain stable power in a powerloss scenario, then 'deepflush' is a complete nop, why publish a flush mechanism? In other words, unless the platform vendor has no confidence in the standard durability model (persistence / durability at global visibility outside the CPU cache) it should skip publishing these flush hints in the ACPI NFIT table. The recourse for an end user whose vendor has published this mechanism in error is to talk to their BIOS vendor to turn off the flush capability, or use the ACPI table override mechanism to edit out the flush capability. I will also note that CXL has done away with this software flush concept and defines a standard Global Persistence Flush mechanism in the protocol that fires at impending power-loss events. * ADR: Asynchronous DRAM refresh, a platform signal to flush write buffers in the device upon detection of power-loss.
