On Tue, Jul 12, 2022 at 06:25:30PM -0700, Dan Williams wrote:
> > This goes back to my question from years ago:  why do we ever
> > do this deep flush in the Linux nvdimm stack to start with?
> 
> The rationale is to push the data to smaller failure domain. Similar to
> flushing disk write-caches.

Flushing disk caches is not about a smaller failure domain.  Flushing
disk caches is about making data durable _at _all_.

> Otherwise, if you trust your memory power
> supplies like you trust your disks then just rely on them to take care
> of the data.

Well, it seems like all the benchmarketing schemes around pmem seem to
trust it.  Why would kernel block I/O be different from device dax,
MAP_SYNC?

> Otherwise, by default the kernel should default to taking as much care
> as possible to push writes to the smallest failure domain possible.

In which case we need remve the device dax direct map and MAP_SYNC.
Reducing the failure domain is not what fsync or REQ_OP_FLUSH are
about, they are about making changes durable.  How durable is up to
your device implementation.  But if you trust it only a little you
should not offer that half way option to start with.

Reply via email to