If anyone remembers fsync-gate https://danluu.com/fsyncgate/ which showed a lot 
of vulnerabilities in
other popular DBMSs, some other research was published on the topic as well
 https://www.usenix.org/conference/atc20/presentation/rebello

I originally discussed this on twitter back in 2020 but wanted to summarize 
again here.

As usual with these types of reports, there are a lot of flaws in their test 
methodology,
which invalidates some of their conclusions.

In particular, I question the validity of the failure scenarios their CuttleFS 
simulator produces.
Specifically, they claim that multiple systems exhibit False Failures after 
fsync reports a failure,
but actually (partially) succeeded. In the case of LMDB, where a 1-page 
synchronous write is involved,
this is just an invalid test.

They assume that the relevant sector that LMDB cares about is successfully 
written, but an I/O error
occurs on some other sector in the page. And so while LMDB invalidates the 
commit in memory, a cache
flush and subsequent page-in will read the updated sector. But in the real 
world, if there are hard
I/O errors on these other sectors, they will most likely also be unreadable, 
and a subsequent page-in
will also fail. So at least for LMDB, there would be no false failure.

The failure modes they're modeling don't reflect reality.

Leaving that issue aside, there's also the point that modern storage devices 
are now using 4KB sectors,
and still guarantee atomic sector writes, so the partial success scenario they 
describe can't even happen.
This is a bunch of academic speculation, with a total absence of real world 
modeling to validate the
failure scenarios they presented.

The other failures they report, on ext4fs with journaled data, are certainly 
disturbing. But we always
recommend turning that journaling off with LMDB; it's redundant with LMDB's own 
COW strategy and harms
perf for no benefit.

Of course, you don't even need to trust the filesystem, you can just use LMDB 
on a raw block device.

-- 
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

Reply via email to