...
LMDB's write performance is pretty mediocre, by design - we emphasized
durability/reliability over performance here. But in practice, it is
always faster than e.g. BerkeleyDB, which supports multiple concurrent
writers. With multiple writer concurrency, we found that BDB spends
much of its time in contended locks and deadlock resolution. In most
applications, lock acquisition/release, deadlock detection, and
resolution will consume a huge amount of CPU time, completely erasing
any potential throughput gains from allowing multiple concurrent
writers.

If you want to do writes thru mmap() then you need to be extremely
careful, so yes, how LMDB does writes is actually highly specific to
its use of mmap. Transactional integrity requires that certain writes
are persisted to disk in a particular order, otherwise you get
corrupted data structures. You can use mlock() to prevent pages from
being flushed before you intend, but then you're invoking a number of
system calls per write, and so you haven't gained anything in the
performance department. Or you can do what LMDB does, and write
arbitrarily to the map until the end of the transaction (using no
syscalls), and then do carefully sequenced final updates and msyncs.

Note that LMDB works perfectly well on OpenBSD even without a unified
buffer cache; it just requires you to perform writes thru the mmap as
well as reads to sidestep the cache coherency issue. (Of course, using
a writable mmap means you lose LMDB's default immunity to stray writes
thru wild pointers.)


But LMDB doesn't even compile on OpenBSD in mmap mode, does it, or did you fix it last months?


Having some kind of vacuum/GC feature on the database file would be nice, even if it's only rare and sporadic. I am principally against database files that are hardcoded to always be ever-growing.

Am aware that some kind of "hotswap" logic can be implemented, maybe quite easily, atop LMDB though. Perhaps such a "hotswap-vacuum" wrapper could be provided as a standard wrapper atop LMDB just because it is so useful to everyone though. *A lot* would be needed for me to accept an ever-growing database file.

Reply via email to