On 08/03/2018 12:55 PM, Simon Slavin wrote:
On 3 Aug 2018, at 8:36pm, Shevek <sql...@anarres.org> wrote:
We are running a 100Gb sqlite database, which we mmap entirely into RAM. We are
having trouble with parts of the disk file being evicted from RAM during
periods of low activity causing slow responses, particularly before 9am. Has
anybody played with mlock and/or madvise within the sqlite mmap subsystem to
improve this behaviour?
Is this a genuine Linux machine running on physical hardware, or is it a
virtual machine ?
Yes, it's a genuine physical, we have Xeon and Epyc CPUs available.
Sometimes we have to run in VMs (up to 50Gb), but the bigger stuff is
all physical. We typically have 256Gb+ of RAM, so we aren't under
particular pressure to mmap a 100Gb database.
Are you intentionally doing anything that would contend for this memory ? In
other words, when a memory-mapped portion gets swapped out, does it make sense
what replaced it, or is it pointless and weird ?
Sometimes, Linux just seems to get unfriendly with a set of pages and
just maps them out. I've been watching it all weekend - right now the
system I'm watching has 165Gb free, and I watched Linux just dump 40Gb
out of RAM. :-( There are other jobs running on the system, and doing
I/O, but nothing that should put any real memory pressure on the system,
aside from disk I/O, backup, etc.
We're about to try mlockall(MCL_FUTURE) along with MAP_SHARED. It might
also be worth trying fadvise(), but I think kernel only honours a few
megabytes based on that. We did think of a page-toucher thread but that
risks thrashing as much as anything, but might be interesting for
monitoring page faults performance.
Later note: mlockall() failed because of JVM heap; we're going to have
to do something much more specific, like holding a secondary map and
mlocking that.
Warren:
Is the copy-everything-into-memory strategy not prohibitively expensive
at the 100+Gb scale? Is it worth sinking the time into implementing
that? Our rows are very small, only a few bytes each, so the per-row
overhead may be significant. Also, it would be nice to have a shared
mmap, rather than entirely private RAM, so we can run experiments over
the shared (readonly) store.
S.
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users