On Sun, Sep 13, 2015 at 8:18 PM, David Barrett <dbarrett at expensify.com>
wrote:

> Hello!  If I have a database that is larger than the system's physical RAM,
> am I correct in thinking I should actually set a very *small* page cache so
> as to avoid "double caching" the same pages in both sqlite and the file
> cache?
>
> To clarify, if the database is *smaller* than system RAM, I imagine the
> best case scenario (for read performance at least) would be to set a page
> cache size equal to or larger than the database size.  This would ensure
> there is always enough RAM set aside to store up to the entire database in
> the page cache.  (Some subset of the database would also be stored in the
> file cache, but once the page cache is warmed up, you would never touch the
> file cache.)
>
> However, if the database is *larger* than system RAM, even assigning all of
> it to the page cache wouldn't enable it to store the whole database in RAM,
> no matter how "warmed up" it gets.  Furthermore, every time you "miss" the
> page cache, you would always *also* miss the file system cache (because the
> only way a page makes it into the page cache is by first reading it from
> the file system).  So every page read into the page cache requires not just
> that memory, but the same amount of memory in the file system cache.  So
> the larger the ratio between database and system size, the more likely you
> spend 2x the RAM for each page in the page cache.
>
> Accordingly, it would seem that the best configuration for a database
> significantly larger than the system RAM (assuming a perfectly uniform
> access pattern where every page is accessed with equal probability) would
> be to disable the page cache entirely -- and thereby preserve as much RAM
> as possible for the file cache.  Yes, it would be better to have it in the
> page cache than the file cache, but I imagine it's vastly better to read
> from the file cache than to go to the disk.  Is this right?
>

For a read-mostly database, enabling memory-mapped I/O would work well, if
your system supports it.  In that case, SQLite handles pages using a small
wrapper structure which references the memory-mapped data which is in the
filesystem buffers directly.

If that isn't an option, you shouldn't disable the page cache entirely.
SQLite uses b-trees to organize data, so even if every piece of data has
equal likelihood for access, the b-tree interior node pages have _much_
greater likelihood of access than leaf-node pages.  So your cache should be
big enough to hold all/most of the interior pages, plus enough extra space
to make sure they aren't being accidentally evicted too often.  The OS may
be able to figure out the right caching pattern for these pages, but SQLite
certainly has more knowledge about how they're used.

There is the opposite option, contrive to have the database opened with an
O_DIRECT type access, so that the filesystem reads into SQLite's buffers
and doesn't keep a cache.  You'd still want to leave enough free memory for
the filesystem to cache things like indirection nodes, but usually those
are very efficiently packed.  Then you can devote lots of memory to
SQLite's cache.

Of course, any of these are just fiddling while the Titanic sinks.  When
faced with a need to do literal random access to a file bigger than memory,
it almost doesn't matter how you structure things, it's going to be
painful.  You'd be much better off to rearrange your access patterns to be
more streaming in nature, then you can just allocate modest cache to SQLite
and let the filesystem do appropriate read-ahead.

-scott

Reply via email to