Andres Freund <and...@anarazel.de> writes:
> Indeed. The buffer mapping hashtable already is visible as a major
> bottleneck in a number of workloads. Even in readonly pgbench if s_b is
> large enough (so the hashtable is larger than the cache). Not to speak
> of things like a cached sequential scan with a cheap qual and wide rows.

To be fair, the added overhead is in buffer allocation not buffer lookup.
So it shouldn't add cost to fully-cached cases.  As Tomas noted upthread,
the potential trouble spot is where the working set is bigger than shared
buffers but still fits in RAM (so there's no actual I/O needed, but we do
still have to shuffle buffers a lot).

> Wonder if the temporary fix is just to do explicit hashtable probes for
> all pages iff the size of the relation is < s_b / 500 or so. That'll
> address the case where small tables are frequently dropped - and
> dropping large relations is more expensive from the OS and data loading
> perspective, so it's not gonna happen as often.

Oooh, interesting idea.  We'd need a reliable idea of how long the
relation had been (preferably without adding an lseek call), but maybe
that's do-able.

                        regards, tom lane


Reply via email to