Andres Freund <and...@anarazel.de> writes: > Indeed. The buffer mapping hashtable already is visible as a major > bottleneck in a number of workloads. Even in readonly pgbench if s_b is > large enough (so the hashtable is larger than the cache). Not to speak > of things like a cached sequential scan with a cheap qual and wide rows.
To be fair, the added overhead is in buffer allocation not buffer lookup. So it shouldn't add cost to fully-cached cases. As Tomas noted upthread, the potential trouble spot is where the working set is bigger than shared buffers but still fits in RAM (so there's no actual I/O needed, but we do still have to shuffle buffers a lot). > Wonder if the temporary fix is just to do explicit hashtable probes for > all pages iff the size of the relation is < s_b / 500 or so. That'll > address the case where small tables are frequently dropped - and > dropping large relations is more expensive from the OS and data loading > perspective, so it's not gonna happen as often. Oooh, interesting idea. We'd need a reliable idea of how long the relation had been (preferably without adding an lseek call), but maybe that's do-able. regards, tom lane