Hi,

On 2023-06-09 07:34:49 +1200, Thomas Munro wrote:
> I wasn't in Mathew Wilcox's unconference in Ottawa but I found an
> older article on LWN:
> 
> https://lwn.net/Articles/895217/
> 
> For what it's worth, FreeBSD hackers have studied this topic too (and
> it's been done in Android and no doubt other systems before):
> 
> https://www.cs.rochester.edu/u/sandhya/papers/ispass19.pdf
> 
> I've shared that paper on this list before in the context of
> super/huge pages and their benefits (to executable code, and to the
> buffer pool), but a second topic in that paper is the idea of a shared
> page table: "We find that sharing PTPs across different processes can
> reduce execution cycles by as much as 6.9%. Moreover, the combined
> effects of using superpages to map the main executable and sharing
> PTPs for the small shared libraries can reduce execution cycles up to
> 18.2%."  And that's just part of it, because those guys are more
> interested in shared code/libraries and such so that's probably not
> even getting to the stuff like buffer pool and DSMs that we might tend
> to think of first.

I've experimented with using huge pages for executable code on linux, and the
benefits are quite noticable:
https://www.postgresql.org/message-id/20221104212126.qfh3yzi7luvyy5d6%40awork3.anarazel.de

I'm a bit dubious that sharing the page table for executable code increase the
benefit that much further in real workloads. I suspect the reason it was
different for the authors of the paper is:

> A fixed number of back-to-back
> transactions are performed on a 5GB database, and we use the
> -C option of pgbench to toggle between reconnecting after
> each transaction (reconnect mode) and using one persistent
> connection per client (persistent connection mode). We use
> the reconnect mode by default unless stated otherwise.

Using -C explains why you'd see a lot of benefit from sharing page tables for
executable code. But I don't think -C is a particularly interesting workload
to optimize for.


> I'm no expert in this stuff, but it seems to be that with shared page
> table schemes you can avoid wasting huge amounts of RAM on duplicated
> page table entries (pages * processes), and with huge/super pages you
> can reduce the number of pages, but AFAIK you still can't escape the
> TLB shootdown cost, which is all-or-nothing (PCID level at best).

Pretty much that. While you can avoid some TLB shootdowns via PCIDs, that only
avoids flushing the TLB, it doesn't help with the TLB hit rate being much
lower due to the number of "redundant" mappings with different PCIDs.


> The only way to avoid TLB shootdowns on context switches is to have *exactly
> the same memory map*.  Or, as Robert succinctly shouted, "THREADS".

+1

Greetings,

Andres Freund


Reply via email to