On 2014-01-13 15:15:16 -0500, Robert Haas wrote: > On Mon, Jan 13, 2014 at 1:51 PM, Kevin Grittner <[email protected]> wrote: > > I notice, Josh, that you didn't mention the problems many people > > have run into with Transparent Huge Page defrag and with NUMA > > access. > > Amen to that. Actually, I think NUMA can be (mostly?) fixed by > setting zone_reclaim_mode; is there some other problem besides that?
I think that fixes some of the worst instances, but I've seen machines spending horrible amounts of CPU (& BUS) time in page reclaim nonetheless. If I analyzed it correctly it's in RAM << working set workloads where RAM is pretty large and most of it is used as page cache. The kernel ends up spending a huge percentage of time finding and potentially defragmenting pages when looking for victim buffers. > On a related note, there's also the problem of double-buffering. When > we read a page into shared_buffers, we leave a copy behind in the OS > buffers, and similarly on write-out. It's very unclear what to do > about this, since the kernel and PostgreSQL don't have intimate > knowledge of what each other are doing, but it would be nice to solve > somehow. I've wondered before if there wouldn't be a chance for postgres to say "my dear OS, that the file range 0-8192 of file x contains y, no need to reread" and do that when we evict a page from s_b but I never dared to actually propose that to kernel people... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
