On Thu, Feb 16, 2017 at 8:13 PM, Thomas Munro <thomas.mu...@enterprisedb.com> wrote: > Obviously there are vanishing returns here as we add more defences > making it increasingly unlikely that we hit "fail" mode. But it > bothers me that hash joins in general are not 100% guaranteed to be > able to complete unless you have infinite RAM.
I think in practice most people are forced to set work_mem to such a small percentage of their available RAM that actual RAM exhaustion is quite rare. The default value of 4MB is probably conservative even for a Raspberry Pi, at least until the connection count spikes unexpectedly, or until you have this problem: https://www.postgresql.org/message-id/16161.1324414...@sss.pgh.pa.us Most advice that I've seen for work_mem involves choosing values like RAM / (4 * max_connections), which tends to result in much larger values that are typically still small very small compared to the amount of RAM that's available at any given moment, because most of the time you either don't have the maximum number of connections or some of them are idle or not all of them are using plans that need any work_mem. Unfortunately, even with these very conservative settings, one sometimes sees a machine get absolutely swamped by a large activity spike at a time when all of the backends just so happen to be running a query that uses 2 or 3 (or 180) copies of work_mem.[1] If I were going to try to do something about the problem of machines running out of memory, I'd be inclined to look at the problem more broadly than "hey, hash joins can exceed work_mem if certain bad things happen" and instead think about "hey, work_mem is a stupid way of deciding on a memory budget". The intrinsic stupidity of work_mem as an allocation system means that (1) it's perfectly possible to run out of memory even if every node respects the memory budget and (2) it's perfectly possible to drastically underutilize the memory you do have even if some nodes fail to respect the memory budget. Of course, if we had a smarter system for deciding on the budget it would be more not less important for nodes to respect the budget they were given, so that's not really an argument against trying to plug the hole you're complaining about here, just a doubt about how much it will improve the user experience if that's the only thing you do. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company [1] Or all of the connections just touch each of your 100,000 relations and the backend-local caches fill up and the whole system falls over without using any work_mem at all. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers