Hi, On June 25, 2020 3:44:22 PM PDT, Alvaro Herrera <alvhe...@2ndquadrant.com> wrote: >On 2020-Jun-25, Andres Freund wrote: > >> > What are people doing for those cases already? Do we have an >> > real-world queries that are a problem in PG 13 for this? >> >> I don't know about real world, but it's pretty easy to come up with >> examples. >> >> query: >> SELECT a, array_agg(b) FROM (SELECT generate_series(1, 10000)) a(a), >(SELECT generate_series(1, 10000)) b(b) GROUP BY a HAVING >array_length(array_agg(b), 1) = 0; >> >> work_mem = 4MB >> >> 12 18470.012 ms >> HEAD 44635.210 ms >> >> HEAD causes ~2.8GB of file IO, 12 doesn't cause any. If you're IO >> bandwidth constrained, this could be quite bad. > >... however, you can pretty much get the previous performance back by >increasing work_mem. I just tried your example here, and I get 32 >seconds of runtime for work_mem 4MB, and 13.5 seconds for work_mem 1GB >(this one spills about 800 MB); if I increase that again to 1.7GB I get >no spilling and 9 seconds of runtime. (For comparison, 12 takes 15.7 >seconds regardless of work_mem). > >My point here is that maybe we don't need to offer a GUC to explicitly >turn spilling off; it seems sufficient to let users change work_mem so >that spilling will naturally not occur. Why do we need more?
That's not really a useful escape hatch, because I'll often lead to other nodes using more memory. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.