On Wed, Nov 18, 2015 at 6:19 PM, Robert Haas <robertmh...@gmail.com> wrote: > On Wed, Nov 18, 2015 at 6:29 PM, Peter Geoghegan <p...@heroku.com> wrote: >> In principle, I have no problem with doing that. Through testing, I >> cannot see any actual upside, though. Perhaps I just missed something. >> Even 8MB is enough to avoid the multipass merge in the event of a >> surprisingly high volume of data (my work laptop is elsewhere, so I >> don't have my notes on this in front of me, but I figured out the >> crossover point for a couple of cases). > > I'd be interested in seeing this analysis in some detail.
Sure. Jeff mentioned 8MB as a work_mem setting, so let's examine a case where that's the work_mem setting, and see experimentally where the crossover point for a multi-pass sort ends up. If this table is created: postgres=# create unlogged table bar as select (random() * 1e9)::int4 idx, 'payload xyz'::text payload from generate_series(1, 10100000) i; SELECT 10100000 Then, on my system, a work_mem setting of 8MB *just about* avoids seeing the multipass_warning message with this query: postgres=# select count(distinct idx) from bar ; count ------------ 10,047,433 (1 row) A work_mem setting of 235MB is just enough to make the query's sort fully internal. Let's see how things change with a higher work_mem setting of 16MB. I mentioned quadratic growth: Having doubled work_mem, let's *quadruple* the number of tuples, to see where this leaves a 16MB setting WRT a multi-pass merge: postgres=# drop table bar ; DROP TABLE postgres=# create unlogged table bar as select (random() * 1e9)::int4 idx, 'payload xyz'::text payload from generate_series(1, 10100000 * 4) i; SELECT 40400000 Further experiments show that this is the exact point at which the 16MB work_mem setting similarly narrowly avoids a multi-pass warning. This should be the dominant consideration, because now a fully internal sort requires 4X the work_mem of my original 16MB work_mem example table/query. The quadratic growth in a simple hybrid sort-merge strategy's ability to avoid a multi-pass merge phase (growth relative to linear increases in work_mem) can be demonstrated with simple experiments. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers