On Fri, Jul 24, 2020 at 11:03:54AM -0700, Peter Geoghegan wrote:
On Fri, Jul 24, 2020 at 8:19 AM Robert Haas <robertmh...@gmail.com> wrote:
This is all really good analysis, I think, but this seems like the key
finding. It seems like we don't really understand what's actually
getting written. Whether we use hash or sort doesn't seem like it
should have this kind of impact on how much data gets written, and
whether we use CP_SMALL_TLIST or project when needed doesn't seem like
it should matter like this either.

Isn't this more or less the expected behavior in the event of
partitions that are spilled recursively? The case that Tomas tested
were mostly cases where work_mem was tiny relative to the data being
aggregated.

The following is an extract from commit 1f39bce0215 showing some stuff
added to the beginning of nodeAgg.c:

+ * We also specify a min and max number of partitions per spill. Too few might
+ * mean a lot of wasted I/O from repeated spilling of the same tuples. Too
+ * many will result in lots of memory wasted buffering the spill files (which
+ * could instead be spent on a larger hash table).
+ */
+#define HASHAGG_PARTITION_FACTOR 1.50
+#define HASHAGG_MIN_PARTITIONS 4
+#define HASHAGG_MAX_PARTITIONS 1024


Maybe, but we're nowhere close to these limits. See this table which I
posted earlier:

      2MB       Planned Partitions:  64    HashAgg Batches:  4160
      4MB       Planned Partitions: 128    HashAgg Batches: 16512
      8MB       Planned Partitions: 256    HashAgg Batches: 21488
     64MB       Planned Partitions:  32    HashAgg Batches:  2720
    256MB       Planned Partitions:   8    HashAgg Batches:     8

This is from the non-parallel runs on the i5 machine with 32GB data set,
the first column is work_mem. We're nowhere near the 1024 limit, and the
cardinality estimates are pretty good.

OTOH the number o batches is much higher, so clearly there was some
recursive spilling happening. What I find strange is that this grows
with work_mem and only starts dropping after 64MB.

Also, how could the amount of I/O be almost constant in all these cases?
Surely more recursive spilling should do more I/O, but the Disk Usage
reported by explain analyze does not show anything like ...


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply via email to