On Thu, 9 Jan 2025 at 09:50, Jeff Davis <[email protected]> wrote:
> Attached POC patch, which reduces memory usage by ~15% for a simple
> distinct query on an integer key. Performance is the same or perhaps a
> hair faster.
>
> It's not many lines of code, but the surrounding code might benefit
> from some refactoring which would make it a bit simpler.
Thanks for working on this. Here's a preliminary review:
Since bump.c does not add headers to the palloc'd chunks, I think the
following code from hash_agg_entry_size() shouldn't be using
CHUNKHDRSZ anymore.
tupleChunkSize = CHUNKHDRSZ + tupleSize;
if (pergroupSize > 0)
pergroupChunkSize = CHUNKHDRSZ + pergroupSize;
else
pergroupChunkSize = 0;
You should be able to get rid of pergroupChunkSize and just use
pergroupSize in the return.
I did some benchmarking using the attached script. There's a general
speedup, but I saw some unexpected increase in the number of batches
with the patched version on certain tests. See the attached results.
For example, the work_mem = 8MB with 10 million rows shows "Batches:
129" on master but "Batches: 641" with the patched version. I didn't
check why.
David
#!/bin/bash
dbname=postgres
secs=10
psql -c "alter system set max_parallel_workers_per_gather = 0;" $dbname >
/dev/null
psql -c "alter system set jit = 0;" $dbname > /dev/null
psql -c "select pg_reload_conf();" $dbname > /dev/null
psql -c "create extension if not exists pg_prewarm;" $dbname > /dev/null
psql -c "drop table if exists hashagg;" $dbname > /dev/null
psql -c "create table hashagg (a bigint);" $dbname > /dev/null
for rows in 10000 100000 1000000 10000000
#for rows in 10000000
do
psql -c "truncate table hashagg;" $dbname > /dev/null
psql -c "insert into hashagg select a from generate_series(1, $rows)
a;" $dbname > /dev/null
psql -c "vacuum freeze analyze hashagg;" $dbname > /dev/null
psql -c "select pg_prewarm('hashagg');" $dbname > /dev/null
for work_mem in '512kB' '1MB' '2MB' '4MB' '8MB' '16MB' '32MB' '64MB'
'128MB' '256MB'
do
psql -c "alter system set work_mem = '$work_mem';" $dbname >
/dev/null
psql -c "select pg_reload_conf();" $dbname > /dev/null
echo "select a,count(*) from hashagg group by a;" > bench.sql
psql -c "explain analyze select a,count(*) from hashagg group
by a;" $dbname | grep "Batches" | tr '\n' ' '
echo -n "$rows $work_mem "
for i in {1..3}
do
pgbench -n -f bench.sql -M prepared -T $secs $dbname |
grep latency | sed 's/[^0-9.]//g' | tr '\n' ' '
done
echo "ms"
done
done
master @ 231006451
$ ./hashagg_bench.sh
Batches: 5 Memory Usage: 1073kB Disk Usage: 200kB 10000 512kB 4.110 4.090
4.061 ms
Batches: 1 Memory Usage: 1041kB 10000 1MB 3.735 3.721 3.703 ms
Batches: 1 Memory Usage: 1041kB 10000 2MB 3.710 3.714 3.709 ms
Batches: 1 Memory Usage: 1169kB 10000 4MB 3.525 3.495 3.506 ms
Batches: 1 Memory Usage: 1425kB 10000 8MB 3.503 3.475 3.521 ms
Batches: 1 Memory Usage: 1425kB 10000 16MB 3.513 3.508 3.511 ms
Batches: 1 Memory Usage: 1425kB 10000 32MB 3.521 3.520 3.499 ms
Batches: 1 Memory Usage: 1425kB 10000 64MB 3.527 3.514 3.513 ms
Batches: 1 Memory Usage: 1425kB 10000 128MB 3.508 3.502 3.502 ms
Batches: 1 Memory Usage: 1425kB 10000 256MB 3.488 3.512 3.515 ms
Planned Partitions: 16 Batches: 17 Memory Usage: 1041kB Disk Usage:
3040kB 100000 512kB 46.576 46.636 46.531 ms
Planned Partitions: 8 Batches: 9 Memory Usage: 2065kB Disk Usage: 3416kB
100000 1MB 47.087 47.063 47.024 ms
Planned Partitions: 4 Batches: 5 Memory Usage: 4145kB Disk Usage: 1768kB
100000 2MB 44.445 44.596 44.609 ms
Batches: 5 Memory Usage: 8241kB Disk Usage: 728kB 100000 4MB 43.944 44.125
43.987 ms
Batches: 1 Memory Usage: 12817kB 100000 8MB 48.186 47.549 47.597 ms
Batches: 1 Memory Usage: 13329kB 100000 16MB 46.805 46.795 46.685 ms
Batches: 1 Memory Usage: 14353kB 100000 32MB 46.127 46.575 46.087 ms
Batches: 1 Memory Usage: 14353kB 100000 64MB 46.536 46.173 46.847 ms
Batches: 1 Memory Usage: 14353kB 100000 128MB 46.770 45.984 46.219 ms
Batches: 1 Memory Usage: 14353kB 100000 256MB 46.706 46.043 46.078 ms
Planned Partitions: 32 Batches: 217 Memory Usage: 1105kB Disk Usage:
30608kB 1000000 512kB 545.327 543.614 543.034 ms
Planned Partitions: 64 Batches: 65 Memory Usage: 2193kB Disk Usage:
28648kB 1000000 1MB 476.331 477.259 476.427 ms
Planned Partitions: 32 Batches: 33 Memory Usage: 4113kB Disk Usage:
30584kB 1000000 2MB 484.892 484.598 487.688 ms
Planned Partitions: 16 Batches: 17 Memory Usage: 8337kB Disk Usage:
31328kB 1000000 4MB 490.869 489.658 492.045 ms
Planned Partitions: 8 Batches: 9 Memory Usage: 16465kB Disk Usage:
23936kB 1000000 8MB 523.156 524.944 521.537 ms
Planned Partitions: 4 Batches: 5 Memory Usage: 32817kB Disk Usage:
19864kB 1000000 16MB 573.808 573.492 575.078 ms
Planned Partitions: 4 Batches: 5 Memory Usage: 65585kB Disk Usage:
11608kB 1000000 32MB 611.767 609.259 609.378 ms
Batches: 1 Memory Usage: 114705kB 1000000 64MB 609.220 605.576 608.327 ms
Batches: 1 Memory Usage: 114705kB 1000000 128MB 611.371 618.847 623.132 ms
Batches: 1 Memory Usage: 114705kB 1000000 256MB 618.622 616.254 618.368 ms
Planned Partitions: 32 Batches: 4182 Memory Usage: 1105kB Disk Usage:
292240kB 10000000 512kB 6282.199 6159.717 6166.364 ms
Planned Partitions: 64 Batches: 1001 Memory Usage: 2193kB Disk Usage:
322784kB 10000000 1MB 6024.200 6051.042 6047.530 ms
Planned Partitions: 128 Batches: 641 Memory Usage: 4241kB Disk Usage:
384136kB 10000000 2MB 5784.646 5876.862 5816.221 ms
Planned Partitions: 256 Batches: 257 Memory Usage: 8465kB Disk Usage:
506976kB 10000000 4MB 5408.663 5395.462 5404.561 ms
Planned Partitions: 128 Batches: 129 Memory Usage: 16401kB Disk Usage:
384112kB 10000000 8MB 5602.240 5570.921 5608.489 ms
Planned Partitions: 64 Batches: 65 Memory Usage: 33297kB Disk Usage:
322656kB 10000000 16MB 6551.026 6594.607 6477.449 ms
Planned Partitions: 32 Batches: 33 Memory Usage: 65809kB Disk Usage:
259968kB 10000000 32MB 7553.664 7540.932 7462.904 ms
Planned Partitions: 16 Batches: 17 Memory Usage: 131217kB Disk Usage:
244400kB 10000000 64MB 8147.225 8086.363 8083.296 ms
Planned Partitions: 8 Batches: 9 Memory Usage: 262225kB Disk Usage:
211616kB 10000000 128MB 8358.746 8312.645 8333.737 ms
Planned Partitions: 4 Batches: 5 Memory Usage: 524337kB Disk Usage:
134640kB 10000000 256MB 7950.195 7943.805 7963.620 ms
Jeff's bump alloc hashagg patch v1:
Batches: 1 Memory Usage: 921kB 10000 512kB 3.607 3.634 3.631 ms
Batches: 1 Memory Usage: 921kB 10000 1MB 3.534 3.521 3.588 ms
Batches: 1 Memory Usage: 921kB 10000 2MB 3.591 3.537 3.579 ms
Batches: 1 Memory Usage: 921kB 10000 4MB 3.589 3.598 3.589 ms
Batches: 1 Memory Usage: 921kB 10000 8MB 3.587 3.598 3.580 ms
Batches: 1 Memory Usage: 921kB 10000 16MB 3.583 3.623 3.567 ms
Batches: 1 Memory Usage: 921kB 10000 32MB 3.565 3.601 3.581 ms
Batches: 1 Memory Usage: 921kB 10000 64MB 3.573 3.573 3.586 ms
Batches: 1 Memory Usage: 921kB 10000 128MB 3.569 3.580 3.590 ms
Batches: 1 Memory Usage: 921kB 10000 256MB 3.607 3.583 3.577 ms
Planned Partitions: 16 Batches: 17 Memory Usage: 1049kB Disk Usage:
3040kB 100000 512kB 46.054 46.249 45.926 ms
Planned Partitions: 8 Batches: 9 Memory Usage: 1881kB Disk Usage: 3392kB
100000 1MB 46.470 46.493 46.507 ms
Planned Partitions: 4 Batches: 5 Memory Usage: 3641kB Disk Usage: 1680kB
100000 2MB 43.501 43.606 43.470 ms
Batches: 5 Memory Usage: 10297kB Disk Usage: 208kB 100000 4MB 47.999
47.991 48.039 ms
Batches: 1 Memory Usage: 10265kB 100000 8MB 44.940 45.104 45.018 ms
Batches: 1 Memory Usage: 10265kB 100000 16MB 44.832 44.912 44.783 ms
Batches: 1 Memory Usage: 10265kB 100000 32MB 44.550 44.923 44.819 ms
Batches: 1 Memory Usage: 10265kB 100000 64MB 44.578 44.832 44.827 ms
Batches: 1 Memory Usage: 10265kB 100000 128MB 44.792 45.085 44.813 ms
Batches: 1 Memory Usage: 10265kB 100000 256MB 44.777 44.901 44.789 ms
Planned Partitions: 32 Batches: 657 Memory Usage: 1473kB Disk Usage:
30608kB 1000000 512kB 602.477 602.585 603.783 ms
Planned Partitions: 64 Batches: 65 Memory Usage: 2329kB Disk Usage:
28648kB 1000000 1MB 470.372 472.783 473.525 ms
Planned Partitions: 32 Batches: 33 Memory Usage: 3865kB Disk Usage:
30568kB 1000000 2MB 487.759 487.002 485.999 ms
Planned Partitions: 16 Batches: 81 Memory Usage: 10393kB Disk Usage:
31296kB 1000000 4MB 547.232 549.565 550.964 ms
Planned Partitions: 8 Batches: 41 Memory Usage: 20569kB Disk Usage:
23800kB 1000000 8MB 595.448 594.752 595.890 ms
Planned Partitions: 4 Batches: 21 Memory Usage: 41017kB Disk Usage:
19256kB 1000000 16MB 678.615 674.505 674.172 ms
Planned Partitions: 4 Batches: 5 Memory Usage: 81977kB Disk Usage: 7528kB
1000000 32MB 631.274 627.113 628.018 ms
Batches: 1 Memory Usage: 90137kB 1000000 64MB 584.076 585.099 582.715 ms
Batches: 1 Memory Usage: 90137kB 1000000 128MB 584.275 579.396 581.503 ms
Batches: 1 Memory Usage: 90137kB 1000000 256MB 577.607 589.093 584.943 ms
Planned Partitions: 32 Batches: 8469 Memory Usage: 1473kB Disk Usage:
292240kB 10000000 512kB 6870.746 6877.693 6903.483 ms
Planned Partitions: 64 Batches: 4009 Memory Usage: 2881kB Disk Usage:
322784kB 10000000 1MB 6841.746 6809.542 6816.949 ms
Planned Partitions: 128 Batches: 2673 Memory Usage: 5185kB Disk Usage:
384136kB 10000000 2MB 6449.305 6446.077 6438.334 ms
Planned Partitions: 256 Batches: 257 Memory Usage: 9241kB Disk Usage:
506976kB 10000000 4MB 5359.919 5360.792 5365.025 ms
Planned Partitions: 128 Batches: 641 Memory Usage: 21529kB Disk Usage:
384104kB 10000000 8MB 6663.404 6636.793 6632.275 ms
Planned Partitions: 64 Batches: 321 Memory Usage: 41497kB Disk Usage:
322616kB 10000000 16MB 8039.401 8034.878 8046.826 ms
Planned Partitions: 32 Batches: 161 Memory Usage: 82201kB Disk Usage:
259840kB 10000000 32MB 8917.696 9073.211 8933.217 ms
Planned Partitions: 16 Batches: 17 Memory Usage: 163993kB Disk Usage:
243832kB 10000000 64MB 8144.712 8089.529 8164.517 ms
Planned Partitions: 8 Batches: 9 Memory Usage: 311385kB Disk Usage:
195776kB 10000000 128MB 8258.068 8346.843 8250.325 ms
Planned Partitions: 4 Batches: 5 Memory Usage: 630841kB Disk Usage:
113920kB 10000000 256MB 7946.322 8081.977 7913.341 ms