Re: Use virtual tuple slot for Unique node

Heikki Linnakangas Fri, 22 Sep 2023 06:22:10 -0700

I did a little more perf testing with this. I'm seeing the same benefitwith the query you posted. But can we find a case where it's notbeneficial? If I understand correctly, when the input slot is a virtualslot, it's cheaper to copy it to another virtual slot than to form aminimal tuple. Like in your test case. What if the input is a minimialtuple?


On master:


postgres=# set enable_hashagg=off;
SET

postgres=# explain analyze select distinct g::text, 'a', 'b', 'c','d','e','f','g','h' from generate_series(1, 5000000) g;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------

Unique (cost=2630852.42..2655852.42 rows=200 width=288) (actualtime=4525.212..6576.992 rows=5000000 loops=1)-> Sort (cost=2630852.42..2643352.42 rows=5000000 width=288)(actual time=4525.211..5960.967 rows=5000000 loops=1)

         Sort Key: ((g)::text)
         Sort Method: external merge  Disk: 165296kB

-> Function Scan on generate_series g (cost=0.00..75000.00rows=5000000 width=288) (actual time=518.914..1194.702 rows=5000000 loops=1)

 Planning Time: 0.036 ms
 JIT:
   Functions: 5

Options: Inlining true, Optimization true, Expressions true,Deforming trueTiming: Generation 0.242 ms (Deform 0.035 ms), Inlining 63.457 ms,Optimization 29.764 ms, Emission 20.592 ms, Total 114.056 ms

 Execution Time: 6766.399 ms
(11 rows)


With the patch:

postgres=# set enable_hashagg=off;
SET

postgres=# explain analyze select distinct g::text, 'a', 'b', 'c','d','e','f','g','h' from generate_series(1, 5000000) g;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------

Unique (cost=2630852.42..2655852.42 rows=200 width=288) (actualtime=4563.639..7362.467 rows=5000000 loops=1)-> Sort (cost=2630852.42..2643352.42 rows=5000000 width=288)(actual time=4563.637..6069.000 rows=5000000 loops=1)

         Sort Key: ((g)::text)
         Sort Method: external merge  Disk: 165296kB

-> Function Scan on generate_series g (cost=0.00..75000.00rows=5000000 width=288) (actual time=528.060..1191.483 rows=5000000 loops=1)

 Planning Time: 0.720 ms
 JIT:
   Functions: 5

Options: Inlining true, Optimization true, Expressions true,Deforming trueTiming: Generation 0.406 ms (Deform 0.065 ms), Inlining 68.385 ms,Optimization 21.656 ms, Emission 21.033 ms, Total 111.480 ms

 Execution Time: 7585.306 ms
(11 rows)

So not a win in this case. Could you peek at the outer slot type, anduse the same kind of slot for the Unique's result? Or some morecomplicated logic, like use a virtual slot if all the values arepass-by-val? I'd also like to keep this simple, though...


Would this kind of optimization make sense elsewhere?

--
Heikki Linnakangas
Neon (https://neon.tech)

Re: Use virtual tuple slot for Unique node

Reply via email to