Re: Performance of Query 4 on TPC-DS Benchmark

Andrei Lepikhov Mon, 11 Nov 2024 01:41:27 -0800

On 11/11/24 02:35, Ba Jinsheng wrote:

Hi all,


Please see this case:


Query 4 on TPC-DS benchmark:

Thank you for interesting example!
Looking into explains I see two sortings:
->  Sort  (cost=794037.94..794037.95 rows=1 width=132)
   (actual time=3024403.310..3024403.313 rows=8 loops=1)
->  Sort  (cost=794033.93..794033.94 rows=1 width=132)
   (actual time=8068.869..8068.872 rows=8 loops=1)

Almost the same cost and different execution time. So, I think, the coreof the problem in accuracy of selectivity estimation.

In this specific example I see lots of composite scan filters:
- ((sale_type = 'w'::text) AND (dyear = 2002))

- ((year_total > '0'::numeric) AND (sale_type = 'w'::text) AND (dyear =2001))- ((year_total > '0'::numeric) AND (sale_type = 's'::text) AND (dyear =2001))

It is all the time a challenge for PostgreSQL to estimate such a filterbecause of absent information on joint column distribution.Can you research this way by building extended statistics on theseclauses? It could move the plan to the more optimal direction.


--
regards, Andrei Lepikhov

Re: Performance of Query 4 on TPC-DS Benchmark

Reply via email to