Re: [PoC] Reducing planning time when tables have many partitions

David Rowley Wed, 23 Mar 2022 19:04:18 -0700

On Fri, 18 Mar 2022 at 23:32, Yuya Watari <[email protected]> wrote:
> I found a problem that planning takes too much time when the tables
> have many child partitions. According to my observation, the planning
> time increases in the order of O(n^2). Here, n is the number of child
> partitions. I attached the patch to solve this problem. Please be
> noted that this patch is a PoC.


> 3. How to Solve?

I think a better way to solve this would be just to have a single hash
table over all EquivalenceClasses that allows fast lookups of
EquivalenceMember->em_expr.  I think there's no reason that a given
Expr should appear in more than one non-merged EquivalenceClass. The
EquivalenceClass a given Expr belongs to would need to be updated
during the merge process.

For functions such as get_eclass_for_sort_expr() and
process_equivalence(), that would become a fairly fast hashtable
lookup instead of having nested loops to find if an EquivalenceMember
already exists for the given Expr. We might not want to build the hash
table for all queries. Maybe we could just do it if we get to
something like ~16 EquivalenceMember in total.

As of now, we don't have any means to hash Exprs, so all that
infrastructure would need to be built first.  Peter Eisentraut is
working on a patch [1] which is a step towards having this.

Here's a simple setup to show the pain of this problem:

create table lp (a int, b int) partition by list(a);
select 'create table lp'||x::text|| ' partition of lp for values
in('||x::text||');' from generate_Series(0,4095)x;
\gexec
explain analyze select * from lp where a=b order by a;

 Planning Time: 510.248 ms
 Execution Time: 264.659 ms

David

[1] https://commitfest.postgresql.org/37/3182/

Re: [PoC] Reducing planning time when tables have many partitions

Reply via email to