Re: Use merge-based matching for MCVs in eqjoinsel

2025-09-08 Thread Ilia Evdokimov
On 08.09.2025 13:35, Ilia Evdokimov wrote: Based on these results, I’d prefer the hash lookup implementation, so I think it makes sense to improve your patch further and bring it into good shape. Shall I take care of that, or would you prefer to do it yourself? I realized I mistakenly cop

Re: Use merge-based matching for MCVs in eqjoinsel

2025-09-08 Thread David Geier
Hi! On 08.09.2025 15:45, Ilia Evdokimov wrote: > I reran the benchmark on a clean cluster and collected the top slowest > JOB queries — now the effect is clearly visible. > > Merge (sum of all JOB queries) > == > default_statistics_target | Planner Speedup (×) | Planner Before (ms

Re: Use merge-based matching for MCVs in eqjoinsel

2025-09-08 Thread Ilia Evdokimov
On 08.09.2025 13:08, David Geier wrote: Hi Ilia! On 05.09.2025 16:03, David Geier wrote: I propose an optimization: when the column datatype supports ordering(i.e., has < and >), we can sort both MCV lists and apply mege-style algorithm to detect matches. This reduces runtime from O(N^2) to O(

Re: Use merge-based matching for MCVs in eqjoinsel

2025-09-08 Thread David Geier
Hi Ilia! > I have read all the previous messages - and yes, you are right. I don’t > know why I didn’t consider using a hash table approach initially. Your > idea makes a lot of sense. Your solution would be beneficial on top, for cases where the data type is not hashable. But I think that's over

Re: Use merge-based matching for MCVs in eqjoinsel

2025-09-08 Thread David Geier
On 08.09.2025 12:45, Ilia Evdokimov wrote: > > I realized I mistakenly copied the wrong results for the hash-map > version in my previous draft. Sorry about that. Here are the correct > benchmark results: > > Merge > > default_statistics_target | Planner Speedup (×) | Planner Before (ms) | > Pla

Re: Use merge-based matching for MCVs in eqjoinsel

2025-09-06 Thread David Geier
Hi Ilia! On 29.07.2025 16:07, Ilia Evdokimov wrote: > > On 21.07.2025 16:55, Ilia Evdokimov wrote: >> >> While analyzing planner performance on JOB with >> default_statistics_target = 1000, I noticed that a significant portion >> of planning time is spent inside the eqjoinsel() function. Accordin

Re: Use merge-based matching for MCVs in eqjoinsel

2025-09-04 Thread Ilia Evdokimov
On 03.09.2025 23:26, Tom Lane wrote: Ilia Evdokimov writes: I’ve attached v3 of the patch. This version adds a check for NULL values when comparing MCV entries, ensuring correctness in edge cases. Um ... what edge cases would those be? We do not put NULL into MCV arrays. You're right - M

Re: Use merge-based matching for MCVs in eqjoinsel

2025-09-03 Thread Tom Lane
Ilia Evdokimov writes: > I’ve attached v3 of the patch. This version adds a check for NULL values > when comparing MCV entries, ensuring correctness in edge cases. Um ... what edge cases would those be? We do not put NULL into MCV arrays. regards, tom lane

Re: Use merge-based matching for MCVs in eqjoinsel

2025-09-03 Thread Ilia Evdokimov
Following up on my previous messages about optimizing eqjoinsel() and eqjoinsel_semi() for Var1 = Var2 clauses, I’d like to share detailed profiling results showing the effect of the patch on JOB for different values of default_statistics_target. The first table shows the total planner time (s

Re: Use merge-based matching for MCVs in eqjoinsel

2025-07-29 Thread Ilia Evdokimov
On 21.07.2025 16:55, Ilia Evdokimov wrote: While analyzing planner performance on JOB with default_statistics_target = 1000, I noticed that a significant portion of planning time is spent inside the eqjoinsel() function. According to perf, in most JOB queries at default_statistics_target = 1