Re: Apply the "LIMIT 1" optimization to partial DISTINCT

2024-01-30 Thread Richard Guo
On Wed, Jan 31, 2024 at 12:26 PM David Rowley  wrote:

> On Fri, 26 Jan 2024 at 21:14, David Rowley  wrote:
> > However, having said that. Parallel plans are often picked when there
> > is some highly selective qual as parallel_tuple_cost has to be applied
> > to fewer tuples for such plans, so probably this is worth doing.
>
> I was messing around with your test case and didn't manage to get any
> plan that had any rows to use the partial path with the LIMIT.  I
> ended up dropping the test that was checking the results were empty as
> I didn't think it added much more value over the EXPLAIN output.
>
> I pushed the result.


Thanks for pushing it!

Thanks
Richard


Re: Apply the "LIMIT 1" optimization to partial DISTINCT

2024-01-30 Thread David Rowley
On Fri, 26 Jan 2024 at 21:14, David Rowley  wrote:
> However, having said that. Parallel plans are often picked when there
> is some highly selective qual as parallel_tuple_cost has to be applied
> to fewer tuples for such plans, so probably this is worth doing.

I was messing around with your test case and didn't manage to get any
plan that had any rows to use the partial path with the LIMIT.  I
ended up dropping the test that was checking the results were empty as
I didn't think it added much more value over the EXPLAIN output.

I pushed the result.

Thanks for working on this.

David




Re: Apply the "LIMIT 1" optimization to partial DISTINCT

2024-01-26 Thread David Rowley
On Fri, 26 Jan 2024 at 20:42, Richard Guo  wrote:
>
> In 5543677ec9 we introduced an optimization that uses Limit instead of
> Unique to implement DISTINCT when all the DISTINCT pathkeys have been
> marked as redundant.  I happened to notice that this optimization was
> not applied to partial DISTINCT, which I think should be.

It seems very likely that the parallel plan would only be chosen if
the planner estimated there'd just be 1 row before the distinct.
Otherwise, the non-partial path's LIMIT would come out so cheap that
it would be unlikely that the parallel plan would be picked.

I think your test case only chooses the parallel plan because you're
doing FROM tenk1 WHERE four=4.  And that column only contains values
0..3.

However, having said that. Parallel plans are often picked when there
is some highly selective qual as parallel_tuple_cost has to be applied
to fewer tuples for such plans, so probably this is worth doing.

David




Apply the "LIMIT 1" optimization to partial DISTINCT

2024-01-25 Thread Richard Guo
In 5543677ec9 we introduced an optimization that uses Limit instead of
Unique to implement DISTINCT when all the DISTINCT pathkeys have been
marked as redundant.  I happened to notice that this optimization was
not applied to partial DISTINCT, which I think should be.  This can
improve plans in some cases, such as

-- on master
explain (costs off) select distinct four from tenk1 where four = 4;
  QUERY PLAN
--
 Limit
   ->  Gather
 Workers Planned: 4
 ->  Unique
   ->  Parallel Seq Scan on tenk1
 Filter: (four = 4)
(6 rows)

-- patched
explain (costs off) select distinct four from tenk1 where four = 4;
  QUERY PLAN
--
 Limit
   ->  Gather
 Workers Planned: 4
 ->  Limit
   ->  Parallel Seq Scan on tenk1
 Filter: (four = 4)
(6 rows)

Such queries might not be that common, but it's very cheap to apply this
optimization.

Attached is a patch for that.

Thanks
Richard


v1-0001-Apply-the-LIMIT-1-optimization-to-partial-DISTINCT.patch
Description: Binary data