On Thu, Apr 13, 2023 at 6:09 AM David Rowley <dgrowle...@gmail.com> wrote:
> .On Thu, 13 Apr 2023 at 02:28, Andy Fan <zhihui.fan1...@gmail.com> wrote: > > The concept of startup_tuples for a WindowAgg looks good to me, but I > > can't follow up with the below line: > > > > + return clamp_row_est(partition_tuples * DEFAULT_INEQ_SEL); > > > > # select count(*) over() from tenk1 limit 1; > > count > > ------- > > 10000 --> We need to scan all the tuples. > > > > Should we just return clamp_row_est(partition_tuples)? > > For the case you've shown, it will. It's handled by this code: > > if (wc->orderClause == NIL) > return clamp_row_est(partition_tuples); > > My fault. I should have real debugging to double check my understanding, surely I will next time. It would take something like the following to hit the code you're > concerned about: > > explain select count(*) over(order by unique1 rows between unbounded > preceding and 10*random() following) from tenk1; > QUERY PLAN > > -------------------------------------------------------------------------------------------- > WindowAgg (cost=140.23..420.29 rows=10000 width=12) > -> Index Only Scan using tenk1_unique1 on tenk1 > (cost=0.29..270.29 rows=10000 width=4) > (2 rows) > > You can see the startup cost is about 33% of the total cost for that, > which is from the DEFAULT_INEQ_SEL. I'm not exactly set on that > having to be DEFAULT_INEQ_SEL, but I'm not really sure what we could > put that's better. I don't really follow why assuming all rows are > required is better. That'll just mean we favour cheap startup plans > less, but there might be a case where a cheap startup plan is > favourable. I was opting for a happy medium when I thought to use > DEFAULT_INEQ_SEL. > That looks reasonable to me. My suggestion came from my misreading before, It was a bit late in my time zone when writing. Thanks for the detailed explanation! > > I also see I might need to do a bit more work on this as the following > is not handled correctly: > > select count(*) over(rows between unbounded preceding and 10 > following) from tenk1; > > it's assuming all rows due to lack of ORDER BY, but it seems like it > should be 10 rows due to the 10 FOLLOWING end bound. > > True to me. -- Best Regards Andy Fan