On Mon, Aug 1, 2022 at 10:16 AM Rick Otten <rottenwindf...@gmail.com> wrote:

>
>> The other problem I ran into, which I'm still building a test case for
>> and I fear might be a bug if I can easily reproduce it,
>> is if I did the original select in a CTE, and then did a sort outside of
>> the CTE, even though the CTE found 0 rows, the database
>> still spent a _ton_ of time sorting those 0 rows:
>> ```
>>    ->  Sort  (cost=70.03..72.53 rows=1000 width=112) (actual
>> time=84848.452..84848.453 rows=0 loops=1)
>> ```
>> Once I can reproduce this on test data I'll be able to pin down more
>> closely what is happening and tell if I'm just reading
>> the explain plan wrong or if something is broken.  It was getting mixed
>> up with the lack of pruning/index usage problem.
>>
>> I'll report back again next week.  Anyway it is looking to me like it
>> doesn't really matter (within reason) from a performance
>> perspective how many partitions we use for our data set and query
>> patterns.  We should be able to pick the most convenient
>> from an archiving and data management perspective instead.
>>
>>
> This behavior is definitely consistent.  0 rows end up slower than when I
> find some rows in my CTE:
> ```
>    ->  Sort  (cost=109.44..113.19 rows=1500 width=112) (actual
> time=87110.841..87110.842 rows=0 loops=1)
>    ->  Sort  (cost=109.44..113.19 rows=1500 width=112) (actual
> time=25367.867..25367.930 rows=840 loops=1)
> ```
> The only thing I changed in the query was the date range.  It is actually
> the CTE scan step inside the Sort block that is slower when no rows are
> returned than when rows are returned.  It also only happens when all the
> partitions are sequence scanned instead of being partition pruned.
>
> I'm still writing up a test case that can demo this without using
> proprietary data.
>

After a bunch of experiments I can explain this now.  :-)

I had a `limit` clause in my test CTE.  When sequence scanning a bunch of
partitions, if the limit is reached, the subsequent partitions are marked
with `never executed` and not scanned.  On the other hand, when no rows are
found, all of the partitions are scanned.

Therefore, with many millions of rows in the partitions, and being forced
to sequence scan because I put the `at time zone` clause in the `where`,
the case when rows are found is always noticeably faster than the case when
rows aren't found as long as at least one partition hasn't been scanned yet
when the limit is hit.

I'm now satisfied this is a good thing, and will move on to other
problems.  Thanks for hearing me out.  I was scratching my head for a while
over that one.

Reply via email to