On Thu, Oct 10, 2019 at 7:22 PM David Rowley <david.row...@2ndquadrant.com>
wrote:

> The planner might be able to get a better estimate on the number of
> matching rows if the now() - interval '10 days' expression was
> replaced with 'now'::timestamptz - interval '10 days'. However, care
> would need to be taken to ensure the plan is never prepared since
> 'now' is evaluated during parse. The same care must be taken when
> creating views, functions, stored procedures and the like.
>

You are on to something here I think with the now() function, even if above
suggestion is not exactly right as you said further down.  I am finding a
hard-coded timestamp gives the right query plan.  I also tested same with
even bigger window (last 16 days) and it yet still chooses the brin index.

foo_prod=# EXPLAIN
foo_prod-# SELECT
foo_prod-#  category, source, MIN(rec_insert_time) OVER (partition by
source order by rec_insert_time) AS first_source_time, MAX(rec_insert_time)
OVER (partition by source order by rec_insert_time) AS last_source_time
foo_prod-# FROM (SELECT DISTINCT ON (brand_id, last_change, log_id)
foo_prod(# category, source(field1) AS source, rec_insert_time
foo_prod(# FROM log_table l
foo_prod(# INNER JOIN public.small_join_table filter ON filter.category =
l.category
foo_prod(# WHERE field1 IS NOT NULL AND l.category = 'music'
foo_prod(# AND l.rec_insert_time >= now() - interval '10 days'
foo_prod(# ORDER BY brand_id, last_change, log_id, rec_insert_time DESC)
unique_cases;

   QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 WindowAgg  (cost=24436329.10..24436343.56 rows=643 width=120)
   ->  Sort  (cost=24436329.10..24436330.70 rows=643 width=104)
         Sort Key: unique_cases.source, unique_cases.rec_insert_time
         ->  Subquery Scan on unique_cases  (cost=24436286.24..24436299.10
rows=643 width=104)
               ->  Unique  (cost=24436286.24..24436292.67 rows=643
width=124)
                     ->  Sort  (cost=24436286.24..24436287.85 rows=643
width=124)
                           Sort Key: l.brand_id, l.last_change, l.log_id,
l.rec_insert_time DESC
                           ->  Nested Loop  (cost=0.00..24436256.25
rows=643 width=124)
                                 Join Filter: ((l.category)::text =
filter.category)
                                 ->  Seq Scan on small_join_table filter
 (cost=0.00..26.99 rows=1399 width=8)
                                 ->  Materialize  (cost=0.00..24420487.02
rows=643 width=99)
                                       ->  Seq Scan on log_table l
 (cost=0.00..24420483.80 rows=643 width=99)
                                             Filter: ((field1 IS NOT NULL)
AND (category = 'music'::name) AND (rec_insert_time >= (now() - '10
days'::interval)))
(13 rows)

foo_prod=# SELECT now() - interval '10 days';
           ?column?
-------------------------------
 2019-10-01 08:20:38.115471-05
(1 row)

foo_prod=# EXPLAIN
SELECT
 category, source, MIN(rec_insert_time) OVER (partition by source order by
rec_insert_time) AS first_source_time, MAX(rec_insert_time) OVER (partition
by source order by rec_insert_time) AS last_source_time
FROM (SELECT DISTINCT ON (brand_id, last_change, log_id)
category, source(field1) AS source, rec_insert_time
FROM log_table l
INNER JOIN public.small_join_table filter ON filter.category = l.category
WHERE field1 IS NOT NULL AND l.category = 'music'
AND l.rec_insert_time >= '2019-10-01 08:20:38.115471-05'
ORDER BY brand_id, last_change, log_id, rec_insert_time DESC) unique_cases;

  QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 WindowAgg  (cost=19664576.17..19664590.63 rows=643 width=120)
   ->  Sort  (cost=19664576.17..19664577.77 rows=643 width=104)
         Sort Key: unique_cases.source, unique_cases.rec_insert_time
         ->  Subquery Scan on unique_cases  (cost=19664533.31..19664546.17
rows=643 width=104)
               ->  Unique  (cost=19664533.31..19664539.74 rows=643
width=124)
                     ->  Sort  (cost=19664533.31..19664534.92 rows=643
width=124)
                           Sort Key: l.brand_id, l.last_change, l.log_id,
l.rec_insert_time DESC
                           ->  Nested Loop  (cost=3181.19..19664503.32
rows=643 width=124)
                                 ->  Gather  (cost=3180.91..19662574.92
rows=643 width=99)
                                       Workers Planned: 3
                                       ->  Parallel Bitmap Heap Scan on
log_table l  (cost=2180.91..19661510.62 rows=207 width=99)
                                             Recheck Cond: (rec_insert_time
>= '2019-10-01 08:20:38.115471-05'::timestamp with time zone)
                                             Filter: ((field1 IS NOT NULL)
AND (category = 'music'::name))
                                             ->  Bitmap Index Scan on
rec_insert_time_brin_1000  (cost=0.00..2180.75 rows=142602171 width=0)
                                                   Index Cond:
(rec_insert_time >= '2019-10-01 08:20:38.115471-05'::timestamp with time
zone)


Let me know if this rings any bells!  I will respond to other comments with
other replies.

Thanks,
Jeremy

Reply via email to