[HACKERS] Constraint partition index usage

Michael Glaesemann Tue, 14 Oct 2008 11:41:29 -0700

We've been looking at breaking out some of our larger logging-typetables (INSERT only, other than DELETEs for regularly removing oldrows for maintenance) using constraint partitioning to avoid the needfor heavy vacuuming. For some cases it's working well. In others,we're running into performance issues due to the planner no longerusing indexes for some queries. I've included a couple of exampleswhere we should have enough information to take advantage of theavailable indexes.


Here's the example schema.


=# \d foo.bars
                Table "foo.bars"
    Column |           Type           |       Modifiers
-----------+--------------------------+------------------------
 user_id   | integer                  | not null
 bar       | ip4                      | not null
 logged_at | timestamp with time zone | not null default now()
Triggers:

partition_bars BEFORE INSERT ON foo.bars FOR EACH ROW EXECUTEPROCEDURE foo.partition_bars()

=# \d foo.bars_20081013 -- example of one of 11 tables inheriting fromfoo.bars, each holding one week's-worth of data


            Table "foo.bars_20081013"
    Column |           Type           |       Modifiers
-----------+--------------------------+------------------------
 user_id   | integer                  | not null
 bar       | ip4                      | not null
 logged_at | timestamp with time zone | not null default now()
Indexes:
    "bars_20081013_bar_idx" btree (bar)
    "bars_20081013_bar_logged_at_idx" btree (bar, logged_at)
    "bars_20081013_logged_at_idx" btree (logged_at)
Check constraints:

"bars_20081013_logged_at_check" CHECK (logged_at >= '2008-10-1300:00:00-04'::timestamp with time zone AND logged_at < '2008-10-2000:00:00-04'::timestamp with time zone)

Inherits: foo.bars


Case 1: ORDER BY indexed_column LIMIT some_limit

Depending on the value of some_limit, we should see a win by returningthe top some_limit rows from each inherited table as candidate rows,and then determine the top some_limit from among the candidates. (Notein the explain that not all of the inheriting tables have data.)

production=# explain select * from foo.bars order by logged_at desclimit 1;

                                                            QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1023367.34..1023367.34 rows=1 width=16)
   ->  Sort  (cost=1023367.34..1134058.31 rows=44276389 width=16)
         Sort Key: foo.bars.logged_at
         ->  Result  (cost=0.00..801985.39 rows=44276389 width=16)

-> Append (cost=0.00..801985.39 rows=44276389width=16)-> Seq Scan on bars (cost=0.00..32.70rows=1770 width=16)-> Seq Scan on bars_20080915 bars(cost=0.00..101199.40 rows=5586490 width=16)-> Seq Scan on bars_20080922 bars(cost=0.00..215666.84 rows=11907734 width=16)-> Seq Scan on bars_20080908 bars(cost=0.00..32.70 rows=1770 width=16)-> Seq Scan on bars_20080901 bars(cost=0.00..32.70 rows=1770 width=16)-> Seq Scan on bars_20080825 bars(cost=0.00..32.70 rows=1770 width=16)-> Seq Scan on bars_20080818 bars(cost=0.00..32.70 rows=1770 width=16)-> Seq Scan on bars_20080811 bars(cost=0.00..32.70 rows=1770 width=16)-> Seq Scan on bars_20080804 bars(cost=0.00..32.70 rows=1770 width=16)-> Seq Scan on bars_20080929 bars(cost=0.00..215029.52 rows=11872652 width=16)-> Seq Scan on bars_20081006 bars(cost=0.00..223559.96 rows=12343346 width=16)-> Seq Scan on bars_20081013 bars(cost=0.00..46300.77 rows=2553777 width=16)

(17 rows)

production=# explain select * from foo.bars_20080915 order bylogged_at desc limit 1;QUERYPLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..0.03 rows=1 width=16)

-> Index Scan Backward using bars_20080915_logged_at_idx onbars_20080915 (cost=0.00..167466.14 rows=5586490 width=16)

(2 rows)


Case 2: Return [min|max](indexed_column)

Again, by finding candidate rows from each inheriting table using theavailable index and then determining the min/max from among thecandidates, we should have much better performance than performing seqscans on each inherting table.


production=# explain select min(logged_at) from foo.bars_20080915;

QUERYPLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=0.03..0.04 rows=1 width=0)
   InitPlan
     ->  Limit  (cost=0.00..0.03 rows=1 width=8)

-> Index Scan using bars_20080915_logged_at_idx onbars_20080915 (cost=0.00..167466.14 rows=5586490 width=8)

                 Filter: (logged_at IS NOT NULL)
(5 rows)

production=# explain select min(logged_at) from foo.bars;
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=913072.22..913072.23 rows=1 width=8)
   ->  Append  (cost=0.00..802333.27 rows=44295577 width=8)
         ->  Seq Scan on bars  (cost=0.00..32.70 rows=1770 width=8)

-> Seq Scan on bars_20080915 bars (cost=0.00..101199.40rows=5586490 width=8)-> Seq Scan on bars_20080922 bars (cost=0.00..215666.84rows=11907734 width=8)-> Seq Scan on bars_20080908 bars (cost=0.00..32.70rows=1770 width=8)-> Seq Scan on bars_20080901 bars (cost=0.00..32.70rows=1770 width=8)-> Seq Scan on bars_20080825 bars (cost=0.00..32.70rows=1770 width=8)-> Seq Scan on bars_20080818 bars (cost=0.00..32.70rows=1770 width=8)-> Seq Scan on bars_20080811 bars (cost=0.00..32.70rows=1770 width=8)-> Seq Scan on bars_20080804 bars (cost=0.00..32.70rows=1770 width=8)-> Seq Scan on bars_20080929 bars (cost=0.00..215029.52rows=11872652 width=8)-> Seq Scan on bars_20081006 bars (cost=0.00..223559.96rows=12343346 width=8)-> Seq Scan on bars_20081013 bars (cost=0.00..46648.65rows=2572965 width=8)

(14 rows)

I thought Greg Stark had looked at some performance tweaks wrtconstraint partitioning, but I can't recall the details. Was thatrelated to this at all?


Michael Glaesemann
[EMAIL PROTECTED]


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Constraint partition index usage

Reply via email to