We've been looking at breaking out some of our larger logging-type tables (INSERT only, other than DELETEs for regularly removing old rows for maintenance) using constraint partitioning to avoid the need for heavy vacuuming. For some cases it's working well. In others, we're running into performance issues due to the planner no longer using indexes for some queries. I've included a couple of examples where we should have enough information to take advantage of the available indexes.

Here's the example schema.

=# \d foo.bars
                Table "foo.bars"
    Column |           Type           |       Modifiers
-----------+--------------------------+------------------------
 user_id   | integer                  | not null
 bar       | ip4                      | not null
 logged_at | timestamp with time zone | not null default now()
Triggers:
partition_bars BEFORE INSERT ON foo.bars FOR EACH ROW EXECUTE PROCEDURE foo.partition_bars()

=# \d foo.bars_20081013 -- example of one of 11 tables inheriting from foo.bars, each holding one week's-worth of data

            Table "foo.bars_20081013"
    Column |           Type           |       Modifiers
-----------+--------------------------+------------------------
 user_id   | integer                  | not null
 bar       | ip4                      | not null
 logged_at | timestamp with time zone | not null default now()
Indexes:
    "bars_20081013_bar_idx" btree (bar)
    "bars_20081013_bar_logged_at_idx" btree (bar, logged_at)
    "bars_20081013_logged_at_idx" btree (logged_at)
Check constraints:
"bars_20081013_logged_at_check" CHECK (logged_at >= '2008-10-13 00:00:00-04'::timestamp with time zone AND logged_at < '2008-10-20 00:00:00-04'::timestamp with time zone)
Inherits: foo.bars


Case 1: ORDER BY indexed_column LIMIT some_limit

Depending on the value of some_limit, we should see a win by returning the top some_limit rows from each inherited table as candidate rows, and then determine the top some_limit from among the candidates. (Note in the explain that not all of the inheriting tables have data.)


production=# explain select * from foo.bars order by logged_at desc limit 1;
                                                            QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1023367.34..1023367.34 rows=1 width=16)
   ->  Sort  (cost=1023367.34..1134058.31 rows=44276389 width=16)
         Sort Key: foo.bars.logged_at
         ->  Result  (cost=0.00..801985.39 rows=44276389 width=16)
-> Append (cost=0.00..801985.39 rows=44276389 width=16) -> Seq Scan on bars (cost=0.00..32.70 rows=1770 width=16) -> Seq Scan on bars_20080915 bars (cost=0.00..101199.40 rows=5586490 width=16) -> Seq Scan on bars_20080922 bars (cost=0.00..215666.84 rows=11907734 width=16) -> Seq Scan on bars_20080908 bars (cost=0.00..32.70 rows=1770 width=16) -> Seq Scan on bars_20080901 bars (cost=0.00..32.70 rows=1770 width=16) -> Seq Scan on bars_20080825 bars (cost=0.00..32.70 rows=1770 width=16) -> Seq Scan on bars_20080818 bars (cost=0.00..32.70 rows=1770 width=16) -> Seq Scan on bars_20080811 bars (cost=0.00..32.70 rows=1770 width=16) -> Seq Scan on bars_20080804 bars (cost=0.00..32.70 rows=1770 width=16) -> Seq Scan on bars_20080929 bars (cost=0.00..215029.52 rows=11872652 width=16) -> Seq Scan on bars_20081006 bars (cost=0.00..223559.96 rows=12343346 width=16) -> Seq Scan on bars_20081013 bars (cost=0.00..46300.77 rows=2553777 width=16)
(17 rows)

production=# explain select * from foo.bars_20080915 order by logged_at desc limit 1; QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..0.03 rows=1 width=16)
-> Index Scan Backward using bars_20080915_logged_at_idx on bars_20080915 (cost=0.00..167466.14 rows=5586490 width=16)
(2 rows)


Case 2: Return [min|max](indexed_column)

Again, by finding candidate rows from each inheriting table using the available index and then determining the min/max from among the candidates, we should have much better performance than performing seq scans on each inherting table.

production=# explain select min(logged_at) from foo.bars_20080915;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=0.03..0.04 rows=1 width=0)
   InitPlan
     ->  Limit  (cost=0.00..0.03 rows=1 width=8)
-> Index Scan using bars_20080915_logged_at_idx on bars_20080915 (cost=0.00..167466.14 rows=5586490 width=8)
                 Filter: (logged_at IS NOT NULL)
(5 rows)

production=# explain select min(logged_at) from foo.bars;
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=913072.22..913072.23 rows=1 width=8)
   ->  Append  (cost=0.00..802333.27 rows=44295577 width=8)
         ->  Seq Scan on bars  (cost=0.00..32.70 rows=1770 width=8)
-> Seq Scan on bars_20080915 bars (cost=0.00..101199.40 rows=5586490 width=8) -> Seq Scan on bars_20080922 bars (cost=0.00..215666.84 rows=11907734 width=8) -> Seq Scan on bars_20080908 bars (cost=0.00..32.70 rows=1770 width=8) -> Seq Scan on bars_20080901 bars (cost=0.00..32.70 rows=1770 width=8) -> Seq Scan on bars_20080825 bars (cost=0.00..32.70 rows=1770 width=8) -> Seq Scan on bars_20080818 bars (cost=0.00..32.70 rows=1770 width=8) -> Seq Scan on bars_20080811 bars (cost=0.00..32.70 rows=1770 width=8) -> Seq Scan on bars_20080804 bars (cost=0.00..32.70 rows=1770 width=8) -> Seq Scan on bars_20080929 bars (cost=0.00..215029.52 rows=11872652 width=8) -> Seq Scan on bars_20081006 bars (cost=0.00..223559.96 rows=12343346 width=8) -> Seq Scan on bars_20081013 bars (cost=0.00..46648.65 rows=2572965 width=8)
(14 rows)

I thought Greg Stark had looked at some performance tweaks wrt constraint partitioning, but I can't recall the details. Was that related to this at all?

Michael Glaesemann
[EMAIL PROTECTED]


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to