Re: [PERFORM] query against single partition uses index, against master table does seq scan

Mike Broers Wed, 21 Sep 2016 14:01:31 -0700

When I excluded the non indexed search criteria the query on aggregate used
the indexes on each partition, without specifying the constraint exclusion
criteria.  When I added the constraint exclusion criteria to the non
indexed criteria, it still used seq scans.


I ended up getting an acceptable plan by using a subquery on the indexed
partition and using those results to scan for the unindexed value.

On Wed, Sep 21, 2016 at 12:37 PM, Mike Broers <[email protected]> wrote:

> Thanks for your response - Is 'selectively choosing what partition'
> different than utilizing each partitions index when scanning each
> partition?  To clarify, I expect to find results in each partition, but to
> have postgres use each partitions index instead of full table scans. It
> seems redundant to add a where clauses to match each exclusion criteria but
> i will try that and report back - thank you for the suggestion.
>
> On Wed, Sep 21, 2016 at 12:15 PM, Ganesh Kannan <ganesh.kannan@
> weatheranalytics.com> wrote:
>
>> Postgres does not have capability to selectively choose child tables
>> unless the query's "WHERE" clause is simple, and it matches (exactly) the
>> CHECK constraint definition.  I have resolved similar issue by explicitly
>> adding check constraint expression in every SQL against the master table.
>> This is also determined by the constraint_exclusion setting value. Check
>> the manual (9.5): https://www.postgresql.org/docs/current/static/ddl-pa
>> rtitioning.html.
>>
>>
>> I would try tweaking WHERE clause to match Check constraint definition.
>> Global partitioning index (like in Oracle) would help, but its just my wish.
>>
>>
>>
>> Regards,
>> Ganesh Kannan
>>
>>
>>
>> ------------------------------
>> *From:* [email protected] <
>> [email protected]> on behalf of Mike Broers <
>> [email protected]>
>> *Sent:* Wednesday, September 21, 2016 12:53 PM
>> *To:* [email protected]
>> *Subject:* [PERFORM] query against single partition uses index, against
>> master table does seq scan
>>
>> Hello, I am curious about the performance of queries against a master
>> table that seem to do seq scans on each child table.  When the same query
>> is issued at a partition directly it uses the partition index and is very
>> fast.
>>
>> The partition constraint is in the query criteria.  We have non
>> overlapping check constraints and constraint exclusion is set to partition.
>>
>> Here is the master table
>>           Column                      Type
>>    Modifiers
>> aggregate_id               bigint                      not null default
>> nextval('seq_aggregate'::regclass)
>> landing_id                 integer                     not null
>> client_program_id          integer
>> sequence_number            bigint
>> start_datetime             timestamp without time zone not null
>> end_datetime               timestamp without time zone not null
>> body                       jsonb                       not null
>> client_parsing_status_code character(1)
>> validation_status_code     character(1)
>> client_parsing_datetime    timestamp without time zone
>> validation_datetime        timestamp without time zone
>> latest_flag_datetime       timestamp without time zone
>> latest_flag                boolean                     not null
>> Indexes:
>>     "pk_aggregate" PRIMARY KEY, btree (aggregate_id)
>>     "ix_aggregate_landing_id_aggregate_id_parsing_status" btree
>> (landing_id, aggregate_id, client_parsing_status_code)
>>     "ix_aggregate_landing_id_start_datetime" btree (landing_id,
>> start_datetime)
>>     "ix_aggregate_latest_flag" btree (latest_flag_datetime) WHERE
>> latest_flag = false
>>     "ix_aggregate_validation_status_code" btree (validation_datetime)
>> WHERE validation_status_code = 'P'::bpchar AND latest_flag = true
>> Check constraints:
>>     "ck_aggregate_client_parsing_status_code" CHECK
>> (client_parsing_status_code IS NULL OR (client_parsing_status_code = ANY
>> (ARRAY['P'::bpchar, 'F'::bpchar, 'I'::bpchar])))
>>     "ck_aggregate_validation_status_code" CHECK (validation_status_code
>> IS NULL OR (validation_status_code = ANY (ARRAY['P'::bpchar, 'F'::bpchar,
>> 'I'::bpchar])))
>> Foreign-key constraints:
>>     "fk_aggregate_client_program" FOREIGN KEY (client_program_id)
>> REFERENCES client_program(client_program_id)
>>     "fk_aggregate_landing" FOREIGN KEY (landing_id) REFERENCES
>> landing(landing_id)
>> Number of child tables: 17 (Use \d+ to list them.)
>>
>> and here is a child table showing a check constraint
>>                                      Table "stage.aggregate__00007223"
>>           Column                      Type
>>    Modifiers
>> ────────────────────────── ───────────────────────────
>> aggregate_id               bigint                      not null default
>> nextval('seq_aggregate'::regclass)
>> landing_id                 integer                     not null
>> client_program_id          integer
>> sequence_number            bigint
>> start_datetime             timestamp without time zone not null
>> end_datetime               timestamp without time zone not null
>> body                       jsonb                       not null
>> client_parsing_status_code character(1)
>> validation_status_code     character(1)
>> client_parsing_datetime    timestamp without time zone
>> validation_datetime        timestamp without time zone
>> latest_flag_datetime       timestamp without time zone
>> latest_flag                boolean                     not null
>> Indexes:
>>     "pk_aggregate__00007223" PRIMARY KEY, btree (aggregate_id),
>> tablespace "archive"
>>     "ix_aggregate__00007223_landing_id_aggregate_id_parsing_status"
>> btree (landing_id, aggregate_id, client_parsing_status_code), tablespace
>> "archive"
>>     "ix_aggregate__00007223_landing_id_start_datetime" btree
>> (landing_id, start_datetime), tablespace "archive"
>>     "ix_aggregate__00007223_latest_flag" btree (latest_flag_datetime)
>> WHERE latest_flag = false, tablespace "archive"
>>     "ix_aggregate__00007223_validation_status_code" btree
>> (validation_datetime) WHERE validation_status_code = 'P'::bpchar AND
>> latest_flag = true, tablespace "archive"
>> Check constraints:
>>     "ck_aggregate__00007223_landing_id" CHECK (landing_id >= 7223 AND
>> landing_id < 9503)
>>     "ck_aggregate_client_parsing_status_code" CHECK
>> (client_parsing_status_code IS NULL OR (client_parsing_status_code = ANY
>> (ARRAY['P'::bpchar, 'F'::bpchar, 'I'::bpchar])))
>>     "ck_aggregate_validation_status_code" CHECK (validation_status_code
>> IS NULL OR (validation_status_code = ANY (ARRAY['P'::bpchar, 'F'::bpchar,
>> 'I'::bpchar])))
>> Inherits: aggregate
>> Tablespace: "archive"
>>
>> Here is an example of the query explain plan against the master table:
>>
>> select landing_id from landing L
>> where exists
>> (
>> select landing_id
>> from stage.aggregate A
>> WHERE (A.body#>>'{Cost}')::BIGINT >= 1000000000
>> and L.landing_id = A.Landing_id
>> )
>> and L.source_id = 36
>>
>>
>> Hash Join  (cost=59793745.91..59793775.14 rows=28 width=4)
>>   Hash Cond: (a.landing_id = l.landing_id)
>>   ->  HashAggregate  (cost=59792700.41..59792721.46 rows=2105 width=4)
>>         Group Key: a.landing_id
>>         ->  Append  (cost=0.00..59481729.32 rows=124388438 width=4)
>>               ->  Seq Scan on aggregate a  (cost=0.00..0.00 rows=1
>> width=4)
>>                     Filter: (((body #>> '{Cost}'::text[]))::bigint >=
>> 1000000000)
>>               ->  Seq Scan on aggregate__00000000 a_1
>>  (cost=0.00..1430331.50 rows=2105558 width=4)
>>                     Filter: (((body #>> '{Cost}'::text[]))::bigint >=
>> 1000000000)
>>               ->  Seq Scan on aggregate__00000470 a_2
>>  (cost=0.00..74082.10 rows=247002 width=4)
>>                     Filter: (((body #>> '{Cost}'::text[]))::bigint >=
>> 1000000000)
>>               ->  Seq Scan on aggregate__00001435 a_3
>>  (cost=0.00..8174909.44 rows=17610357 width=4)
>>                     Filter: (((body #>> '{Cost}'::text[]))::bigint >=
>> 1000000000)
>>               ->  Seq Scan on aggregate__00001685 a_4
>>  (cost=0.00..11011311.44 rows=23516624 width=4)
>>                     Filter: (((body #>> '{Cost}'::text[]))::bigint >=
>> 1000000000)
>>               ->  Seq Scan on aggregate__00003836 a_5
>>  (cost=0.00..5833050.44 rows=13102557 width=4)
>>                     Filter: (((body #>> '{Cost}'::text[]))::bigint >=
>> 1000000000)
>>               ->  Seq Scan on aggregate__00005638 a_6
>>  (cost=0.00..5950768.16 rows=12342003 width=4)
>>                     Filter: (((body #>> '{Cost}'::text[]))::bigint >=
>> 1000000000)
>>               ->  Seq Scan on aggregate__00007223 a_7
>>  (cost=0.00..6561806.24 rows=13203237 width=4)
>>                     Filter: (((body #>> '{Cost}'::text[]))::bigint >=
>> 1000000000)
>>               ->  Seq Scan on aggregate__00009503 a_8
>>  (cost=0.00..5420961.64 rows=10931794 width=4)
>>                     Filter: (((body #>> '{Cost}'::text[]))::bigint >=
>> 1000000000)
>>               ->  Seq Scan on aggregate__00011162 a_9
>>  (cost=0.00..4262902.64 rows=8560011 width=4)
>>                     Filter: (((body #>> '{Cost}'::text[]))::bigint >=
>> 1000000000)
>>               ->  Seq Scan on aggregate__00012707 a_10
>>  (cost=0.00..4216271.28 rows=9077921 width=4)
>>                     Filter: (((body #>> '{Cost}'::text[]))::bigint >=
>> 1000000000)
>>               ->  Seq Scan on aggregate__00014695 a_11
>>  (cost=0.00..3441205.72 rows=7674495 width=4)
>>                     Filter: (((body #>> '{Cost}'::text[]))::bigint >=
>> 1000000000)
>>               ->  Seq Scan on aggregate__00016457 a_12
>>  (cost=0.00..688010.74 rows=1509212 width=4)
>>                     Filter: (((body #>> '{Cost}'::text[]))::bigint >=
>> 1000000000)
>>               ->  Seq Scan on aggregate__00016805 a_13
>>  (cost=0.00..145219.14 rows=311402 width=4)
>>                     Filter: (((body #>> '{Cost}'::text[]))::bigint >=
>> 1000000000)
>>               ->  Seq Scan on aggregate__00016871 a_14  (cost=0.00..21.40
>> rows=190 width=4)
>>                     Filter: (((body #>> '{Cost}'::text[]))::bigint >=
>> 1000000000)
>>               ->  Seq Scan on aggregate__00016874 a_15
>>  (cost=0.00..478011.62 rows=1031110 width=4)
>>                     Filter: (((body #>> '{Cost}'::text[]))::bigint >=
>> 1000000000)
>>               ->  Seq Scan on aggregate__00017048 a_16  (cost=0.00..21.40
>> rows=190 width=4)
>>                     Filter: (((body #>> '{Cost}'::text[]))::bigint >=
>> 1000000000)
>>               ->  Seq Scan on aggregate__00017049 a_17
>>  (cost=0.00..1792844.42 rows=3164774 width=4)
>>                     Filter: (((body #>> '{Cost}'::text[]))::bigint >=
>> 1000000000)
>>   ->  Hash  (cost=1042.69..1042.69 rows=225 width=4)
>>         ->  Seq Scan on landing l  (cost=0.00..1042.69 rows=225 width=4)
>>               Filter: (source_id = 36)
>>
>> And here is an example of the query using the index when ran against a
>> partition directly
>>
>> select landing_id from landing L
>> where exists
>> (
>> select landing_id
>> from stage.aggregate__00007223 A
>> WHERE (A.body#>>'{Cost}')::BIGINT >= 1000000000
>> and L.landing_id = A.Landing_id
>> )
>> and L.source_id = 36
>>
>> Nested Loop Semi Join  (cost=0.56..3454.75 rows=5 width=4)
>>   ->  Seq Scan on landing l  (cost=0.00..1042.69 rows=225 width=4)
>>         Filter: (source_id = 36)
>>   ->  Index Scan using ix_aggregate__00007223_landing_id_start_datetime
>> on aggregate__00007223 a  (cost=0.56..359345.74 rows=36173 width=4)
>>         Index Cond: (landing_id = l.landing_id)
>>         Filter: (((body #>> '{Cost}'::text[]))::bigint >= 1000000000)
>>
>>
>> The parent table never had rows, and pg_class had relpages=0.  I saw a
>> suggestion in a different thread about updating this value to greater than
>> 0 so I tried that but didnt get a different plan.  We have
>> autovacuum/analyze enabled and also run nightly vacuum/analyze on the
>> database to keep stats up to date.
>>
>> I'm new to troubleshooting partition query performance and not sure what
>> I am missing here.  Any advice is appreciated.
>>
>
>

Re: [PERFORM] query against single partition uses index, against master table does seq scan

Reply via email to