Re: [PERFORM] Query not using Index
On Sun, Mar 27, 2016 at 9:12 AM, Wei Shanwrote: > Hi Andreas, > > The tablespace is not on SSD although I intend to do it within the next > week. I actually tried reducing the random_page_cost to 0.2 but it doesn't > help. Setting random_page_cost to less than seq_page_cost is nonsensical. You could try to increase cpu_tuple_cost to 0.015 or 0.02 Cheers, Jeff -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Query not using Index
Hi Andreas, The tablespace is not on SSD although I intend to do it within the next week. I actually tried reducing the random_page_cost to 0.2 but it doesn't help. On 26 March 2016 at 22:13, Andreas Kretschmerwrote: > Wei Shan wrote: > > > Hi all, > > > > Please provide some advise on the following query not using the index: > > I have 2 questions: > > > > 1. Why does the optimizer chose not to use the index when it will run > faster? > > because of the estimated costs.: > > Seq Scan on testdb auditrecor0_ (cost=0.00..18147465.00 > Bitmap Heap Scan on testdb auditrecor0_ (cost=2291521.32..19046381.97 > > The estimated costs for the index-scan are higher. > > > > 2. How do I ensure the optimizer will use the index without setting > > enable_seqscan='off' > > You have a dedicated tablespace for indexes, is this a SSD? You can try > to reduce the random_page_cost, from default 4 to maybe 2.(depends on > hardware) This would reduce the estimated costs for the Index-scan and > prefer the index-scan. > > > > Regards, Andreas Kretschmer > -- > Andreas Kretschmer > http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > -- Regards, Ang Wei Shan
Re: [PERFORM] Query not using Index
Wei Shanwrote: > Hi all, > > Please provide some advise on the following query not using the index: > I have 2 questions: > > 1. Why does the optimizer chose not to use the index when it will run faster? because of the estimated costs.: Seq Scan on testdb auditrecor0_ (cost=0.00..18147465.00 Bitmap Heap Scan on testdb auditrecor0_ (cost=2291521.32..19046381.97 The estimated costs for the index-scan are higher. > 2. How do I ensure the optimizer will use the index without setting > enable_seqscan='off' You have a dedicated tablespace for indexes, is this a SSD? You can try to reduce the random_page_cost, from default 4 to maybe 2.(depends on hardware) This would reduce the estimated costs for the Index-scan and prefer the index-scan. Regards, Andreas Kretschmer -- Andreas Kretschmer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Query not using Index
Hi all, Please provide some advise on the following query not using the index: pgsql version: 9.2.4 OS version: RedHat 6.5 Ram: 64 GB rows in testdb: 180 million shared_buffers: 16GB effective_cache_size: 32GB work_mem='32MB' I have executed the query below after I vaccum analyze the table. I have 2 questions: 1. Why does the optimizer chose not to use the index when it will run faster? 2. How do I ensure the optimizer will use the index without setting enable_seqscan='off' *Table structure.* testdb=# \d testtable Table "public.testtable" Column | Type | Modifiers ---+-+--- pk| text| not null additionaldetails | text| authtoken | text| not null customid | text| eventstatus | text| not null eventtype | text| not null module| text| not null nodeid| text| not null rowprotection | text| rowversion| integer | not null searchdetail1 | text| searchdetail2 | text| sequencenumber| bigint | not null service | text| not null timestamp | bigint | not null Indexes: "testtable_pkey" PRIMARY KEY, btree (pk) "testtable_nodeid_eleanor1_idx" btree (nodeid) WHERE nodeid = 'eleanor1'::text, tablespace "tablespace_index" "testtable_nodeid_eleanor2_idx" btree (nodeid) WHERE nodeid = 'eleanor2'::text, tablespace "tablespace_index" "testtable_nodeid_eleanor3_idx" btree (nodeid) WHERE nodeid = 'eleanor3'::text, tablespace "tablespace_index" *Explain Plan with enable_seqscan='on'* testdb=# explain analyze select max ( auditrecor0_.sequenceNumber ) AS col_0_0_ From testdb auditrecor0_ where auditrecor0_.nodeid = 'eleanor1'; QUERY PLAN - - Aggregate (cost=18291486.05..18291486.06 rows=1 width=8) (actual time=484907.446..484907.446 rows=1 loops=1) -> Seq Scan on testdb auditrecor0_ (cost=0.00..18147465.00 rows=57608421 width=8) (actual time=0.166..473959.12 6 rows=57801797 loops=1) Filter: (nodeid = 'eleanor1'::text) Rows Removed by Filter: 126233820 Total runtime: 484913.013 ms (5 rows) *Explain Plan with enable_seqscan='off'* testdb=# explain analyze select max ( auditrecor0_.sequenceNumber ) AS col_0_0_ From testdb auditrecor0_ where auditrecor0_.nodeid = 'eleanor3'; QUERY PLAN -- Aggregate (cost=19226040.50..19226040.51 rows=1 width=8) (actual time=388293.245..388293.245 rows=1 loops=1) -> Bitmap Heap Scan on testdb auditrecor0_ (cost=2291521.32..19046381.97 rows=71863412 width=8) (actual time=15626.372..375378.362 rows=71 412687 loops=1) Recheck Cond: (nodeid = 'eleanor3'::text) Rows Removed by Index Recheck: 900820 -> Bitmap Index Scan on testdb_nodeid_eleanor3_idx (cost=0.00..2273555.47 rows=71863412 width=0) (actual time=15503.465..15503.465 r ows=71412687 loops=1) Index Cond: (nodeid = 'eleanor3'::text) Total runtime: 388294.378 ms (7 rows) Thanks! -- Regards, Ang Wei Shan
Re: [PERFORM] query not using index
On 23/12/13 21:58, Johann Spies wrote: On 19 December 2013 16:48, Tom Lane t...@sss.pgh.pa.us mailto:t...@sss.pgh.pa.us wrote: Johann Spies johann.sp...@gmail.com mailto:johann.sp...@gmail.com writes: I would appreciate some help optimising the following query: It's a mistake to imagine that indexes are going to help much with a join of this size. Hash or merge join is going to be a lot better than nestloop. What you need to do is make sure those will perform as well as possible, and to that end, it'd likely help to raise work_mem. I'm not sure if you can sanely put it high enough to make the query operate totally in memory --- it looks like you'd need work_mem of 500MB or more to prevent any of the sorts or hashes from spilling to disk, and keep in mind that this query is going to use several times work_mem because there are multiple sorts/hashes going on. But if you can transiently dedicate a lot of RAM to this query, that should help some. I'd suggest increasing work_mem via a SET command in the particular session running this query --- you don't want such a high value to be the global default. Thanks Tom. Raising work_mem from 384MB to 512MB made a significant difference. You said hash or merge join id going to be a lot better than nestloop. Is that purely in the hands of the query planner or what can I do to get the planner to use that options apart from raising the work_mem? You can disable the hash and merge join options by doing: SET enable_hashjoin=off; SET enable_mergejoin=off; before running the query again. Timing it (or EXPLAIN ANALYZE) should demonstrate if that planner made the right call by choosing hash or merge in the first place. regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] query not using index
On 19 December 2013 16:48, Tom Lane t...@sss.pgh.pa.us wrote: Johann Spies johann.sp...@gmail.com writes: I would appreciate some help optimising the following query: It's a mistake to imagine that indexes are going to help much with a join of this size. Hash or merge join is going to be a lot better than nestloop. What you need to do is make sure those will perform as well as possible, and to that end, it'd likely help to raise work_mem. I'm not sure if you can sanely put it high enough to make the query operate totally in memory --- it looks like you'd need work_mem of 500MB or more to prevent any of the sorts or hashes from spilling to disk, and keep in mind that this query is going to use several times work_mem because there are multiple sorts/hashes going on. But if you can transiently dedicate a lot of RAM to this query, that should help some. I'd suggest increasing work_mem via a SET command in the particular session running this query --- you don't want such a high value to be the global default. Thanks Tom. Raising work_mem from 384MB to 512MB made a significant difference. You said hash or merge join id going to be a lot better than nestloop. Is that purely in the hands of the query planner or what can I do to get the planner to use that options apart from raising the work_mem? Regards Johann -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3)
[PERFORM] query not using index
I would appreciate some help optimising the following query: with subject_journals as( select A.sq fromisi.rissue A, isi.rsc_joern_link C WHERE C.sc_id in ('d0963875-e438-4923-b3fa-f462e8975221', '04e14284-09c8-421a-b1ad-c8238051601a', '04e2189f-cd2a-44f0-b98d-52f6bb5dcd78', 'f5521c65-ec49-408a-9a42-8a69d47703cd', '2e47ae2f-2c4d-433e-8bdf-9983eeeafc42', '5d3639b1-04c2-4d94-a99a-5323277fd2b7') AND C.rj_id = A.uuid), subject_articles as ( SELECT B.article_id as art_id FROM isi.isi_l1_publication B, subject_journals A, isi.ritem C WHERE A.sq = B.journal_id AND B.publication_year = '2012' AND B.publication_year = '2000' AND C.ut = B.article_id AND C.dt in ('@ Article','Review') ), country_articles as ( SELECT A.art_id FROM isi.art_country_link A WHERE A.countrycode = 'ZA') select art_id from subject_articles INTERSECT select art_id from country_articles Analyze explains shows that it is not using the indexes on both isi.isi_l1_publication and isi.ritem (both tables with more than 43 million records).: HashSetOp Intersect (cost=10778065.50..11227099.44 rows=200 width=48) (actual time=263120.868..263279.467 rows=4000 loops=1) Output: *SELECT* 1.art_id, (0) Buffers: shared hit=627401 read=4347235, temp read=234498 written=234492 CTE subject_journals - Hash Join (cost=12846.55..17503.27 rows=28818 width=8) (actual time=99.762..142.439 rows=30291 loops=1) Output: a.sq Hash Cond: ((c.rj_id)::text = (a.uuid)::text) Buffers: shared hit=12232 - Bitmap Heap Scan on isi.rsc_joern_link c (cost=1020.92..5029.23 rows=28818 width=37) (actual time=4.238..15.806 rows=30291 loops=1) Output: c.id, c.rj_id, c.sc_id Recheck Cond: ((c.sc_id)::text = ANY ('{d0963875-e438-4923-b3fa-f462e8975221,04e14284-09c8-421a-b1ad-c8238051601a,04e2189f-cd2a-44f0-b98d-52f6bb5dcd78,f5521c65-ec49-408a-9a42-8a69d47703cd,2e47ae2f-2c4d-433e-8bdf-9983eeeafc42,5d3639b1-04c2-4 (...) Buffers: shared hit=3516 - Bitmap Index Scan on rsc_joern_link_sc_id_idx (cost=0.00..1013.72 rows=28818 width=0) (actual time=3.722..3.722 rows=30291 loops=1) Index Cond: ((c.sc_id)::text = ANY ('{d0963875-e438-4923-b3fa-f462e8975221,04e14284-09c8-421a-b1ad-c8238051601a,04e2189f-cd2a-44f0-b98d-52f6bb5dcd78,f5521c65-ec49-408a-9a42-8a69d47703cd,2e47ae2f-2c4d-433e-8bdf-9983eeeafc42,5d3639b1-04 (...) Buffers: shared hit=237 - Hash (cost=10098.06..10098.06 rows=138206 width=45) (actual time=95.495..95.495 rows=138206 loops=1) Output: a.sq, a.uuid Buckets: 16384 Batches: 1 Memory Usage: 10393kB Buffers: shared hit=8716 - Seq Scan on isi.rissue a (cost=0.00..10098.06 rows=138206 width=45) (actual time=0.005..58.225 rows=138206 loops=1) Output: a.sq, a.uuid Buffers: shared hit=8716 CTE subject_articles - Merge Join (cost=9660996.21..9896868.27 rows=13571895 width=16) (actual time=229449.020..259557.073 rows=2513896 loops=1) Output: b.article_id Merge Cond: ((a.sq)::text = (b.journal_id)::text) Buffers: shared hit=519891 read=4347235, temp read=234498 written=234492 - Sort (cost=2711.01..2783.05 rows=28818 width=32) (actual time=224.901..230.615 rows=30288 loops=1) Output: a.sq Sort Key: a.sq Sort Method: quicksort Memory: 2188kB Buffers: shared hit=12232 - CTE Scan on subject_journals a (cost=0.00..576.36 rows=28818 width=32) (actual time=99.764..152.459 rows=30291 loops=1) Output: a.sq Buffers: shared hit=12232 - Materialize (cost=9658285.21..9722584.29 rows=12859816 width=24) (actual time=229223.851..253191.308 rows=14664245 loops=1) Output: b.article_id, b.journal_id Buffers: shared hit=507659 read=4347235, temp read=234498 written=234492 - Sort (cost=9658285.21..9690434.75 rows=12859816 width=24) (actual time=229223.846..251142.167 rows=14072645 loops=1) Output: b.article_id, b.journal_id Sort Key: b.journal_id Sort Method: external merge Disk: 467704kB Buffers: shared hit=507659 read=4347235, temp read=234498 written=234492
Re: [PERFORM] query not using index
Johann Spies johann.sp...@gmail.com writes: I would appreciate some help optimising the following query: It's a mistake to imagine that indexes are going to help much with a join of this size. Hash or merge join is going to be a lot better than nestloop. What you need to do is make sure those will perform as well as possible, and to that end, it'd likely help to raise work_mem. I'm not sure if you can sanely put it high enough to make the query operate totally in memory --- it looks like you'd need work_mem of 500MB or more to prevent any of the sorts or hashes from spilling to disk, and keep in mind that this query is going to use several times work_mem because there are multiple sorts/hashes going on. But if you can transiently dedicate a lot of RAM to this query, that should help some. I'd suggest increasing work_mem via a SET command in the particular session running this query --- you don't want such a high value to be the global default. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Query not using index
Hi all, Thanks for the reply. I made some more test and find out that the problem is with the = operator for the network type. Can I create index which to work with =. Because if I use = the index is used. But not for =. iplog=# explain analyze SELECT * iplog-#FROM croute iplog-#WHERE '193.68.0.10/32' = network; QUERY PLAN - Seq Scan on croute (cost=0.00..707.27 rows=4891 width=103) (actual time=10.313..29.621 rows=2 loops=1) Filter: ('193.68.0.10/32'::cidr = network) Total runtime: 29.729 ms (3 rows) iplog=# explain analyze SELECT * iplog-#FROM croute iplog-#WHERE '193.68.0.10/32' = network; QUERY PLAN -- Index Scan using croute_network_all on croute (cost=0.00..17.99 rows=4 width=103) (actual time=0.053..0.059 rows=1 loops=1) Index Cond: ('193.68.0.10/32'::cidr = network) Total runtime: 0.167 ms (3 rows) Waiting for replies. Thanks to all in advance. Kaloyan Iliev ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Query not using index
Hi all, I have a problem with a query which doeson't want to use indexes. I tried to create different indexes but nothing help. Can anyone suggest what index I need. This query is executed 1.5Milion times per day and I need it to be veri fast. I made my test on 8.0.0 beta but the production database is still 7.4.6 so i need suggestions for 7.4.6. I will post the table with the indexes and the query plans. iplog=# \d croute Table public.croute Column | Type | Modifiers -+--+--- confid | integer | network | cidr | comment | text | router | text | port| text | valid_at| timestamp with time zone | archived_at | timestamp with time zone | Indexes: croute_netwo btree (network) WHERE confid 0 AND archived_at IS NULL croute_netwokr_valid_at btree (network, valid_at) croute_network btree (network) WHERE archived_at IS NULL croute_network_all btree (network) iplog=# select version(); version -- PostgreSQL 8.0.0beta1 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.2 (Mandrake Linux 10.0 3.3.2-6mdk) (1 row) THIS IS THE QUERY! customer= explain analyze SELECT * customer-FROM croute customer-WHERE '193.68.0.8/32' = network AND customer- (archived_at is NULL OR archived_at '17-11-2005') AND customer- valid_at '1-12-2005'::date AND customer- confid 0; QUERY PLAN Seq Scan on croute (cost=0.00..441.62 rows=413 width=102) (actual time=14.131..37.515 rows=1 loops=1) Filter: (('193.68.0.8/32'::cidr = network) AND ((archived_at IS NULL) OR (archived_at '2005-11-17 00:00:00+02'::timestamp with time zone)) AND (valid_at ('2005-12-01'::date)::timestamp with time zone) AND (confid 0)) Total runtime: 37.931 ms (3 rows) customer= select count(*) from croute; count --- 10066 (1 row) This is the result of the query: confid |network | comment | router | port | valid_at | archived_at | ---+---+-+--++-+---+ 19971 | xx.xx.xx.xx/32 | x | ? | ? |2005-03-11 00:00:00+02 | | (1 row) And last I try to stop the sequance scan but it doesn't help. I suppose I don't have the right index. iplog=# set enable_seqscan = off; SET iplog=# explain analyze SELECT * iplog-#FROM croute iplog-#WHERE '193.68.0.8/32' = network AND iplog-# (archived_at is NULL OR archived_at '17-11-2005') AND iplog-# valid_at '1-12-2005'::date AND iplog-# confid 0; QUERY PLAN Seq Scan on croute (cost=1.00..10780.64 rows=1030 width=103) (actual time=29.593..29.819 rows=1 loops=1) Filter: (('193.68.0.8/32'::cidr = network) AND ((archived_at IS NULL) OR (archived_at '2005-11-17 00:00:00+02'::timestamp with time zone)) AND (valid_at '2005-12-01'::date) AND (confid 0)) Total runtime: 29.931 ms (3 rows) I try creating one last index on all fields but it doesn't help. iplog=# CREATE INDEX croute_all on croute(network,archived_at,valid_at,confid); CREATE INDEX iplog=# explain analyze SELECT * iplog-#FROM croute iplog-#WHERE '193.68.0.8/32' = network AND iplog-# (archived_at is NULL OR archived_at '17-11-2005') AND iplog-# valid_at '1-12-2005'::date AND iplog-# confid 0;
Re: [PERFORM] Query not using index
On 12/9/05, Kaloyan Iliev [EMAIL PROTECTED] wrote: Hi all, I have a problem with a query which doeson't want to use indexes. I tried to create different indexes but nothing help. Can anyone suggest what index I need. This query is executed 1.5Milion times per day and I need it to be veri fast. I made my test on 8.0.0 beta but the production database is still 7.4.6 so i need suggestions for 7.4.6. I will post the table with the indexes and the query plans. iplog=# \d croute Table public.croute Column | Type | Modifiers -+--+--- confid | integer | network | cidr | comment | text | router | text | port| text | valid_at| timestamp with time zone | archived_at | timestamp with time zone | Indexes: croute_netwo btree (network) WHERE confid 0 AND archived_at IS NULL croute_netwokr_valid_at btree (network, valid_at) croute_network btree (network) WHERE archived_at IS NULL croute_network_all btree (network) iplog=# select version(); version -- PostgreSQL 8.0.0beta1 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.2 (Mandrake Linux 10.0 3.3.2-6mdk) (1 row) THIS IS THE QUERY! customer= explain analyze SELECT * customer-FROM croute customer-WHERE '193.68.0.8/32' = network AND customer- (archived_at is NULL OR archived_at '17-11-2005') AND customer- valid_at '1-12-2005'::date AND customer- confid 0; QUERY PLAN Seq Scan on croute (cost=0.00..441.62 rows=413 width=102) (actual time=14.131..37.515 rows=1 loops=1) Filter: (('193.68.0.8/32'::cidr = network) AND ((archived_at IS NULL) OR (archived_at '2005-11-17 00:00:00+02'::timestamp with time zone)) AND (valid_at ('2005-12-01'::date)::timestamp with time zone) AND (confid 0)) Total runtime: 37.931 ms (3 rows) customer= select count(*) from croute; count --- 10066 (1 row) This is the result of the query: confid |network | comment | router | port | valid_at | archived_at | ---+---+-+--++-+---+ 19971 | xx.xx.xx.xx/32 | x | ? | ? |2005-03-11 00:00:00+02 | | (1 row) And last I try to stop the sequance scan but it doesn't help. I suppose I don't have the right index. iplog=# set enable_seqscan = off; SET iplog=# explain analyze SELECT * iplog-#FROM croute iplog-#WHERE '193.68.0.8/32' = network AND iplog-# (archived_at is NULL OR archived_at '17-11-2005') AND iplog-# valid_at '1-12-2005'::date AND iplog-# confid 0; QUERY PLAN Seq Scan on croute (cost=1.00..10780.64 rows=1030 width=103) (actual time=29.593..29.819 rows=1 loops=1) Filter: (('193.68.0.8/32'::cidr = network) AND ((archived_at IS NULL) OR (archived_at '2005-11-17 00:00:00+02'::timestamp with time zone)) AND (valid_at '2005-12-01'::date) AND (confid 0)) Total runtime: 29.931 ms (3 rows) I try creating one last index on all fields but it doesn't help. iplog=# CREATE INDEX croute_all on croute(network,archived_at,valid_at,confid); CREATE INDEX iplog=# explain analyze SELECT * iplog-#FROM croute iplog-#WHERE '193.68.0.8/32' = network AND iplog-# (archived_at is NULL OR archived_at '17-11-2005') AND iplog-# valid_at '1-12-2005'::date AND iplog-# confid 0; QUERY PLAN Seq Scan on croute
Re: [PERFORM] Query not using index
Hi all, Thanks for the reply. I made some more test and find out that the problem is with the = operator for the network type. Can I create index which to work with =. Because if I use = the index is used. But not for =. iplog=# explain analyze SELECT * iplog-#FROM croute iplog-#WHERE '193.68.0.10/32' = network; QUERY PLAN - Seq Scan on croute (cost=0.00..707.27 rows=4891 width=103) (actual time=10.313..29.621 rows=2 loops=1) Filter: ('193.68.0.10/32'::cidr = network) Total runtime: 29.729 ms (3 rows) iplog=# explain analyze SELECT * iplog-#FROM croute iplog-#WHERE '193.68.0.10/32' = network; QUERY PLAN -- Index Scan using croute_network_all on croute (cost=0.00..17.99 rows=4 width=103) (actual time=0.053..0.059 rows=1 loops=1) Index Cond: ('193.68.0.10/32'::cidr = network) Total runtime: 0.167 ms (3 rows) Waiting for replies. Thanks to all in advance. Kaloyan Iliev ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org