Re: [PERFORM] force the use of a particular index
Scott Cain <[EMAIL PROTECTED]> writes: > So, the question is, is there a way I can force the query planner to use > the index I want it to use? No (and I don't think there should be). Given that it *can* generate the plan you want, this is clearly an estimation failure. What is the index it does use? Would you show us EXPLAIN ANALYZE results when using each index? regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [NOVICE] [PERFORM] Extreme high load averages
Shridhar Daithankar wrote: On 10 Jul 2003 at 0:43, Martin Foster wrote: As for creating a new table, that in itself is a nice idea. But it would cause issues for people currently in the realm. Their posts would essentially dissapear from site and cause more confusion then its worth. No they won't. Say you have a base table and your current post table is child of that. You can query on base table and get rows from child table. That way all the data would always be there. While inserting posts, you would insert in child table. While qeurying you would query on base table. That way things will be optimal. Inheritance would work, but the database would essentially just grow and grow and grow right? Right. But there are two advantages. 1. It will always contain valid posts. No dead tuples. 2. You can work in chuncks of data. Each child table can be dealt with separately without affecting other child tables, whereas in case of a single large table, entire site is affected.. Deleting 100K posts from 101K rows table is vastly different than deleting 10K posts from 2M rows table. Later one would unnecessary starve the table with dead tuples and IO whereas in former case you can do create table as select from and drop the original.. HTH Bye Shridhar -- "[In 'Doctor' mode], I spent a good ten minutes telling Emacs what Ithought of it. (The response was, 'Perhaps you could try to be lessabusive.')"(By Matt Welsh) When I ran EXPLAIN on the views and queries making use of the inherited tables, I noticed that everything worked based on sequence scans and it avoided all indexes.While making use of ONLY kicked in full indexes. This is even after having created a child table with the same indexes as the parent. Is this a known issue, or just some sort of oddity on my setup? Tables still cannot be removed easily, but I found a way to work around it for a day-to-day basis. Essentailly I just clean out the tables containing old rows and delete them later. However based on the above, I doubt performance would get any better. Thanks for the advice however! Martin Foster Creator/Designer Ethereal Realms [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] force the use of a particular index
Hi Tom, Embarrassingly, I can't. I've been monkeying with the database so much that I can't seem to get it back to the state where I reproduce the behavior I want. A database drop and reload may be the only way, but since that is a time consuming thing to do, I won't be able to do it until this evening. Thanks, Scott On Fri, 2003-07-11 at 11:24, Tom Lane wrote: > Scott Cain <[EMAIL PROTECTED]> writes: > > So, the question is, is there a way I can force the query planner to use > > the index I want it to use? > > No (and I don't think there should be). Given that it *can* generate > the plan you want, this is clearly an estimation failure. What is the > index it does use? Would you show us EXPLAIN ANALYZE results when > using each index? > > regards, tom lane -- Scott Cain, Ph. D. [EMAIL PROTECTED] GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] force the use of a particular index
The problem (at least as it appears to me) is not that it is performing a table scan instead of an index scan, it is that it is using the wrong index. Here is the output from EXPLAIN ANALYZE: QUERY PLAN Unique (cost=494008.47..494037.59 rows=166 width=54) (actual time=114660.37..114660.38 rows=1 loops=1) -> Sort (cost=494008.47..494012.63 rows=1664 width=54) (actual time=114660.37..114660.37 rows=1 loops=1) Sort Key: f.name, fl.fmin, fl.fmax, fl.strand, f.type_id, f.feature_id -> Nested Loop (cost=0.00..493919.44 rows=1664 width=54) (actual time=2596.13..114632.90 rows=1 loops=1) -> Index Scan using feature_pkey on feature f (cost=0.00..134601.43 rows=52231 width=40) (actual time=105.74..56048.87 rows=13825 loops=1) Filter: (type_id = 219) -> Index Scan using featureloc_idx1 on featureloc fl (cost=0.00..6.87 rows=1 width=14) (actual time=4.23..4.23 rows=0 loops=13825) Index Cond: ("outer".feature_id = fl.feature_id) Filter: ((srcfeature_id = 6) AND (fmin <= 2585581) AND (fmax >= 2565581)) Total runtime: 114660.91 msec This is the same regardless of enable_seqscan's setting. The index that it is using on featureloc (featureloc_idx1) is on the foreign key feature_id. It should instead be using another index, featureloc_idx3, which is built on (srcfeature_id, fmin, fmax). I should also mention that I've done a VACUUM FULL ANALYZE on this database, and I've been using it for a while, and this is the primary type of query I perform on the database. Thanks, Scott On Fri, 2003-07-11 at 06:51, Rod Taylor wrote: > On Thu, 2003-07-10 at 15:18, Scott Cain wrote: > > Hello, > > > > I am wondering if there is a way to force the use of a particular index > > when doing a query. I have two tables that are pretty big (each >3 > > million rows), and when I do a join between them the performance is > > generally quite poor as it does not use the indexes that I think it > > should use. Here is an example query: > > Please send the EXPLAIN ANALYZE results for that query with and without > sequential scans enabled. > > set enable_seqscan = true; > EXPLAIN ANALYZE ; > > set enable_seqscan = false; > EXPLAIN ANALYZE ; -- Scott Cain, Ph. D. [EMAIL PROTECTED] GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] force the use of a particular index
Rod, I see what you mean about the scan on the feature_pkey taking a long time. I tried several things to remedy that. I created an index on feature (feature_id,type_id) (which I don't think makes sense since feature_id is the primary key, so add another column really doesn't help). I also created a index on feature (type_id, feature_id), but the planner doesn't use it. Also, there was an already existing index on feature (type_id) that the planner never used. One thing I tried that changed the query plan and improved performance slightly (but still nowhere near what I need) was to add a partial index on featureloc on (fmin,fmax) where scrfeature_id=6. This is something I could realistically do since there are relatively few (>30) srcfeature_ids that I am interested in, so putting in place a partial index for each of them would not be a big deal. Nevertheless, the performance is still not there. Here is the EXPLAIN ANALYZE for this situation: QUERY PLAN - Unique (cost=156172.23..156200.11 rows=159 width=54) (actual time=63631.93..63631.93 rows=1 loops=1) -> Sort (cost=156172.23..156176.21 rows=1594 width=54) (actual time=63631.93..63631.93 rows=1 loops=1) Sort Key: f.name, fl.fmin, fl.fmax, fl.strand, f.type_id, f.feature_id -> Hash Join (cost=135100.30..156087.46 rows=1594 width=54) (actual time=63631.29..63631.79 rows=1 loops=1) Hash Cond: ("outer".feature_id = "inner".feature_id) -> Index Scan using featureloc_src_6 on featureloc fl (cost=0.00..18064.99 rows=101883 width=14) (actual time=26.11..430.00 rows=570 loops=1) Index Cond: ((fmin <= 2585581) AND (fmax >= 2565581)) Filter: (srcfeature_id = 6) -> Hash (cost=134601.43..134601.43 rows=48347 width=40) (actual time=63182.86..63182.86 rows=0 loops=1) -> Index Scan using feature_pkey on feature f (cost=0.00..134601.43 rows=48347 width=40) (actual time=69.98..62978.27 rows=13825 loops=1) Filter: (type_id = 219) Total runtime: 63632.28 msec (12 rows) Any other ideas? Thanks, Scott On Fri, 2003-07-11 at 09:38, Rod Taylor wrote: > On Fri, 2003-07-11 at 13:17, Scott Cain wrote: > > The problem (at least as it appears to me) is not that it is performing > > a table scan instead of an index scan, it is that it is using the wrong > > index. Here is the output from EXPLAIN ANALYZE: > > > > QUERY PLAN > > > > Unique (cost=494008.47..494037.59 rows=166 width=54) (actual > > time=114660.37..114660.38 rows=1 loops=1) > >-> Sort (cost=494008.47..494012.63 rows=1664 width=54) (actual > > time=114660.37..114660.37 rows=1 loops=1) > > Sort Key: f.name, fl.fmin, fl.fmax, fl.strand, f.type_id, f.feature_id > > -> Nested Loop (cost=0.00..493919.44 rows=1664 width=54) (actual > > time=2596.13..114632.90 rows=1 loops=1) > >-> Index Scan using feature_pkey on feature f > > (cost=0.00..134601.43 rows=52231 width=40) (actual time=105.74..56048.87 > > rows=13825 loops=1) > > Filter: (type_id = 219) > >-> Index Scan using featureloc_idx1 on featureloc fl > > (cost=0.00..6.87 rows=1 width=14) (actual time=4.23..4.23 rows=0 loops=13825) > > Index Cond: ("outer".feature_id = fl.feature_id) > > Filter: ((srcfeature_id = 6) AND (fmin <= 2585581) AND (fmax > > >= 2565581)) > > Total runtime: 114660.91 msec > > > it is using on featureloc (featureloc_idx1) is on the foreign key > > feature_id. It should instead be using another index, featureloc_idx3, > > which is built on (srcfeature_id, fmin, fmax). > > Nope.. The optimizer is right in the decision to use featureloc_idx1. > You will notice it is expecting to retrieve a single row from this > index, but the featureloc_idx3 is bound to be larger (due to indexing > more data), thus take more disk reads for the exact same information (or > in this case, lack thereof). > > What is taking a long time is the scan on feature_pkey. It looks like it > is throwing away a ton of rows that are not type_id = 219. Either that, > or you do a pile of deletes and haven't run REINDEX recently. > > Create an index consisting of (feature_id, type_id). This will probably > make a significant different in execution t
Re: [PERFORM] force the use of a particular index
On Fri, 2003-07-11 at 11:36, Scott Cain wrote: > Rod, > > I see what you mean about the scan on the feature_pkey taking a long > time. I tried several things to remedy that. I created an index on > feature (feature_id,type_id) (which I don't think makes sense since > feature_id is the primary key, so add another column really doesn't It may be the primary key, but the system looked like it was throwing away many rows based on type_id. If it was throwing away many more rows than found, the index with type_id may have been cheaper. It is difficult to tell from an EXPLAIN ANALYZE as it doesn't tell you exactly how many rows were filtered, just the cost to read them and how many were used after the filter. > help). I also created a index on feature (type_id, feature_id), but the > planner doesn't use it. Also, there was an already existing index on > feature (type_id) that the planner never used. It cannot use more than one index for a given table scan at the moment. There are proposals on how to 'fix' that, but those require significant overhauls of various systems. > Any other ideas? Out of curiosity, what do you get if you disable hash joins? set enable_hashjoin = false; How about a partial index on (feature_id) where type_id = 219? signature.asc Description: This is a digitally signed message part
Re: [PERFORM] force the use of a particular index
On Fri, 2003-07-11 at 14:14, Rod Taylor wrote: > > > set enable_hashjoin = false; > > > > BINGO! > > I'm not so sure about that. Your dataset seems to have changed fairly > significantly since the last test. > > > -> Index Scan using feature_pkey on feature f (cost=0.00..134592.43 rows=47912 > > width=39) (actual time=0.46..502.50 rows=431 loops=1) > > Notice it only pulled out 431 rows where prior runs pulled out several > thousand (~13000). I think what really happened was something came > along and deleted a bunch of stuff, then vacuum ran. There is nearly a zero chance that happened. This database is accessible only by me, I haven't deleted anything. The only things I have done is to create and drop various indexes and run vacuum. Is there anything else that could explain the difference? Is the index scan on feature_pkey using information from the index scan on featureloc_src_6 to limit the number of rows to get from feature? Scott -- Scott Cain, Ph. D. [EMAIL PROTECTED] GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] force the use of a particular index
Scott Cain <[EMAIL PROTECTED]> writes: > Embarrassingly, I can't. I've been monkeying with the database so much > that I can't seem to get it back to the state where I reproduce the > behavior I want. If the thing works as desired after a VACUUM ANALYZE, then I suggest the estimation failure was just due to out-of-date statistics ... regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly