Re: [GENERAL] Index not being used unless enable_seqscan=false
On Wed, 2005-08-10 at 12:01 -0700, Shane wrote: Hello all, I am working with a simple table and query abut cannot seem to get it to use the index I have created. However, if I set enable_seqscan=false, the index is used and the query is much faster. I have tried a vacuum analyze but to no avail. Table layout: Table public.seen Column | Type | Modifiers --++--- group_id | integer| not null msgid| text | not null msgtime | timestamp(0) without time zone | not null Indexes: seen_group_id_key unique, btree (group_id, msgid) seen_msgtime btree (msgtime) Foreign-key constraints: $1 FOREIGN KEY (group_id) REFERENCES groups(id) ON UPDATE CASCADE ON DELETE CASCADE explain analyze with enable_seqscan=true explain analyze select msgid from seen where msgtime cast(now() - interval '6 months' as timestamp(0) without time zone); QUERY PLAN - Seq Scan on seen (cost=0.00..107879.45 rows=1081044 width=46) (actual time=7597.387..27000.777 rows=28907 loops=1) Filter: (msgtime ((now() - '6 mons'::interval))::timestamp(0) without time zone) Total runtime: 27096.337 ms (3 rows) Same query with enable_seqscan=false QUERY PLAN -- Index Scan using seen_msgtime on seen (cost=0.00..3818325.78 rows=1081044 width=46) (actual time=0.140..156.222 rows=28907 loops=1) Index Cond: (msgtime ((now() - '6 mons'::interval))::timestamp(0) without time zone) Total runtime: 248.737 ms (3 rows) Any ideas on how I can fix this. I get this problem now and again with other databases but a vacuum usually fixes it. Right off the bat (if I am interpreting the results of your explain analyze correctly) it looks like the planner is basing its decision to seqscan as it thinks that it needs to filter over 1 million rows (versus the 29,000 rows that actually are pulled). Perhaps increasing stats on msgtime and then analyzing the table may help. Depending on your hardware, decreasing random_page_cost in your postgresql.conf just a touch may help too. Sven ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Index not being used unless enable_seqscan=false
On Wed, 2005-08-10 at 12:01 -0700, Shane wrote: Hello all, I am working with a simple table and query abut cannot seem to get it to use the index I have created. However, if I set enable_seqscan=false, the index is used and the query is much faster. I have tried a vacuum analyze but to no avail. [snip] explain analyze select msgid from seen where msgtime cast(now() - interval '6 months' as timestamp(0) without time zone); QUERY PLAN - Seq Scan on seen (cost=0.00..107879.45 rows=1081044 width=46) (actual time=7597.387..27000.777 rows=28907 loops=1) Filter: (msgtime ((now() - '6 mons'::interval))::timestamp(0) without time zone) Total runtime: 27096.337 ms (3 rows) Same query with enable_seqscan=false [snip faster plan] Any ideas on how I can fix this. I get this problem now and again with other databases but a vacuum usually fixes it. The planner is not very good at estimating selectivity of single unequalities. If you can specify a range in the where clause, you might possibly have better luck. ...WHERE msgtime cast(now() - interval '6 months' as timestamp(0) without time zone AND msgtime = '2000-01-01' Also, you might want to try to increase the STATISTICS target of msgtime. Sometimes an ORDER BY clause can help the planner on choosing indexscan, although in this case the difference in estimated cost is so high that I doubt it. gnari ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Index not being used unless enable_seqscan=false
On Wed, 2005-08-10 at 12:58 -0700, Shane wrote: On Wed, Aug 10, 2005 at 03:31:27PM -0400, Sven Willenberger wrote: Right off the bat (if I am interpreting the results of your explain analyze correctly) it looks like the planner is basing its decision to seqscan as it thinks that it needs to filter over 1 million rows (versus the 29,000 rows that actually are pulled). Perhaps increasing stats on msgtime and then analyzing the table may help. Depending on your hardware, decreasing random_page_cost in your postgresql.conf just a touch may help too. Thanks for the pointers. I tried increasing the stats from the default of 10 to 25 with no change. How high would you bring it? Also, I've never played with the various cost variables. The database sits on a raid5 partition composed of 4 15k u320 SCSI drives, dual xeon 2.8(ht enabled) 2gb ram. I suppose this might actually increase the cost of fetching a random disk page as it may well be on another physical disk and wouldn't be in the readahead cache. Any idea as to what it should be on this sort of system? ---(end of broadcast)--- Try increasing stats to 100 on just the msgtime column, not the default (changing the default will only have an effect on newly created columns -- you may want to change the default back to 10): ALTER TABLE seen ALTER msgtime SET STATISTICS 100; After running that command, analyze the table again and see if that helps. I am assuming the culprit is this particular column as your index and search criteria is based on that one. The default random_page_cost I believe is 4.0; on your system you could probably easily drop it to 3, possibly lower, and see how that performs. Sven ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Index not being used unless enable_seqscan=false
Shane [EMAIL PROTECTED] writes: I am working with a simple table and query abut cannot seem to get it to use the index I have created. ... explain analyze select msgid from seen where msgtime cast(now() - interval '6 months' as timestamp(0) without time zone); As some other people already pointed out, the problem is the horrible misestimate of the number of matching rows. You did not say your Postgres version, but I'm betting it's pre-8.0. Versions before 8.0 would not assume that they could get any useful statistical info from an expression involving now() (or in general, any non-immutable function). The default assumption in such cases is that a lot of rows are retrieved --- too many for an indexscan. If you cannot update to 8.0.* at the moment, a workaround is to do the timestamp calculation on the client side so that you can send over a query that's just a comparison to a constant: ... where msgtime '2005-02-14 ...'::timestamp; regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster