Re: [PERFORM] Index not used
> Or, better, persuade the app to label the value " public.push_guid " since that is the column's type...a type you haven't defined for us. If you get to add explicit casts this should be easy...but I'm not familiar with the framework you are using. push_guid was a CHARACTER(36) column. I ended up converting it to CHARACTER VARYING(36). Index is now being used and performance is as expected. Thanks a lot Meike
Re: [PERFORM] Index not used
On Thu, Jun 16, 2016 at 11:05 AM, Tom Lanewrote: > meike.talb...@women-at-work.org writes: > > When I query this through pgsql, the queries are fast as expected. > > select * from push_topic where guid = > 'DD748CCD-B8A4-3B9F-8F60-67F1F673CFE5' > > Index Scan using push_topic_idx_topicguid on push_topic > (cost=0.42..8.44 rows=1 width=103) (actual time=0.117..0.121 rows=1 loops=1) > > Index Cond: ((guid)::bpchar = > 'DD748CCD-B8A4-3B9F-8F60-67F1F673CFE5'::bpchar) > > Buffers: shared hit=3 read=1 > > Total runtime: 0.191 ms > > > However when I run the exact query through a different application > (CodeSynthesis ORM) the query is very slow (~ 115ms logged) > > I noted this is due to a sequential scan happening on the table instead > of an index scan. > > It looks like what that app is actually issuing is something different > from what you tested by hand, to wit > > select * from push_topic where guid = > 'DD748CCD-B8A4-3B9F-8F60-67F1F673CFE5'::text > > which causes the comparison to be resolved as texteq not bpchareq, ie you > effectively have > > select * from push_topic where guid::text = > 'DD748CCD-B8A4-3B9F-8F60-67F1F673CFE5'::text > > and that doesn't match a bpchar index. If you can't persuade the app to > label the comparison value as bpchar not text, the easiest fix would be > to create an additional index on "guid::text". > Or, better, persuade the app to label the value " public.push_guid " since that is the column's type...a type you haven't defined for us. If you get to add explicit casts this should be easy...but I'm not familiar with the framework you are using. David J.
Re: [PERFORM] Index not used
meike.talb...@women-at-work.org writes: > When I query this through pgsql, the queries are fast as expected. > select * from push_topic where guid = 'DD748CCD-B8A4-3B9F-8F60-67F1F673CFE5' > Index Scan using push_topic_idx_topicguid on push_topic (cost=0.42..8.44 > rows=1 width=103) (actual time=0.117..0.121 rows=1 loops=1) > Index Cond: ((guid)::bpchar = > 'DD748CCD-B8A4-3B9F-8F60-67F1F673CFE5'::bpchar) > Buffers: shared hit=3 read=1 > Total runtime: 0.191 ms > However when I run the exact query through a different application > (CodeSynthesis ORM) the query is very slow (~ 115ms logged) > I noted this is due to a sequential scan happening on the table instead of an > index scan. It looks like what that app is actually issuing is something different from what you tested by hand, to wit select * from push_topic where guid = 'DD748CCD-B8A4-3B9F-8F60-67F1F673CFE5'::text which causes the comparison to be resolved as texteq not bpchareq, ie you effectively have select * from push_topic where guid::text = 'DD748CCD-B8A4-3B9F-8F60-67F1F673CFE5'::text and that doesn't match a bpchar index. If you can't persuade the app to label the comparison value as bpchar not text, the easiest fix would be to create an additional index on "guid::text". regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index not used
When you run psql, are you running that on the application server or the database server? Does the application run on the same server as the database and how is the application connecting to the database (JDBC, ODBC, etc)? In other words is there a difference in network time between the 2? Also the queries are not exactly the same. With psql you use "select *" and the application specifies what columns it wants returned and the order to return them. Try running the exact query on both. Regards John -Original Message- From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of meike.talb...@women-at-work.org Sent: Thursday, June 16, 2016 12:59 AM To: pgsql-performance@postgresql.org Subject: [PERFORM] Index not used Hello, I've a basic table with about 100K rows: CREATE TABLE "public"."push_topic" ( "id" Serial PRIMARY KEY, "guid" public.push_guid NOT NULL, "authenticatorsending" Varchar(32) NOT NULL, "authenticatorsubscription" Varchar(32) NOT NULL, "countpushed" Integer NOT NULL, "datecreated" timestamp NOT NULL, "datelastpush" timestamp ) CREATE UNIQUE INDEX push_topic_idx_topicguid ON push_topic USING btree (guid) When I query this through pgsql, the queries are fast as expected. This is the query: select * from push_topic where guid = 'DD748CCD-B8A4-3B9F-8F60-67F1F673CFE5' And the plan: Index Scan using push_topic_idx_topicguid on push_topic (cost=0.42..8.44 rows=1 width=103) (actual time=0.117..0.121 rows=1 loops=1) Index Cond: ((guid)::bpchar = 'DD748CCD-B8A4-3B9F-8F60-67F1F673CFE5'::bpchar) Buffers: shared hit=3 read=1 Total runtime: 0.191 ms However when I run the exact query through a different application (CodeSynthesis ORM) the query is very slow (~ 115ms logged) I noted this is due to a sequential scan happening on the table instead of an index scan. This is query plan in the log file: LOG: plan: DETAIL: {PLANNEDSTMT :commandType 1 :queryId 0 :hasReturning false :hasModifyingCTE false :canSetTag true :transientPlan false :planTree {SEQSCAN :startup_cost 0.00 :total_cost 2877.58 :plan_rows 429 :plan_width 103 :targetlist ( {TARGETENTRY :expr {VAR :varno 1 :varattno 1 :vartype 23 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno 1 :location 7 } :resno 1 :resname id :ressortgroupref 0 :resorigtbl 16393 :resorigcol 1 :resjunk false } {TARGETENTRY :expr {VAR :varno 1 :varattno 2 :vartype 16385 :vartypmod -1 :varcollid 100 :varlevelsup 0 :varnoold 1 :varoattno 2 :location 26 } :resno 2 :resname guid :ressortgroupref 0 :resorigtbl 16393 :resorigcol 2 :resjunk false } {TARGETENTRY :expr {VAR :varno 1 :varattno 3 :vartype 1043 :vartypmod 36 :varcollid 100 :varlevelsup 0 :varnoold 1 :varoattno 3 :location 47 } :resno 3 :resname authenticatorsending :ressortgroupref 0 :resorigtbl 16393 :resorigcol 3 :resjunk false } {TARGETENTRY :expr {VAR :varno 1 :varattno 4 :vartype 1043 :vartypmod 36 :varcollid 100 :varlevelsup 0 :varnoold 1 :varoattno 4 :location 84 } :resno 4 :resname authenticatorsubscription :ressortgroupref 0 :resorigtbl 16393 :resorigcol 4 :resjunk false }
[PERFORM] Index not used
Hello, I've a basic table with about 100K rows: CREATE TABLE "public"."push_topic" ( "id" Serial PRIMARY KEY, "guid" public.push_guid NOT NULL, "authenticatorsending" Varchar(32) NOT NULL, "authenticatorsubscription" Varchar(32) NOT NULL, "countpushed" Integer NOT NULL, "datecreated" timestamp NOT NULL, "datelastpush" timestamp ) CREATE UNIQUE INDEX push_topic_idx_topicguid ON push_topic USING btree (guid) When I query this through pgsql, the queries are fast as expected. This is the query: select * from push_topic where guid = 'DD748CCD-B8A4-3B9F-8F60-67F1F673CFE5' And the plan: Index Scan using push_topic_idx_topicguid on push_topic (cost=0.42..8.44 rows=1 width=103) (actual time=0.117..0.121 rows=1 loops=1) Index Cond: ((guid)::bpchar = 'DD748CCD-B8A4-3B9F-8F60-67F1F673CFE5'::bpchar) Buffers: shared hit=3 read=1 Total runtime: 0.191 ms However when I run the exact query through a different application (CodeSynthesis ORM) the query is very slow (~ 115ms logged) I noted this is due to a sequential scan happening on the table instead of an index scan. This is query plan in the log file: LOG: plan: DETAIL: {PLANNEDSTMT :commandType 1 :queryId 0 :hasReturning false :hasModifyingCTE false :canSetTag true :transientPlan false :planTree {SEQSCAN :startup_cost 0.00 :total_cost 2877.58 :plan_rows 429 :plan_width 103 :targetlist ( {TARGETENTRY :expr {VAR :varno 1 :varattno 1 :vartype 23 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno 1 :location 7 } :resno 1 :resname id :ressortgroupref 0 :resorigtbl 16393 :resorigcol 1 :resjunk false } {TARGETENTRY :expr {VAR :varno 1 :varattno 2 :vartype 16385 :vartypmod -1 :varcollid 100 :varlevelsup 0 :varnoold 1 :varoattno 2 :location 26 } :resno 2 :resname guid :ressortgroupref 0 :resorigtbl 16393 :resorigcol 2 :resjunk false } {TARGETENTRY :expr {VAR :varno 1 :varattno 3 :vartype 1043 :vartypmod 36 :varcollid 100 :varlevelsup 0 :varnoold 1 :varoattno 3 :location 47 } :resno 3 :resname authenticatorsending :ressortgroupref 0 :resorigtbl 16393 :resorigcol 3 :resjunk false } {TARGETENTRY :expr {VAR :varno 1 :varattno 4 :vartype 1043 :vartypmod 36 :varcollid 100 :varlevelsup 0 :varnoold 1 :varoattno 4 :location 84 } :resno 4 :resname authenticatorsubscription :ressortgroupref 0 :resorigtbl 16393 :resorigcol 4 :resjunk false } {TARGETENTRY :expr {VAR :varno 1 :varattno 5 :vartype 23 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno 5 :location 126 } :resno 5 :resname countpushed :ressortgroupref 0 :resorigtbl 16393 :resorigcol 5 :resjunk false } {TARGETENTRY :expr {VAR :varno 1 :varattno 6 :vartype 1114 :vartypmod -1 :varcollid 0 :varlevelsup 0
Re: [PERFORM] index not used again
Stephan Szabo schrieb: Did you reset the table contents between these two (remember that explain analyze actually runs the query)? The second appears to be changing no rows from the output. I for myself did not, but as there are runnig automatic jobs periodically I can't tell, if one ran in the time while I was testing (but I guess not). At starting my tests all rows contained a zero for all tokens and there should be no ones at all. In my case rows with token set to one are really rare, about one of a thousand rows. I looked for fast way to find therse rows. I'll try again after a successful run - not resetting the token (not using analyse this time). Cheers, Jan signature.asc Description: OpenPGP digital signature
Re: [PERFORM] index not used again
On Sun, 2 Apr 2006, Jan Kesten wrote: Stephan Szabo schrieb: Did you reset the table contents between these two (remember that explain analyze actually runs the query)? The second appears to be changing no rows from the output. I for myself did not, but as there are runnig automatic jobs periodically I can't tell, if one ran in the time while I was testing (but I guess not). At starting my tests all rows contained a zero for all tokens and there should be no ones at all. The reason I asked is that the explain analyze output for the first query on fak6 (using a seqscan) seemed to imply 24k rows actually matched the condition and were updated, so comparisons to the later times may be skewed. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] index not used again
Hi folks! I have just a issue again with unused indexes. I have a database with a couple of tables and I have to do an sync job with them. For marking which row has to be transfered I added a new column token (integer, I will need some more tokens in near future) to every table. Before determining wich rows to mark I first had a simple update table set token=0; Okay, this uses seq scan of course. For speeding things up, I created an partial index on every table like this: create index table_idx_token on table using (token) where token=1; After that I run vacuum analyse to update statistics and changed my to: update table set token=0 where token=1; I think this should be able to use my index, and indeed on one table this works quite fine: transfer= explain analyse update ku set token=0 where token=1; QUERY PLAN Index Scan using ku_idx_token on ku (cost=0.00..1.01 rows=1 width=1871) (actual time=0.169..0.169 rows=0 loops=1) Index Cond: (token = 1) Total runtime: 3.816 ms (3 rows) But on most of the other tables a seq scan is still used: transfer= explain analyse update fak6 set token=0 where token=1; QUERY PLAN Seq Scan on fak6 (cost=0.00..301618.71 rows=24217 width=1895) (actual time=96987.417..127020.919 rows=24251 loops=1) Filter: (token = 1) Total runtime: 181828.281 ms (3 rows) So I tried to force using an index with setting enable_seqscan to off, here are the results: transfer= set enable_seqscan to off; SET transfer= explain analyse update fak6 set token=0 where token=1; QUERY PLAN Index Scan using fak6_idx_token on fak6 (cost=0.00..301697.93 rows=24217 width=1895) (actual time=1271.273..1271.273 rows=0 loops=1) Index Cond: (token = 1) Total runtime: 1272.572 ms (3 rows) transfer= set enable_seqscan to on; SET transfer= explain analyse update fak6 set token=0 where token=1; QUERY PLAN Seq Scan on fak6 (cost=0.00..301618.71 rows=24217 width=1895) (actual time=93903.379..93903.379 rows=0 loops=1) Filter: (token = 1) Total runtime: 93904.679 ms (3 rows) transfer= set enable_seqscan to off; SET transfer= explain analyse update fak6 set token=0 where token=1; QUERY PLAN Index Scan using fak6_idx_token on fak6 (cost=0.00..301697.93 rows=24217 width=1895) (actual time=223.721..223.721 rows=0 loops=1) Index Cond: (token = 1) Total runtime: 226.851 ms (3 rows) Now I'm a bit confused. The costs are nearly the same if using index or not - but runtime is about 70 times faster? Any idea how I can fix this issue - I thought a partial index would be the right way? Cheers, Jan signature.asc Description: OpenPGP digital signature
Re: [PERFORM] index not used again
On Fri, 31 Mar 2006, Jan Kesten wrote: Hi folks! I have just a issue again with unused indexes. I have a database with a couple of tables and I have to do an sync job with them. For marking which row has to be transfered I added a new column token (integer, I will need some more tokens in near future) to every table. Before determining wich rows to mark I first had a simple update table set token=0; Okay, this uses seq scan of course. For speeding things up, I created an partial index on every table like this: create index table_idx_token on table using (token) where token=1; After that I run vacuum analyse to update statistics and changed my to: update table set token=0 where token=1; I think this should be able to use my index, and indeed on one table this works quite fine: transfer= explain analyse update ku set token=0 where token=1; QUERY PLAN Index Scan using ku_idx_token on ku (cost=0.00..1.01 rows=1 width=1871) (actual time=0.169..0.169 rows=0 loops=1) Index Cond: (token = 1) Total runtime: 3.816 ms (3 rows) But on most of the other tables a seq scan is still used: transfer= explain analyse update fak6 set token=0 where token=1; QUERY PLAN Seq Scan on fak6 (cost=0.00..301618.71 rows=24217 width=1895) (actual time=96987.417..127020.919 rows=24251 loops=1) Filter: (token = 1) Total runtime: 181828.281 ms (3 rows) So I tried to force using an index with setting enable_seqscan to off, here are the results: transfer= set enable_seqscan to off; SET transfer= explain analyse update fak6 set token=0 where token=1; QUERY PLAN Index Scan using fak6_idx_token on fak6 (cost=0.00..301697.93 rows=24217 width=1895) (actual time=1271.273..1271.273 rows=0 loops=1) Index Cond: (token = 1) Total runtime: 1272.572 ms (3 rows) Did you reset the table contents between these two (remember that explain analyze actually runs the query)? The second appears to be changing no rows from the output. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Index isn't used during a join.
On Wed, Jan 11, 2006 at 12:56:55AM -0700, Michael Fuhr wrote: WHERE ... AND doy = EXTRACT(doy FROM now() - '24 hour'::interval) AND doy = EXTRACT(doy FROM now()) To work on 1 Jan this should be more like WHERE ... AND (doy = EXTRACT(doy FROM now() - '24 hour'::interval) OR doy = EXTRACT(doy FROM now())) In any case the point is to add conditions to the WHERE clause that will use an index on the table for which you're currently getting a sequential scan. -- Michael Fuhr ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Index isn't used during a join.
When grilled further on (Wed, 11 Jan 2006 00:56:55 -0700), Michael Fuhr [EMAIL PROTECTED] confessed: On Tue, Jan 10, 2006 at 10:10:55PM -0700, Robert Creager wrote: The query is now correct, but still is slow because of lack of index usage. I don't know how to structure the query correctly to use the index. Have you tried adding restrictions on doy in the WHERE clause? Something like this, I think: I cannot. That's what I thought I would get from the join. The query shown will always have two days involved, and only grows from there. The data is graphed at http://www.logicalchaos.org/weather/index.html, and I'm looking at adding historical data to the graphs. Opps, never mind. You hit the nail on the head: weather-# SELECT *, unmunge_time( time_group ) AS time, weather-# EXTRACT( doy FROM unmunge_time( time_group ) ) weather-# FROM minute.windspeed weather-# JOIN doy_agg ON( EXTRACT( doy FROM unmunge_time( time_group ) ) = doy ) weather-# WHERE unmunge_time( time_group ) ( now() - '24 hour'::interval ) weather-# AND doy BETWEEN EXTRACT( doy FROM now() - '24 hour'::interval) weather-# AND EXTRACT( doy FROM now() ) weather-# ORDER BY time_group; QUERY PLAN Sort (cost=21914.09..21914.10 rows=1 width=48) (actual time=76.595..76.662 rows=286 loops=1) Sort Key: windspeed.time_group - Hash Join (cost=21648.19..21914.08 rows=1 width=48) (actual time=64.656..75.562 rows=286 loops=1) Hash Cond: (date_part('doy'::text, unmunge_time(outer.time_group)) = inner.doy) - Bitmap Heap Scan on windspeed (cost=2.27..267.40 rows=74 width=28) (actual time=0.585..1.111 rows=286 loops=1) Recheck Cond: (unmunge_time(time_group) (now() - '24:00:00'::interval)) - Bitmap Index Scan on minute_windspeed_unmunge_index (cost=0.00..2.27 rows=74 width=0) (actual time=0.566..0.566 rows=287 loops=1) Index Cond: (unmunge_time(time_group) (now() - '24:00:00'::interval)) - Hash (cost=21645.92..21645.92 rows=3 width=20) (actual time=63.849..63.849 rows=2 loops=1) - HashAggregate (cost=21645.84..21645.89 rows=3 width=20) (actual time=63.832..63.834 rows=2 loops=1) - Bitmap Heap Scan on readings (cost=59.21..21596.85 rows=6532 width=20) (actual time=15.174..53.249 rows=7613 loops=1) Recheck Cond: ((date_part('doy'::text, when) = date_part('doy'::text, (now() - '24:00:00'::interval))) AND (date_part('doy'::text, when) = date_part('doy'::text, now( - Bitmap Index Scan on readings_doy_index (cost=0.00..59.21 rows=6532 width=0) (actual time=12.509..12.509 rows=10530 loops=1) Index Cond: ((date_part('doy'::text, when) = date_part('doy'::text, (now() - '24:00:00'::interval))) AND (date_part('doy'::text, when) = date_part('doy'::text, now( Total runtime: 77.177 ms What I had thought is that PG would (could?) be smart enough to realize that one query was restricted, and apply that restriction to the other based on the join. I know it works in other cases (using indexes on both tables using the join)... Something else occurred to me: do you (or will you) have more than one year of data? If so then matching on doy could be problematic unless you also check for the year, or unless you want to match more than one year. Yes and yes. I'm doing both aggregate by day of the year for all data, and aggregate by day of year within each year. The examples are: weather=# select * from doy_agg where doy = extract( doy from now() ); doy | avg_windspeed | max_windspeed -+--+--- 11 | 6.14058239764748 |69 (1 row) weather=# select * from doy_day_agg where extract( doy from day ) = extract( doy from now() ); day | avg_windspeed | max_windspeed -+--+--- 2004-01-11 00:00:00 | 5.03991313397539 |17 2006-01-11 00:00:00 | 18.532050716667 |69 2005-01-11 00:00:00 | 3.6106763448041 |13 Thanks for your help Michael. Cheers, Rob -- 07:07:30 up 3 days, 23:34, 9 users, load average: 2.29, 2.44, 2.43 Linux 2.6.12-12-2 #4 SMP Tue Jan 3 19:56:19 MST 2006 pgpbEkxD1FaVM.pgp Description: PGP signature
Re: [PERFORM] Index isn't used during a join.
When grilled further on (Wed, 11 Jan 2006 07:26:59 -0700), Robert Creager [EMAIL PROTECTED] confessed: weather-# SELECT *, unmunge_time( time_group ) AS time, weather-# EXTRACT( doy FROM unmunge_time( time_group ) ) weather-# FROM minute.windspeed weather-# JOIN doy_agg ON( EXTRACT( doy FROM unmunge_time( time_group ) ) = doy ) weather-# WHERE unmunge_time( time_group ) ( now() - '24 hour'::interval ) weather-# AND doy BETWEEN EXTRACT( doy FROM now() - '24 hour'::interval) weather-# AND EXTRACT( doy FROM now() ) weather-# ORDER BY time_group; The more I think about it, the more I believe PG is missing an opportunity. The query is adequately constrained without the BETWEEN clause. Why doesn't PG see that? I realize I'm a hack and by db organization shows that... The query is wrong as stated, as it won't work when the interval crosses a year boundary, but it's a stop gap for now. Cheers, Rob -- 07:58:30 up 4 days, 25 min, 9 users, load average: 2.13, 2.15, 2.22 Linux 2.6.12-12-2 #4 SMP Tue Jan 3 19:56:19 MST 2006 pgpvvd8OrY8zM.pgp Description: PGP signature
Re: [PERFORM] Index isn't used during a join.
Robert Creager [EMAIL PROTECTED] writes: What I had thought is that PG would (could?) be smart enough to realize tha= t one query was restricted, and apply that restriction to the other based o= n the join. I know it works in other cases (using indexes on both tables u= sing the join)... The planner understands about transitivity of equality, ie given a = b and b = c it can infer a = c. It doesn't do any such thing for inequalities though, nor does it deduce f(a) = f(b) for arbitrary functions f. The addition Michael suggested requires much more understanding of the properties of the functions in your query than I think would be reasonable to put into the planner. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Index isn't used during a join.
On Wed, Jan 11, 2006 at 08:02:37AM -0700, Robert Creager wrote: The query is wrong as stated, as it won't work when the interval crosses a year boundary, but it's a stop gap for now. Yeah, I realized that shortly after I posted the original and posted a correction. http://archives.postgresql.org/pgsql-performance/2006-01/msg00104.php -- Michael Fuhr ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Index isn't used during a join.
When grilled further on (Wed, 11 Jan 2006 10:33:03 -0500), Tom Lane [EMAIL PROTECTED] confessed: The planner understands about transitivity of equality, ie given a = b and b = c it can infer a = c. It doesn't do any such thing for inequalities though, nor does it deduce f(a) = f(b) for arbitrary functions f. The addition Michael suggested requires much more understanding of the properties of the functions in your query than I think would be reasonable to put into the planner. OK. I think reached a point that I need to re-organize how the data is stored, maybe ridding myself of the schema and switching entirely to views. At that point, I likely could rid myself of the function (unmunge_time) I'm using, and work with times and doy fields. Thanks, Rob -- 21:17:00 up 4 days, 13:43, 9 users, load average: 2.02, 2.18, 2.23 Linux 2.6.12-12-2 #4 SMP Tue Jan 3 19:56:19 MST 2006 pgpFfPrTCdHVy.pgp Description: PGP signature
Re: [PERFORM] Index isn't used during a join.
When grilled further on (Mon, 9 Jan 2006 22:58:18 -0700), Michael Fuhr [EMAIL PROTECTED] confessed: On Mon, Jan 09, 2006 at 09:23:38PM -0700, Robert Creager wrote: I'm working with a query to get more info out with a join. The base query works great speed wise because of index usage. When the join is tossed in, the index is no longer used, so the query performance tanks. The first query you posted returns 285 rows and the second returns over one million; index usage aside, that difference surely accounts for a performance penalty. And as is often pointed out, index scans aren't always faster than sequential scans: the more of a table a query has to fetch, the more likely a sequential scan will be faster. Thanks for pointing out the obvious that I missed. Too much data in the second query. It's supposed to match (row wise) what was returned from the first query. Just ignore me for now... Thanks, Rob -- 08:15:24 up 3 days, 42 min, 9 users, load average: 2.07, 2.20, 2.25 Linux 2.6.12-12-2 #4 SMP Tue Jan 3 19:56:19 MST 2006 pgpQ6nKK3glcj.pgp Description: PGP signature
Re: [PERFORM] Index isn't used during a join.
Ok, I'm back, and in a little better shape. The query is now correct, but still is slow because of lack of index usage. I don't know how to structure the query correctly to use the index. Taken individually: weather=# explain analyze select * from doy_agg where doy = extract( doy from now() ); QUERY PLAN HashAggregate (cost=13750.67..13750.71 rows=2 width=20) (actual time=123.134..123.135 rows=1 loops=1) - Bitmap Heap Scan on readings (cost=25.87..13720.96 rows=3962 width=20) (actual time=6.384..116.559 rows=4175 loops=1) Recheck Cond: (date_part('doy'::text, when) = date_part('doy'::text, now())) - Bitmap Index Scan on readings_doy_index (cost=0.00..25.87 rows=3962 width=0) (actual time=5.282..5.282 rows=4215 loops=1) Index Cond: (date_part('doy'::text, when) = date_part('doy'::text, now())) Total runtime: 123.366 ms produces the data: weather=# select * from doy_agg where doy = extract( doy from now() ); doy | avg_windspeed | max_windspeed -+--+--- 10 | 8.53403056583666 |59 and: weather=# EXPLAIN ANALYZE weather-# SELECT *, weather-# unmunge_time( time_group ) AS time weather-# FROM minute.windspeed weather-# WHERE unmunge_time( time_group ) ( now() - '24 hour'::interval ) weather-# ORDER BY time_group; QUERY PLAN - Sort (cost=595.33..595.77 rows=176 width=28) (actual time=4.762..4.828 rows=283 loops=1) Sort Key: time_group - Bitmap Heap Scan on windspeed (cost=2.62..588.76 rows=176 width=28) (actual time=0.901..3.834 rows=283 loops=1) Recheck Cond: (unmunge_time(time_group) (now() - '24:00:00'::interval)) - Bitmap Index Scan on minute_windspeed_unmunge_index (cost=0.00..2.62 rows=176 width=0) (actual time=0.745..0.745 rows=284 loops=1) Index Cond: (unmunge_time(time_group) (now() - '24:00:00'::interval)) Total runtime: 5.108 ms produces: time_group |min_reading| max_reading |avg_reading|time +---+-+---+- 1136869500 | 0.8 | 6 | 2.62193548387097 | 2006-01-09 22:05:00 1136869800 | 0 | 3 | 0.406021505376343 | 2006-01-09 22:10:00 1136870100 | 0 | 5 | 1.68 | 2006-01-09 22:15:00 ... But I want the composite of the two queries, and I'm stuck on: weather=# EXPLAIN ANALYZE weather-# SELECT *, weather-# unmunge_time( time_group ) AS time weather-# FROM minute.windspeed weather-# JOIN doy_agg ON( EXTRACT( doy FROM unmunge_time( time_group ) ) = doy ) weather-# WHERE unmunge_time( time_group ) ( now() - '24 hour'::interval ) weather-# ORDER BY time_group; QUERY PLAN - Sort (cost=153627.67..153628.48 rows=322 width=48) (actual time=10637.681..10637.748 rows=286 loops=1) Sort Key: windspeed.time_group - Merge Join (cost=153604.82..153614.26 rows=322 width=48) (actual time=10633.375..10636.728 rows=286 loops=1) Merge Cond: (outer.?column5? = inner.doy) - Sort (cost=594.89..595.33 rows=176 width=28) (actual time=5.539..5.612 rows=286 loops=1) Sort Key: date_part('doy'::text, unmunge_time(windspeed.time_group)) - Bitmap Heap Scan on windspeed (cost=2.62..588.32 rows=176 width=28) (actual time=0.918..4.637 rows=286 loops=1) Recheck Cond: (unmunge_time(time_group) (now() - '24:00:00'::interval)) - Bitmap Index Scan on minute_windspeed_unmunge_index (cost=0.00..2.62 rows=176 width=0) (actual time=0.739..0.739 rows=287 loops=1) Index Cond: (unmunge_time(time_group) (now() - '24:00:00'::interval)) - Sort (cost=153009.93..153010.84 rows=366 width=20) (actual time=10627.699..10627.788 rows=295 loops=1) Sort Key: doy_agg.doy - HashAggregate (cost=152984.28..152990.69 rows=366 width=20) (actual time=10625.649..10626.601 rows=366 loops=1) - Seq Scan on readings (cost=0.00..145364.93 rows=1015914 width=20) (actual
Re: [PERFORM] Index isn't used during a join.
On Tue, Jan 10, 2006 at 10:10:55PM -0700, Robert Creager wrote: The query is now correct, but still is slow because of lack of index usage. I don't know how to structure the query correctly to use the index. Have you tried adding restrictions on doy in the WHERE clause? Something like this, I think: WHERE ... AND doy = EXTRACT(doy FROM now() - '24 hour'::interval) AND doy = EXTRACT(doy FROM now()) Something else occurred to me: do you (or will you) have more than one year of data? If so then matching on doy could be problematic unless you also check for the year, or unless you want to match more than one year. -- Michael Fuhr ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] Index isn't used during a join.
Hey folks, I'm working with a query to get more info out with a join. The base query works great speed wise because of index usage. When the join is tossed in, the index is no longer used, so the query performance tanks. Can anyone advise on how to get the index usage back? weather=# select version(); version --- PostgreSQL 8.1.1 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.0.1 (4.0.1-5mdk for Mandriva Linux release 2006.0) (1 row) The base query is: weather=# EXPLAIN ANALYZE weather-# SELECT min_reading, max_reading, avg_reading, -- doy, weather-#unmunge_time( time_group ) AS time weather-# FROM minute.windspeed weather-# --JOIN readings_doy ON EXTRACT( doy FROM unmunge_time( time_group ) ) = doy weather-# WHERE unmunge_time( time_group ) ( now() - '24 hour'::interval ) weather-# ORDER BY time_group; QUERY PLAN - Sort (cost=10995.29..11155.58 rows=64117 width=28) (actual time=4.509..4.574 rows=285 loops=1) Sort Key: time_group - Bitmap Heap Scan on windspeed (cost=402.42..5876.05 rows=64117 width=28) (actual time=0.784..3.639 rows=285 loops=1) Recheck Cond: (unmunge_time(time_group) (now() - '24:00:00'::interval)) - Bitmap Index Scan on minute_windspeed_index (cost=0.00..402.42 rows=64117 width=0) (actual time=0.675..0.675 rows=285 loops=1) Index Cond: (unmunge_time(time_group) (now() - '24:00:00'::interval)) Total runtime: 4.880 ms (7 rows) When I add in the join, the query tosses out the nice quick index in favor of sequence scans: weather=# EXPLAIN ANALYZE weather-# SELECT min_reading, max_reading, avg_reading, -- doy, weather-#unmunge_time( time_group ) AS time weather-# FROM minute.windspeed weather-# JOIN readings_doy ON EXTRACT( doy FROM unmunge_time( time_group ) ) = doy weather-# WHERE unmunge_time( time_group ) ( now() - '24 hour'::interval ) weather-# ORDER BY time_group; QUERY PLAN - Sort (cost=98239590.88..99052623.66 rows=325213113 width=28) (actual time=60136.484..61079.845 rows=1030656 loops=1) Sort Key: windspeed.time_group - Merge Join (cost=265774.21..8396903.54 rows=325213113 width=28) (actual time=34318.334..47113.277 rows=1030656 loops=1) Merge Cond: (outer.?column5? = inner.?column2?) - Sort (cost=12997.68..13157.98 rows=64120 width=28) (actual time=2286.155..2286.450 rows=284 loops=1) Sort Key: date_part('doy'::text, unmunge_time(windspeed.time_group)) - Seq Scan on windspeed (cost=0.00..7878.18 rows=64120 width=28) (actual time=2279.275..2285.271 rows=284 loops=1) Filter: (unmunge_time(time_group) (now() - '24:00:00'::interval)) - Sort (cost=252776.54..255312.51 rows=1014389 width=8) (actual time=32001.370..33473.407 rows=1051395 loops=1) Sort Key: date_part('doy'::text, readings.when) - Seq Scan on readings (cost=0.00..142650.89 rows=1014389 width=8) (actual time=0.053..13759.015 rows=1014448 loops=1) Total runtime: 61720.935 ms (12 rows) weather=# \d minute.windspeed Table minute.windspeed Column| Type | Modifiers -+--+--- time_group | integer | not null min_reading | double precision | not null max_reading | double precision | not null avg_reading | double precision | not null Indexes: windspeed_pkey PRIMARY KEY, btree (time_group) minute_windspeed_index btree (unmunge_time(time_group)) CREATE OR REPLACE FUNCTION unmunge_time( integer ) RETURNS timestamp AS ' DECLARE input ALIAS FOR $1; BEGIN RETURN (''epoch''::timestamptz + input * ''1sec''::interval)::timestamp; END; ' LANGUAGE plpgsql IMMUTABLE STRICT; weather=# \d readings Table public.readings Column|Type | Modifiers --+-+- when | timestamp without time zone | not null default (timeofday())::timestamp without time zone hour_group | integer | minute_group
Re: [PERFORM] Index isn't used during a join.
On Mon, Jan 09, 2006 at 09:23:38PM -0700, Robert Creager wrote: I'm working with a query to get more info out with a join. The base query works great speed wise because of index usage. When the join is tossed in, the index is no longer used, so the query performance tanks. The first query you posted returns 285 rows and the second returns over one million; index usage aside, that difference surely accounts for a performance penalty. And as is often pointed out, index scans aren't always faster than sequential scans: the more of a table a query has to fetch, the more likely a sequential scan will be faster. Have the tables been vacuumed and analyzed? The planner's estimates for windspeed are pretty far off, which could be affecting the query plan: - Sort (cost=12997.68..13157.98 rows=64120 width=28) (actual time=2286.155..2286.450 rows=284 loops=1) Sort Key: date_part('doy'::text, unmunge_time(windspeed.time_group)) - Seq Scan on windspeed (cost=0.00..7878.18 rows=64120 width=28) (actual time=2279.275..2285.271 rows=284 loops=1) Filter: (unmunge_time(time_group) (now() - '24:00:00'::interval)) That's a small amount of the total query time, however, so although an index scan might help it probably won't provide the big gain you're looking for. Have you done any tests with enable_seqscan disabled? That'll show whether an index or bitmap scan would be faster. And have you verified that the join condition is correct? Should the query be returning over a million rows? -- Michael Fuhr ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Index not used on group by
Hello all, I have table ma_data, that contain above 30 rows. This table has primary key id, and field alias_id. I create index (btree)on this field. Set statistic: ALTER TABLE public.ma_data ALTER COLUMN alias_id SET STATISTICS 998; So, when I do something like SELECT alias_id FROM ma_data GROUP BY alias_id and have (with seq_scan off): Group (cost=0.00..1140280.63 rows=32 width=4) (actual time=0.159..2640.090 rows=32 loops=1) - Index Scan using reference_9_fk on ma_data (cost=0.00..1139526.57 rows=301624 width=4) (actual time=0.120..1471.128 rows=301624 loops=1) Total runtime: 2640.407 ms (3 rows) As I understand there are some problems with visibility of records, but some others DBMS used indexes without problems(for example FireBird)? Or maybe some another information be helpful for me and community. -- С наилучшими пожеланиями, Репко Андрей Владимирович mailto:[EMAIL PROTECTED] ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Index not used on group by
Andrey Repko wrote: I have table ma_data, that contain above 30 rows. This table has primary key id, and field alias_id. I create index (btree)on this field. Set statistic: ALTER TABLE public.ma_data ALTER COLUMN alias_id SET STATISTICS 998; So, when I do something like SELECT alias_id FROM ma_data GROUP BY alias_id Why are you using GROUP BY without any aggregate functions? What happens if you use something like SELECT DISTINCT alias_id FROM ma_data; -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Index not used on group by
Здравствуйте Richard, Tuesday, September 27, 2005, 1:48:15 PM, Вы писали: RH Andrey Repko wrote: I have table ma_data, that contain above 30 rows. This table has primary key id, and field alias_id. I create index (btree)on this field. Set statistic: ALTER TABLE public.ma_data ALTER COLUMN alias_id SET STATISTICS 998; So, when I do something like SELECT alias_id FROM ma_data GROUP BY alias_id RH Why are you using GROUP BY without any aggregate functions? RH What happens if you use something like RHSELECT DISTINCT alias_id FROM ma_data; sart_ma=# EXPLAIN ANALYZE SELECT DISTINCT alias_id FROM ma_data; QUERY PLAN --- Unique (cost=65262.63..66770.75 rows=32 width=4) (actual time=16780.214..18250.761 rows=32 loops=1) - Sort (cost=65262.63..66016.69 rows=301624 width=4) (actual time=16780.204..17255.129 rows=301624 loops=1) Sort Key: alias_id - Seq Scan on ma_data (cost=0.00..37811.24 rows=301624 width=4) (actual time=6.896..15321.023 rows=301624 loops=1) Total runtime: 18292.542 ms (5 rows) sart_ma=# EXPLAIN ANALYZE SELECT alias_id FROM ma_data GROUP BY alias_id; QUERY PLAN - HashAggregate (cost=38565.30..38565.62 rows=32 width=4) (actual time=15990.863..15990.933 rows=32 loops=1) - Seq Scan on ma_data (cost=0.00..37811.24 rows=301624 width=4) (actual time=3.446..14572.141 rows=301624 loops=1) Total runtime: 15991.244 ms (3 rows) -- С наилучшими пожеланиями, Репко Андрей Владимирович mailto:[EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Index not used on group by
Здравствуйте Richard, Tuesday, September 27, 2005, 2:08:31 PM, Вы писали: sart_ma=# EXPLAIN ANALYZE SELECT alias_id FROM ma_data GROUP BY alias_id; QUERY PLAN - HashAggregate (cost=38565.30..38565.62 rows=32 width=4) (actual time=15990.863..15990.933 rows=32 loops=1) - Seq Scan on ma_data (cost=0.00..37811.24 rows=301624 width=4) (actual time=3.446..14572.141 rows=301624 loops=1) Total runtime: 15991.244 ms RH OK - the planner thinks it's doing the right thing, your cost estimates RH are way off. If you look back at where you got an index-scan, it's cost RH was 1.1 million. RHIndex Scan using reference_9_fk on ma_data (cost=0.00..1139526.57 But why PG scan _all_ the records in the table? As I understand we can just select information from index, not scaning all the table? Of course if we select ALL records from table index can't help us. If I write something like: SELECT (SELECT alias_id FROM ma_data WHERE alias_id =1 LIMIT 1) UNION ALL SELECT (SELECT alias_id FROM ma_data WHERE alias_id =2 LIMIT 1) UNION ALL SELECT (SELECT alias_id FROM ma_data WHERE alias_id =3 LIMIT 1) UNION ALL SELECT (SELECT alias_id FROM ma_data WHERE alias_id =4 LIMIT 1) UNION ALL SELECT (SELECT alias_id FROM ma_data WHERE alias_id =5 LIMIT 1) UNION ALL SELECT (SELECT alias_id FROM ma_data WHERE alias_id =6 LIMIT 1) UNION ALL SELECT (SELECT alias_id FROM ma_data WHERE alias_id =7 LIMIT 1) UNION ALL SELECT (SELECT alias_id FROM ma_data WHERE alias_id =8 LIMIT 1) UNION ALL SELECT (SELECT alias_id FROM ma_data WHERE alias_id =9 LIMIT 1) ... UNION ALL SELECT (SELECT alias_id FROM ma_data WHERE alias_id=max_alias_id LIMIT 1) It works better, much better. RH That's way above the numbers for seq-scan+hash/sort, so if the cost RH estimate was right PG would be making the right choice. Looks like you RH need to check your configuration settings. Have you read: RHhttp://www.powerpostgresql.com/PerfList RH or RHhttp://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html Thanks. -- С наилучшими пожеланиями, Репко Андрей Владимирович mailto:[EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Index not used on group by
Андрей Репко wrote: RH What happens if you use something like RHSELECT DISTINCT alias_id FROM ma_data; sart_ma=# EXPLAIN ANALYZE SELECT DISTINCT alias_id FROM ma_data; QUERY PLAN --- Unique (cost=65262.63..66770.75 rows=32 width=4) (actual time=16780.214..18250.761 rows=32 loops=1) - Sort (cost=65262.63..66016.69 rows=301624 width=4) (actual time=16780.204..17255.129 rows=301624 loops=1) Sort Key: alias_id - Seq Scan on ma_data (cost=0.00..37811.24 rows=301624 width=4) (actual time=6.896..15321.023 rows=301624 loops=1) Total runtime: 18292.542 ms sart_ma=# EXPLAIN ANALYZE SELECT alias_id FROM ma_data GROUP BY alias_id; QUERY PLAN - HashAggregate (cost=38565.30..38565.62 rows=32 width=4) (actual time=15990.863..15990.933 rows=32 loops=1) - Seq Scan on ma_data (cost=0.00..37811.24 rows=301624 width=4) (actual time=3.446..14572.141 rows=301624 loops=1) Total runtime: 15991.244 ms OK - the planner thinks it's doing the right thing, your cost estimates are way off. If you look back at where you got an index-scan, it's cost was 1.1 million. Index Scan using reference_9_fk on ma_data (cost=0.00..1139526.57 That's way above the numbers for seq-scan+hash/sort, so if the cost estimate was right PG would be making the right choice. Looks like you need to check your configuration settings. Have you read: http://www.powerpostgresql.com/PerfList or http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Index not used on group by
Андрей Репко wrote: Здравствуйте Richard, Tuesday, September 27, 2005, 2:08:31 PM, Вы писали: sart_ma=# EXPLAIN ANALYZE SELECT alias_id FROM ma_data GROUP BY alias_id; QUERY PLAN - HashAggregate (cost=38565.30..38565.62 rows=32 width=4) (actual time=15990.863..15990.933 rows=32 loops=1) - Seq Scan on ma_data (cost=0.00..37811.24 rows=301624 width=4) (actual time=3.446..14572.141 rows=301624 loops=1) Total runtime: 15991.244 ms RH OK - the planner thinks it's doing the right thing, your cost estimates RH are way off. If you look back at where you got an index-scan, it's cost RH was 1.1 million. RHIndex Scan using reference_9_fk on ma_data (cost=0.00..1139526.57 But why PG scan _all_ the records in the table? As I understand we can just select information from index, not scaning all the table? Of course if we select ALL records from table index can't help us. Actually, if you select more than 5-10% of the rows (in general) you are better off using a seq-scan. PostgreSQL estimates the total cost of possible query plans and picks the cheapest. In your case your configuration settings seem to be pushing the cost of an index scan much higher than it is. So, it picks the sequential-scan. If I write something like: SELECT (SELECT alias_id FROM ma_data WHERE alias_id =1 LIMIT 1) UNION ALL SELECT (SELECT alias_id FROM ma_data WHERE alias_id =2 LIMIT 1) UNION ALL SELECT (SELECT alias_id FROM ma_data WHERE alias_id =3 LIMIT 1) UNION ALL SELECT (SELECT alias_id FROM ma_data WHERE alias_id =4 LIMIT 1) UNION ALL SELECT (SELECT alias_id FROM ma_data WHERE alias_id =5 LIMIT 1) UNION ALL SELECT (SELECT alias_id FROM ma_data WHERE alias_id =6 LIMIT 1) UNION ALL SELECT (SELECT alias_id FROM ma_data WHERE alias_id =7 LIMIT 1) UNION ALL SELECT (SELECT alias_id FROM ma_data WHERE alias_id =8 LIMIT 1) UNION ALL SELECT (SELECT alias_id FROM ma_data WHERE alias_id =9 LIMIT 1) ... UNION ALL SELECT (SELECT alias_id FROM ma_data WHERE alias_id=max_alias_id LIMIT 1) It works better, much better. Of course - it will always choose index queries here - it can see you are only fetching one row in each subquery. Correct your configuration settings so PG estimates the cost of an index query correctly and all should be well. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Index not used with prepared statement
Hi. I have a performance problem with prepared statements (JDBC prepared statement). This query: PreparedStatement st = conn.prepareStatement(SELECT id FROM dga_dienstleister WHERE plz like '45257'); does use an index. This query: String plz = 45257; PreparedStatement st = conn.prepareStatement(SELECT id FROM dga_dienstleister WHERE plz like ?); st.setString(1, plz); does NOT use an index. As it should in the end result in absolutely the same statement, the index should be used all the time. I have to set the protocolVersion=2 and use the JDBC2 driver to get it working (but then the statements are created like in the first query, so no solution, only a workaround). I'm not sure whether this is a bug (I think it is) or a problem of understanding. Known problem? I have tried PG 8.0.1, 8.0.3, 8.1beta with the JDBC-drivers - postgresql-8.0-312.jdbc2.jar -- okay with protocolVersion=2 in the URL - postgresql-8.0-312.jdbc3.jar -- not okay whatever I do I'm on Mac OS X, if that matters. cug ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Index not used with prepared statement
Guido Neitzer schrob: I have a performance problem with prepared statements (JDBC prepared statement). This query: PreparedStatement st = conn.prepareStatement(SELECT id FROM dga_dienstleister WHERE plz like '45257'); does use an index. This query: String plz = 45257; PreparedStatement st = conn.prepareStatement(SELECT id FROM dga_dienstleister WHERE plz like ?); st.setString(1, plz); does NOT use an index. As it should in the end result in absolutely the same statement, the index should be used all the time. I'm not perfectly sure, but since the index could only be used with a subset of all possible parameters (the pattern for like has to be left-anchored), I could imagine the planner has to avoid the index in order to produce an universal plan (the thing behind a prepared statement). Is there a reason you are using the like operator at all? IMO using the =-operator instead in your example should produce an index-using prepared statement. HTH Andreas -- ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Index not used with prepared statement
On 11.09.2005, at 11:03 Uhr, Andreas Seltenreich wrote: I'm not perfectly sure, but since the index could only be used with a subset of all possible parameters (the pattern for like has to be left-anchored), I could imagine the planner has to avoid the index in order to produce an universal plan (the thing behind a prepared statement). Hmm. Now I get it. So I have to look that my framework doesn't produce a preparedStatement, instead build a complete statement string. Weird. Is there a reason you are using the like operator at all? IMO using the =-operator instead in your example should produce an index-using prepared statement. Yes, you are right, but then I can't pass anything like '45%' to the query. It will just return nothing. I use the like because I build the queries on the fly and add a % at the end where necessary. And, to be clear: this is a minimal example, most of my queries are generated by a framework. This was an example to test the behaviour. Okay, I had problems with the understanding of prepared statements on the client and the server side. What I thought was, that I get a preparedStatement by JDBC which also inserts the values into the string and this is executed on the server side. cug ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] Index not used on join with inherited tables
Hi all, I'm having another problem with a query that takes to long, because the appropriate index is not used. I found some solutions to this problem, but I think Postgres should do an index scan in all cases. To show the problem I've attached a small script with a testcase. Thanks in advance Sebastian /* tables */ CREATE TABLE test ( id SERIAL PRIMARY KEY, test TEXT ); CREATE TABLE test1 ( id INTEGER PRIMARY KEY ) INHERITS (test); CREATE TABLE test2 ( id INTEGER PRIMARY KEY ) INHERITS (test); CREATE TABLE test3 ( id INTEGER PRIMARY KEY ) INHERITS (test); CREATE TABLE bug ( id INTEGER PRIMARY KEY ); /* views */ CREATE OR REPLACE VIEW working_fast AS SELECT * FROM test WHERE id = 1; CREATE OR REPLACE VIEW working_fast_2 AS SELECT test1.* FROM test1 JOIN bug AS bug1 ON TRUE WHERE test1.id = bug1.id UNION ALL SELECT test2.* FROM test2 JOIN bug AS bug2 ON TRUE WHERE test2.id = bug2.id UNION ALL SELECT test3.* FROM test3 JOIN bug AS bug3 ON TRUE WHERE test3.id = bug3.id; CREATE OR REPLACE VIEW working_slow AS SELECT test.* FROM test JOIN bug ON TRUE WHERE test.id = bug.id; /* data */ CREATE OR REPLACE FUNCTION data () RETURNS BOOLEAN AS $$ BEGIN FOR i IN 1..1 LOOP INSERT INTO test1 (id,test) VALUES (DEFAULT,i); INSERT INTO test2 (id,test) VALUES (DEFAULT,i); INSERT INTO test3 (id,test) VALUES (DEFAULT,i); END LOOP; RETURN TRUE; END; $$ LANGUAGE plpgsql; SELECT data(); INSERT INTO bug VALUES ('1'); ANALYZE; EXPLAIN ANALYZE SELECT * from working_fast; EXPLAIN ANALYZE SELECT * from working_fast_2; EXPLAIN ANALYZE SELECT * from working_slow; ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Index not used on join with inherited tables
Sebastian, I'm having another problem with a query that takes to long, because the appropriate index is not used. PostgreSQL is not currently able to push down join criteria into UNIONed subselects. It's a TODO. Also, if you're using inherited tables, it's unnecessary to use UNION; just select from the parent. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Index not used on join with inherited tables
Josh Berkus wrote: Sebastian, I'm having another problem with a query that takes to long, because the appropriate index is not used. PostgreSQL is not currently able to push down join criteria into UNIONed subselects. It's a TODO. And the appends in a SELECT * from parent are UNIONs, aren't they? Also, if you're using inherited tables, it's unnecessary to use UNION; just select from the parent. Yes, but then no index is used... Sebastian ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] index not used
Hi folks, I'm doing a simple lookup in a small table by an unique id, and I'm wondering, why explains tells me seqscan is used instead the key. The table looks like: id bigint primary key, a varchar, b varchar, c varchar and I'm quering: select * from foo where id = 2; I've got only 15 records in this table, but I wanna have it as fast as possible since its used (as a map between IDs and names) for larger queries. thx -- - Enrico Weigelt== metux IT service phone: +49 36207 519931 www: http://www.metux.de/ fax: +49 36207 519932 email: [EMAIL PROTECTED] cellphone: +49 174 7066481 - -- DSL ab 0 Euro. -- statische IP -- UUCP -- Hosting -- Webshops -- - ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] index not used
If id is PK, the query shoudl return 1 row only... --- Enrico Weigelt [EMAIL PROTECTED] wrote: Hi folks, I'm doing a simple lookup in a small table by an unique id, and I'm wondering, why explains tells me seqscan is used instead the key. The table looks like: idbigint primary key, a varchar, b varchar, c varchar and I'm quering: select * from foo where id = 2; I've got only 15 records in this table, but I wanna have it as fast as possible since its used (as a map between IDs and names) for larger queries. thx -- - Enrico Weigelt== metux IT service phone: +49 36207 519931 www: http://www.metux.de/ fax: +49 36207 519932 email: [EMAIL PROTECTED] cellphone: +49 174 7066481 - -- DSL ab 0 Euro. -- statische IP -- UUCP -- Hosting -- Webshops -- - ---(end of broadcast)--- TIP 8: explain analyze is your friend __ Do you Yahoo!? Yahoo! Small Business - Try our new resources site! http://smallbusiness.yahoo.com/resources/ ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] index not used
On Thu, 21 Apr 2005, Enrico Weigelt wrote: I'm doing a simple lookup in a small table by an unique id, and I'm wondering, why explains tells me seqscan is used instead the key. The table looks like: idbigint primary key, a varchar, b varchar, c varchar and I'm quering: select * from foo where id = 2; I've got only 15 records in this table, but I wanna have it as fast as possible since its used (as a map between IDs and names) for larger queries. Two general things: For 15 records, an index scan may not be faster. For simple tests you can play with enable_seqscan to see, but for more complicated queries it's a little harder to tell. If you're using a version earlier than 8.0, you'll need to quote or cast the value you're searching for due to problems with cross-type comparisons (the 2 would be treated as int4). ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[PERFORM] Index not used with or condition
Hi all, I am facing a strange problem when I run EXPLAIN against a table having more than 10 records. The query have lot of OR conditions and when parts of the query is removed it is using index. To analyse it I created a table with a single column, inserted 10 records(random number) in it created index and run a query which returns 1 record which have no or condition and it was using index. I added an OR conditon and is using sequential scan. I set the enable_seqscan to off. I ran the tests again and is using index scan. So which one I have to use. Is this any bug in Explain. rgds Antony Paul. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Index not used with or condition
On Mon, Feb 07, 2005 at 04:44:07PM +0530, Antony Paul wrote: On more investigation I found that index scan is not used if the query have a function in it like lower() and an index exist for lower() column. What version are you using? 8.0 had fixes for this situation. /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Index not used with or condition
It depends on many circumstances, but, at first, simple question: Did you run vacuum analyze? I am satisfied with functional indexes - it works in my pg 7.4.x. Antony Paul wrote: On more investigation I found that index scan is not used if the query have a function in it like lower() and an index exist for lower() column. rgds Antony Paul On Mon, 7 Feb 2005 14:37:15 +0530, Antony Paul [EMAIL PROTECTED] wrote: Hi all, I am facing a strange problem when I run EXPLAIN against a table having more than 10 records. The query have lot of OR conditions and when parts of the query is removed it is using index. To analyse it I created a table with a single column, inserted 10 records(random number) in it created index and run a query which returns 1 record which have no or condition and it was using index. I added an OR conditon and is using sequential scan. I set the enable_seqscan to off. I ran the tests again and is using index scan. So which one I have to use. Is this any bug in Explain. rgds Antony Paul. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Index not used with or condition
I ran analyze; several times. rgds Antony Paul On Mon, 07 Feb 2005 12:53:30 +0100, Jan Poslusny pajout@gingerall.cz wrote: It depends on many circumstances, but, at first, simple question: Did you run vacuum analyze? I am satisfied with functional indexes - it works in my pg 7.4.x. Antony Paul wrote: On more investigation I found that index scan is not used if the query have a function in it like lower() and an index exist for lower() column. rgds Antony Paul On Mon, 7 Feb 2005 14:37:15 +0530, Antony Paul [EMAIL PROTECTED] wrote: Hi all, I am facing a strange problem when I run EXPLAIN against a table having more than 10 records. The query have lot of OR conditions and when parts of the query is removed it is using index. To analyse it I created a table with a single column, inserted 10 records(random number) in it created index and run a query which returns 1 record which have no or condition and it was using index. I added an OR conditon and is using sequential scan. I set the enable_seqscan to off. I ran the tests again and is using index scan. So which one I have to use. Is this any bug in Explain. rgds Antony Paul. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] index not used if using IN or OR
Mario Ivankovits wrote: Hello ! Sorry if this has been discussed before, it is just hard to find in the archives using the words or or in :-o I use postgres-8.0 beta4 for windows. I broke down my problem to a very simple table - two columns primary_key and secondary_key. Creates and Insert you will find below. If I query the _empty_ freshly created table I get the following explain result: select * from tt where seckey = 1; Index Scan using seckey_key on tt (cost=0.00..17.07 rows=5 width=12) Index Cond: (seckey = 1) If I use OR (or IN) things get worse: select * from tt where seckey = 1 or seckey = 2 Seq Scan on tt (cost=0.00..0.00 rows=1 width=12) Filter: ((seckey = 1) OR (seckey = 2)) Note the Seq Scan instead of using the index. But as you said, your table is *empty* - why would an index be faster? Try running EXPLAIN ANALYSE on these queries and look at the actual times. After populating the table with 8920 records and analyze the scenario gets even worser: select * from tt where seckey = 1; Seq Scan on tt (cost=0.00..168.50 rows=1669 width=12) (actual time=0.000..15.000 rows=1784 loops=1) Filter: (seckey = 1) Total runtime: 31.000 ms Now also this simple query uses a Seq Scan. Well, it thinks it's going to be returning 1669 rows. If that's roughly right, then scanning the table probably is faster. Run the queries again with EXPLAIN ANALYSE. Also try issuing set enable_seqscan=false; This will force the planner to use any indexes it finds. Compare the times with and without, and don't forget to account for the effects of caching. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] index not used if using IN or OR
Mario Ivankovits [EMAIL PROTECTED] writes: After populating the table with 8920 records and analyze the scenario gets even worser: select * from tt where seckey = 1; Seq Scan on tt (cost=0.00..168.50 rows=1669 width=12) (actual time=0.000..15.000 rows=1784 loops=1) Filter: (seckey = 1) Total runtime: 31.000 ms Now also this simple query uses a Seq Scan. Which is exactly what it *should* do, considering that it is selecting 1784 out of 8920 records. Indexscans only win for small selectivities --- the rule of thumb is that retrieving more than about 1% of the records should use a seqscan. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PERFORM] index not used if using IN or OR
Hello ! Sorry if this has been discussed before, it is just hard to find in the archives using the words or or in :-o I use postgres-8.0 beta4 for windows. I broke down my problem to a very simple table - two columns primary_key and secondary_key. Creates and Insert you will find below. If I query the _empty_ freshly created table I get the following explain result: select * from tt where seckey = 1; Index Scan using seckey_key on tt (cost=0.00..17.07 rows=5 width=12) Index Cond: (seckey = 1) If I use OR (or IN) things get worse: select * from tt where seckey = 1 or seckey = 2 Seq Scan on tt (cost=0.00..0.00 rows=1 width=12) Filter: ((seckey = 1) OR (seckey = 2)) Note the Seq Scan instead of using the index. After populating the table with 8920 records and analyze the scenario gets even worser: select * from tt where seckey = 1; Seq Scan on tt (cost=0.00..168.50 rows=1669 width=12) (actual time=0.000..15.000 rows=1784 loops=1) Filter: (seckey = 1) Total runtime: 31.000 ms Now also this simple query uses a Seq Scan. Now the questions are: a) Why is the index not used if I use OR or IN b) Why is the index not used after analyze ? Any help is very appreciated! Thanks, Mario // The table and data CREATE TABLE tt ( pkey int4 NOT NULL DEFAULT nextval('public.tt_PKEY_seq'::text), seckey int8, CONSTRAINT pkey_key PRIMARY KEY (pkey) ) WITHOUT OIDS; CREATE INDEX seckey_key ON tt USING btree (seckey); // inserted many-many times insert into tt values (default, 1); insert into tt values (default, 2); insert into tt values (default, 3); insert into tt values (default, 4); insert into tt values (default, 5); ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Index not used in query. Why?
Is there a solution to make it faster? At the end I need only in the query the id_status =4 and 6, but if I write in the sql query (where condition) where id_status in (4,6), the explain says the same(the slow version). For example: SELECT count(o.id) FROM orders o INNER JOIN report r ON o.id=r.id_order INNER JOIN status s ON o.id_status=s.id INNER JOIN contact c ON o.id_ag=c.id INNER JOIN endkunde e ON o.id_endkunde=e.id INNER JOIN zufriden z ON r.id_zufriden=z.id INNER JOIN plannung v ON v.id=o.id_plannung INNER JOIN mpsworker w ON v.id_worker=w.id INNER JOIN person p ON p.id = w.id_person WHERE o.id_status in (4,6); The result for this query is also without index searches. I really have to make this query a little more faster. Suggestions? Regards, Andy. - Original Message - From: Tom Lane [EMAIL PROTECTED] To: Andrei Bintintan [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Tuesday, October 19, 2004 7:52 PM Subject: Re: [PERFORM] Index not used in query. Why? Andrei Bintintan [EMAIL PROTECTED] writes: Hi to all! I have the following query. The execution time is very big, it doesn't use the indexes and I don't understand why... Indexes are not necessarily the best way to do a large join. If I use the following query the indexes are used: The key reason this wins seems to be that the id_status = 4 condition is far more selective than id_status 3 (the estimates are 52 and 36967 rows respectively ... is that accurate?) which means that the second query is inherently about 1/700th as much work. This, and not the use of indexes, is the fundamental reason why it's faster. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[PERFORM] Index not used in query. Why?
Hi to all! I have the following query. The execution time is very big, it doesn't use the indexes and I don't understand why... SELECT count(o.id) FROM orders o INNER JOIN report r ON o.id=r.id_order INNER JOIN status s ON o.id_status=s.id INNER JOIN contact c ON o.id_ag=c.id INNER JOIN endkunde e ON o.id_endkunde=e.id INNER JOIN zufriden z ON r.id_zufriden=z.id INNER JOIN plannung v ON v.id=o.id_plannung INNER JOIN mpsworker w ON v.id_worker=w.id INNER JOIN person p ON p.id = w.id_person WHERE o.id_status 3 The query explain: Aggregate (cost=32027.38..32027.38 rows=1 width=4) - Hash Join (cost=23182.06..31944.82 rows=33022 width=4) Hash Cond: (outer.id_person = inner.id) - Hash Join (cost=23179.42..31446.85 rows=33022 width=8) Hash Cond: (outer.id_endkunde = inner.id) - Hash Join (cost=21873.54..28891.42 rows=33022 width=12) Hash Cond: (outer.id_ag = inner.id) - Hash Join (cost=21710.05..28067.50 rows=33021 width=16) Hash Cond: (outer.id_status = inner.id) - Hash Join (cost=21708.97..27571.11 rows=33021 width=20) Hash Cond: (outer.id_worker = inner.id) - Hash Join (cost=21707.49..27074.31 rows=33021 width=20) Hash Cond: (outer.id_zufriden = inner.id) - Hash Join (cost=21706.34..26564.09 rows=35772 width=24) Hash Cond: (outer.id_plannung = inner.id) - Hash Join (cost=20447.15..23674.04 rows=35771 width=24) Hash Cond: (outer.id = inner.id_order) - Seq Scan on orders o (cost=0.00..1770.67 rows=36967 width=20) Filter: (id_status 3) - Hash (cost=20208.32..20208.32 rows=37132 width=8) - Seq Scan on report r (cost=0.00..20208.32 rows=37132 width=8) - Hash (cost=913.15..913.15 rows=54015 width=8) - Seq Scan on plannung v (cost=0.00..913.15 rows=54015 width=8) - Hash (cost=1.12..1.12 rows=12 width=4) - Seq Scan on zufriden z (cost=0.00..1.12 rows=12 width=4) - Hash (cost=1.39..1.39 rows=39 width=8) - Seq Scan on mpsworker w (cost=0.00..1.39 rows=39 width=8) - Hash (cost=1.06..1.06 rows=6 width=4) - Seq Scan on status s (cost=0.00..1.06 rows=6 width=4) - Hash (cost=153.19..153.19 rows=4119 width=4) - Seq Scan on contact c (cost=0.00..153.19 rows=4119 width=4) - Hash (cost=1077.91..1077.91 rows=38391 width=4) - Seq Scan on endkunde e (cost=0.00..1077.91 rows=38391 width=4) - Hash (cost=2.51..2.51 rows=51 width=4) - Seq Scan on person p (cost=0.00..2.51 rows=51 width=4) As you can see, no index is used.I made everywhere indexes where the jons are made. If I use the following query the indexes are used: SELECT count(o.id) FROM orders o INNER JOIN report r ON o.id=r.id_order INNER JOIN status s ON o.id_status=s.id INNER JOIN contact c ON o.id_ag=c.id INNER JOIN endkunde e ON o.id_endkunde=e.id INNER JOIN zufriden z ON r.id_zufriden=z.id INNER JOIN plannung v ON v.id=o.id_plannung INNER JOIN mpsworker w ON v.id_worker=w.id INNER JOIN person p ON p.id = w.id_person WHERE o.id_status =4 Aggregate (cost=985.55..985.55 rows=1 width=4) - Hash Join (cost=5.28..985.42 rows=50 width=4) Hash Cond: (outer.id_person = inner.id) - Hash Join (cost=2.64..982.03 rows=50 width=8) Hash Cond: (outer.id_worker = inner.id) - Nested Loop (cost=1.15..979.79 rows=50 width=8) - Nested Loop (cost=1.15..769.64 rows=49
Re: [PERFORM] Index not used in query. Why?
Andrei Bintintan [EMAIL PROTECTED] writes: Hi to all! I have the following query. The execution time is very big, it doesn't use the indexes and I don't understand why... Indexes are not necessarily the best way to do a large join. If I use the following query the indexes are used: The key reason this wins seems to be that the id_status = 4 condition is far more selective than id_status 3 (the estimates are 52 and 36967 rows respectively ... is that accurate?) which means that the second query is inherently about 1/700th as much work. This, and not the use of indexes, is the fundamental reason why it's faster. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] index not used when using function
Maybe add an order by artist to force a groupaggregate ? Hi all, a small question: I've got this table songs and an index on column artist. Since there's about one distinct artist for every 10 rows, it would be nice if it could use this index when counting artists. It doesn't however: lyrics= EXPLAIN ANALYZE SELECT count(DISTINCT artist) FROM songs; Aggregate (cost=31961.26..31961.26 rows=1 width=14) (actual time=808.863..808.864 rows=1 loops=1) - Seq Scan on songs (cost=0.00..31950.41 rows=4341 width=14) (actual time=26.801..607.172 rows=25207 loops=1) Total runtime: 809.106 ms Even with enable_seqscan to off, it just can't seem to use the index. The same query without the count() works just fine: lyrics= EXPLAIN ANALYZE SELECT DISTINCT artist FROM songs; Unique (cost=0.00..10814.96 rows=828 width=14) (actual time=0.029..132.903 rows=3280 loops=1) - Index Scan using songs_artist_key on songs (cost=0.00..10804.11 rows=4341 width=14) (actual time=0.027..103.448 rows=25207 loops=1) Total runtime: 135.697 ms Of course I can just take the number of rows from the latter query, but I'm still wondering why it can't use indexes with functions. Thanks ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] index not used when using function
Hi all, a small question: I've got this table songs and an index on column artist. Since there's about one distinct artist for every 10 rows, it would be nice if it could use this index when counting artists. It doesn't however: lyrics= EXPLAIN ANALYZE SELECT count(DISTINCT artist) FROM songs; Aggregate (cost=31961.26..31961.26 rows=1 width=14) (actual time=808.863..808.864 rows=1 loops=1) - Seq Scan on songs (cost=0.00..31950.41 rows=4341 width=14) (actual time=26.801..607.172 rows=25207 loops=1) Total runtime: 809.106 ms Even with enable_seqscan to off, it just can't seem to use the index. The same query without the count() works just fine: lyrics= EXPLAIN ANALYZE SELECT DISTINCT artist FROM songs; Unique (cost=0.00..10814.96 rows=828 width=14) (actual time=0.029..132.903 rows=3280 loops=1) - Index Scan using songs_artist_key on songs (cost=0.00..10804.11 rows=4341 width=14) (actual time=0.027..103.448 rows=25207 loops=1) Total runtime: 135.697 ms Of course I can just take the number of rows from the latter query, but I'm still wondering why it can't use indexes with functions. Thanks -- Shiar - http://www.shiar.org Faktoj estas malamik del verajh ---(end of broadcast)--- TIP 8: explain analyze is your friend