[SQL] populate value of column
I have a table CREATE TABLE meter ( meter_id integer NOT NULL, area_no integer NOT NULL, CONSTRAINT meter_pkey PRIMARY KEY (meter_id) ) ; INSERT INTO meter(meter_id, no_of_bays) VALUES (1001, 4); INSERT INTO meter(meter_id, no_of_bays) VALUES (1012, 6); select meter_id, area_no from meter; meter_id | no_of_bays --+ 1001 | 4 1012 | 6 How can I generate the following result? meter_id | bay --+ 1001 | 01 1001 | 02 1001 | 03 1001 | 04 1012 | 01 1012 | 02 1012 | 03 1012 | 04 1012 | 05 1012 | 06 ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [SQL] populate value of column
Thank you :-)) That's so quick! Apologies for the error on my DDL statement On 08/08/07, Phillip Smith <[EMAIL PROTECTED]> wrote: > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > On Behalf Of novice > Sent: Wednesday, 8 August 2007 15:31 > To: pgsql-sql@postgresql.org > Subject: [SQL] populate value of column > > > How can I generate the following result? > > > > meter_id | bay > > --+ > > 1001 | 01 > > 1001 | 02 > > 1001 | 03 > > 1001 | 04 > > 1012 | 01 > > 1012 | 02 > > 1012 | 03 > > 1012 | 04 > > 1012 | 05 > > 1012 | 06 > > I even tested this one too ;) > SELECT meter_id, > LPAD(GENERATE_SERIES(1,area_no),2,'0') > FROMmeter; > > Works on 8.2.4 - you didn't say what version you were using. > > > ***Confidentiality and Privilege Notice*** > > The material contained in this message is privileged and confidential to > the addressee. If you are not the addressee indicated in this message or > responsible for delivery of the message to such person, you may not copy > or deliver this message to anyone, and you should destroy it and kindly > notify the sender by reply email. > > Information in this message that does not relate to the official business > of Weatherbeeta must be treated as neither given nor endorsed by Weatherbeeta. > Weatherbeeta, its employees, contractors or associates shall not be liable > for direct, indirect or consequential loss arising from transmission of this > message or any attachments > > THINK BEFORE YOU PRINT - Save paper if you don't really need to print this > e-mail. > > ---(end of broadcast)--- > TIP 9: In versions below 8.0, the planner will ignore your desire to >choose an index scan if your joining column's datatypes do not >match > ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[SQL] Join query help
Hi, We have the following three tables. safety=> SELECT record_id, record_date FROM record; record_id | record_date ---+ 1 | 2007-07-23 11:30:37+10 2 | 2007-07-27 11:30:14+10 3 | 2007-07-17 13:15:03+10 (3 rows) safety=> SELECT observation_id, record_id, score_id FROM observation; observation_id | record_id | score_id +---+-- 3240 | 1 |1 3239 | 1 |1 3238 | 1 |2 3237 | 1 |1 2872 | 2 |1 2869 | 2 |2 2870 | 2 |1 2871 | 2 |1 3218 | 3 |2 3217 | 3 |1 (10 rows) safety=> SELECT * FROM SCORE; score_id | score_description --+--- 0 | NA 1 | SAFE 2 | AT RISK (3 rows) What query do I write to generate the following? week_no | count(record_id | count(observation_id) | sum(score_id) where = '1' 2007, 30 | 2 | 8 | 6 2007, 29 | 1 | 2 | 1 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [SQL] [GENERAL] Join query help
Try 2: Here are my DDLs & DMLs -- Start CREATE TABLE record ( record_id integer PRIMARY KEY, record_date timestamp with time zone NOT NULL ); INSERT INTO record(record_id, record_date) VALUES ('1', '2007-07-23 11:30:37'); INSERT INTO record(record_id, record_date) VALUES ('2', '2007-07-27 11:30:14'); INSERT INTO record(record_id, record_date) VALUES ('3', '2007-07-17 13:15:03'); CREATE TABLE score ( score_id integer PRIMARY KEY, score_description character(7) NOT NULL ); INSERT INTO score(score_id, score_description) VALUES ('0', 'NA'); INSERT INTO score(score_id, score_description) VALUES ('1', 'SAFE'); INSERT INTO score(score_id, score_description) VALUES ('2', 'AT RISK'); CREATE TABLE observation ( observation_id integer PRIMARY KEY, record_id integer REFERENCES record (record_id), score_id integer REFERENCES score (score_id) ); INSERT INTO observation(observation_id, record_id, score_id) VALUES ('3240', '1', '1'); INSERT INTO observation(observation_id, record_id, score_id) VALUES ('3239', '1', '1'); INSERT INTO observation(observation_id, record_id, score_id) VALUES ('3238', '1', '2'); INSERT INTO observation(observation_id, record_id, score_id) VALUES ('3237', '1', '1'); INSERT INTO observation(observation_id, record_id, score_id) VALUES ('2872', '2', '1'); INSERT INTO observation(observation_id, record_id, score_id) VALUES ('2869', '2', '2'); INSERT INTO observation(observation_id, record_id, score_id) VALUES ('2870', '2', '1'); INSERT INTO observation(observation_id, record_id, score_id) VALUES ('2871', '2', '1'); INSERT INTO observation(observation_id, record_id, score_id) VALUES ('3218', '3', '2'); INSERT INTO observation(observation_id, record_id, score_id) VALUES ('3217', '3', '1'); -- End By executing SELECT week, COUNT(record) as records, SUM(inspection) as inspections FROM ( SELECT TO_CHAR(record.record_date, ', WW') as week, RECORD.RECORD_ID AS RECORD, COUNT(OBSERVATION_ID) AS INSPECTION FROM OBSERVATION LEFT JOIN record ON record.record_id = observation.record_id GROUP BY RECORD, WEEK ) A GROUP BY WEEK; I get week | records | inspections --+-+-- 2007, 30 | 2 |8 2007, 29 | 1 |2 (2 rows) I like to query for a result set that will also have the sum(score_id) where score_id = '1' like the following week | records | inspections | score --+-+-+-- 2007, 30 | 2 | 8 |6 2007, 29 | 1 | 2 |1 (2 rows) This will help identify that there were 6 SAFE observations found from the 8 inspections on week 30. I hope this is not too confusing that I have to re-design the whole schema ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [SQL] Join query help
On 21/08/07, Michael Glaesemann <[EMAIL PROTECTED]> wrote: > Now here's where I started having trouble. I can't figure out how to > get 2 observations for week 29 (record_id 1 & 3) and 8 for week 30 > (record_id 2). Assuming the data is wrong (which is admittedly a poor > assumption), I moved ahead. > > week_no | count(record_id | count(observation_id) | sum(score_id) > > where = '1' > > 2007, 30 | 2 | 8 | 6 > > 2007, 29 | 1 | 2 | 1 For week 29, there were 1 SAFE observation out of the 2 observations and for week 30, there were 6 SAFE obseravations out of the 8 observations. Hope this helps. Thanks! ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] [GENERAL] Join query help
Many many thanks for all the advice =) On 21/08/07, Michael Glaesemann <[EMAIL PROTECTED]> wrote: > > On Aug 20, 2007, at 19:52 , novice wrote: > > > Try 2: Here are my DDLs & DMLs > > Thanks for the data. It looks like the data you provided in the first > set was a little different, and the queries I supplied in my previous > message give you the results you want. > > > CREATE TABLE record > > ( > > record_id integer PRIMARY KEY, > > record_date timestamp with time zone NOT NULL > > ); > > > > INSERT INTO record(record_id, record_date) VALUES ('1', '2007-07-23 > > 11:30:37'); > > Note: record_id is in integer, yet you're quoting the value ('1'). > This causes the server to cast the text value to an integer. Here > it's not going to cause much of a problem, just a couple CPU cycles. > In table definitions (and possibly queries?) it can cause the server > to ignore otherwise usable indexes when planning queries. (I see > below your other inserts also quote integer values: you should drop > the quotes there as well.) > > > CREATE TABLE score > > ( > > score_id integer PRIMARY KEY, > > score_description character(7) NOT NULL > > > > ); > > Unless you have a strict business rule that score_description can > have no more than 7 characters, you should just use text instead of > character(7). It provides you more freedom (for example, you don't > have to change the column type if you ever want to use longer > descriptions). character(7) does not gain you anything in terms of > performance. > > I'd also recommend adding a UNIQUE constraint to score_description. > It appears your score_id has no meaning other than use as a surrogate > key. There's nothing to prevent INSERT INTO score (score_id, > score_description) VALUES (5, 'SAFE') which would be quite confusing, > I should think. And if you don't have any particular reason to use a > surrogate key, you could just use score_description as the primary > key of the table, dropping score_id altogether. it would reduce the > number of joins you needed to do to have easily interpretable query > results. (This all goes for your record table as well.) > > > I like to query for a result set that will also have the sum(score_id) > > where score_id = '1' like the following > > > >week | records | inspections | score > > --+-+-+-- > > 2007, 30 | 2 | 8 |6 > > 2007, 29 | 1 | 2 |1 > > (2 rows) > > > > (repeating from earlier post) > > SELECT to_char(record_week, ', IW') as formatted_record_week > , count(DISTINCT record_id) AS record_count > , count(DISTINCT observation_id) AS observation_count > , safe_score_sum > FROM record_with_week > NATURAL JOIN observation > NATURAL JOIN ( > SELECT record_week, sum(score_id) as safe_score_sum > FROM record_with_week > NATURAL JOIN observation > NATURAL JOIN score > WHERE score_description = 'SAFE' > GROUP BY record_week > ) safe_observation > GROUP BY record_week, safe_score_sum; > formatted_record_week | record_count | observation_count | > safe_score_sum > ---+--+--- > + > 2007, 29 |1 | 2 > | 1 > 2007, 30 |2 | 8 > | 6 > (2 rows) > > > > This will help identify that there were 6 SAFE observations found from > > the 8 inspections on week 30. > > Yeah, I thought so: you're actually looking for the *count* of SAFE > observations, not the sum of the score_id for 'SAFE'. So what you > really want is: > > SELECT TO_CHAR(record_week, ', IW') AS formatted_record_week > , count(DISTINCT record_id) AS record_count > , count(DISTINCT observation_id) AS observation_count > , count(DISTINCT safe_observation_id) as safe_observation_count > FROM record_with_week > NATURAL JOIN observation > NATURAL JOIN ( > SELECT record_week > , observation_id as safe_observation_id > FROM record_with_week > NATURAL JOIN observation > NATURAL JOIN score > WHERE score_description = 'SAFE' > ) safe_observation > GROUP BY record_week; > formatted_record_week | record_count | observation_count | > safe_observation_count > ---+--+--- > + > 2007, 29 |1 | 2 > | 1 > 2007, 30 |2 | 8 > | 6 > (2 rows) > > Michael Glaesemann > grzm seespotcode net > > > > ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[SQL] raw data into table process
I am trying to record the following entries into a table. I'm curious to know if there's an efficient/effective way of doing this? This entries come from an ancient datalogger (note: separated by space and uses YY/MM/DD format to record date) Plain file sample.dat 3665 OK BS 07/08/16 07:28 3665 CC BS 07/08/16 07:29 3665 CS BS 07/08/16 07:29 3665 CS BS 07/08/16 07:29 4532 OK BS 07/08/16 07:34 4004 OK BS 07/08/16 07:51 3991 OK BS 07/08/16 07:54 This is the table that I'm adding the entries to CREATE TABLE maintenance ( maintenance_id SERIAL PRIMARY KEY, meter_id integer, status text, inspector text, inspection_date timestamp with time zone, ) -- Begin SQL Script -- First table to dump the records in CREATE TABLE dataload1 (data text) -- Dump records using \copy \copy dataload1 FROM sample.dat -- Second table to import unique records ONLY CREATE TABLE dataload2 AS SELECT DISTINCT data FROM dataload1; -- Now I update unique records into the maintenance table -- maintenance_id is SERIAL so it will be populated automatically INSERT INTO maintenance(meter_id, status, inspector, inspection_date) SELECT substr("data", 1, 4)::int , substr("data", 8, 3) , substr("data", 21, 2) , (20||substr("data", 24, 2) ||'-'|| substr("data", 27, 2) ||'-'|| substr("data", 30, 2)||' '||substr("data", 33, 5))::timestamp as inspection_date FROM dataload2 -- So the new records will also be in timestamp order ORDER BY inspection_date ; -- Some housekeeping VACUUM FULL VERBOSE ANALYZE maintenance; -- Finally, drop the temporary tables DROP TABLE dataload1 DROP TABLE dataload2 -- End SQL script Any thoughts and suggestions welcome. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[SQL] work hour calculations
Hello All, SELECT notification_time, finished_time, sum(finished_time - notification_time) as actual FROM log GROUP BY notification_time, finished_time; gives me: notification_time| finished_time | actual ++- 2007-07-06 15:50:00+10 | 2007-07-09 07:10:00+10 | 2 days 15:20:00 2007-07-07 12:30:00+10 | 2007-07-09 07:20:00+10 | 1 day 18:50:00 How can write a query to calculate the duration using custom work hours which is Monday 7am / Friday 5pm? The result I'm expecting for the above to be notification_time| finished_time | actual ++- 2007-07-06 15:50:00+10 | 2007-07-09 07:10:00+10 | 00:20:00 2007-07-07 12:30:00+10 | 2007-07-09 07:20:00+10 | 00:20:00 Thanks. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [SQL] work hour calculations
correction: > The result I'm expecting for the above to be > >notification_time| finished_time | actual > ++- 2007-07-06 15:50:00+10 | 2007-07-09 07:10:00+10 | 01:20:00 > 2007-07-07 12:30:00+10 | 2007-07-09 07:20:00+10 | 00:20:00 > ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [SQL] [GENERAL] work hour calculations
On 07/09/2007, Filip Rembiałkowski <[EMAIL PROTECTED]> wrote: > 2007/9/5, Raj A <[EMAIL PROTECTED]>: > > correction: > > > > > The result I'm expecting for the above to be > > > > > >notification_time| finished_time | actual > > > ++- > > > 2007-07-06 15:50:00+10 | 2007-07-09 07:10:00+10 | 01:20:00 > > > 2007-07-07 12:30:00+10 | 2007-07-09 07:20:00+10 | 00:20:00 > > Could you please show the data structures ( create table commmands + > ane constraints you have )? > > If you do this, you have much bigger chance of getting an answer :) sure create table log ( id integer PRIMARY KEY, notification_time timestamp with time zone, finished_time timestamp with time zone ); INSERT INTO log values (1, '2007-07-06 15:50', '2007-07-09 07:10'); INSERT INTO log values (2, '2007-07-07 12:30', '2007-07-09 07:20'); SELECT notification_time, finished_time, sum(finished_time - notification_time) as actual FROM log GROUP BY notification_time, finished_time; ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[SQL] count question
i have a table CREATE TABLE meter ( meter_id integer NOT NULL, no_of_bays integer, CONSTRAINT meter_pkey PRIMARY KEY (meter_id) ) INSERT INTO meter( meter_id, no_of_bays) VALUES (5397, 2); INSERT INTO meter( meter_id, no_of_bays) VALUES (5409, 3); select meter_id, no_of_bays from meter; meter_id | no_of_bays --+ 5397 | 2 5409 | 3 Is it possible to write a query to produce: meter_id | no_of_bays | bay_id --++--- 5397 | 2 | 5397-01 5397 | 2 | 5397-02 5409 | 3 | 5409-01 5409 | 3 | 5409-02 5409 | 3 | 5409-03 -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] Results with leading zero
I have a table CREATE TABLE problem ( problem_id integer, solution_count integer ); INSERT INTO problem VALUES (1001, 4); INSERT INTO problem VALUES (1012, 11); SELECT * from problem; problem_id | solution_count +--- 1001 | 4 1012 |11 (2 rows) Is there a way I could write a query to produce the following? I will need the leading zero for solution < 10 problem_id | solution -+ 1001 | 01 1001 | 02 1001 | 02 1001 | 04 1012 | 01 1012 | 02 1012 | 03 1012 | 04 1012 | 05 1012 | 06 1012 | 07 1012 | 08 1012 | 09 1012 | 10 1012 | 11 (15 rows) Thanks. -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Results with leading zero
Perfect! Thank you very mcuh :) 2008/6/16 A. Kretschmer <[EMAIL PROTECTED]>: > am Mon, dem 16.06.2008, um 11:48:01 +1000 mailte novice folgendes: >> I have a table >> >> CREATE TABLE problem ( >> problem_id integer, >> solution_count integer >> ); >> >> INSERT INTO problem VALUES (1001, 4); >> INSERT INTO problem VALUES (1012, 11); >> >> SELECT * from problem; >> >> problem_id | solution_count >> +--- >>1001 | 4 >>1012 |11 >> (2 rows) >> >> >> Is there a way I could write a query to produce the following? I will >> need the leading zero for solution < 10 >> >> problem_id | solution >> -+ >> 1001 | 01 >> 1001 | 02 > > My previous answer was a little bit wrong (no leading zero for solution > < 10), sorry. But no problem: > > select problem_id, to_char(generate_Series(1,solution_count),'09') as > solution_count from problem ; > > > > Andreas > -- > Andreas Kretschmer > Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) > GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net > > -- > Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-sql > -- THINK BEFORE YOU PRINT - Save paper if you don't really need to print this. -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] date range query help
Hi, We have two tables. select * from mobile_custodian; custodian_id | user_id | issue_date | return_date | mobile_no --+-++-+- 4 | Ben | 2008-10-11 | 2008-10-13 | 09455225998 5 |Josh | 2008-10-15 | | 09455225998 (2 rows) select * from call; call_id | datetime | mobile_no | charge -+-+-+ 2 | 2007-10-14 13:27:00 | 09455225998 |5.2 1 | 2007-10-12 10:00:00 | 09455225998 |4.5 (2 rows) Now user Ben has passed his mobile to user Josh and we issued Josh his mobile on 2008-10-15. 1. Is it possible for me to write a query that will have the fields call.call_id, call.datetime, mobile_custodian.user_id, call.mobile_no call.charge that will use call.datetime and lookup the date range from mobile_custodian.issue_date and mobile_custodian.return_date to identify the right user for each call? 2. Do I need to change the issue_date & return_date fields to timestamp to perform the above? Thanks. -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] [GENERAL] date range query help
2008/11/20 Adam Rich <[EMAIL PROTECTED]>: >> Now user Ben has passed his mobile to user Josh and we issued Josh his >> mobile on 2008-10-15. >> >> 1. Is it possible for me to write a query that will have the fields >> >> call.call_id, >> call.datetime, >> mobile_custodian.user_id, >> call.mobile_no >> call.charge >> >> that will use call.datetime and lookup the date range from >> mobile_custodian.issue_date and mobile_custodian.return_date to >> identify the right user for each call? >> >> 2. Do I need to change the issue_date & return_date fields to >> timestamp to perform the above? >> > > No, a date will work fine. Try this: > > > select call.call_id, > call.datetime, > mobile_custodian.user_id, > call.mobile_no > call.charge > from call, mobile_custodian > where call.mobile_no = mobile_custodian.mobile_no > and call.datetime between mobile_custodian.issue_date >and mobile_custodian.return_date sorry I get nothing :( -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] [GENERAL] date range query help
2008/11/20 brian <[EMAIL PROTECTED]>: > novice wrote: >> >> 2008/11/20 Adam Rich <[EMAIL PROTECTED]>: >>> >>> select call.call_id, >>> call.datetime, >>> mobile_custodian.user_id, >>> call.mobile_no >>> call.charge >>> from call, mobile_custodian >>> where call.mobile_no = mobile_custodian.mobile_no >>> and call.datetime between mobile_custodian.issue_date >>> and mobile_custodian.return_date >> >> sorry I get nothing :( >> > > How about: > > SELECT call.call_id, > call.datetime, > mobile_custodian.user_id, > call.mobile_no, > call.charge > FROM call > LEFT JOIN mobile_custodian > ON call.mobile_no = mobile_custodian.mobile_no > AND call.datetime > BETWEEN > mobile_custodian.issue_date > AND > mobile_custodian.return_date; > this gave me nothing on the user_id field :( call_id | datetime | user_id | mobile_no | charge -+-+-+-+ 1 | 2007-10-12 10:00:00 | | 09455225998 |4.5 2 | 2007-10-16 13:27:00 | | 09455225998 |5.2 -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] [GENERAL] date range query help
2008/11/20 Rodrigo E. De León Plicet <[EMAIL PROTECTED]>: > On Wed, Nov 19, 2008 at 10:03 PM, novice <[EMAIL PROTECTED]> wrote: >> sorry I get nothing :( > > Of course not. None of the dates you gave in the example overlap. > But it should still have the 1st entry with the name Ben? Am I missing something? -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] Query Performance
Hello List, I have a query which use to run very fast now has turn into show stopper . PostgreSQL:8.2 explain analyze select user_name,A.user_id, dnd_window_start, dnd_window_stop, B.subs as subs, B.city_id as city_id, B.source_type as source_type from users A left join user_subscriptions B on (A.user_id=B.user_id) where A.user_id in (select user_id from subs_feed where feed_id=1411 and f_sms='t') ; Nested Loop Left Join (cost=986325.88..1094601.47 rows=11148 width=55) (actual time=132635.994..1590487.280 rows=609070 loops=1) -> Nested Loop (cost=986325.88..1062280.53 rows=11148 width=26) (actual time=132630.057..1398299.117 rows=609070 loops=1) -> HashAggregate (cost=986325.88..986437.36 rows=11148 width=4) (actual time=132591.648..133386.651 rows=609070 loops=1) -> Bitmap Heap Scan on subs_feed (cost=16316.71..985194.44 rows=452576 width=4) (actual time=20199.571..131566.494 rows=609070 loops=1) Recheck Cond: (feed_id = 1411) Filter: f_sms -> Bitmap Index Scan on feed_user_id (cost=0.00..16203.57 rows=681933 width=0) (actual time=19919.512..19919.512 rows=616900 loops=1) Index Cond: (feed_id = 1411) -> Index Scan using users_pkey on users a (cost=0.00..6.79 rows=1 width=26) (actual time=2.073..2.074 rows=1 loops=609070) Index Cond: (a.user_id = subs_feed.user_id) -> Index Scan using user_subscriptions_user_id_pk on user_subscriptions b (cost=0.00..2.89 rows=1 width=33) (actual time=0.312..0.313 rows=1 loops=609070) Index Cond: (a.user_id = b.user_id) Total runtime: 1590755.918 ms (13 rows) This query runs almost half an hour. It is evident that nested loop is taking most of the time (approx 27 minutes). Any tips would be very useful. Also these table have below count: select relname,reltuples from pg_class where relname in ('users','user_subscriptions','subs_feed'); relname | reltuples +- user_subscriptions | 3758304 users | 1.95481e+07 subs_feed | 2.96492e+07 select n_tup_ins,n_tup_upd,n_tup_del,last_vacuum,last_analyze from pg_stat_user_tables where relname='user_subscriptions'; n_tup_ins | n_tup_upd | n_tup_del | last_vacuum| last_analyze ---+---+---+--+-- 86371397 | 25865942 | 0 | 2009-12-06 23:00:36.355251+05:30 | 2009-12-06 23:00:36.355251+05:30 Thanks in advance for help ...
Re: [SQL] Query Performance
Tomas: Thanks for your replay. I got it working within 3 minutes because of hash join by rearranging the query. explain analyze select *, subs from (select user_name,A.user_id, dnd_window_start, dnd_window_stop from users A ,subs_feed B where A.user_id=B.user_id and b.feed_id=1413 and f_sms='t') as foo left outer join user_subscriptions u on (foo.user_id=u.user_id); Hash Left Join (cost=1472105.16..2775728.42 rows=957624 width=75) (actual time=86322.190..192125.402 rows=1340957 loops=1) Hash Cond: (a.user_id = u.user_id) -> Hash Join (cost=1347862.70..2638877.52 rows=957624 width=26) (actual time=79407.144..181327.848 rows=1340957 loops=1) Hash Cond: (b.user_id = a.user_id) -> Bitmap Heap Scan on subs_feed b (cost=34518.45..1127550.37 rows=957624 width=4) (actual time=8791.019..72889.040 rows=1340957 loops=1) Recheck Cond: (feed_id = 1413) Filter: f_sms -> Bitmap Index Scan on feed_user_id (cost=0.00..34279.04 rows=1442930 width=0) (actual time=8492.067..8492.067 rows=1352180 loops=1) Index Cond: (feed_id = 1413) -> Hash (cost=916273.00..916273.00 rows=19548100 width=26) (actual time=70605.366..70605.366 rows=19548560 loops=1) -> Seq Scan on users a (cost=0.00..916273.00 rows=19548100 width=26) (actual time=10.405..55676.884 rows=19548560 loops=1) -> Hash (cost=77415.54..77415.54 rows=3746154 width=49) (actual time=6909.632..6909.632 rows=3758304 loops=1) -> Seq Scan on user_subscriptions u (cost=0.00..77415.54 rows=3746154 width=49) (actual time=3.624..4265.114 rows=3758304 loops=1) Total runtime: 192585.437 ms From: "t...@fuzzy.cz" To: Postgre Novice Cc: pgsql-sql@postgresql.org Sent: Mon, December 7, 2009 7:23:19 PM Subject: Re: [SQL] Query Performance Yes, the problem is the nested loop scan - it's scanning users 609070 times, which is awful. Could you provide explain plan that executed fast? Was it executed with the same parameter values or did the parameters change (maybe it's slow for some parameters values only)? Have you tried to rewrite the subselect to a join? I.e. something like this select user_name,A.user_id, dnd_window_start, dnd_window_stop, B.subs as subs, B.city_id as city_id, B.source_type as source_type from users A left join user_subscriptions B on (A.user_id=B.user_id) join subs_feed C ON (A.user_id = C.user_id) where feed_id=1411 and f_sms='t' But I guess it won't solve the issue (it seems PostgreSQL did this rewrite on it's own). Tomas > Hello List, > > I have a query which use to run very fast now has turn into show stopper . > > PostgreSQL:8.2 > > explain analyze select user_name,A.user_id, dnd_window_start, > dnd_window_stop, B.subs as subs, B.city_id as city_id, B.source_type as > source_type from > users A left join user_subscriptions B on (A.user_id=B.user_id) > where A.user_id in (select user_id from subs_feed where feed_id=1411 and > f_sms='t') > ; > >Nested Loop Left Join (cost=986325.88..1094601.47 rows=11148 > width=55) (actual time=132635.994..1590487.280 rows=609070 > loops=1) >-> Nested Loop (cost=986325.88..1062280.53 rows=11148 width=26) > (actual time=132630.057..1398299.117 rows=609070 loops=1) > -> HashAggregate (cost=986325.88..986437.36 rows=11148 width=4) > (actual time=132591.648..133386.651 rows=609070 loops=1) >-> Bitmap Heap Scan on subs_feed > (cost=16316.71..985194.44 rows=452576 width=4) (actual > time=20199.571..131566.494 rows=609070 loops=1) > Recheck Cond: (feed_id = 1411) > Filter: f_sms > -> Bitmap Index Scan on feed_user_id > (cost=0.00..16203.57 rows=681933 width=0) (actual > time=19919.512..19919.512 rows=616900 loops=1) >Index Cond: (feed_id = 1411) > -> Index Scan using users_pkey on users a (cost=0.00..6.79 > rows=1 width=26) (actual time=2.073..2.074 rows=1 loops=609070) >Index Cond: (a.user_id = subs_feed.user_id) >-> Index Scan using user_subscriptions_user_id_pk on > user_subscriptions b (cost=0.00..2.89 rows=1 width=33) (actual > time=0.312..0.313 rows=1 loops=609070) > Index Cond: (a.user_id = b.user_id) > Total runtime: 1590755.918 ms > (13 rows) > > > > This query runs almost half an hour. It is evident that nested loop is > taking most of the time (approx 27 minutes). > > Any tips would be very useful. > > Also these table have below count: > > select relname,reltuples from pg_class where relname in > ('users','user_subscriptions','subs_feed'); > relname | relt