Re: [SQL] complex query
Hi, Scott. I'd like to kick in this thread to ask you some advice, as you are experienced in optimizing queries. I also use extensively joins and unions (less than joins though). Anyway, my response times are somewhat behind miliseconds, they are situated on seconds range, and sometimes they exceed one minute. I have some giant tables with over 100 000 000 records collected for more than 6 years. Most of my queries are made over recent data, so I'm considering partitioning the tables. But I believe that my problem arises from misplaced indexes... I have an index on every PRK. But if the join is not made using the PRKs, perhaps, should I place an index also on the joined columns? The application is not a hard real time one, but if you can do it much faster than I do, then I'm positive that I must have been doin something wrong. Could you please let me know about your thoughts on this? Thanks in advance Best, Oliver - Original Message - From: "Scott Marlowe" To: "Mark Fenbers" Cc: Sent: Sunday, October 28, 2012 2:20 AM Subject: Re: [SQL] complex query On Sat, Oct 27, 2012 at 7:56 PM, Mark Fenbers wrote: I'd do somethings like: select * from ( select id, sum(col1), sum(col2) from tablename group by yada ) as a [full, left, right, outer] join ( select id, sum(col3), sum(col4) from tablename group by bada ) as b on (a.id=b.id); and choose the join type as appropriate. Thanks! Your idea worked like a champ! Mark The basic rules for mushing together data sets is to join them to put the pieces of data into the same row (horiztonally extending the set) and use unions to pile the rows one on top of the other. One of the best things about PostgreSQL is that it's very efficient at making these kinds of queries efficient and fast. I've written 5 or 6 page multi-join multi-union queries that still ran in hundreds of milliseconds, returning thousands of rows. -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] pull in most recent record in a view
I know I've asked a similar question before but I can't find it. I'm doing a new project for a charity which involves managing a skills / requesit matrix. For each skill type / staff id I need to keep a record of when the skill was aquired/renewed and when it expires. Put simply skills=# select * from staff; st_id | st_name ---+- 1 | Gary (1 row) skills=# select * from skills; sk_id | sk_desc | sk_renewal ---+-+ 1 | Medical | 5 years (1 row) skills=# select * from qualifications ; st_id | sk_id | qu_qualified | qu_renewal ---+---+--+ 1 | 1 | 2004-07-01 | 10 years 1 | 1 | 2009-05-25 | 3 years (2 rows) skills=# What is the best (cleanest SQL or fastest performance) way to produce the following view? st_id | st_name | sk_id | sk_desc | last_qualified | Renewal | Expires ---+-+---+-+|-|--- 1 | Gary| 1 | Medical | 2009-05-25 | 3 years | 2012-05-25 I've got the following which gives all but the last two fields. The problem is that the Renewal period and expires has to be from the most recent record, i.e. even though the record 1 above expires after record 2, the results of record 2 have to be used. select t.*, k.sk_id, k.sk_desc, q.last_qualified from (select st_id, sk_id, max(qu_qualified) as last_qualified from qualifications group by st_id, sk_id) q join staff t on t.st_id = q.st_id join skills k on k.sk_id = q.sk_id order by st_id, sk_id I am still at the concept stage for this project so I can change the schema if required -- Gary Stainburn Group I.T. Manager Ringways Garages http://www.ringways.co.uk -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] pull in most recent record in a view
This is my best effort so far is below. My concern is that it isn't very efficient and will slow down as record numbers increase create view current_qualifications as select q.*, (q.qu_qualified+q.qu_renewal)::date as qu_expires from qualifications q join (select st_id, sk_id, max(qu_qualified) as qu_qualified from qualifications group by st_id, sk_id) s on q.st_id=s.st_id and q.sk_id = s.sk_id and q.qu_qualified = s.qu_qualified; select t.st_id, t.st_name, k.sk_id, k.sk_desc, q.qu_qualified, q.qu_renewal, q.qu_expires from current_qualifications q join staff t on t.st_id = q.st_id join skills k on k.sk_id = q.sk_id; -- Gary Stainburn Group I.T. Manager Ringways Garages http://www.ringways.co.uk -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] pull in most recent record in a view
On Oct 26, 2012, at 5:24, Gary Stainburn wrote: > This is my best effort so far is below. My concern is that it isn't very > efficient and will slow down as record numbers increase > > create view current_qualifications as > select q.*, (q.qu_qualified+q.qu_renewal)::date as qu_expires from > qualifications q > join (select st_id, sk_id, max(qu_qualified) as qu_qualified from > qualifications group by st_id, sk_id) s > on q.st_id=s.st_id and q.sk_id = s.sk_id and q.qu_qualified = s.qu_qualified; > > > select t.st_id, t.st_name, k.sk_id, k.sk_desc, q.qu_qualified, q.qu_renewal, > q.qu_expires > from current_qualifications q > join staff t on t.st_id = q.st_id > join skills k on k.sk_id = q.sk_id; > The best way to deal with recency problems is to maintain a table that contains only the most recent records using insert/update/delete triggers. A boolean flag along with a partial index can work instead of an actual table in some cases. If using a table only the pkid needs to be stored, along with any desired metadata. It probably isn't worth the effort until you actually do encounter performance problems. David J. -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql