Re: [SQL] complex query

2012-10-28 Thread Oliveiros d'Azevedo Cristina

Hi, Scott.

I'd like to kick in this thread to ask you some advice, as you are
experienced in optimizing queries.
I also use extensively joins and unions (less than joins though).

Anyway, my response times are somewhat behind miliseconds, they are situated 
on seconds range, and sometimes they exceed one minute.
I have some giant tables with over 100 000 000 records collected for more 
than 6 years.


Most of my queries are made over recent data, so I'm considering 
partitioning the tables.


But I believe that my problem arises from misplaced indexes...
I have an index on every PRK.
But if the join is not made using the PRKs, perhaps, should I place an index 
also on the joined columns?


The application is not a hard real time one, but if you can do it much 
faster than I do, then I'm positive that I must have been doin something 
wrong.


Could you please let me know about your thoughts on this?

Thanks in advance

Best,
Oliver

- Original Message - 
From: "Scott Marlowe" 

To: "Mark Fenbers" 
Cc: 
Sent: Sunday, October 28, 2012 2:20 AM
Subject: Re: [SQL] complex query


On Sat, Oct 27, 2012 at 7:56 PM, Mark Fenbers  
wrote:

I'd do somethings like:

select * from (
select id, sum(col1), sum(col2) from tablename group by yada
   ) as a [full, left, right, outer] join (
select id, sum(col3), sum(col4) from tablename group by bada
) as b
on (a.id=b.id);

and choose the join type as appropriate.

Thanks!  Your idea worked like a champ!
Mark


The basic rules for mushing together data sets is to join them to put
the pieces of data into the same row (horiztonally extending the set)
and use unions to pile the rows one on top of the other.

One of the best things about PostgreSQL is that it's very efficient at
making these kinds of queries efficient and fast.  I've written 5 or 6
page multi-join multi-union queries that still ran in hundreds of
milliseconds, returning thousands of rows.


--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql





--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


[SQL] pull in most recent record in a view

2012-10-28 Thread Gary Stainburn
I know I've asked a similar question before but I can't find it.

I'm doing a new project for a charity which involves managing a skills / 
requesit matrix. For each skill type / staff id I need to keep a record of 
when the skill was aquired/renewed and when it expires.

Put simply

skills=# select * from staff;
 st_id | st_name 
---+-
 1 | Gary
(1 row)

skills=# select * from skills;
 sk_id | sk_desc | sk_renewal 
---+-+
 1 | Medical | 5 years
(1 row)

skills=# select * from qualifications ;
 st_id | sk_id | qu_qualified | qu_renewal 
---+---+--+
 1 | 1 | 2004-07-01   | 10 years
 1 | 1 | 2009-05-25   | 3 years
(2 rows)

skills=# 


What is the best (cleanest SQL or fastest performance) way to produce the 
following view?

 st_id | st_name | sk_id | sk_desc | last_qualified | Renewal | Expires
---+-+---+-+|-|---
 1 | Gary| 1 | Medical | 2009-05-25 | 3 years | 2012-05-25


I've got the following which gives all but the last two fields. The problem is 
that the Renewal period and expires has to be from the most recent record, 
i.e. even though the record 1 above expires after record 2, the results of 
record 2 have to be used.

select t.*, k.sk_id, k.sk_desc, q.last_qualified from
(select st_id, sk_id, max(qu_qualified) as last_qualified from qualifications 
group by st_id, sk_id) q
join staff t on t.st_id = q.st_id
join skills k on k.sk_id = q.sk_id
order by st_id, sk_id

I am still at the concept stage for this project so I can change the schema if 
required

-- 
Gary Stainburn
Group I.T. Manager
Ringways Garages
http://www.ringways.co.uk 


-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] pull in most recent record in a view

2012-10-28 Thread Gary Stainburn
This is my best effort so far is below. My concern is that it isn't very 
efficient and will slow down as record numbers increase

create view current_qualifications as 
select q.*, (q.qu_qualified+q.qu_renewal)::date as qu_expires from 
qualifications q 
join (select st_id, sk_id, max(qu_qualified) as qu_qualified from 
qualifications group by st_id, sk_id) s
on q.st_id=s.st_id and q.sk_id = s.sk_id and q.qu_qualified = s.qu_qualified;


select t.st_id, t.st_name, k.sk_id, k.sk_desc, q.qu_qualified, q.qu_renewal, 
q.qu_expires 
from current_qualifications q
join staff t on t.st_id = q.st_id
join skills k on k.sk_id = q.sk_id;


-- 
Gary Stainburn
Group I.T. Manager
Ringways Garages
http://www.ringways.co.uk 


-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] pull in most recent record in a view

2012-10-28 Thread David Johnston
On Oct 26, 2012, at 5:24, Gary Stainburn  wrote:

> This is my best effort so far is below. My concern is that it isn't very 
> efficient and will slow down as record numbers increase
> 
> create view current_qualifications as 
> select q.*, (q.qu_qualified+q.qu_renewal)::date as qu_expires from 
> qualifications q 
> join (select st_id, sk_id, max(qu_qualified) as qu_qualified from 
> qualifications group by st_id, sk_id) s
> on q.st_id=s.st_id and q.sk_id = s.sk_id and q.qu_qualified = s.qu_qualified;
> 
> 
> select t.st_id, t.st_name, k.sk_id, k.sk_desc, q.qu_qualified, q.qu_renewal, 
> q.qu_expires 
> from current_qualifications q
> join staff t on t.st_id = q.st_id
> join skills k on k.sk_id = q.sk_id;
> 

The best way to deal with recency problems is to maintain a table that contains 
only the most recent records using insert/update/delete triggers.  A boolean 
flag along with a partial index can work instead of an actual table in some 
cases.  If using a table only the pkid needs to be stored, along with any 
desired metadata.

It probably isn't worth the effort until you actually do encounter performance 
problems.

David J.

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql