Hi Sebastian, - First of all i think there is an open-parenthesis missing in the query V2. Maybe in the V2 version you cold restrict the results in the INNER query a bit more if you use a restriction clause like "WHERE n_issue = i.id" in that. It will certainly lower the number of rows returned by it to only 1 result.
Regards -- Helio Campos Mello de Andrade On Mon, Nov 10, 2008 at 8:43 AM, Sebastian Ritter < [EMAIL PROTECTED]> wrote: > Hi all, > > I was hoping to receive some advise on a slow running query in our > business' Issue Tracking System. To shed some light on the below mentioned > queries, here is a brief summary of how users interact with the system. The > two main components in the system are a Issues and Followups. An Issue is > created by our technical support staff when some form of action needs to be > taken in order to resolve a problem. FollowUps are entries entered against > an issue depicting the sequence of events taken to resolve the issue. There > are about 15,000 Issues in the system at present and about 95,000 FollowUps > in the system. As we need the system to be very responsive, each query > should ideally run in under 1 second. > > A lot of the reports our technical officers submit to us include a listing > of all actioned issues for a given day along with the last modified followup > of each said issue. With the number of rows in our database increasing at a > high rate, these queries are starting to run too slowly. > > Here is a condensed version of the two tables: > > Issues: > ================================= > id - integer > dt_created - timestamp > dt_modified - timestamp > t_title - varchar > t_description - varchar > > FollowUps: > ================================= > id - integer > dt_created - timestamp > dt_modified - timestamp > t_description - varchar > n_issue - foregin key to issues > > We have identified that the slowness in our queries is trying to return the > lastest followup for each actioned issue that day. Without further ado here > are two variations I have tried within the system (neither of which are > making the cut): > > V1 (correlated subquery - Very bad performance) > > (SELECT > fu.* > FROM > manage_followup fu, > manage_issue i > WHERE > i.id = fu.n_issue > AND > fu.id = (SELECT > id > FROM > manage_followup > WHERE > n_issue = i.id > ORDER BY > dt_modified DESC > LIMIT 1)) AS latestfu, > > V2 (Using Group By, "max" aggregate function and distinct- better > performance, but still bad because of distinct) > > > SELECT DISTINCT ON (fu.n_issue) > fu.id, > fu.dt_created, > fu.dt_modified, > fu.t_description, > fu.n_issue as issue_id > FROM > manage_followup fu, > (SELECT > n_issue, > max(dt_modified) as dt_modified > FROM > manage_followup > GROUP BY > n_issue) as max_modified > WHERE > max_modified.n_issue = fu.n_issue > AND > fu.dt_modified = max_modified.dt_modified) > AS latestfu ON (latestfu.issue_id = i.id), > > We must use distinct here as we sometimes use batch scripts to enter > followups, which will give them all similar, if not equal, modification > dates. We also can't use followup ids as an indicator of the latest followup > because users of the system can retrospectively go back and change older > followups. > > I was hoping some one could provide a solution that does not require a > corrolated subquery or make use of the distinct keyword. Any help would be > much appreciated. > > Kind regards, > Sebastian > > > > > >