Hi Helio, Sorry about the parenthesis - Bad copy/pasting skills! To further discuss your suggestion: Wouldn't adding n_issue=i.id as a where clause filter cause the sub-query to become correlated and thus much less efficient ? I may be wrong, or may have miss-understood your suggestion.
Thanks for you help, Sebastian On Mon, Nov 10, 2008 at 11:48 AM, Helio Campos Mello de Andrade < [EMAIL PROTECTED]> wrote: > Hi Sebastian, > > - First of all i think there is an open-parenthesis missing in the query > V2. > Maybe in the V2 version you cold restrict the results in the INNER query a > bit more if you use a restriction clause like "WHERE n_issue = i.id" in > that. It will certainly lower the number of rows returned by it to only 1 > result. > > Regards > > -- > Helio Campos Mello de Andrade > > > > On Mon, Nov 10, 2008 at 8:43 AM, Sebastian Ritter < > [EMAIL PROTECTED]> wrote: > >> Hi all, >> >> I was hoping to receive some advise on a slow running query in our >> business' Issue Tracking System. To shed some light on the below mentioned >> queries, here is a brief summary of how users interact with the system. The >> two main components in the system are a Issues and Followups. An Issue is >> created by our technical support staff when some form of action needs to be >> taken in order to resolve a problem. FollowUps are entries entered against >> an issue depicting the sequence of events taken to resolve the issue. There >> are about 15,000 Issues in the system at present and about 95,000 FollowUps >> in the system. As we need the system to be very responsive, each query >> should ideally run in under 1 second. >> >> A lot of the reports our technical officers submit to us include a listing >> of all actioned issues for a given day along with the last modified followup >> of each said issue. With the number of rows in our database increasing at a >> high rate, these queries are starting to run too slowly. >> >> Here is a condensed version of the two tables: >> >> Issues: >> ================================= >> id - integer >> dt_created - timestamp >> dt_modified - timestamp >> t_title - varchar >> t_description - varchar >> >> FollowUps: >> ================================= >> id - integer >> dt_created - timestamp >> dt_modified - timestamp >> t_description - varchar >> n_issue - foregin key to issues >> >> We have identified that the slowness in our queries is trying to return >> the lastest followup for each actioned issue that day. Without further ado >> here are two variations I have tried within the system (neither of which are >> making the cut): >> >> V1 (correlated subquery - Very bad performance) >> >> (SELECT >> fu.* >> FROM >> manage_followup fu, >> manage_issue i >> WHERE >> i.id = fu.n_issue >> AND >> fu.id = (SELECT >> id >> FROM >> manage_followup >> WHERE >> n_issue = i.id >> ORDER BY >> dt_modified DESC >> LIMIT 1)) AS latestfu, >> >> V2 (Using Group By, "max" aggregate function and distinct- better >> performance, but still bad because of distinct) >> >> >> SELECT DISTINCT ON (fu.n_issue) >> fu.id, >> fu.dt_created, >> fu.dt_modified, >> fu.t_description, >> fu.n_issue as issue_id >> FROM >> manage_followup fu, >> (SELECT >> n_issue, >> max(dt_modified) as dt_modified >> FROM >> manage_followup >> GROUP BY >> n_issue) as max_modified >> WHERE >> max_modified.n_issue = fu.n_issue >> AND >> fu.dt_modified = max_modified.dt_modified) >> AS latestfu ON (latestfu.issue_id = i.id), >> >> We must use distinct here as we sometimes use batch scripts to enter >> followups, which will give them all similar, if not equal, modification >> dates. We also can't use followup ids as an indicator of the latest followup >> because users of the system can retrospectively go back and change older >> followups. >> >> I was hoping some one could provide a solution that does not require a >> corrolated subquery or make use of the distinct keyword. Any help would be >> much appreciated. >> >> Kind regards, >> Sebastian >> >> >> >> >> >>