Re: [SQL] Query optimizing

Sebastian Ritter Mon, 10 Nov 2008 03:58:35 -0800

Hi Helio,

Sorry about the parenthesis - Bad copy/pasting skills! To further discuss
your suggestion: Wouldn't adding n_issue=i.id as a where clause filter cause
the sub-query to become correlated and thus much less efficient ? I may be
wrong, or may have miss-understood your suggestion.


Thanks for you help,
Sebastian


On Mon, Nov 10, 2008 at 11:48 AM, Helio Campos Mello de Andrade <
[EMAIL PROTECTED]> wrote:

> Hi Sebastian,
>
>  - First of all i think there is an open-parenthesis missing in the query
> V2.
> Maybe in the V2 version you cold restrict the results in the INNER query a
> bit more if you use a restriction clause like "WHERE n_issue = i.id" in
> that. It will certainly lower the number of rows returned by it to only 1
> result.
>
> Regards
>
> --
> Helio Campos Mello de Andrade
>
>
>
> On Mon, Nov 10, 2008 at 8:43 AM, Sebastian Ritter <
> [EMAIL PROTECTED]> wrote:
>
>> Hi all,
>>
>> I was hoping to receive some advise on a slow running query in our
>> business' Issue Tracking System. To shed some light on the below mentioned
>> queries, here is a brief summary of how users interact with the system. The
>> two main components in the system are a Issues and Followups. An Issue is
>> created by our technical support staff when some form of action needs to be
>> taken in order to resolve a problem. FollowUps are entries entered against
>> an issue depicting the sequence of events taken to resolve the issue. There
>> are about 15,000 Issues in the system at present and about 95,000 FollowUps
>> in the system. As we need the system to be very responsive, each query
>> should ideally run in under 1 second.
>>
>> A lot of the reports our technical officers submit to us include a listing
>> of all actioned issues for a given day along with the last modified followup
>> of each said issue. With the number of rows in our database increasing at a
>> high rate, these queries are starting to run too slowly.
>>
>> Here is a condensed version of the two tables:
>>
>> Issues:
>> =================================
>> id  - integer
>> dt_created - timestamp
>> dt_modified - timestamp
>> t_title - varchar
>> t_description - varchar
>>
>> FollowUps:
>> =================================
>> id  - integer
>> dt_created - timestamp
>> dt_modified - timestamp
>> t_description - varchar
>> n_issue - foregin key to issues
>>
>> We have identified that the slowness in our queries is trying to return
>> the lastest followup for each actioned issue that day. Without further ado
>> here are two variations I have tried within the system (neither of which are
>> making the cut):
>>
>> V1 (correlated subquery - Very bad performance)
>>
>>  (SELECT
>>          fu.*
>>   FROM
>>         manage_followup fu,
>>         manage_issue i
>>   WHERE
>>          i.id = fu.n_issue
>>          AND
>>          fu.id = (SELECT
>>                         id
>>                    FROM
>>                         manage_followup
>>                     WHERE
>>                          n_issue = i.id
>>                     ORDER BY
>>                          dt_modified DESC
>>                     LIMIT 1)) AS latestfu,
>>
>> V2 (Using Group By, "max" aggregate function  and distinct- better
>> performance, but still bad because of distinct)
>>
>>
>> SELECT DISTINCT ON (fu.n_issue)
>>                                 fu.id,
>>                                 fu.dt_created,
>>                                 fu.dt_modified,
>>                                 fu.t_description,
>>                                 fu.n_issue as issue_id
>>                             FROM
>>                                 manage_followup fu,
>>                                 (SELECT
>>                                     n_issue,
>>                                     max(dt_modified) as dt_modified
>>                                  FROM
>>                                     manage_followup
>>                                  GROUP BY
>>                                     n_issue) as max_modified
>>                             WHERE
>>                                 max_modified.n_issue = fu.n_issue
>>                                 AND
>>                                 fu.dt_modified = max_modified.dt_modified)
>> AS latestfu ON (latestfu.issue_id = i.id),
>>
>> We must use distinct here as we sometimes use batch scripts to enter
>> followups, which will give them all similar, if not equal, modification
>> dates. We also can't use followup ids as an indicator of the latest followup
>> because users of the system can retrospectively go back and change older
>> followups.
>>
>> I was hoping some one could provide a solution that does not require a
>> corrolated subquery or make use of the distinct keyword. Any help would be
>> much appreciated.
>>
>> Kind regards,
>> Sebastian
>>
>>
>>
>>
>>
>>

Re: [SQL] Query optimizing

Reply via email to