Jacob,

> Hi. I'm trying to optimize a query that joins a master table with additional
> information from 4 other tables, and sorting on a date field.
> 
> Here is my query
> 
> SELECT
>   T.TransNo, T.TransDate, T.VoucherNo,
>   T.PayerNo, P.PayerNumber,
>   T.PaymentNo, Pm.PaymentName,
>   T.SupplementText,
>   T.PayTypeNo, Pt.PayTypeName,
>   T.DebitAmount, T.CreditAmount,
>   T.StudentNo, S.StudentNumber
> FROM Trans T
> LEFT OUTER JOIN Student S ON T.StudentNo = S.StudentNo JOIN Payer P ON
> T.PayerNo = P.PayerNo JOIN Payment Pm ON T.PaymentNo =
> Pm.PaymentNo JOIN PayType Pt ON T.PayTypeNo = Pt.PayTypeNo ORDER
> BY TransDate
> 
> I've got a descending index on TRANSDATE (DDL below) but not an ascending
> so I was wondering why doing
> 
> ORDER BY TransDate ASC
> 
> Wasn't faster than doing
> 
> ORDER BY TransDate DESC
> 
> So I looked at the automatically generated plan (using FlameRobin), realizing
> that the optimizer didn't use the index on TRANSDATE at all:
> 
> PLAN SORT (JOIN (JOIN (T NATURAL, S INDEX (STUDENT_PK)), P INDEX
> (PAYER_PK), PM INDEX (PAYMENT_PK), PT INDEX (PAYTYPE_PK)))

Your query doesn't have any where clause, to narrow the scope of the rows to be 
returned.  Further, you are sort a large table, as such the engine has 
determined that walking the master table (performing NATURAL scan) then sorting 
the results is a faster access method than using any TransDate index.  

In the majority of cases this is the better approach, since it minimizes the 
number of random disk IO operations which result.  Random IOs slows down query 
performance.

Also, are you testing using tables which have a realistic/live-like number of 
rows?  Testing on small tables do not always show real-world results.


> Now trying to change the all joins into left outer joins
> 
> SELECT
>   T.TransNo, T.TransDate, T.VoucherNo,
>   T.PayerNo, P.PayerNumber,
>   T.PaymentNo, Pm.PaymentName,
>   T.SupplementText,
>   T.PayTypeNo, Pt.PayTypeName,
>   T.DebitAmount, T.CreditAmount,
>   T.StudentNo, S.StudentNumber
> FROM Trans T
> LEFT OUTER JOIN Student S ON T.StudentNo = S.StudentNo LEFT OUTER JOIN
> Payer P ON T.PayerNo = P.PayerNo LEFT OUTER JOIN Payment Pm ON
> T.PaymentNo = Pm.PaymentNo LEFT OUTER JOIN PayType Pt ON
> T.PayTypeNo = Pt.PayTypeNo ORDER BY TransDate
> 
> Gives me the plan I'm after:
> 
> PLAN JOIN (JOIN (JOIN (JOIN (T ORDER TRANS_DATE_IX, S INDEX
> (STUDENT_PK)), P INDEX (PAYER_PK)), PM INDEX (PAYMENT_PK)), PT INDEX
> (PAYTYPE_PK))
> 
> And the query runs 10 times faster! Since the relations I changed are secured
> by referential integrity, there should be no difference (?) so I'm a 
> surprised,
> that the optimizer couldn't figure that out.

You need to be careful when trying to benchmark/compare query results.  There 
are several caches (FB Engine and OS) which can significantly skew results and 
make a query look faster than it is in reality -- the state of the caches when 
in production usage will not be the same as during your testing.

Personally, I always use the Classic engine when I am benchmarking (and 
disconnect/reconnect in between query runs) to get a truer picture of query 
performance -- the default engine cache size is much smaller than other 
engines, so it minimizes the 'benefit' of the engine cache, thus providing more 
of a 'worse case' usage scenario.


Sean

Reply via email to