王 旭 <[EMAIL PROTECTED]> wrote on 07/18/2005 03:12:28 AM:

> NO effect :-(
> 
> >From: pow <[EMAIL PROTECTED]>
> >To: 王 旭 <[EMAIL PROTECTED]>
> >Subject: Re: optimize a sql statement
> >Date: Mon, 18 Jul 2005 11:51:23 +0800
> >
> >Do u have composite index on order_line.ol_o_id AND order_line.ol_i_id?
> >
> >You could try that...
> >王 旭 wrote:
> >
> > > Now,I make this sql statement to easy.
> > >
> > > Follow is the sql statement:
> > > 
> 
-------------------------------------------------------------------------
> > > SELECT ol_i_id FROM orders,order_line
> > > WHERE order_line.ol_o_id = orders.o_id
> > > GROUP BY ol_i_id
> > > 
> 
-------------------------------------------------------------------------
> > >
> > > Follow is the explain output:
> > > 
> 
-------------------------------------------------------------------------
> > > 1, 'SIMPLE', 'orders', 'index', 'PRIMARY', 'PRIMARY', '4', '', 
259231,
> > > 'Using index; Using temporary; Using filesort'
> > >
> > > 1, 'SIMPLE', 'order_line', 'ref', 'PRIMARY,order_line_ol_o_id',
> > > 'PRIMARY', '4', 'tpcw.orders.o_id', 1, ''
> > > 
> 
-------------------------------------------------------------------------
> > > Can it be optimized?
> > >
> > >> From: 王 旭 <[EMAIL PROTECTED]>
> > >> To: mysql@lists.mysql.com
> > >> Subject: optimize a sql statement
> > >> Date: Sat, 16 Jul 2005 18:24:15 +0800
> > >>
> > >> Follow is my sql statement:
> > >> 
> 
-------------------------------------------------------------------------------------
> 
> > >>
> > >
> > >
> > >>
> > >> SELECT SUM(ol_qty) sumolqty, ol_i_id FROM orders, order_line WHERE
> > >> orders.o_id = order_line.ol_o_id AND orders.o_id > (SELECT
> > >> MAX(o_id)-10000 FROM orders) AND NOT (order_line.ol_i_id = 5000) 
AND
> > >> orders.o_c_id IN (SELECT o_c_id FROM orders, order_line WHERE
> > >> orders.o_id = order_line.ol_o_id and orders.o_id > (SELECT
> > >> MAX(o_id)-10000 FROM orders) AND order_line.ol_i_id = 5000) GROUP 
BY
> > >> ol_i_id ORDER BY sumolqty DESC limit 50
> > >> 
> 
--------------------------------------------------------------------------------------
> 
> > >>
> > >
> > >
> > >>
> > >> follows are explain output:
> > >> 
> 
--------------------------------------------------------------------------------------
> 
> > >>
> > >
> > >
> > >>
> > >> 1, 'PRIMARY', 'orders', 'range', 'PRIMARY', 'PRIMARY', '4', '',
> > >> 19398, 'Using where; Using temporary; Using filesort'
> > >>
> > >> 1, 'PRIMARY', 'order_line', 'ref',
> > >> 'PRIMARY,order_line_ol_i_id,order_line_ol_o_id',
> > >> 'order_line_ol_o_id', '4', 'tpcw.orders.o_id', 1, 'Using where'
> > >>
> > >> 3, 'DEPENDENT SUBQUERY', 'orders', 'ref', 'PRIMARY,orders_o_c_id',
> > >> 'orders_o_c_id', '5', 'func', 1, 'Using where; Using index'
> > >>
> > >> 3, 'DEPENDENT SUBQUERY', 'order_line', 'ref',
> > >> 'PRIMARY,order_line_ol_i_id,order_line_ol_o_id',
> > >> 'order_line_ol_o_id', '4', 'tpcw.orders.o_id', 1, 'Using where'
> > >>
> > >> 4, 'SUBQUERY', '', '', '', '', '', '', , 'Select tables optimized 
> away'
> > >>
> > >> 2, 'SUBQUERY', '', '', '', '', '', '', , 'Select tables optimized 
> away'
> > >> 
> 
------------------------------------------------------------------------------------------
> 
> > >>
> > >
> > >
> > >>
> > >>
> > >> This sql statement performance is too bad.Please help me to 
optimize
> > >> it .
> > >>
> > >> thanks!
> > >>

Your original query, reformatted only:

SELECT SUM(ol_qty) sumolqty
        , ol_i_id 
FROM orders, order_line 
WHERE orders.o_id = order_line.ol_o_id 
        AND orders.o_id > (SELECT MAX(o_id)-10000 FROM orders) 
        AND NOT (order_line.ol_i_id = 5000) 
        AND orders.o_c_id IN (
                SELECT o_c_id 
                FROM orders, order_line 
                WHERE orders.o_id = order_line.ol_o_id 
                        and orders.o_id > (SELECT MAX(o_id)-10000 FROM 
orders) 
                        AND order_line.ol_i_id = 5000
        ) 
GROUP BY ol_i_id 
ORDER BY sumolqty DESC 
limit 50

If I try to describe your query, this is how it reads to me:

You are want to get for each order, how many individual items comprises 
that order. You only want to see the top 50 item counts from within the 
10000 most recent orders ignoring those line items where the ol_i_id = 
5000 but including those where the o_c_id is equal to that of an order 
that has a line item where ol_i_id = 5000.

For starters, you could pre-compute the lowest o_id and use that value 
from a variable instead of getting it during execution (yes it will be a 
scalar and only computed once any way.... just bear with me...)

SELECT @min_o_id := max(o_id)-10000 from orders;

SELECT SUM(ol_qty) sumolqty
        , ol_i_id 
FROM orders, order_line 
WHERE orders.o_id = order_line.ol_o_id 
        AND orders.o_id > @min_o_id
        AND NOT (order_line.ol_i_id = 5000) 
        AND orders.o_c_id IN (
                SELECT o_c_id 
                FROM orders, order_line 
                WHERE orders.o_id = order_line.ol_o_id 
                        and orders.o_id > @min_o_id
                        AND order_line.ol_i_id = 5000
        ) 
GROUP BY ol_i_id 
ORDER BY sumolqty DESC 
limit 50

Just doing that eliminates two subqueries, simplifying your total query 
execution plan. If I rewrite the query to use explicit JOINs and 
eliminated the subqueries, your query turns into:

/* q1: limit queries to the last 10000 orders */
SELECT @min_o_id := max(o_id)-10000 from orders;

/*q2: locate all customers within the last 10000 orders who
have at least 5000 order_lines in a single order */
CREATE TEMPORARY TABLE tmpOCID
SELECT DISTINCT o_c_id 
FROM orders
INNER JOIN order_line 
ON orders.o_id = order_line.ol_o_id 
        and orders.o_id > @min_o_id
        AND order_line.ol_i_id = 5000;


/*q3: for each line item position, add up how many units were listed in 
each position (except position 5000) for all customers who had at least 
5000 line items in an order within the last 10000 orders. Limit the 
results to those line item positions with the top 50 qty values */
SELECT SUM(ol.ol_qty) sumolqty
        , ol_i_id 
FROM orders o
INNER JOIN order_line ol
        ON orders.o_id = order_line.ol_o_id 
        AND orders.o_id > @min_o_id
        AND NOT (order_line.ol_i_id = 5000) 
INNER JOIN tmpOCID t
        ON orders.o_c_id = t.o_c_id
GROUP BY ol_i_id 
ORDER BY sumolqty DESC 
limit 50;

DROP TEMPORARY TABLE tmpOCID;

If you are using a programming language to execute your statements, just 
make sure that you DO NOT close the connection between each statement and 
the whole thing should work as expected.

I think you wanted to know was: the top 50 orders (by ranked by total 
units ordered) within the last 10000 orders that came from customers who 
had orders of at least 5000 line items (within the last 10000 orders). 
That changes "q3:" of the last batch to read:

/* q3a: */
SELECT o_id 
        SUM(ol.ol_qty) sumolqty
FROM orders o
INNER JOIN order_line ol
        ON orders.o_id = order_line.ol_o_id 
        AND orders.o_id > @min_o_id
INNER JOIN tmpOCID t
        ON orders.o_c_id = t.o_c_id
GROUP BY o_id 
ORDER BY sumolqty DESC 
limit 50;


If I did not divine the purpose of your query correctly, just let me know. 
I am more than happy to try again.

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Reply via email to