Thank you SGreen.But i can'optimize the sql statement like your way because this query must describe in one sql statement.So i think i maybe optimize this sql statement through creating high efficiency index or describe this sql statement in other method. Rewrite this sql statement must in one sql statement.

Follow is the original sql statement:
-----------------------------------------------------
SELECT ol_i_id
FROM orders, order_line
WHERE orders.o_id = order_line.ol_o_id and o_id>0 AND not(order_line.ol_i_id = 5000)
 AND orders.o_c_id
     IN (SELECT o_c_id FROM orders, order_line
         WHERE orders.o_id = order_line.ol_o_id
AND orders.o_id > (SELECT MAX(o_id)-10000 FROM orders) AND order_line.ol_i_id = 5000)
GROUP BY ol_i_id
ORDER BY SUM(ol_qty)DESC limit 5
-------------------------------------------------------------
Follow is the original explain output:

--------------------------------------------------------------
1, 'PRIMARY', 'orders', 'range', 'PRIMARY', 'PRIMARY', '4', '', 519414, 'Using where; Using temporary; Using filesort'

1, 'PRIMARY', 'order_line', 'ref', 'PRIMARY,order_line_ol_i_id,order_line_ol_o_id', 'order_line_ol_o_id', '4', 'tpcw.orders.o_id', 1, 'Using where'

2, 'DEPENDENT SUBQUERY', 'orders', 'ref', 'PRIMARY,orders_o_c_id', 'orders_o_c_id', '5', 'func', 1, 'Using where; Using index'

2, 'DEPENDENT SUBQUERY', 'order_line', 'ref', 'PRIMARY,order_line_ol_i_id,order_line_ol_o_id', 'order_line_ol_o_id', '4', 'tpcw.orders.o_id', 1, 'Using where'

3, 'SUBQUERY', '', '', '', '', '', '', , 'Select tables optimized away'
-------------------------------------------------------------------------------


In the mail,Pow give me a way.He advise me to build a index on orders.o_id and orders.o_c_id.I understanding his mean is table orders join table order_line using the index.Because orders.o_c_id be limited in the where statemetn,so the index should be filter date before Join operation.In this way,the index of orders using much less data to join the table order_line.But i can't see the effect when i do like it.I don't know the reason.

  Are you have any better way?


From: [EMAIL PROTECTED]
To: 王 旭 <[EMAIL PROTECTED]>
CC: mysql@lists.mysql.com,[EMAIL PROTECTED]
Subject: Re: optimize a sql statement
Date: Tue, 19 Jul 2005 16:10:18 -0400

王 旭 <[EMAIL PROTECTED]> wrote on 07/18/2005 03:12:28 AM:

> NO effect :-(
>
> >From: pow <[EMAIL PROTECTED]>
> >To: 王 旭 <[EMAIL PROTECTED]>
> >Subject: Re: optimize a sql statement
> >Date: Mon, 18 Jul 2005 11:51:23 +0800
> >
> >Do u have composite index on order_line.ol_o_id AND
order_line.ol_i_id?
> >
> >You could try that...
> >王 旭 wrote:
> >
> > > Now,I make this sql statement to easy.
> > >
> > > Follow is the sql statement:
> > >
>
-------------------------------------------------------------------------
> > > SELECT ol_i_id FROM orders,order_line
> > > WHERE order_line.ol_o_id = orders.o_id
> > > GROUP BY ol_i_id
> > >
>
-------------------------------------------------------------------------
> > >
> > > Follow is the explain output:
> > >
>
-------------------------------------------------------------------------
> > > 1, 'SIMPLE', 'orders', 'index', 'PRIMARY', 'PRIMARY', '4', '',
259231,
> > > 'Using index; Using temporary; Using filesort'
> > >
> > > 1, 'SIMPLE', 'order_line', 'ref', 'PRIMARY,order_line_ol_o_id',
> > > 'PRIMARY', '4', 'tpcw.orders.o_id', 1, ''
> > >
>
-------------------------------------------------------------------------
> > > Can it be optimized?
> > >
> > >> From: 王 旭 <[EMAIL PROTECTED]>
> > >> To: mysql@lists.mysql.com
> > >> Subject: optimize a sql statement
> > >> Date: Sat, 16 Jul 2005 18:24:15 +0800
> > >>
> > >> Follow is my sql statement:
> > >>
>
-------------------------------------------------------------------------------------

>
> > >>
> > >
> > >
> > >>
> > >> SELECT SUM(ol_qty) sumolqty, ol_i_id FROM orders, order_line WHERE
> > >> orders.o_id = order_line.ol_o_id AND orders.o_id > (SELECT
> > >> MAX(o_id)-10000 FROM orders) AND NOT (order_line.ol_i_id = 5000)
AND
> > >> orders.o_c_id IN (SELECT o_c_id FROM orders, order_line WHERE
> > >> orders.o_id = order_line.ol_o_id and orders.o_id > (SELECT
> > >> MAX(o_id)-10000 FROM orders) AND order_line.ol_i_id = 5000) GROUP
BY
> > >> ol_i_id ORDER BY sumolqty DESC limit 50
> > >>
>
--------------------------------------------------------------------------------------

>
> > >>
> > >
> > >
> > >>
> > >> follows are explain output:
> > >>
>
--------------------------------------------------------------------------------------

>
> > >>
> > >
> > >
> > >>
> > >> 1, 'PRIMARY', 'orders', 'range', 'PRIMARY', 'PRIMARY', '4', '',
> > >> 19398, 'Using where; Using temporary; Using filesort'
> > >>
> > >> 1, 'PRIMARY', 'order_line', 'ref',
> > >> 'PRIMARY,order_line_ol_i_id,order_line_ol_o_id',
> > >> 'order_line_ol_o_id', '4', 'tpcw.orders.o_id', 1, 'Using where'
> > >>
> > >> 3, 'DEPENDENT SUBQUERY', 'orders', 'ref', 'PRIMARY,orders_o_c_id',
> > >> 'orders_o_c_id', '5', 'func', 1, 'Using where; Using index'
> > >>
> > >> 3, 'DEPENDENT SUBQUERY', 'order_line', 'ref',
> > >> 'PRIMARY,order_line_ol_i_id,order_line_ol_o_id',
> > >> 'order_line_ol_o_id', '4', 'tpcw.orders.o_id', 1, 'Using where'
> > >>
> > >> 4, 'SUBQUERY', '', '', '', '', '', '', , 'Select tables optimized
> away'
> > >>
> > >> 2, 'SUBQUERY', '', '', '', '', '', '', , 'Select tables optimized
> away'
> > >>
>
------------------------------------------------------------------------------------------

>
> > >>
> > >
> > >
> > >>
> > >>
> > >> This sql statement performance is too bad.Please help me to
optimize
> > >> it .
> > >>
> > >> thanks!
> > >>

Your original query, reformatted only:

SELECT SUM(ol_qty) sumolqty
        , ol_i_id
FROM orders, order_line
WHERE orders.o_id = order_line.ol_o_id
        AND orders.o_id > (SELECT MAX(o_id)-10000 FROM orders)
        AND NOT (order_line.ol_i_id = 5000)
        AND orders.o_c_id IN (
                SELECT o_c_id
                FROM orders, order_line
                WHERE orders.o_id = order_line.ol_o_id
                        and orders.o_id > (SELECT MAX(o_id)-10000 FROM
orders)
                        AND order_line.ol_i_id = 5000
        )
GROUP BY ol_i_id
ORDER BY sumolqty DESC
limit 50

If I try to describe your query, this is how it reads to me:

You are want to get for each order, how many individual items comprises
that order. You only want to see the top 50 item counts from within the
10000 most recent orders ignoring those line items where the ol_i_id =
5000 but including those where the o_c_id is equal to that of an order
that has a line item where ol_i_id = 5000.

For starters, you could pre-compute the lowest o_id and use that value
from a variable instead of getting it during execution (yes it will be a
scalar and only computed once any way.... just bear with me...)

SELECT @min_o_id := max(o_id)-10000 from orders;

SELECT SUM(ol_qty) sumolqty
        , ol_i_id
FROM orders, order_line
WHERE orders.o_id = order_line.ol_o_id
        AND orders.o_id > @min_o_id
        AND NOT (order_line.ol_i_id = 5000)
        AND orders.o_c_id IN (
                SELECT o_c_id
                FROM orders, order_line
                WHERE orders.o_id = order_line.ol_o_id
                        and orders.o_id > @min_o_id
                        AND order_line.ol_i_id = 5000
        )
GROUP BY ol_i_id
ORDER BY sumolqty DESC
limit 50

Just doing that eliminates two subqueries, simplifying your total query
execution plan. If I rewrite the query to use explicit JOINs and
eliminated the subqueries, your query turns into:

/* q1: limit queries to the last 10000 orders */
SELECT @min_o_id := max(o_id)-10000 from orders;

/*q2: locate all customers within the last 10000 orders who
have at least 5000 order_lines in a single order */
CREATE TEMPORARY TABLE tmpOCID
SELECT DISTINCT o_c_id
FROM orders
INNER JOIN order_line
ON orders.o_id = order_line.ol_o_id
        and orders.o_id > @min_o_id
        AND order_line.ol_i_id = 5000;


/*q3: for each line item position, add up how many units were listed in
each position (except position 5000) for all customers who had at least
5000 line items in an order within the last 10000 orders. Limit the
results to those line item positions with the top 50 qty values */
SELECT SUM(ol.ol_qty) sumolqty
        , ol_i_id
FROM orders o
INNER JOIN order_line ol
        ON orders.o_id = order_line.ol_o_id
        AND orders.o_id > @min_o_id
        AND NOT (order_line.ol_i_id = 5000)
INNER JOIN tmpOCID t
        ON orders.o_c_id = t.o_c_id
GROUP BY ol_i_id
ORDER BY sumolqty DESC
limit 50;

DROP TEMPORARY TABLE tmpOCID;

If you are using a programming language to execute your statements, just
make sure that you DO NOT close the connection between each statement and
the whole thing should work as expected.

I think you wanted to know was: the top 50 orders (by ranked by total
units ordered) within the last 10000 orders that came from customers who
had orders of at least 5000 line items (within the last 10000 orders).
That changes "q3:" of the last batch to read:

/* q3a: */
SELECT o_id
        SUM(ol.ol_qty) sumolqty
FROM orders o
INNER JOIN order_line ol
        ON orders.o_id = order_line.ol_o_id
        AND orders.o_id > @min_o_id
INNER JOIN tmpOCID t
        ON orders.o_c_id = t.o_c_id
GROUP BY o_id
ORDER BY sumolqty DESC
limit 50;


If I did not divine the purpose of your query correctly, just let me know.
I am more than happy to try again.

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine


_________________________________________________________________
与联机的朋友进行交流,请使用 MSN Messenger: http://messenger.msn.com/cn

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to