I analyzed the query plan again.
---------------------------------------------------------------
1, 'PRIMARY', 'orders', 'range', 'PRIMARY', 'PRIMARY', '4', '', 348660,
'Using where; Using temporary; Using filesort'
1, 'PRIMARY', 'order_line', 'ref',
'PRIMARY,order_line_ol_i_id,order_line_ol_o_id,order_line_test',
'order_line_ol_o_id', '4', 'tpcw.orders.o_id', 1, 'Using where'
2, 'DEPENDENT SUBQUERY', 'orders', 'ref',
'PRIMARY,orders_o_c_id','orders_o_c_id', '5', 'func', 1, 'Using where;
Using index'
2, 'DEPENDENT SUBQUERY', 'order_line', 'ref',
'PRIMARY,order_line_ol_i_id,order_line_ol_o_id,order_line_test',
'order_line_ol_o_id', '4', 'tpcw.orders.o_id', 1, 'Using where'
3, 'SUBQUERY', '', '', '', '', '', '', , 'Select tables optimized away'
---------------------------------------------------------------------
In the plan,i find there are 348660 row scan in table orders.And in the
sql statement,orders.o_id is be used to join operation and orders.o_c_id be
used in the where statement.So i create a index "orders_test" on table
orders(o_id and o_c_id).
After create the index.I execute the explain again.Follow is the
result:
-----------------------------------------------------------------------
1, 'PRIMARY', 'orders', 'range', 'PRIMARY,orders_test', 'PRIMARY', '4', '',
348660, 'Using where; Using temporary; Using filesort'
1, 'PRIMARY', 'order_line', 'ref',
'PRIMARY,order_line_ol_i_id,order_line_ol_o_id,order_line_test',
'order_line_ol_o_id', '4', 'tpcw.orders.o_id', 1, 'Using where'
2, 'DEPENDENT SUBQUERY', 'orders', 'ref',
'PRIMARY,orders_o_c_id,orders_test', 'orders_o_c_id', '5', 'func', 1,
'Using where; Using index'
2, 'DEPENDENT SUBQUERY', 'order_line', 'ref',
'PRIMARY,order_line_ol_i_id,order_line_ol_o_id,order_line_test',
'order_line_ol_o_id', '4', 'tpcw.orders.o_id', 1, 'Using where'
3, 'SUBQUERY', '', '', '', '', '', '', , 'Select tables optimized away'
----------------------------------------------------------------------------
I find the index "order_test" be set in the possible_keys but not set in
the keys.
So i alter the sql statement with "use index" statement.
Follow is the explain output after alter:
----------------------------------------------------------------------------
1, 'PRIMARY', 'orders', 'range', 'orders_test', 'orders_test', '4', '',
519210, 'Using where; Using index; Using temporary; Using filesort'
1, 'PRIMARY', 'order_line', 'ref',
'PRIMARY,order_line_ol_i_id,order_line_ol_o_id,order_line_test',
'order_line_ol_o_id', '4', 'tpcw.orders.o_id', 1, 'Using where'
2, 'DEPENDENT SUBQUERY', 'orders', 'ref',
'PRIMARY,orders_o_c_id,orders_test', 'orders_o_c_id', '5', 'func', 1,
'Using where; Using index'
2, 'DEPENDENT SUBQUERY', 'order_line', 'ref',
'PRIMARY,order_line_ol_i_id,order_line_ol_o_id,order_line_test',
'order_line_ol_o_id', '4', 'tpcw.orders.o_id', 1, 'Using where'
3, 'SUBQUERY', '', '', '', '', '', '', , 'Select tables optimized away'
---------------------------------------------------------------------------------
Performance maybe much more bad.
Now i have two question.
One,Are there any probability for optimization performance using this way?
Two,I am not able to calculate the cost time with the explain output.I have
learned the knowlege about "7.2.2. Estimating Query Performance" in
help.But i can't understand.
thanks
From: 王 旭 <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
CC: mysql@lists.mysql.com, [EMAIL PROTECTED]
Subject: Re: optimize a sql statement
Date: Thu, 21 Jul 2005 18:46:32 +0800
Thank you SGreen.But i can'optimize the sql statement like your way
because this query must describe in one sql statement.So i think i
maybe optimize this sql statement through creating high efficiency
index or describe this sql statement in other method. Rewrite this
sql statement must in one sql statement.
Follow is the original sql statement:
-----------------------------------------------------
SELECT ol_i_id
FROM orders, order_line
WHERE orders.o_id = order_line.ol_o_id and o_id>0 AND
not(order_line.ol_i_id = 5000)
AND orders.o_c_id
IN (SELECT o_c_id FROM orders, order_line
WHERE orders.o_id = order_line.ol_o_id
AND orders.o_id > (SELECT MAX(o_id)-10000 FROM orders)
AND order_line.ol_i_id = 5000)
GROUP BY ol_i_id
ORDER BY SUM(ol_qty)DESC limit 5
-------------------------------------------------------------
Follow is the original explain output:
--------------------------------------------------------------
1, 'PRIMARY', 'orders', 'range', 'PRIMARY', 'PRIMARY', '4', '',
519414, 'Using where; Using temporary; Using filesort'
1, 'PRIMARY', 'order_line', 'ref',
'PRIMARY,order_line_ol_i_id,order_line_ol_o_id',
'order_line_ol_o_id', '4', 'tpcw.orders.o_id', 1, 'Using where'
2, 'DEPENDENT SUBQUERY', 'orders', 'ref', 'PRIMARY,orders_o_c_id',
'orders_o_c_id', '5', 'func', 1, 'Using where; Using index'
2, 'DEPENDENT SUBQUERY', 'order_line', 'ref',
'PRIMARY,order_line_ol_i_id,order_line_ol_o_id',
'order_line_ol_o_id', '4', 'tpcw.orders.o_id', 1, 'Using where'
3, 'SUBQUERY', '', '', '', '', '', '', , 'Select tables optimized
away'
-------------------------------------------------------------------------------
In the mail,Pow give me a way.He advise me to build a index on
orders.o_id and orders.o_c_id.I understanding his mean is table
orders join table order_line using the index.Because orders.o_c_id
be limited in the where statemetn,so the index should be filter date
before Join operation.In this way,the index of orders using much
less data to join the table order_line.But i can't see the effect
when i do like it.I don't know the reason.
Are you have any better way?
From: [EMAIL PROTECTED]
To: 王 旭 <[EMAIL PROTECTED]>
CC: mysql@lists.mysql.com,[EMAIL PROTECTED]
Subject: Re: optimize a sql statement
Date: Tue, 19 Jul 2005 16:10:18 -0400
王 旭 <[EMAIL PROTECTED]> wrote on 07/18/2005 03:12:28 AM:
> NO effect :-(
>
> >From: pow <[EMAIL PROTECTED]>
> >To: 王 旭 <[EMAIL PROTECTED]>
> >Subject: Re: optimize a sql statement
> >Date: Mon, 18 Jul 2005 11:51:23 +0800
> >
> >Do u have composite index on order_line.ol_o_id AND
order_line.ol_i_id?
> >
> >You could try that...
> >王 旭 wrote:
> >
> > > Now,I make this sql statement to easy.
> > >
> > > Follow is the sql statement:
> > >
>
-------------------------------------------------------------------------
> > > SELECT ol_i_id FROM orders,order_line
> > > WHERE order_line.ol_o_id = orders.o_id
> > > GROUP BY ol_i_id
> > >
>
-------------------------------------------------------------------------
> > >
> > > Follow is the explain output:
> > >
>
-------------------------------------------------------------------------
> > > 1, 'SIMPLE', 'orders', 'index', 'PRIMARY', 'PRIMARY', '4',
'',
259231,
> > > 'Using index; Using temporary; Using filesort'
> > >
> > > 1, 'SIMPLE', 'order_line', 'ref',
'PRIMARY,order_line_ol_o_id',
> > > 'PRIMARY', '4', 'tpcw.orders.o_id', 1, ''
> > >
>
-------------------------------------------------------------------------
> > > Can it be optimized?
> > >
> > >> From: 王 旭 <[EMAIL PROTECTED]>
> > >> To: mysql@lists.mysql.com
> > >> Subject: optimize a sql statement
> > >> Date: Sat, 16 Jul 2005 18:24:15 +0800
> > >>
> > >> Follow is my sql statement:
> > >>
>
-------------------------------------------------------------------------------------
>
> > >>
> > >
> > >
> > >>
> > >> SELECT SUM(ol_qty) sumolqty, ol_i_id FROM orders,
order_line WHERE
> > >> orders.o_id = order_line.ol_o_id AND orders.o_id > (SELECT
> > >> MAX(o_id)-10000 FROM orders) AND NOT (order_line.ol_i_id =
5000)
AND
> > >> orders.o_c_id IN (SELECT o_c_id FROM orders, order_line
WHERE
> > >> orders.o_id = order_line.ol_o_id and orders.o_id > (SELECT
> > >> MAX(o_id)-10000 FROM orders) AND order_line.ol_i_id = 5000)
GROUP
BY
> > >> ol_i_id ORDER BY sumolqty DESC limit 50
> > >>
>
--------------------------------------------------------------------------------------
>
> > >>
> > >
> > >
> > >>
> > >> follows are explain output:
> > >>
>
--------------------------------------------------------------------------------------
>
> > >>
> > >
> > >
> > >>
> > >> 1, 'PRIMARY', 'orders', 'range', 'PRIMARY', 'PRIMARY', '4',
'',
> > >> 19398, 'Using where; Using temporary; Using filesort'
> > >>
> > >> 1, 'PRIMARY', 'order_line', 'ref',
> > >> 'PRIMARY,order_line_ol_i_id,order_line_ol_o_id',
> > >> 'order_line_ol_o_id', '4', 'tpcw.orders.o_id', 1, 'Using
where'
> > >>
> > >> 3, 'DEPENDENT SUBQUERY', 'orders', 'ref',
'PRIMARY,orders_o_c_id',
> > >> 'orders_o_c_id', '5', 'func', 1, 'Using where; Using index'
> > >>
> > >> 3, 'DEPENDENT SUBQUERY', 'order_line', 'ref',
> > >> 'PRIMARY,order_line_ol_i_id,order_line_ol_o_id',
> > >> 'order_line_ol_o_id', '4', 'tpcw.orders.o_id', 1, 'Using
where'
> > >>
> > >> 4, 'SUBQUERY', '', '', '', '', '', '', , 'Select tables
optimized
> away'
> > >>
> > >> 2, 'SUBQUERY', '', '', '', '', '', '', , 'Select tables
optimized
> away'
> > >>
>
------------------------------------------------------------------------------------------
>
> > >>
> > >
> > >
> > >>
> > >>
> > >> This sql statement performance is too bad.Please help me to
optimize
> > >> it .
> > >>
> > >> thanks!
> > >>
Your original query, reformatted only:
SELECT SUM(ol_qty) sumolqty
, ol_i_id
FROM orders, order_line
WHERE orders.o_id = order_line.ol_o_id
AND orders.o_id > (SELECT MAX(o_id)-10000 FROM orders)
AND NOT (order_line.ol_i_id = 5000)
AND orders.o_c_id IN (
SELECT o_c_id
FROM orders, order_line
WHERE orders.o_id = order_line.ol_o_id
and orders.o_id > (SELECT MAX(o_id)-10000
FROM
orders)
AND order_line.ol_i_id = 5000
)
GROUP BY ol_i_id
ORDER BY sumolqty DESC
limit 50
If I try to describe your query, this is how it reads to me:
You are want to get for each order, how many individual items
comprises
that order. You only want to see the top 50 item counts from within
the
10000 most recent orders ignoring those line items where the
ol_i_id =
5000 but including those where the o_c_id is equal to that of an
order
that has a line item where ol_i_id = 5000.
For starters, you could pre-compute the lowest o_id and use that
value
from a variable instead of getting it during execution (yes it will
be a
scalar and only computed once any way.... just bear with me...)
SELECT @min_o_id := max(o_id)-10000 from orders;
SELECT SUM(ol_qty) sumolqty
, ol_i_id
FROM orders, order_line
WHERE orders.o_id = order_line.ol_o_id
AND orders.o_id > @min_o_id
AND NOT (order_line.ol_i_id = 5000)
AND orders.o_c_id IN (
SELECT o_c_id
FROM orders, order_line
WHERE orders.o_id = order_line.ol_o_id
and orders.o_id > @min_o_id
AND order_line.ol_i_id = 5000
)
GROUP BY ol_i_id
ORDER BY sumolqty DESC
limit 50
Just doing that eliminates two subqueries, simplifying your total
query
execution plan. If I rewrite the query to use explicit JOINs and
eliminated the subqueries, your query turns into:
/* q1: limit queries to the last 10000 orders */
SELECT @min_o_id := max(o_id)-10000 from orders;
/*q2: locate all customers within the last 10000 orders who
have at least 5000 order_lines in a single order */
CREATE TEMPORARY TABLE tmpOCID
SELECT DISTINCT o_c_id
FROM orders
INNER JOIN order_line
ON orders.o_id = order_line.ol_o_id
and orders.o_id > @min_o_id
AND order_line.ol_i_id = 5000;
/*q3: for each line item position, add up how many units were
listed in
each position (except position 5000) for all customers who had at
least
5000 line items in an order within the last 10000 orders. Limit the
results to those line item positions with the top 50 qty values */
SELECT SUM(ol.ol_qty) sumolqty
, ol_i_id
FROM orders o
INNER JOIN order_line ol
ON orders.o_id = order_line.ol_o_id
AND orders.o_id > @min_o_id
AND NOT (order_line.ol_i_id = 5000)
INNER JOIN tmpOCID t
ON orders.o_c_id = t.o_c_id
GROUP BY ol_i_id
ORDER BY sumolqty DESC
limit 50;
DROP TEMPORARY TABLE tmpOCID;
If you are using a programming language to execute your statements,
just
make sure that you DO NOT close the connection between each
statement and
the whole thing should work as expected.
I think you wanted to know was: the top 50 orders (by ranked by
total
units ordered) within the last 10000 orders that came from
customers who
had orders of at least 5000 line items (within the last 10000
orders).
That changes "q3:" of the last batch to read:
/* q3a: */
SELECT o_id
SUM(ol.ol_qty) sumolqty
FROM orders o
INNER JOIN order_line ol
ON orders.o_id = order_line.ol_o_id
AND orders.o_id > @min_o_id
INNER JOIN tmpOCID t
ON orders.o_c_id = t.o_c_id
GROUP BY o_id
ORDER BY sumolqty DESC
limit 50;
If I did not divine the purpose of your query correctly, just let
me know.
I am more than happy to try again.
Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine
_________________________________________________________________
与联机的朋友进行交流,请使用 MSN Messenger:
http://messenger.msn.com/cn
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]
_________________________________________________________________
与联机的朋友进行交流,请使用 MSN Messenger: http://messenger.msn.com/cn
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]