王 旭 <[EMAIL PROTECTED]> wrote on 07/18/2005 03:12:28 AM: > NO effect :-( > > >From: pow <[EMAIL PROTECTED]> > >To: 王 旭 <[EMAIL PROTECTED]> > >Subject: Re: optimize a sql statement > >Date: Mon, 18 Jul 2005 11:51:23 +0800 > > > >Do u have composite index on order_line.ol_o_id AND order_line.ol_i_id? > > > >You could try that... > >王 旭 wrote: > > > > > Now,I make this sql statement to easy. > > > > > > Follow is the sql statement: > > > > ------------------------------------------------------------------------- > > > SELECT ol_i_id FROM orders,order_line > > > WHERE order_line.ol_o_id = orders.o_id > > > GROUP BY ol_i_id > > > > ------------------------------------------------------------------------- > > > > > > Follow is the explain output: > > > > ------------------------------------------------------------------------- > > > 1, 'SIMPLE', 'orders', 'index', 'PRIMARY', 'PRIMARY', '4', '', 259231, > > > 'Using index; Using temporary; Using filesort' > > > > > > 1, 'SIMPLE', 'order_line', 'ref', 'PRIMARY,order_line_ol_o_id', > > > 'PRIMARY', '4', 'tpcw.orders.o_id', 1, '' > > > > ------------------------------------------------------------------------- > > > Can it be optimized? > > > > > >> From: 王 旭 <[EMAIL PROTECTED]> > > >> To: mysql@lists.mysql.com > > >> Subject: optimize a sql statement > > >> Date: Sat, 16 Jul 2005 18:24:15 +0800 > > >> > > >> Follow is my sql statement: > > >> > ------------------------------------------------------------------------------------- > > > >> > > > > > > > > >> > > >> SELECT SUM(ol_qty) sumolqty, ol_i_id FROM orders, order_line WHERE > > >> orders.o_id = order_line.ol_o_id AND orders.o_id > (SELECT > > >> MAX(o_id)-10000 FROM orders) AND NOT (order_line.ol_i_id = 5000) AND > > >> orders.o_c_id IN (SELECT o_c_id FROM orders, order_line WHERE > > >> orders.o_id = order_line.ol_o_id and orders.o_id > (SELECT > > >> MAX(o_id)-10000 FROM orders) AND order_line.ol_i_id = 5000) GROUP BY > > >> ol_i_id ORDER BY sumolqty DESC limit 50 > > >> > -------------------------------------------------------------------------------------- > > > >> > > > > > > > > >> > > >> follows are explain output: > > >> > -------------------------------------------------------------------------------------- > > > >> > > > > > > > > >> > > >> 1, 'PRIMARY', 'orders', 'range', 'PRIMARY', 'PRIMARY', '4', '', > > >> 19398, 'Using where; Using temporary; Using filesort' > > >> > > >> 1, 'PRIMARY', 'order_line', 'ref', > > >> 'PRIMARY,order_line_ol_i_id,order_line_ol_o_id', > > >> 'order_line_ol_o_id', '4', 'tpcw.orders.o_id', 1, 'Using where' > > >> > > >> 3, 'DEPENDENT SUBQUERY', 'orders', 'ref', 'PRIMARY,orders_o_c_id', > > >> 'orders_o_c_id', '5', 'func', 1, 'Using where; Using index' > > >> > > >> 3, 'DEPENDENT SUBQUERY', 'order_line', 'ref', > > >> 'PRIMARY,order_line_ol_i_id,order_line_ol_o_id', > > >> 'order_line_ol_o_id', '4', 'tpcw.orders.o_id', 1, 'Using where' > > >> > > >> 4, 'SUBQUERY', '', '', '', '', '', '', , 'Select tables optimized > away' > > >> > > >> 2, 'SUBQUERY', '', '', '', '', '', '', , 'Select tables optimized > away' > > >> > ------------------------------------------------------------------------------------------ > > > >> > > > > > > > > >> > > >> > > >> This sql statement performance is too bad.Please help me to optimize > > >> it . > > >> > > >> thanks! > > >>
Your original query, reformatted only: SELECT SUM(ol_qty) sumolqty , ol_i_id FROM orders, order_line WHERE orders.o_id = order_line.ol_o_id AND orders.o_id > (SELECT MAX(o_id)-10000 FROM orders) AND NOT (order_line.ol_i_id = 5000) AND orders.o_c_id IN ( SELECT o_c_id FROM orders, order_line WHERE orders.o_id = order_line.ol_o_id and orders.o_id > (SELECT MAX(o_id)-10000 FROM orders) AND order_line.ol_i_id = 5000 ) GROUP BY ol_i_id ORDER BY sumolqty DESC limit 50 If I try to describe your query, this is how it reads to me: You are want to get for each order, how many individual items comprises that order. You only want to see the top 50 item counts from within the 10000 most recent orders ignoring those line items where the ol_i_id = 5000 but including those where the o_c_id is equal to that of an order that has a line item where ol_i_id = 5000. For starters, you could pre-compute the lowest o_id and use that value from a variable instead of getting it during execution (yes it will be a scalar and only computed once any way.... just bear with me...) SELECT @min_o_id := max(o_id)-10000 from orders; SELECT SUM(ol_qty) sumolqty , ol_i_id FROM orders, order_line WHERE orders.o_id = order_line.ol_o_id AND orders.o_id > @min_o_id AND NOT (order_line.ol_i_id = 5000) AND orders.o_c_id IN ( SELECT o_c_id FROM orders, order_line WHERE orders.o_id = order_line.ol_o_id and orders.o_id > @min_o_id AND order_line.ol_i_id = 5000 ) GROUP BY ol_i_id ORDER BY sumolqty DESC limit 50 Just doing that eliminates two subqueries, simplifying your total query execution plan. If I rewrite the query to use explicit JOINs and eliminated the subqueries, your query turns into: /* q1: limit queries to the last 10000 orders */ SELECT @min_o_id := max(o_id)-10000 from orders; /*q2: locate all customers within the last 10000 orders who have at least 5000 order_lines in a single order */ CREATE TEMPORARY TABLE tmpOCID SELECT DISTINCT o_c_id FROM orders INNER JOIN order_line ON orders.o_id = order_line.ol_o_id and orders.o_id > @min_o_id AND order_line.ol_i_id = 5000; /*q3: for each line item position, add up how many units were listed in each position (except position 5000) for all customers who had at least 5000 line items in an order within the last 10000 orders. Limit the results to those line item positions with the top 50 qty values */ SELECT SUM(ol.ol_qty) sumolqty , ol_i_id FROM orders o INNER JOIN order_line ol ON orders.o_id = order_line.ol_o_id AND orders.o_id > @min_o_id AND NOT (order_line.ol_i_id = 5000) INNER JOIN tmpOCID t ON orders.o_c_id = t.o_c_id GROUP BY ol_i_id ORDER BY sumolqty DESC limit 50; DROP TEMPORARY TABLE tmpOCID; If you are using a programming language to execute your statements, just make sure that you DO NOT close the connection between each statement and the whole thing should work as expected. I think you wanted to know was: the top 50 orders (by ranked by total units ordered) within the last 10000 orders that came from customers who had orders of at least 5000 line items (within the last 10000 orders). That changes "q3:" of the last batch to read: /* q3a: */ SELECT o_id SUM(ol.ol_qty) sumolqty FROM orders o INNER JOIN order_line ol ON orders.o_id = order_line.ol_o_id AND orders.o_id > @min_o_id INNER JOIN tmpOCID t ON orders.o_c_id = t.o_c_id GROUP BY o_id ORDER BY sumolqty DESC limit 50; If I did not divine the purpose of your query correctly, just let me know. I am more than happy to try again. Shawn Green Database Administrator Unimin Corporation - Spruce Pine