RE: Slow when using sub-query

2010-06-03 Thread Jerry Schwartz
>-Original Message-
>From: vegiv...@gmail.com [mailto:vegiv...@gmail.com] On Behalf Of Johan De
>Meersman
>Sent: Thursday, June 03, 2010 6:52 AM
>To: je...@gii.co.jp
>Cc: mysql@lists.mysql.com
>Subject: Re: Slow when using sub-query
>
>The short answer is that the optimizer is amazingly stupid about subqueries,
>and it assumes that they are dependent even when they're not - that subquery
>gets executed for every row in your main query.
>
[JS] That's what I figured was happening -- either that, or my CPU was on 
strike for higher voltages.

>The fastest way to do this, would probably be to run your subquery, have
>your code assemble the appropriate IN clause, and then run your main query.
>
[JS] A lot of what I do is one-off things, and I usually wind up using the 
CLI. Rewriting the whole business as a JOIN to a derived table worked very 
well.

>The long answer is that there's a rather good Advanced Tuning course that
>addresses all this and more, as does Baron & C° 's excellent MySQL book.
>
[JS] No doubt. Our database is so small, and my duties are so diverse, that 
tuning only gets my attention when a problem interferes with other people's 
productivity.

Thanks.

Regards,

Jerry Schwartz
Global Information Incorporated
195 Farmington Ave.
Farmington, CT 06032

860.674.8796 / FAX: 860.674.8341

www.the-infoshop.com





-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Slow when using sub-query

2010-06-03 Thread Johan De Meersman
The short answer is that the optimizer is amazingly stupid about subqueries,
and it assumes that they are dependent even when they're not - that subquery
gets executed for every row in your main query.

The fastest way to do this, would probably be to run your subquery, have
your code assemble the appropriate IN clause, and then run your main query.

The long answer is that there's a rather good Advanced Tuning course that
addresses all this and more, as does Baron & C° 's excellent MySQL book.



On Wed, Jun 2, 2010 at 10:05 PM, Jerry Schwartz  wrote:

> I've heard that sub-queries aren't well-optimized, but this case seems
> ridiculous.
>
> First, a little setup:
>
> SELECT pub_id FROM pub WHERE pub_code = 'GD' INTO @P;
>
> === Inner Query by Itself ===
>
> us-gii >SELECT prod_pub_prod_id FROM prod
>-> WHERE pub_id = @P
>-> AND prod_discont = 0
>-> GROUP BY prod_pub_prod_id
>-> HAVING COUNT(*) > 1;
> +--+
> | prod_pub_prod_id |
> +--+
> | NULL |
> | GDAE0106ICR  |
> | GDME0002TR   |
> | GDME0023IAR  |
> | GDME0059IAR  |
> +--+
> 5 rows in set (0.05 sec)
>
> us-gii >EXPLAIN
>-> SELECT prod_pub_prod_id FROM prod
>-> WHERE pub_id = @P
>-> AND prod_discont = 0
>-> GROUP BY prod_pub_prod_id
>-> HAVING COUNT(*) > 1
>-> \G
> *** 1. row ***
>   id: 1
>  select_type: SIMPLE
>table: prod
> type: ref
> possible_keys: pub_id,pub_id_2
>  key: pub_id
>  key_len: 48
>  ref: const
> rows: 1543
>Extra: Using where; Using temporary; Using filesort
>
> === Outer Query without Inner Query ===
>
> us-gii >SELECT prod_num FROM prod
>-> WHERE pub_id = @P
>->  AND prod_pub_prod_id IN
>->  (
>->  NULL,
>->  'GDAE0106ICR',
>->  'GDME0002TR',
>->  'GDME0023IAR',
>->  'GDME0059IAR'
>->  )
>-> ;
> +--+
> | prod_num |
> +--+
> |83298 |
> |85092 |
> |88728 |
> |97231 |
> |97235 |
> |98368 |
> |   107693 |
> |   112461 |
> +--+
> 8 rows in set (0.01 sec)
>
> us-gii >EXPLAIN
>-> SELECT prod_num FROM prod
>-> WHERE pub_id = @P
>->  AND prod_pub_prod_id IN
>->  (
>->  NULL,
>->  'GDAE0106ICR',
>->  'GDME0002TR',
>->  'GDME0023IAR',
>->  'GDME0059IAR'
>->  )
>-> \G
> *** 1. row ***
>   id: 1
>  select_type: SIMPLE
>table: prod
> type: ref
> possible_keys: prod_pub_prod_id,pub_id,pub_id_2
>  key: pub_id
>  key_len: 48
>  ref: const
> rows: 1543
>Extra: Using where
>
> === Outer Query with Sub-query ===
>
> us-gii >EXPLAIN
>-> SELECT prod_num FROM prod
>-> WHERE pub_id = @P
>->  AND prod_pub_prod_id IN
>->  (SELECT prod_pub_prod_id FROM prod
>->  WHERE pub_id = @P
>->  AND prod_discont = 0
>->  GROUP BY prod_pub_prod_id
>->  HAVING COUNT(*) > 1)
>-> \G
> *** 1. row ***
>   id: 1
>  select_type: PRIMARY
>table: prod
> type: ref
> possible_keys: pub_id,pub_id_2
>  key: pub_id
>  key_len: 48
>  ref: const
> rows: 1543
>Extra: Using where
> *** 2. row ***
>   id: 2
>  select_type: DEPENDENT SUBQUERY
>table: prod
> type: index
> possible_keys: pub_id,pub_id_2
>  key: prod_pub_prod_id
>  key_len: 768
>  ref: NULL
> rows: 72
>Extra: Using where; Using filesort
>
> I don't know how long the Outer Query with Sub-query would take: I killed
> it
> after several minutes. I'm guessing that it has to do with the fact that
> the
> inner query is "dependent", but why is that happening?
>
> === Rewritten as Join ===
>
> us-gii >SELECT prod_num FROM prod JOIN
>->  (SELECT prod_pub_prod_id FROM prod
>->  WHERE pub_id = @P
>->  AND prod_discont = 0
>->  GROUP BY prod_pub_prod_id
>->  HAVING COUNT(*) > 1) AS x
>-> ON prod.prod_pub_prod_id = x.prod_pub_prod_id
>-> WHERE prod.pub_id = @P
>->  AND prod.prod_discont = 0;
> +--+
> | prod_num |
> +--+
> |98368 |
> |   107693 |
> |83298 |
> |85092 |
> |88728 |
> |97231 |
> |97235 |
> |   112461 |
> +--+
> 8 rows in set (0.05 sec)
>
> us-gii >EXPLAIN
>-> SELECT prod_num FROM prod JOIN
>->  (SELECT prod_pub_prod_id FROM prod
>->  WHERE pub_id = @P
>->  AND prod_discont = 0
>->  GROUP BY prod_pub_prod_id
>->  HAVING COUNT(*) > 1) AS x
>-> ON prod.prod_pub_prod_id = x.prod_pub_prod_id
>-> WHERE prod.pub_id = @P
>->  AND prod.prod_discont = 0
>-> \G
> *** 1. row ***
>   id: 1
>  select_type: PRIMARY
>t

Slow when using sub-query

2010-06-02 Thread Jerry Schwartz
I've heard that sub-queries aren't well-optimized, but this case seems 
ridiculous.

First, a little setup:

SELECT pub_id FROM pub WHERE pub_code = 'GD' INTO @P;

=== Inner Query by Itself ===

us-gii >SELECT prod_pub_prod_id FROM prod
-> WHERE pub_id = @P
-> AND prod_discont = 0
-> GROUP BY prod_pub_prod_id
-> HAVING COUNT(*) > 1;
+--+
| prod_pub_prod_id |
+--+
| NULL |
| GDAE0106ICR  |
| GDME0002TR   |
| GDME0023IAR  |
| GDME0059IAR  |
+--+
5 rows in set (0.05 sec)

us-gii >EXPLAIN
-> SELECT prod_pub_prod_id FROM prod
-> WHERE pub_id = @P
-> AND prod_discont = 0
-> GROUP BY prod_pub_prod_id
-> HAVING COUNT(*) > 1
-> \G
*** 1. row ***
   id: 1
  select_type: SIMPLE
table: prod
 type: ref
possible_keys: pub_id,pub_id_2
  key: pub_id
  key_len: 48
  ref: const
 rows: 1543
Extra: Using where; Using temporary; Using filesort

=== Outer Query without Inner Query ===

us-gii >SELECT prod_num FROM prod
-> WHERE pub_id = @P
->  AND prod_pub_prod_id IN
->  (
->  NULL,
->  'GDAE0106ICR',
->  'GDME0002TR',
->  'GDME0023IAR',
->  'GDME0059IAR'
->  )
-> ;
+--+
| prod_num |
+--+
|83298 |
|85092 |
|88728 |
|97231 |
|97235 |
|98368 |
|   107693 |
|   112461 |
+--+
8 rows in set (0.01 sec)

us-gii >EXPLAIN
-> SELECT prod_num FROM prod
-> WHERE pub_id = @P
->  AND prod_pub_prod_id IN
->  (
->  NULL,
->  'GDAE0106ICR',
->  'GDME0002TR',
->  'GDME0023IAR',
->  'GDME0059IAR'
->  )
-> \G
*** 1. row ***
   id: 1
  select_type: SIMPLE
table: prod
 type: ref
possible_keys: prod_pub_prod_id,pub_id,pub_id_2
  key: pub_id
  key_len: 48
  ref: const
 rows: 1543
Extra: Using where

=== Outer Query with Sub-query ===

us-gii >EXPLAIN
-> SELECT prod_num FROM prod
-> WHERE pub_id = @P
->  AND prod_pub_prod_id IN
->  (SELECT prod_pub_prod_id FROM prod
->  WHERE pub_id = @P
->  AND prod_discont = 0
->  GROUP BY prod_pub_prod_id
->  HAVING COUNT(*) > 1)
-> \G
*** 1. row ***
   id: 1
  select_type: PRIMARY
table: prod
 type: ref
possible_keys: pub_id,pub_id_2
  key: pub_id
  key_len: 48
  ref: const
 rows: 1543
Extra: Using where
*** 2. row ***
   id: 2
  select_type: DEPENDENT SUBQUERY
table: prod
 type: index
possible_keys: pub_id,pub_id_2
  key: prod_pub_prod_id
  key_len: 768
  ref: NULL
 rows: 72
Extra: Using where; Using filesort

I don't know how long the Outer Query with Sub-query would take: I killed it 
after several minutes. I'm guessing that it has to do with the fact that the 
inner query is "dependent", but why is that happening?

=== Rewritten as Join ===

us-gii >SELECT prod_num FROM prod JOIN
->  (SELECT prod_pub_prod_id FROM prod
->  WHERE pub_id = @P
->  AND prod_discont = 0
->  GROUP BY prod_pub_prod_id
->  HAVING COUNT(*) > 1) AS x
-> ON prod.prod_pub_prod_id = x.prod_pub_prod_id
-> WHERE prod.pub_id = @P
->  AND prod.prod_discont = 0;
+--+
| prod_num |
+--+
|98368 |
|   107693 |
|83298 |
|85092 |
|88728 |
|97231 |
|97235 |
|   112461 |
+--+
8 rows in set (0.05 sec)

us-gii >EXPLAIN
-> SELECT prod_num FROM prod JOIN
->  (SELECT prod_pub_prod_id FROM prod
->  WHERE pub_id = @P
->  AND prod_discont = 0
->  GROUP BY prod_pub_prod_id
->  HAVING COUNT(*) > 1) AS x
-> ON prod.prod_pub_prod_id = x.prod_pub_prod_id
-> WHERE prod.pub_id = @P
->  AND prod.prod_discont = 0
-> \G
*** 1. row ***
   id: 1
  select_type: PRIMARY
table: 
 type: ALL
possible_keys: NULL
  key: NULL
  key_len: NULL
  ref: NULL
 rows: 5
Extra:
*** 2. row ***
   id: 1
  select_type: PRIMARY
table: prod
 type: ref
possible_keys: prod_pub_prod_id,pub_id,pub_id_2
  key: prod_pub_prod_id
  key_len: 768
  ref: x.prod_pub_prod_id
 rows: 1
Extra: Using where
*** 3. row ***
   id: 2
  select_type: DERIVED
table: prod
 type: ref
possible_keys: pub_id,pub_id_2
  key: pub_id
  key_len: 48
  ref:
 rows: 1544
Extra: Using where; Using temporary; Using filesort
3 rows in set (0.03 sec)

=