I've heard that sub-queries aren't well-optimized, but this case seems 
ridiculous.

First, a little setup:

SELECT pub_id FROM pub WHERE pub_code = 'GD' INTO @P;

=== Inner Query by Itself ===

us-gii >SELECT prod_pub_prod_id FROM prod
    -> WHERE pub_id = @P
    -> AND prod_discont = 0
    -> GROUP BY prod_pub_prod_id
    -> HAVING COUNT(*) > 1;
+------------------+
| prod_pub_prod_id |
+------------------+
| NULL             |
| GDAE0106ICR      |
| GDME0002TR       |
| GDME0023IAR      |
| GDME0059IAR      |
+------------------+
5 rows in set (0.05 sec)

us-gii >EXPLAIN
    -> SELECT prod_pub_prod_id FROM prod
    -> WHERE pub_id = @P
    -> AND prod_discont = 0
    -> GROUP BY prod_pub_prod_id
    -> HAVING COUNT(*) > 1
    -> \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: prod
         type: ref
possible_keys: pub_id,pub_id_2
          key: pub_id
      key_len: 48
          ref: const
         rows: 1543
        Extra: Using where; Using temporary; Using filesort

=== Outer Query without Inner Query ===

us-gii >SELECT prod_num FROM prod
    -> WHERE pub_id = @P
    ->  AND prod_pub_prod_id IN
    ->  (
    ->  NULL,
    ->  'GDAE0106ICR',
    ->  'GDME0002TR',
    ->  'GDME0023IAR',
    ->  'GDME0059IAR'
    ->  )
    -> ;
+----------+
| prod_num |
+----------+
|    83298 |
|    85092 |
|    88728 |
|    97231 |
|    97235 |
|    98368 |
|   107693 |
|   112461 |
+----------+
8 rows in set (0.01 sec)

us-gii >EXPLAIN
    -> SELECT prod_num FROM prod
    -> WHERE pub_id = @P
    ->  AND prod_pub_prod_id IN
    ->  (
    ->  NULL,
    ->  'GDAE0106ICR',
    ->  'GDME0002TR',
    ->  'GDME0023IAR',
    ->  'GDME0059IAR'
    ->  )
    -> \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: prod
         type: ref
possible_keys: prod_pub_prod_id,pub_id,pub_id_2
          key: pub_id
      key_len: 48
          ref: const
         rows: 1543
        Extra: Using where

=== Outer Query with Sub-query ===

us-gii >EXPLAIN
    -> SELECT prod_num FROM prod
    -> WHERE pub_id = @P
    ->  AND prod_pub_prod_id IN
    ->          (SELECT prod_pub_prod_id FROM prod
    ->          WHERE pub_id = @P
    ->          AND prod_discont = 0
    ->          GROUP BY prod_pub_prod_id
    ->          HAVING COUNT(*) > 1)
    -> \G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: prod
         type: ref
possible_keys: pub_id,pub_id_2
          key: pub_id
      key_len: 48
          ref: const
         rows: 1543
        Extra: Using where
*************************** 2. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: prod
         type: index
possible_keys: pub_id,pub_id_2
          key: prod_pub_prod_id
      key_len: 768
          ref: NULL
         rows: 72
        Extra: Using where; Using filesort

I don't know how long the Outer Query with Sub-query would take: I killed it 
after several minutes. I'm guessing that it has to do with the fact that the 
inner query is "dependent", but why is that happening?

=== Rewritten as Join ===

us-gii >SELECT prod_num FROM prod JOIN
    ->  (SELECT prod_pub_prod_id FROM prod
    ->  WHERE pub_id = @P
    ->  AND prod_discont = 0
    ->  GROUP BY prod_pub_prod_id
    ->  HAVING COUNT(*) > 1) AS x
    -> ON prod.prod_pub_prod_id = x.prod_pub_prod_id
    -> WHERE prod.pub_id = @P
    ->  AND prod.prod_discont = 0;
+----------+
| prod_num |
+----------+
|    98368 |
|   107693 |
|    83298 |
|    85092 |
|    88728 |
|    97231 |
|    97235 |
|   112461 |
+----------+
8 rows in set (0.05 sec)

us-gii >EXPLAIN
    -> SELECT prod_num FROM prod JOIN
    ->  (SELECT prod_pub_prod_id FROM prod
    ->  WHERE pub_id = @P
    ->  AND prod_discont = 0
    ->  GROUP BY prod_pub_prod_id
    ->  HAVING COUNT(*) > 1) AS x
    -> ON prod.prod_pub_prod_id = x.prod_pub_prod_id
    -> WHERE prod.pub_id = @P
    ->  AND prod.prod_discont = 0
    -> \G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived2>
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 5
        Extra:
*************************** 2. row ***************************
           id: 1
  select_type: PRIMARY
        table: prod
         type: ref
possible_keys: prod_pub_prod_id,pub_id,pub_id_2
          key: prod_pub_prod_id
      key_len: 768
          ref: x.prod_pub_prod_id
         rows: 1
        Extra: Using where
*************************** 3. row ***************************
           id: 2
  select_type: DERIVED
        table: prod
         type: ref
possible_keys: pub_id,pub_id_2
          key: pub_id
      key_len: 48
          ref:
         rows: 1544
        Extra: Using where; Using temporary; Using filesort
3 rows in set (0.03 sec)

=====

What a difference! I don't understand it, though.

Does anyone want to take on the challenge of educating me?

Regards,

Jerry Schwartz
Global Information Incorporated
195 Farmington Ave.
Farmington, CT 06032

860.674.8796 / FAX: 860.674.8341

www.the-infoshop.com






-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql?unsub=arch...@jab.org

Reply via email to