Sub Selects, Alias Names and stored procedures

2005-11-04 Thread Gordon Bruce
After reading one of the recent posts from Gobi [EMAIL PROTECTED] 
I took his successful query and modified it for one of my tables. It indeed 
produce the correct result, but in the process raised some questions.

1. Why do list_ID and acct_ID not have to be qualified with a table 
name or alias? 

mysql SELECT list_ID, list_Name, acct_ID, list_Qty
    - FROM   lists
    -    INNER JOIN (SELECT acct_ID, max(list_ID) AS list_ID
    -    FROM   lists
    -    GROUP BY acct_id
    -    ) AS t
    -    USING (acct_ID, list_ID)
    - WHERE  list_Active = 'Yes'
    -    AND cpny_ID = 'RER1'
    - LIMIT  100,10;
+-++-+--+
| list_ID | list_Name  | acct_ID | list_Qty |
+-++-+--+
|   3 | Farm   | BA8M    |    0 |
|  10 | Woodbury   | BA8Y    |  100 |
|   2 | Brookview Heights 03-23-04 | BA9O    |  278 |
|   4 | Magnet Mailing | BABA    |  250 |
|   2 | Fall Back  | BABM    |  223 |
|   1 | Contact list   | BACF    |   71 |
|   4 | Friends/Family | BAE2    |   10 |
|   1 | St. Michael    | BAE7    |  139 |
|   2 | JS Prospects   | BAE8    |  196 |
|   1 | Home Focus     | BAE9    |   55 |
+-++-+--+
10 rows in set (0.03 sec)


2. While the subselect does work, it appears to generate a cartesian 
product. Initial guess with 5.0 and stored procedures would be that 

    CREATING TEMPORARY TABLE
    INSERT max values in temporary
    SELECT from main table joined with temporary

    would run faster and still allow this to be done with 1 statement.

However,  even though the explains would indicate that this was so {23508 * 
7354 rows for subselect VS 6060 rows for temporary table}
actual times are {0.03 for subselect VS 0.19 for temporary table} . After doing 
some playing, it is the INSERT into temporary that adds the 
time even though the table was memory resident. Trying a similar request on a 
table with 3.5M rows still favors the subselect 
{27.50 sec for subselect VS 1 min 13.91 sec for temporary table}. 

Has EXPLAIN just not caught up with SUBSELECT logic or is there something else 
going on?



mysql EXPLAIN
    - SELECT list_ID, list_Name, acct_ID, list_Qty
    - FROM   lists
    -    INNER JOIN (SELECT acct_ID, max(list_ID) AS list_ID
    -    FROM   lists
    -    GROUP BY acct_id
    -    ) AS t
    -    USING (acct_ID, list_ID)
    - WHERE  list_Active = 'Yes'
    -    AND cpny_ID = 'RER1'
    - LIMIT  100,10;
++-+++--+-+-+-+---+-+
| id | select_type | table  | type   | possible_keys    | key | 
key_len | ref | rows  | Extra   |
++-+++--+-+-+-+---+-+
|  1 | PRIMARY | derived2 | ALL    | NULL | NULL    | 
NULL    | NULL    |  7354 | |
|  1 | PRIMARY | lists  | eq_ref | PRIMARY,acct_list_ID | PRIMARY | 
6   | t.acct_ID,t.list_ID | 1 | Using where |
|  2 | DERIVED | lists  | index  | NULL | PRIMARY | 
6   | NULL    | 23508 | Using index |
++-+++--+-+-+-+---+-+
3 rows in set (0.01 sec)

mysql SELECT list_ID, list_Name, acct_ID, list_Qty
    - FROM   lists
    -    INNER JOIN (SELECT acct_ID, max(list_ID) AS list_ID
    -    FROM   lists
    -    GROUP BY acct_id
    -    ) AS t
    -    USING (acct_ID, list_ID)
    - WHERE  list_Active = 'Yes'
    -    AND cpny_ID = 'RER1'
    - LIMIT  100,10;
+-++-+--+
| list_ID | list_Name  | acct_ID | list_Qty |
+-++-+--+
|   3 | Farm   | BA8M    |    0 |
|  10 | Woodbury   | BA8Y    |  100 |
|   2 | Brookview Heights 03-23-04 | BA9O    |  278 |
|   4 | Magnet Mailing | BABA    |  250 |
|   2 | Fall Back  | BABM    |  223 |
|   1 | Contact list   | BACF    |   71 |
|   4 | Friends/Family | BAE2    |   10 |
|   1 | St. Michael    | BAE7    |  139 |
|   2 | JS Prospects   | BAE8    |  196 |
|   1 | Home Focus     | BAE9  

Re: Sub Selects, Alias Names and stored procedures

2005-11-04 Thread Rhino
I don't have any idea about your second question but I have a thought on the
first one.

In DB2, which I use most of the time, you don't need to qualify a column
name like list_ID or acct_ID unless it is ambiguous. In this case, neither
one is ambiguous because both get used in single-table SELECT statements. If
I did those same queries in DB2, I would only expect an error message if I
did a join of two tables in which both tables had an acct_ID or list_ID;
then the SQL processor would get confused if the duplicated column names
weren't qualified in a SELECT list, GROUP BY, WHERE, or whatever.

Since DB2 and MySQL are presumably written to the same SQL standard, they
probably have the same rules; you only have to qualify a column name when it
is ambiguous.

That's my guess and I'm sticking to it until I hear otherwise from someone
more familiar with MySQL's implementation of SQL :-)

Rhino
- Original Message - 
From: Gordon Bruce [EMAIL PROTECTED]
To: MySQL List mysql@lists.mysql.com
Sent: Friday, November 04, 2005 2:51 PM
Subject: Sub Selects, Alias Names and stored procedures


After reading one of the recent posts from Gobi [EMAIL PROTECTED]
I took his successful query and modified it for one of my tables. It indeed
produce the correct result, but in the process raised some questions.

1. Why do list_ID and acct_ID not have to be qualified with a table name or
alias?

mysql SELECT list_ID, list_Name, acct_ID, list_Qty
- FROM lists
- INNER JOIN (SELECT acct_ID, max(list_ID) AS list_ID
- FROM lists
- GROUP BY acct_id
- ) AS t
- USING (acct_ID, list_ID)
- WHERE list_Active = 'Yes'
- AND cpny_ID = 'RER1'
- LIMIT 100,10;
+-++-+--+
| list_ID | list_Name | acct_ID | list_Qty |
+-++-+--+
| 3 | Farm | BA8M | 0 |
| 10 | Woodbury | BA8Y | 100 |
| 2 | Brookview Heights 03-23-04 | BA9O | 278 |
| 4 | Magnet Mailing | BABA | 250 |
| 2 | Fall Back | BABM | 223 |
| 1 | Contact list | BACF | 71 |
| 4 | Friends/Family | BAE2 | 10 |
| 1 | St. Michael | BAE7 | 139 |
| 2 | JS Prospects | BAE8 | 196 |
| 1 | Home Focus | BAE9 | 55 |
+-++-+--+
10 rows in set (0.03 sec)


2. While the subselect does work, it appears to generate a cartesian
product. Initial guess with 5.0 and stored procedures would be that

CREATING TEMPORARY TABLE
INSERT max values in temporary
SELECT from main table joined with temporary

would run faster and still allow this to be done with 1 statement.

However, even though the explains would indicate that this was so {23508 *
7354 rows for subselect VS 6060 rows for temporary table}
actual times are {0.03 for subselect VS 0.19 for temporary table} . After
doing some playing, it is the INSERT into temporary that adds the
time even though the table was memory resident. Trying a similar request on
a table with 3.5M rows still favors the subselect
{27.50 sec for subselect VS 1 min 13.91 sec for temporary table}.

Has EXPLAIN just not caught up with SUBSELECT logic or is there something
else going on?



mysql EXPLAIN
- SELECT list_ID, list_Name, acct_ID, list_Qty
- FROM lists
- INNER JOIN (SELECT acct_ID, max(list_ID) AS list_ID
- FROM lists
- GROUP BY acct_id
- ) AS t
- USING (acct_ID, list_ID)
- WHERE list_Active = 'Yes'
- AND cpny_ID = 'RER1'
- LIMIT 100,10;
++-+++--+-+-
+-+---+-+
| id | select_type | table | type | possible_keys | key | key_len | ref |
rows | Extra |
++-+++--+-+-
+-+---+-+
| 1 | PRIMARY | derived2 | ALL | NULL | NULL | NULL | NULL | 7354 | |
| 1 | PRIMARY | lists | eq_ref | PRIMARY,acct_list_ID | PRIMARY | 6 |
t.acct_ID,t.list_ID | 1 | Using where |
| 2 | DERIVED | lists | index | NULL | PRIMARY | 6 | NULL | 23508 | Using
index |
++-+++--+-+-
+-+---+-+
3 rows in set (0.01 sec)

mysql SELECT list_ID, list_Name, acct_ID, list_Qty
- FROM lists
- INNER JOIN (SELECT acct_ID, max(list_ID) AS list_ID
- FROM lists
- GROUP BY acct_id
- ) AS t
- USING (acct_ID, list_ID)
- WHERE list_Active = 'Yes'
- AND cpny_ID = 'RER1'
- LIMIT 100,10;
+-++-+--+
| list_ID | list_Name | acct_ID | list_Qty |
+-++-+--+
| 3 | Farm | BA8M | 0 |
| 10 | Woodbury | BA8Y | 100 |
| 2 | Brookview Heights 03-23-04 | BA9O | 278 |
| 4 | Magnet Mailing | BABA | 250 |
| 2 | Fall Back | BABM | 223 |
| 1 | Contact list | BACF | 71 |
| 4 | Friends/Family | BAE2 | 10 |
| 1 | St. Michael | BAE7 | 139 |
| 2 | JS Prospects | BAE8 | 196 |
| 1 | Home Focus | BAE9 | 55 |
+-++-+--+
10 rows in set (0.03 sec