Kevin Weslowski <[EMAIL PROTECTED]> wrote on 05/26/2005 02:20:46 AM:

> Hi,
> 
> I've got a query that I believe should be giving me better performance 
> than it is, since I've put indexes on all the appropriate columns...I'll 

> include as much info as I can for anyone wishing to help me optimize the 

> query/tables/indicies...thanks in advance for any assistance...
> 
> the query:
> 
> select ind_first_name, ind_last_name, fam_phone, max(iat_date) max_date 
> from tms_individual_account_transactions, tms_families, tms_individuals 
> where
>                      fam_key = ind_fam_key and
>                      ind_key = iat_ind_key and
>                      iat_date <= '2000-01-01'
>                         group by ind_key
>                            order by ind_last_name, ind_first_name;
> 
> it takes ~150 seconds on a PIII 667 MHz, with 640MB RAM...and returns 
> ~700 rows...
> 
> MySQL server version: 4.0.18
> 
> tms_families: ~1000 records
> tms_individuals: ~1700 records
> tms_individual_account_transactions: ~100000 records
> 
> the explain plan:
> 
> +--------+--------+---------------+--------+---------+--------
> +--------+--------+
> | table  | type   | possible_keys | key    | key_len | ref    | rows   | 

> Extra  |
> +--------+--------+---------------+--------+---------+--------
> +--------+--------+
> | tms_families| ALL    | PRIMARY       | [NULL] | [NULL]  | [NULL] | 993 

>     | Using temporary; Using filesort|
> | tms_individuals| ref    | PRIMARY,index_ind_fam_key| 
> index_ind_fam_key| 4       | tms_families.fam_key| 1      |        |
> | tms_individual_account_transactions| ref    | 
> index_iat_ind_key,index_iat_date| index_iat_ind_key| 4       | 
> tms_individuals.ind_key| 79     | Using where|
> +--------+--------+---------------+--------+---------+--------
> +--------+--------+
> 
> 
> 
> now, I know what's killing the query is the max() and group by, but I 
> still think I should get better performance??
> 
> now, the table definitions (sorry for the overkill, but there's 
> referential integrity I needed to maintain with extraneous tables):
> 
> CREATE TABLE `tms_marriage_statuses` (
>    `mst_key` int(11) unsigned NOT NULL default '0',
>    `mst_description` varchar(15) NOT NULL default '',
>    PRIMARY KEY  (`mst_key`),
>    KEY `unique_mst_description` (`mst_description`)
> ) TYPE=InnoDB;
> 
> CREATE TABLE `tms_families` (
>    `fam_key` int(11) unsigned NOT NULL default '0',
>    `fam_name` varchar(50) NOT NULL default '',
>    `fam_sortname` varchar(25) NOT NULL default '',
>    `fam_address` varchar(100) default '',
>    `fam_city` varchar(50) default '',
>    `fam_prov_state` char(2) default '',
>    `fam_postal_zip_code` varchar(10) default '',
>    `fam_phone` varchar(14) default '',
>    `fam_update_user` varchar(15) default '',
>    `fam_update_date` timestamp(14) NOT NULL,
>    `fam_cmts_family_no` int(11) unsigned default '0',
>    `fam_admin_comments` varchar(255) default NULL,
>    PRIMARY KEY  (`fam_key`),
>    UNIQUE KEY `unique_fam_name` (`fam_name`),
>    KEY `index_fam_cmts_family_no` (`fam_cmts_family_no`)
> ) TYPE=InnoDB;
> 
> CREATE TABLE `tms_individuals` (
>    `ind_key` int(11) unsigned NOT NULL default '0',
>    `ind_fam_key` int(11) unsigned NOT NULL default '0',
>    `ind_last_name` varchar(30) NOT NULL default '',
>    `ind_first_name` varchar(20) NOT NULL default '',
>    `ind_middle_name` varchar(20) default NULL,
>    `ind_name_title` varchar(6) default NULL,
>    `ind_name_suffix` char(3) default NULL,
>    `ind_marriage_status` int(11) unsigned NOT NULL default '0',
>    `ind_sex` char(1) NOT NULL default '',
>    `ind_join_date` date default NULL,
>    `ind_birth_date` date default NULL,
>    `ind_envelope_number` varchar(10) default NULL,
>    `ind_update_user` varchar(15) default NULL,
>    `ind_update_date` timestamp(14) NOT NULL,
>    `ind_admin_comments` varchar(255) default NULL,
>    `ind_cmts_member_no` int(11) unsigned default NULL,
>    `ind_deceased` char(1) NOT NULL default 'N',
>    `ind_member` char(1) NOT NULL default 'Y',
>    PRIMARY KEY  (`ind_key`),
>    KEY `index_ind_fam_key` (`ind_fam_key`),
>    KEY `index_ind_marriage_status` (`ind_marriage_status`),
>    KEY `index_ind_cmts_member_no` (`ind_cmts_member_no`),
>    CONSTRAINT `tms_individuals_ibfk_1` FOREIGN KEY (`ind_fam_key`) 
> REFERENCES `tms_families` (`fam_key`) ON DELETE CASCADE ON UPDATE 
CASCADE,
>    CONSTRAINT `tms_individuals_ibfk_2` FOREIGN KEY 
> (`ind_marriage_status`) REFERENCES `tms_marriage_statuses` (`mst_key`)
> ) TYPE=InnoDB;
> 
> CREATE TABLE `tms_accounts` (
>    `acc_key` int(11) unsigned NOT NULL default '0',
>    `acc_number` int(11) unsigned NOT NULL default '0',
>    `acc_description` varchar(50) NOT NULL default '',
>    `acc_update_user` varchar(15) default '',
>    `acc_update_date` timestamp(14) NOT NULL,
>    PRIMARY KEY  (`acc_key`),
>    UNIQUE KEY `unique_acc_number` (`acc_number`)
> ) TYPE=InnoDB;
> 
> CREATE TABLE `tms_individual_account_transactions` (
>    `iat_key` int(11) unsigned NOT NULL default '0',
>    `iat_ind_key` int(11) unsigned NOT NULL default '0',
>    `iat_acc_key` int(11) unsigned NOT NULL default '0',
>    `iat_date` date NOT NULL default '0000-00-00',
>    `iat_amount` decimal(10,2) NOT NULL default '0.00',
>    `iat_update_user` varchar(15) default '',
>    `iat_update_date` timestamp(14) NOT NULL,
>    `iat_comments` varchar(100) default NULL,
>    `iat_post_datetime` datetime default NULL,
>    PRIMARY KEY  (`iat_key`),
>    KEY `index_iat_ind_key` (`iat_ind_key`),
>    KEY `index_iat_acc_key` (`iat_acc_key`),
>    KEY `index_iat_date` (`iat_date`),
>    CONSTRAINT `fk_iat_acc_key` FOREIGN KEY (`iat_acc_key`) REFERENCES 
> `tms_accounts` (`acc_key`),
>    CONSTRAINT `fk_iat_ind_key` FOREIGN KEY (`iat_ind_key`) REFERENCES 
> `tms_individuals` (`ind_key`) ON DELETE CASCADE ON UPDATE CASCADE
> ) TYPE=InnoDB;
> 
> 
> 
> 
> I think that's all the pertinent info...if I missed anything else that 
> would be helpful in solving this, let me know...
> 
> Thanks all!
> 
> Kevin
> 

THANK YOU! THANK YOU! THANK YOU! for posting your SHOW CREATE TABLE 
Statements and the EXPLAIN plan for the query. If we could get everyone 
else to do just that, this list would be a much nicer place to lurk. 8-D

Here is your original query vertically reformatted (The "vertical" format 
survives message wrapping better):

SELECT ind_first_name
        , ind_last_name
        , fam_phone
        , max(iat_date) max_date 
FROM tms_individual_account_transactions
        , tms_families
        , tms_individuals 
WHERE fam_key = ind_fam_key
        and ind_key = iat_ind_key
        and iat_date <= '2000-01-01'
GROUP BY ind_key
ORDER BY ind_last_name, ind_first_name;

IF you run it without the GROUP BY and ORDER BY clauses how fast is it?
If you changed your query to say:

SELECT ind_first_name
        , ind_last_name
        , fam_phone
        , max(iat_date) max_date 
FROM tms_individual_account_transactions
        , tms_families
        , tms_individuals 
WHERE fam_key = ind_fam_key
        and ind_key = iat_ind_key
        and iat_date <= '2000-01-01'
GROUP BY 2,1,3;

you will be grouping on all of your non-aggregated columns (most RDBMS 
systems would have rejected your query on the fact that you weren't doing 
just that), not some column that is not even part of the results, AND as a 
side effect of the GROUP BY, you get sorting for free. That's why I am 
grouping on ind_last_name and ind_first_name (to duplicate your original 
ORDER BY) which leaves fam_phone listed last (because it needed to be 
there for this to be complete).

It's my pet peeve to leave queries in this format (implicit INNER JOINS). 
I firmly believe that the comma separated format is just too easily 
abused. Here is my updated query reformatted to use explicit JOIN 
statements:

SELECT ind_first_name
        , ind_last_name
        , fam_phone
        , max(iat_date) max_date 
FROM tms_individual_account_transactions
INNER JOIN tms_families
        ON fam_key = ind_fam_key
INNER JOIN tms_individuals
        ON ind_key = iat_ind_key
WHERE iat_date <= '2000-01-01'
GROUP BY 2,1,3;

Another thing you might consider is to pre-compute a table of 
MAX(idat_date) values for each iat_ind_key for the dates <='2000-01-01' 
then JOINing the other two tables to that in order to lookup the textual 
information. Assuming there is an average of 10 transactions per 
individual this will speed up your JOINs by a factor of about 10 (it will 
take appx 1/10 the time to create the final report)

CREATE TEMPORARY TABLE tmpMaxDates (KEY (iat_ind_key))
SELECT iat_ind_key, max(iat_date) max_date
FROM tms_individual_account_transactions
WHERE iat_date <= '2001-01-01'
GROUP BY 1;

SELECT ind_first_name
        , ind_last_name
        , fam_phone
        , max_date 
FROM tmpMaxDates
INNER JOIN tms_families
        ON fam_key = ind_fam_key
INNER JOIN tms_individuals
        ON ind_key = iat_ind_key
ORDER BY ind_last_name, ind_first_name;

DROP TEMPORARY TABLE tmpMaxDates;

This technique will probably reduce your total time to a few seconds or 
less (The longest part of that will be to build tmpMaxDates). Let us know 
what works and what doesn't, OK?

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Reply via email to