What it comes down to is that MySQL can only use 1 index per table per query. The moment your query includes OR examining different columns, a full table scan is the only option.
One typical way to implement this is to use UNIONS as Mr. Green suggested: SELECT aid, count(*) FROM ( SELECT a.id FROM athletes a LEFT JOIN matches m ON (m.team1_player_id = a.id) UNION SELECT a.id FROM athletes a LEFT JOIN matches m ON (m.team1_partner_id = a.id) UNION SELECT a.id FROM athletes a LEFT JOIN matches m ON (m.team2_player_id = a.id) UNION SELECT a.id FROM athletes a LEFT JOIN matches m ON (m.team2_partner_id = a.id) ) AS tmp GROUP BY aid On Tue, Jan 5, 2010 at 9:18 AM, Shawn Green <shawn.gr...@sun.com> wrote: > Tobias Schultze wrote: >> >> Hello, >> >> >> I'm working on an application for my bachelor thesis. >> >> I'm having a performance problem with a SQL-Query in MySQL5. >> >> I hoped someone can easily enlighten me in this issue. >> >> >> The schema: >> >> >> CREATE TABLE IF NOT EXISTS `athletes` ( >> >> `id` int(11) NOT NULL AUTO_INCREMENT, >> >> `last_name` varchar(20) NOT NULL, >> >> `first_name` varchar(20) NOT NULL, >> >> `gender` enum('male','female') NOT NULL, >> >> `birthday` date NOT NULL, >> >> `country` char(2) NOT NULL, >> >> `club_id` int(11) NOT NULL, >> >> `is_active` tinyint(1) NOT NULL, >> >> PRIMARY KEY (`id`), >> >> KEY `gender_index_idx` (`gender`), >> >> KEY `is_active_index_idx` (`is_active`), >> >> KEY `club_id_idx` (`club_id`) >> >> ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COMMENT='Sportler bzw Mitglieder des >> Verbandes'; >> >> >> -- -------------------------------------------------------- >> >> >> CREATE TABLE IF NOT EXISTS `matches` ( >> >> `id` int(11) NOT NULL AUTO_INCREMENT, >> >> `teammatch_id` int(11) NOT NULL, >> >> `match_type` varchar(10) NOT NULL, >> >> `team1_player_id` int(11) DEFAULT NULL, >> >> `team1_partner_id` int(11) DEFAULT NULL, >> >> `team2_player_id` int(11) DEFAULT NULL, >> >> `team2_partner_id` int(11) DEFAULT NULL, >> >> `team1_score` tinyint(3) unsigned DEFAULT NULL, >> >> `team2_score` tinyint(3) unsigned DEFAULT NULL, >> >> PRIMARY KEY (`id`), >> >> KEY `teammatch_id_idx` (`teammatch_id`), >> >> KEY `team1_player_id_idx` (`team1_player_id`), >> >> KEY `team1_partner_id_idx` (`team1_partner_id`), >> >> KEY `team2_player_id_idx` (`team2_player_id`), >> >> KEY `team2_partner_id_idx` (`team2_partner_id`) >> >> ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COMMENT='Spiele zwischen zwei oder >> vier Sportlern (Einzel und Doppel)' AUTO_INCREMENT=46665 ; >> >> >> >> I want to get all matches for each athlete and calculate statistics such >> as >> number of matches etc. >> >> The basic very simplified query is like >> >> >> SELECT a.id, COUNT(*) >> >> FROM athletes a >> >> LEFT JOIN matches m ON ( >> m.team1_player_id = a.id OR >> m.team1_partner_id = a.id OR >> m.team2_player_id = a.id OR >> m.team2_partner_id = a.id >> >> ) >> >> WHERE a.gender = 'female' >> >> GROUP BY a.id >> >> >> >> Now the problem is, that mysql uses a full table scan to retrieve the >> matches for an athlete, so the execution takes many seconds or even worse. >> >> An athlete can be referenced in any of the m.team1_player_id OR >> m.team1_partner_id OR m.team2_player_id OR m.team2_partner_id. (That >> allows >> doubles matches.) >> >> Why is an full table scan necessary although there is an index on each of >> these fields? So an index exists for each OR-part of the join condition... >> >> >> Here the execution plan: >> >> >> >> id >> select_type >> table >> type >> possible_keys >> key >> key_len >> ref >> rows >> Extra >> >> 1 >> >> SIMPLE >> >> a >> >> ref >> >> gender_index_idx >> >> gender_index_idx >> >> 1 >> >> const >> >> 2193 >> >> Using where; Using temporary; Using filesort >> >> >> 1 >> >> SIMPLE >> >> m >> >> ALL >> >> team1_player_id_idx,team1_partner_id_idx,team2_pla... >> >> NULL >> >> NULL >> >> NULL >> >> 46664 >> >> >> >> >> Joining on each fields like the following is very fast and uses the index >> but of course doesn't give me the expected result. >> >> >> FROM athletes a >> >> LEFT JOIN matches m ON (a.id = m.team1_player_id) >> >> LEFT JOIN matches m2 ON (a.id = m2.team2_player_id) >> >> >> >> Maybe I need to do a workaround using a UNION? >> >> But this doesn't help either: (It takes 76 seconds) >> >> >> FROM athletes a >> >> LEFT JOIN ( >> >> ( >> >> SELECT team1_player_id AS player_id, teammatch_id, match_type, >> team1_score, >> team2_score, team1_points, team2_points, no_fight >> FROM matches >> >> ) >> UNION >> ( >> >> SELECT team1_partner_id, teammatch_id, match_type, team1_score, >> team2_score, >> team1_points, team2_points, no_fight >> FROM matches >> >> ) >> >> UNION >> ( >> >> SELECT team2_player_id, teammatch_id, match_type, team1_score, >> team2_score, >> team1_points, team2_points, no_fight >> FROM matches >> >> ) >> >> UNION >> ( >> >> SELECT team2_partner_id, teammatch_id, match_type, team1_score, >> team2_score, >> team1_points, team2_points, no_fight >> FROM matches >> >> ) >> >> ) m ON (a.id = m.player_id) >> >> >> >> >> I hope someone can help me with this. >> >> Thanks in advance. >> >> > > I think the problem here is that you have an ordered vector table (main1, > partner1, main2, partner2) which requires you to optionally match on 4 > separate columns to make your JOIN condition because the athlete you want to > locate could be in any of those 4 places in the match record. > > What would be faster is to normalize your match data into two tables. The > first table would be just match information (id, date, location, etc) and > another table of match-athlete pairs (match_id, athlete_id) > > alternatively you would have to use a query like > ( > select... > FROM athletes > INNER JOIN matches > ON ... matches.member1 > ) UNION ( > select... > FROM athletes > INNER JOIN matches > ON ... matches.partner1 > ) UNION ( > select... > FROM athletes > INNER JOIN matches > ON ... matches.member2 > ) UNION ( > select... > FROM athletes > INNER JOIN matches > ON ... matches.partner2 > ) > > in order to use the indexes for the JOINS. > > -- > Shawn Green, MySQL Senior Support Engineer > Sun Microsystems, Inc. > Office: Blountville, TN > > > > -- > MySQL General Mailing List > For list archives: http://lists.mysql.com/mysql > To unsubscribe: http://lists.mysql.com/mysql?unsub=mdyk...@gmail.com > > -- - michael dykman - mdyk...@gmail.com May the Source be with you. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=arch...@jab.org