Tobias Schultze wrote:
Hello,

I'm working on an application for my bachelor thesis.

I'm having a performance problem with a SQL-Query in MySQL5.

I hoped someone can easily enlighten me in this issue.

The schema:

CREATE TABLE IF NOT EXISTS `athletes` (

  `id` int(11) NOT NULL AUTO_INCREMENT,

  `last_name` varchar(20) NOT NULL,

  `first_name` varchar(20) NOT NULL,

  `gender` enum('male','female') NOT NULL,

  `birthday` date NOT NULL,

  `country` char(2) NOT NULL,

  `club_id` int(11) NOT NULL,

  `is_active` tinyint(1) NOT NULL,

  PRIMARY KEY (`id`),

  KEY `gender_index_idx` (`gender`),

  KEY `is_active_index_idx` (`is_active`),

  KEY `club_id_idx` (`club_id`)

) ENGINE=MyISAM  DEFAULT CHARSET=utf8 COMMENT='Sportler bzw Mitglieder des
Verbandes';

-- --------------------------------------------------------

CREATE TABLE IF NOT EXISTS `matches` (

  `id` int(11) NOT NULL AUTO_INCREMENT,

  `teammatch_id` int(11) NOT NULL,

  `match_type` varchar(10) NOT NULL,

  `team1_player_id` int(11) DEFAULT NULL,

  `team1_partner_id` int(11) DEFAULT NULL,

  `team2_player_id` int(11) DEFAULT NULL,

  `team2_partner_id` int(11) DEFAULT NULL,

  `team1_score` tinyint(3) unsigned DEFAULT NULL,

  `team2_score` tinyint(3) unsigned DEFAULT NULL,

  PRIMARY KEY (`id`),

  KEY `teammatch_id_idx` (`teammatch_id`),

  KEY `team1_player_id_idx` (`team1_player_id`),

  KEY `team1_partner_id_idx` (`team1_partner_id`),

  KEY `team2_player_id_idx` (`team2_player_id`),

  KEY `team2_partner_id_idx` (`team2_partner_id`)

) ENGINE=MyISAM  DEFAULT CHARSET=utf8 COMMENT='Spiele zwischen zwei oder
vier Sportlern (Einzel und Doppel)' AUTO_INCREMENT=46665 ;

I want to get all matches for each athlete and calculate statistics such as
number of matches etc.

The basic very simplified query is like

SELECT a.id, COUNT(*)

FROM athletes a

LEFT JOIN matches m ON ( m.team1_player_id = a.id OR m.team1_partner_id = a.id OR m.team2_player_id = a.id OR
            m.team2_partner_id = a.id

)

WHERE a.gender = 'female'

GROUP BY a.id

Now the problem is, that mysql uses a full table scan to retrieve the
matches for an athlete, so the execution takes many seconds or even worse.

An athlete can be referenced in any of the m.team1_player_id OR
m.team1_partner_id OR m.team2_player_id OR m.team2_partner_id. (That allows
doubles matches.)

Why is an full table scan necessary although there is an index on each of
these fields? So an index exists for each OR-part of the join condition...

Here the execution plan:


id select_type table type possible_keys key key_len ref rows Extra

1

SIMPLE

a

ref

gender_index_idx

gender_index_idx

1

const

2193

Using where; Using temporary; Using filesort


1

SIMPLE

m

ALL

team1_player_id_idx,team1_partner_id_idx,team2_pla...

NULL

NULL

NULL

46664

Joining on each fields like the following is very fast and uses the index
but of course doesn't give me the expected result.

FROM athletes a

LEFT JOIN matches m ON (a.id = m.team1_player_id)

LEFT JOIN matches m2 ON (a.id = m2.team2_player_id)

Maybe I need to do a workaround using a UNION?

But this doesn't help either: (It takes 76 seconds)

FROM athletes a

LEFT JOIN (

(

SELECT team1_player_id AS player_id, teammatch_id, match_type, team1_score,
team2_score, team1_points, team2_points, no_fight
FROM matches

) UNION
(

SELECT team1_partner_id, teammatch_id, match_type, team1_score, team2_score,
team1_points, team2_points, no_fight
FROM matches

)

UNION
(

SELECT team2_player_id, teammatch_id, match_type, team1_score, team2_score,
team1_points, team2_points, no_fight
FROM matches

)

UNION
(

SELECT team2_partner_id, teammatch_id, match_type, team1_score, team2_score,
team1_points, team2_points, no_fight
FROM matches

)

) m ON (a.id = m.player_id)

I hope someone can help me with this.

Thanks in advance.


I think the problem here is that you have an ordered vector table (main1, partner1, main2, partner2) which requires you to optionally match on 4 separate columns to make your JOIN condition because the athlete you want to locate could be in any of those 4 places in the match record.

What would be faster is to normalize your match data into two tables. The first table would be just match information (id, date, location, etc) and another table of match-athlete pairs (match_id, athlete_id)

alternatively you would have to use a query like
(
select...
FROM athletes
INNER JOIN matches
  ON ... matches.member1
) UNION (
select...
FROM athletes
INNER JOIN matches
  ON ... matches.partner1
) UNION (
select...
FROM athletes
INNER JOIN matches
  ON ... matches.member2
) UNION (
select...
FROM athletes
INNER JOIN matches
  ON ... matches.partner2
)

in order to use the indexes for the JOINS.

--
Shawn Green, MySQL Senior Support Engineer
Sun Microsystems, Inc.
Office: Blountville, TN



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql?unsub=arch...@jab.org

Reply via email to