On Mon, Jan 24, 2011 at 2:20 PM, Kendall Gifford <zettab...@gmail.com>wrote:
> > > On Mon, Jan 24, 2011 at 3:40 AM, Joerg Bruehe <joerg.bru...@oracle.com>wrote: > >> Hi everybody! >> >> >> Shawn Green (MySQL) wrote: >> > On 1/21/2011 14:21, Kendall Gifford wrote: >> >> Hello everyone, I've got a database on an old Fedora Core 4 server >> >> running >> >> MySQL 4 (mysql-server.x86_64 4.1.12-2.FC4.1). The database in question >> >> has >> >> just two (InnoDB) tables: >> >> >> >> messages (approx 2.5 million records) >> >> recipients (approx 6.5 million records) >> >> >> >> [[ ... see the original post for the schema details ... ]] >> >> >> >> >> >> I have the following query that is just too slow: >> >> >> >>> SELECT messages.* FROM messages >> >>> INNER JOIN recipients ON recipients.message_id = messages.id >> >>> WHERE recipients.employee_id = X >> >>> GROUP BY messages.id >> >>> ORDER BY sent_at DESC >> >>> LIMIT 0, 25; >> >> >> >> This takes about 44 seconds on average. [[...]] >> >> >> > >> > You need to get rid of the GROUP BY to make this go faster. You can do >> > that by running two queries, one to pick the list of unique >> > recipients.message_id values that match your where condition then >> > another to actually retrieve the message data. [[...]] >> >> I don't want to contradict Shawn, but currently I fail to see the need >> for the GROUP BY: Joining like this >> >> messages INNER JOIN recipients ON recipients.message_id = messages.id >> WHERE recipients.employee_id = X >> >> can return only one row, unless there are multiple "recipients" records >> for the same values of "message_id" and "employee_id". >> >> I don't know whether that can happen in the poster's application, and >> whether it would cause trouble if the result line would occur multiple >> times. >> >> > In my application, there CAN in fact be several "recipients" records with > both the same "message_id" foreign key value AND the same "employee_id" > value (some employees may be a recipient of a message several times over via > alternative addresses and/or aliases). However, as I rework things, I could > probably rework application logic nuke the GROUP BY and just cope, in code, > with these extra "messages" records in my result set. (Just FYI, the SQL > query is simply the default query as created by rails or, more specifically, > ActiveRecord 2.3.9 which I can/will-be optimizing). > > I will additionally be moving this database to a new server. However, for > academic interest, I'll see if I can make time to post the query time(s) > once I change the app, before moving the database to a new (and better > configured) server. > > Just an update for posterity, simply removing the GROUP BY clause of my query above has, overall, no noticeable effect on performance. I suspect server configuration, as pointed out by Reindl, is too much of a bottleneck and is what I first need to change (working on that now). Perhaps the removal of GROUP BY would/will be noticeable if the server configuration for InnoDB tables wasn't so horrendous. I'll find out... -- Kendall Gifford zettab...@gmail.com