On Mon, Jan 24, 2011 at 2:20 PM, Kendall Gifford <zettab...@gmail.com>wrote:

>
>
> On Mon, Jan 24, 2011 at 3:40 AM, Joerg Bruehe <joerg.bru...@oracle.com>wrote:
>
>> Hi everybody!
>>
>>
>> Shawn Green (MySQL) wrote:
>> > On 1/21/2011 14:21, Kendall Gifford wrote:
>> >> Hello everyone, I've got a database on an old Fedora Core 4 server
>> >> running
>> >> MySQL 4 (mysql-server.x86_64 4.1.12-2.FC4.1). The database in question
>> >> has
>> >> just two (InnoDB) tables:
>> >>
>> >> messages (approx 2.5 million records)
>> >> recipients (approx 6.5 million records)
>> >>
>> >> [[ ... see the original post for the schema details ... ]]
>> >>
>> >>
>> >> I have the following query that is just too slow:
>> >>
>> >>> SELECT messages.* FROM messages
>> >>> INNER JOIN recipients ON recipients.message_id = messages.id
>> >>> WHERE recipients.employee_id = X
>> >>> GROUP BY messages.id
>> >>> ORDER BY sent_at DESC
>> >>> LIMIT 0, 25;
>> >>
>> >> This takes about 44 seconds on average. [[...]]
>> >>
>> >
>> > You need to get rid of the GROUP BY to make this go faster. You can do
>> > that by running two queries, one to pick the list of unique
>> > recipients.message_id values that match your where condition then
>> > another to actually retrieve the message data. [[...]]
>>
>> I don't want to contradict Shawn, but currently I fail to see the need
>> for the GROUP BY: Joining like this
>>
>>  messages INNER JOIN recipients ON recipients.message_id = messages.id
>>  WHERE recipients.employee_id = X
>>
>> can return only one row, unless there are multiple "recipients" records
>> for the same values of "message_id" and "employee_id".
>>
>> I don't know whether that can happen in the poster's application, and
>> whether it would cause trouble if the result line would occur multiple
>> times.
>>
>>
> In my application, there CAN in fact be several "recipients" records with
> both the same "message_id" foreign key value AND the same "employee_id"
> value (some employees may be a recipient of a message several times over via
> alternative addresses and/or aliases). However, as I rework things, I could
> probably rework application logic nuke the GROUP BY and just cope, in code,
> with these extra "messages" records in my result set. (Just FYI, the SQL
> query is simply the default query as created by rails or, more specifically,
> ActiveRecord 2.3.9 which I can/will-be optimizing).
>
> I will additionally be moving this database to a new server. However, for
> academic interest, I'll see if I can make time to post the query time(s)
> once I change the app, before moving the database to a new (and better
> configured) server.
>
>
Just an update for posterity, simply removing the GROUP BY clause of my
query above has, overall, no noticeable effect on performance. I suspect
server configuration, as pointed out by Reindl, is too much of a bottleneck
and is what I first need to change (working on that now). Perhaps the
removal of GROUP BY would/will be noticeable if the server configuration for
InnoDB tables wasn't so horrendous. I'll find out...

-- 
Kendall Gifford
zettab...@gmail.com

Reply via email to