Hi Michael! 

        Talking about the "query with group by issue"...

> I'll explain my reasoning below.
[...]
>  From the manual, section "7.2.8 How MySQL Optimizes LEFT 
> JOIN and RIGHT 
> JOIN" <http://dev.mysql.com/doc/mysql/en/LEFT_JOIN_optimization.html>:
> 
>    A LEFT JOIN B join_condition is implemented in MySQL as follows:
>    ...
>    * The LEFT JOIN condition is used to decide how to 
> retrieve rows from table B. (In other words, any condition in the
> WHERE clause is not used.)
>    ...
> 
> So, the "WHERE c2.id IS NULL" cannot be applied until after 
> the rows which match the ON clause (and the NULL rows) have been
> fetched.

        Thanks for your explanations here, it's clear now I was confused. I
missed this LEFT JOIN behaviour. I was in the idea the WHERE xx IS NULL was
taken into account.

[...]
> You are certainly right that "temporary" and "filesort" are 
> to be avoided. And they will be, if the table is properly indexed. Single 
> column indexing won't help much here, because the WHERE condition,
> the GROUP BY column, and the MAX column are all different.  A multi-column
> index on (content, location, date), however, will allow mysql to use the
> index to find the matching rows, find the groups, and calculate the MAX
date.

        I still refuse to use the temporary table solution, call me fussy
here. Indeed, I think giving more indexes than necessary is a bit redundant,
unless completely necessary. (You will end up having more space on indexes
than data itself). It's a "handle with care" issue for me.

>>     Anyway, I don't know if one can program an agregate UDF 
>>  called something like EXTERNAL_MAX(...) or something, so that we
>>  could do like:
>> 
>>         SELECT EXTERNAL_MAX(date, version)  ---> i.e: 
>>  Returns the "version" value for the row with MAX(date).
>>
>>     This, for sure, will be the best solution. ;-)
> 
> That would have to do the same thing behind the scenes.

        I have to beg you pardon here. ;-) Think again this solution doesn't
require a JOIN, nor a temporary table. I think the UDF solution doesn't have
to be less efficient than a MAX or AVG aggregate function by itself. I'm in
the process of creating such a monster :-) Let me know if you are curious
about and have time to test it, I will test MAX() and EXTERNAL_MAX() against
a very large table.

        Cheers,
        Jose Miguel.



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to