I agree that Michael's solution with the temporary tables is the best I have
seen so far.
I am running MySQL 4.0.15 so I don't have any subquery capability. As a
result, I went to DB2 to come up with one solution that does give the right
answer via a subquery:
select content, location, version, date
from versions
where concat(location, char(date)) in
(select concat(location, char(max(date)))
from versions
where content = 'ALPHA'
group by location);
I don't know if this will work in MySQL 4.1.x though. You may want to give
it a try if you have 4.1.x. By the way, I'm not convinced that this is the
*best* solution using a subquery; it's just the first one I could think of.
I don't have all day to spend on this ;-)
One other thought: is it really necessary to keep track of all of the
different versions you have had at each location? I can't think of a lot of
cases where I'd really care that I had version 8 of the ALPHA in Tokyo last
week or last year. Your query would be simpler - and the volume of data
would be less - if you only kept track of the current version for each
content at each location.
Rhino
----- Original Message -----
From: "Michael Stassen" <[EMAIL PROTECTED]>
To: "Jose Miguel P�rez" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Thursday, September 16, 2004 10:47 AM
Subject: Re: Query with group by
>
> Jose Miguel P�rez wrote:
>
> > Hi Vincent!
> >
> >>i'm trying to built a query that report me the latest tuple for a given
> >>field.
> >>I explain myself a bit. Let's say i have a table such this :
> >
> > There is no simple solution to your problem. Strictly speaking, it's
> > forbidden to name a column in the SELECT clause which is NOT on the
GROUP BY
> > clause. (Read chapter: "13.9.3 GROUP BY with Hidden Fields" from the
MySQL
> > Documentation.). Anyway, MySQL is relaxed from ANSI SQL in such a way
it's
> > permitted to include a column which is not in the GROUP BY clause
(hidden
> > columns). The values returned from such a query are "unpredictable".
> >
> > I started saying there is no _simple_ solution. However, you can
think a
> > little bit and your problem goes away. I can think of two solutions to
your
> > problem which neither involves a subquery.
> >
> > You are lucky in that you know for sure what your last value is (you
do
> > have a "date" field"). You already know the value of using MAX(date) in
the
> > SELECT. You can concatenate the date column with the version, and return
a
> > "SUBSTRING" from that operation, something like this:
> >
> > SELECT SUBSTRING_INDEX(MAX(CONCAT(date, ' ', version)), ' ', -1)
AS
> > correct_version,
> > location, version AS wrong_version
> > FROM cities
> > WHERE content='ALPHA'
> > GROUP BY location
>
> This is the MAX-CONCAT trick. It works, but it's inefficient, as it has
to
> do a full table scan, with calculations done on each row. An index on
date
> cannot be used in this case, because we are searching for the maximum
> CONCAT(date, ' ', version), rather than the maximum date.
>
> > You will get the following result set: (Notice the "correct_version"
> > column).
> >
> > +-----------------+----------+---------------+
> > | correct_version | location | wrong_version |
> > +-----------------+----------+---------------+
> > | 10 | NEW-YORK | 11 |
> > | 10 | PARIS | 10 |
> > | 11 | TOKYO | 10 |
> > +-----------------+----------+---------------+
> >
> > There is another option, however, and this one is what I like most.
In
> > two words, you can join the table with itself, like this:
> >
> > SELECT c1.date, c1.location, c1.version
> > FROM cities c1
> > LEFT JOIN cities c2
> > ON c1.location=c2.location AND c1.content=c2.content
> > AND c2.date>c1.date
> > WHERE c2.id IS NULL AND c1.content = 'ALPHA'
> >
> > What we have done here is first LEFT JOIN the table (cities in my
> > example) with itself so that we have in the left part those rows which
have
> > the maximum date and with NULL on the right as the value of c2. We then
> > select those saying "WHERE c2.id IS NULL". The last operation is
selecting
> > the rows for an 'ALPHA' content.
>
> This will work, but it is also somewhat inefficient. The LEFT JOIN is
> creating numerous extra, unwanted rows, only to throw them away with the
> WHERE c2.id IS NULL. Assuming n rows for a particular location value, you
> are creating 1 + (n * (n - 1)/2) rows {optimized down to n rows, if id is
> defined NOT NULL} to find the one row you want for that group. That's no
> big deal for this small sample table, but it may not scale well.
>
> > Notice that with this version we also eliminate the GROUP BY, since
all
> > rows returned from the join are unique. The result set returned with the
> > last example is this:
> >
> > +------------+----------+---------+
> > | date | location | version |
> > +------------+----------+---------+
> > | 2004-09-16 | PARIS | 10 |
> > | 2004-09-16 | NEW-YORK | 10 |
> > | 2004-09-15 | TOKYO | 11 |
> > +------------+----------+---------+
>
> The most efficient way is probably to use a temporary table.
>
> CREATE TEMPORARY TABLE max_dates
> SELECT location, MAX(date) AS max_date
> FROM temp
> WHERE content = 'ALPHA'
> GROUP BY location;
>
> SELECT t.*
> FROM temp t, max_dates m
> WHERE t.location = m.location
> AND t.date = m.max_date;
>
> DROP TABLE max_dates;
>
> The manual describes the MAX-CONCAT trick, the temporary table solution,
and
> a subquery solution for 4.1+
> <http://dev.mysql.com/doc/mysql/en/example-Maximum-column-group-row.html>.
>
> > Cheers,
> > Jose Miguel.
>
> Michael
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]
>
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]