I agree that Michael's solution with the temporary tables is the best I have seen so far.
I am running MySQL 4.0.15 so I don't have any subquery capability. As a result, I went to DB2 to come up with one solution that does give the right answer via a subquery: select content, location, version, date from versions where concat(location, char(date)) in (select concat(location, char(max(date))) from versions where content = 'ALPHA' group by location); I don't know if this will work in MySQL 4.1.x though. You may want to give it a try if you have 4.1.x. By the way, I'm not convinced that this is the *best* solution using a subquery; it's just the first one I could think of. I don't have all day to spend on this ;-) One other thought: is it really necessary to keep track of all of the different versions you have had at each location? I can't think of a lot of cases where I'd really care that I had version 8 of the ALPHA in Tokyo last week or last year. Your query would be simpler - and the volume of data would be less - if you only kept track of the current version for each content at each location. Rhino ----- Original Message ----- From: "Michael Stassen" <[EMAIL PROTECTED]> To: "Jose Miguel Pérez" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Thursday, September 16, 2004 10:47 AM Subject: Re: Query with group by > > Jose Miguel Pérez wrote: > > > Hi Vincent! > > > >>i'm trying to built a query that report me the latest tuple for a given > >>field. > >>I explain myself a bit. Let's say i have a table such this : > > > > There is no simple solution to your problem. Strictly speaking, it's > > forbidden to name a column in the SELECT clause which is NOT on the GROUP BY > > clause. (Read chapter: "13.9.3 GROUP BY with Hidden Fields" from the MySQL > > Documentation.). Anyway, MySQL is relaxed from ANSI SQL in such a way it's > > permitted to include a column which is not in the GROUP BY clause (hidden > > columns). The values returned from such a query are "unpredictable". > > > > I started saying there is no _simple_ solution. However, you can think a > > little bit and your problem goes away. I can think of two solutions to your > > problem which neither involves a subquery. > > > > You are lucky in that you know for sure what your last value is (you do > > have a "date" field"). You already know the value of using MAX(date) in the > > SELECT. You can concatenate the date column with the version, and return a > > "SUBSTRING" from that operation, something like this: > > > > SELECT SUBSTRING_INDEX(MAX(CONCAT(date, ' ', version)), ' ', -1) AS > > correct_version, > > location, version AS wrong_version > > FROM cities > > WHERE content='ALPHA' > > GROUP BY location > > This is the MAX-CONCAT trick. It works, but it's inefficient, as it has to > do a full table scan, with calculations done on each row. An index on date > cannot be used in this case, because we are searching for the maximum > CONCAT(date, ' ', version), rather than the maximum date. > > > You will get the following result set: (Notice the "correct_version" > > column). > > > > +-----------------+----------+---------------+ > > | correct_version | location | wrong_version | > > +-----------------+----------+---------------+ > > | 10 | NEW-YORK | 11 | > > | 10 | PARIS | 10 | > > | 11 | TOKYO | 10 | > > +-----------------+----------+---------------+ > > > > There is another option, however, and this one is what I like most. In > > two words, you can join the table with itself, like this: > > > > SELECT c1.date, c1.location, c1.version > > FROM cities c1 > > LEFT JOIN cities c2 > > ON c1.location=c2.location AND c1.content=c2.content > > AND c2.date>c1.date > > WHERE c2.id IS NULL AND c1.content = 'ALPHA' > > > > What we have done here is first LEFT JOIN the table (cities in my > > example) with itself so that we have in the left part those rows which have > > the maximum date and with NULL on the right as the value of c2. We then > > select those saying "WHERE c2.id IS NULL". The last operation is selecting > > the rows for an 'ALPHA' content. > > This will work, but it is also somewhat inefficient. The LEFT JOIN is > creating numerous extra, unwanted rows, only to throw them away with the > WHERE c2.id IS NULL. Assuming n rows for a particular location value, you > are creating 1 + (n * (n - 1)/2) rows {optimized down to n rows, if id is > defined NOT NULL} to find the one row you want for that group. That's no > big deal for this small sample table, but it may not scale well. > > > Notice that with this version we also eliminate the GROUP BY, since all > > rows returned from the join are unique. The result set returned with the > > last example is this: > > > > +------------+----------+---------+ > > | date | location | version | > > +------------+----------+---------+ > > | 2004-09-16 | PARIS | 10 | > > | 2004-09-16 | NEW-YORK | 10 | > > | 2004-09-15 | TOKYO | 11 | > > +------------+----------+---------+ > > The most efficient way is probably to use a temporary table. > > CREATE TEMPORARY TABLE max_dates > SELECT location, MAX(date) AS max_date > FROM temp > WHERE content = 'ALPHA' > GROUP BY location; > > SELECT t.* > FROM temp t, max_dates m > WHERE t.location = m.location > AND t.date = m.max_date; > > DROP TABLE max_dates; > > The manual describes the MAX-CONCAT trick, the temporary table solution, and > a subquery solution for 4.1+ > <http://dev.mysql.com/doc/mysql/en/example-Maximum-column-group-row.html>. > > > Cheers, > > Jose Miguel. > > Michael > > -- > MySQL General Mailing List > For list archives: http://lists.mysql.com/mysql > To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] > -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]