Re: simple but frustrating query

2004-10-15 Thread Jeff Mathis
i don't think i missed any points raised by anyone in this discussion. 
email is often a difficult medium for technical issues.

for most of our purposes, we run multiple queries in order to make sure 
we are actually getting the data we want. it makes the code simpler, 
easier to understand and less error prone. this was one example where 
conceptually we thought it should be able to be done -- and it can, if 
you use a nested select.

thanks again to all for helped. we're on to bigger and better things...
jeff
Michael Stassen wrote:
Jeff Mathis wrote:
well, obviously some, if not all, of what you are saying is true. your 

I don't believe I said anything untrue.  Did you have something in mind?
table example below though is not the same as mine. My table stores 
time series data. for every symbol, there are a series of rows all 
with different dates. there is a unique constraint on the combination 
of symbol and close_date. so, for every symbol, there is one and only 
one maximum date. ...

You've missed my point.  My example table wasn't meant to be equivalent 
to yours.  Rather, it was a simple example to illustrate the idea that 
the MAX() function does not look for the row with the largest value.  
Instead, it tells you what the largest value is with no reference to 
row.  That is, it is an aggregate function for use with GROUP BY.  In 
other words, it is a summary statistic, not a data point.

Now, you may know that in your particular situation there will be only 
one row for each group's max date, but the MAX() function doesn't know 
that. Even if it did (because your dates are unique), it still wouldn't 
make sense for MAX() to think in terms of finding a row, because it is 
perfectly reasonable to ask for the MIN() in the same query.

Look at my example again.  I asked for the max, the min, and the 
average. Even if we change that to use your table rather than mine, 
which row should be pulled?  The one with the max date, the one with the 
min date, or the one with the average date (which probably doesn't even 
exist)?

... i want that row and the name field it contains.

Right, I got that.
your example using the subquery works. when we used the subquery 
approach, we forgot to include the equivalent of t1.symbol = t2.symbol.

I'm glad it worked.  I was confident that it would.
if we use:
select max(close_date), symbol, name from TD
where symbol in (quoted char string) group by symbol,
name order by symbol;
we end up getting multiple rows for each symbol if the names change 
over time. but that's ok for now -- we can parse the query output 
within our application and get the one row with the most recent date.

Right.  Adding name to the GROUP BY makes it legitimate to have name in 
the list of columns to select, but this query doesn't do what you want.  
It gives the maximum close_date for each symbol-name combination.  As 
you say, you can parse the results to find the max close_date for each 
symbol subsection, but why would you do that when you already have a 
query which gives exactly the result you want?

what we want to get is conceptually simple, but perhaps not so in 
terms of SQL.

It's easy to say, but describe how to do it:  For a given symbol, you 
have to look at all the close_date values to find the max, then you have 
to find the row with that value; or sort by close_date, then take the 
row at the high end of the sorted list; or compare the rows two at a 
time, storing the rest of the row for the winner of each comparison.  
They all amount to the same thing: a 2 step process.  Those two steps 
are accomplished by the 2 queries in the temp table method, or by the 
query + subquery.

Doing this in SQL, however, is tricky enough that yours is a frequently 
asked question.

jeff

Michael
Michael Stassen wrote:
No, Shawn's answer is correct.  You are starting from a false 
assumption. You are expecting that MAX(closedate) corresponds to a 
row.  It does not. Consider the table

Table=stuff:
  sym val  note
  --- ---  
  AAA   2   one
  AAA   2   two
  AAA   4   three
  AAA   6   four
  AAA  12   five
  AAA   7   six
  AAA  12   seven
  BBB   1   eight
  BBB   2   nine
  BBB   3   ten
Now consider the query
  SELECT sym, note, MAX(val), MIN(val), AVG(val)
  FROM stuff
  GROUP BY sym;
Which row should be returned for sym='AAA'?  Do you see the problem?  
It is clear that, for sym=AAA, MIN(val) is 2, MAX(val) is 12, and 
AVG(val) is about 6.43.  Which row is that?  The answer is that it's 
not a row.  MAX(), MIN(), and AVG() are aggregate functions.  They do 
not return rows, they return summary stats about a set of rows.

Many dbs wouldn't even allow a query like that, because we are asking 
for a column not included in the GROUP BY.  Mysql allows that, but 
the manual warns that it is pointless to do so if the extra column 
does not have a unique value per group 
http://dev.mysql.com/doc/mysql/en/GROUP-BY-hidden-fields.html.

Finding the notes which correspond to the maximum val 

RE: simple but frustrating query

2004-10-14 Thread Ed Lazor
 -Original Message-
 what we want is the value for the name field corresponding to the row 
 with the most recent close_date.

Based on that comment, I'd

select name from TD order by close_date DESC limit 1


 
 something like this:
 
 select max(close_date), symbol, name from TD where symbol in 
 (list of 
 characters) group by symbol;
 
 this returns the max close_date value, but the name that is 
 returned is 
 garbage and seems to point to the earliest row in the table.
 
 any suggestions?
 
 jeff


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: simple but frustrating query

2004-10-14 Thread Jeff Mathis
Ed Lazor wrote:
-Original Message-
what we want is the value for the name field corresponding to the row 
with the most recent close_date.

Based on that comment, I'd
select name from TD order by close_date DESC limit 1
except, we run into problems when there is a list of values for symbol 
in the query.

for example
select max(close_date), symbol, name from TD where symbol in
('aa','bb','cc','dd','ee') 
in fact this is the real problem. for a single value of symbol, we can 
do this query. but we want to feed in a list of values for symbol




something like this:
select max(close_date), symbol, name from TD where symbol in 
(list of 
characters) group by symbol;

this returns the max close_date value, but the name that is 
returned is 
garbage and seems to point to the earliest row in the table.

any suggestions?
jeff


--
Jeff Mathis, Ph.D.  505-955-1434
Prediction Company  [EMAIL PROTECTED]
525 Camino de los Marquez, Ste 6http://www.predict.com
Santa Fe, NM 87505
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: simple but frustrating query

2004-10-14 Thread Joe Audette
How about
 
select close_date, symbol, name 
from TD 
where symbol in (list of 
characters)  limit 1
ORDER BY close_date desc
 


Jeff Mathis [EMAIL PROTECTED] wrote:
hello query gurus.

we have a table TD with the following columns:

close_date
symbol
name

close_date is just a date field

there is a unique constraint on the combination of close_date and symbol.

what we want is the value for the name field corresponding to the row 
with the most recent close_date.

something like this:

select max(close_date), symbol, name from TD where symbol in (
characters) group by symbol;

this returns the max close_date value, but the name that is returned is 
garbage and seems to point to the earliest row in the table.

any suggestions?

jeff

ps we're using mysql 4.1.3 with the innodb engine


-- 
Jeff Mathis, Ph.D. 505-955-1434
Prediction Company [EMAIL PROTECTED]
525 Camino de los Marquez, Ste 6 http://www.predict.com
Santa Fe, NM 87505


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]



[EMAIL PROTECTED]
http://www.joeaudette.com
http://www.mojoportal.com

Re: simple but frustrating query

2004-10-14 Thread Jeff Mathis
I'll be more explicit:
select max(close_date), symbol, name from TD
where symbol in ('aa','bb','cc') group by symbol, name order by symbol;
returns
+-++-+
| max(close_date) | symbol | name|
+-++-+
| 2004-10-05  | aa | cmptrhw |
| 2004-10-05  | bb | biotech |
| 2002-05-03  | bb | drugs   |
| 2002-02-05  | bb | medprovr|
| 2004-10-05  | cc | biotech |
| 2002-05-03  | cc | drugs   |
| 2002-02-05  | cc | infosvcs|
+-++-+
now, leaving off name from the group by clause
(select max(close_date), symbol, name from TD
where symbol in ('aa','bb','cc') group by symbol;)
gives
+-++-+
| max(close_date) | symbol | name|
+-++-+
| 2004-10-05  | aa | cmptrhw |
| 2004-10-05  | bb | drugs   |
| 2004-10-05  | cc | infosvcs|
+-++-+
which is wrong. what we want is
+-++-+
| max(close_date) | symbol | name|
+-++-+
| 2004-10-05  | aa | cmptrhw |
| 2004-10-05  | bb | biotech |
| 2004-10-05  | cc | biotech |
+-++-+
but we can't seem to fomrulate the query.


Jeff Mathis wrote:
Ed Lazor wrote:
-Original Message-
what we want is the value for the name field corresponding to the row 
with the most recent close_date.

Based on that comment, I'd
select name from TD order by close_date DESC limit 1

except, we run into problems when there is a list of values for symbol 
in the query.

for example
select max(close_date), symbol, name from TD where symbol in
('aa','bb','cc','dd','ee') 
in fact this is the real problem. for a single value of symbol, we can 
do this query. but we want to feed in a list of values for symbol




something like this:
select max(close_date), symbol, name from TD where symbol in (list 
of characters) group by symbol;

this returns the max close_date value, but the name that is returned 
is garbage and seems to point to the earliest row in the table.

any suggestions?
jeff




--
Jeff Mathis, Ph.D.  505-955-1434
Prediction Company  [EMAIL PROTECTED]
525 Camino de los Marquez, Ste 6http://www.predict.com
Santa Fe, NM 87505
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: simple but frustrating query

2004-10-14 Thread SGreen
It takes two steps: first determine the max(closedate) for each symbol, 
then use those results to get the name field. You could do this with a 
subquery (both steps in the one statement) because you are using a version 
of MySQL  4.0.0 but here is a temp table implementation that will work 
with just about anyone.

CREATE TEMPORARY TABLE tmpSymbols
SELECT symbol, max(close_date) as last_date
FROM TD
WHERE symbol in (list of symbols)
GROUP BY symbol;

SELECT ts.symbol, ts.last_date, TD.name
FROM tmpSymbols ts
INNER JOIN TD
ON TD.symbol = ts.symbol
AND TD.close_date = ts.last_date;


Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine



Jeff Mathis [EMAIL PROTECTED] wrote on 10/14/2004 02:22:32 PM:

 hello query gurus.
 
 we have a table TD with the following columns:
 
 close_date
 symbol
 name
 
 close_date is just a date field
 
 there is a unique constraint on the combination of close_date and 
symbol.
 
 what we want is the value for the name field corresponding to the row 
 with the most recent close_date.
 
 something like this:
 
 select max(close_date), symbol, name from TD where symbol in (list of 
 characters) group by symbol;
 
 this returns the max close_date value, but the name that is returned is 
 garbage and seems to point to the earliest row in the table.
 
 any suggestions?
 
 jeff
 
 ps we're using mysql 4.1.3 with the innodb engine
 
 
 -- 
 Jeff Mathis, Ph.D. 505-955-1434
 Prediction Company [EMAIL PROTECTED]
 525 Camino de los Marquez, Ste 6   http://www.predict.com
 Santa Fe, NM 87505
 
 
 -- 
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
 


Re: simple but frustrating query

2004-10-14 Thread Joe Audette

+-++-+
| max(close_date) | symbol | name|
+-++-+
| 2004-10-05  | aa | cmptrhw |
| 2004-10-05  | bb | biotech |
| 2004-10-05  | cc | biotech |
+-++-+

.
 
OK I see what you want, this should do it
select close_date, symbol, name from TD
where symbol in ('aa','bb','cc')  AND close_date = (SELECT max(close_date) FROM TD)
 



Jeff Mathis [EMAIL PROTECTED] wrote:
I'll be more explicit:

select max(close_date), symbol, name from TD
where symbol in ('aa','bb','cc') group by symbol, name order by symbol;

returns

+-++-+
| max(close_date) | symbol | name |
+-++-+
| 2004-10-05 | aa | cmptrhw |
| 2004-10-05 | bb | biotech |
| 2002-05-03 | bb | drugs |
| 2002-02-05 | bb | medprovr |
| 2004-10-05 | cc | biotech |
| 2002-05-03 | cc | drugs |
| 2002-02-05 | cc | infosvcs |
+-++-+

now, leaving off name from the group by clause
(select max(close_date), symbol, name from TD
where symbol in ('aa','bb','cc') group by symbol;)

gives

+-++-+
| max(close_date) | symbol | name |
+-++-+
| 2004-10-05 | aa | cmptrhw |
| 2004-10-05 | bb | drugs |
| 2004-10-05 | cc | infosvcs |
+-++-+

which is wrong. what we want is

+-++-+
| max(close_date) | symbol | name |
+-++-+
| 2004-10-05 | aa | cmptrhw |
| 2004-10-05 | bb | biotech |
| 2004-10-05 | cc | biotech |
+-++-+

but we can't seem to fomrulate the query.






Jeff Mathis wrote:
 Ed Lazor wrote:
 
 -Original Message-
 what we want is the value for the name field corresponding to the row 
 with the most recent close_date.



 Based on that comment, I'd

 select name from TD order by close_date DESC limit 1
 
 
 except, we run into problems when there is a list of values for symbol 
 in the query.
 
 for example
 
 select max(close_date), symbol, name from TD where symbol in
 ('aa','bb','cc','dd','ee') 
 
 in fact this is the real problem. for a single value of symbol, we can 
 do this query. but we want to feed in a list of values for symbol
 
 
 



 something like this:

 select max(close_date), symbol, name from TD where symbol in (
 of characters) group by symbol;

 this returns the max close_date value, but the name that is returned 
 is garbage and seems to point to the earliest row in the table.

 any suggestions?

 jeff



 
 


-- 
Jeff Mathis, Ph.D. 505-955-1434
Prediction Company [EMAIL PROTECTED]
525 Camino de los Marquez, Ste 6 http://www.predict.com
Santa Fe, NM 87505


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]



[EMAIL PROTECTED]
http://www.joeaudette.com
http://www.mojoportal.com

Re: simple but frustrating query

2004-10-14 Thread Jeff Mathis
we really don't want to issue two queries. this should be able to be 
done in one, and without using temp tables, but maybe not.

thanks for the help though
jeff
[EMAIL PROTECTED] wrote:
It takes two steps: first determine the max(closedate) for each symbol, 
then use those results to get the name field. You could do this with a 
subquery (both steps in the one statement) because you are using a version 
of MySQL  4.0.0 but here is a temp table implementation that will work 
with just about anyone.

CREATE TEMPORARY TABLE tmpSymbols
SELECT symbol, max(close_date) as last_date
FROM TD
X-Mozilla-Status: 8000
X-Mozilla-Status2: 
WHERE symbol in (list of symbols)
GROUP BY symbol;
SELECT ts.symbol, ts.last_date, TD.name
FROM tmpSymbols ts
X-Mozilla-Status: 8000
X-Mozilla-Status2: 
INNER JOIN TD
ON TD.symbol = ts.symbol
AND TD.close_date = ts.last_date;
Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Jeff Mathis [EMAIL PROTECTED] wrote on 10/14/2004 02:22:32 PM:

hello query gurus.
we have a table TD with the following columns:
close_date
symbol
name
close_date is just a date field
there is a unique constraint on the combination of close_date and 
symbol.
what we want is the value for the name field corresponding to the row 
with the most recent close_date.

something like this:
select max(close_date), symbol, name from TD where symbol in (list of 
characters) group by symbol;

this returns the max close_date value, but the name that is returned is 
garbage and seems to point to the earliest row in the table.

any suggestions?
jeff
ps we're using mysql 4.1.3 with the innodb engine
--
Jeff Mathis, Ph.D. 505-955-1434
Prediction Company [EMAIL PROTECTED]
525 Camino de los Marquez, Ste 6   http://www.predict.com
Santa Fe, NM 87505
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


--
Jeff Mathis, Ph.D.  505-955-1434
Prediction Company  [EMAIL PROTECTED]
525 Camino de los Marquez, Ste 6http://www.predict.com
Santa Fe, NM 87505
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: simple but frustrating query

2004-10-14 Thread Brad Eacker
Jeff Mathis writes:
now, leaving off name from the group by clause
(select max(close_date), symbol, name from TD
where symbol in ('aa','bb','cc') group by symbol;)

gives

+-++-+
| max(close_date) | symbol | name|
+-++-+
| 2004-10-05  | aa | cmptrhw |
| 2004-10-05  | bb | drugs   |
| 2004-10-05  | cc | infosvcs|
+-++-+

Jeff,
 What version of MySQL are you using?  I ran your query on
4.0.18 and got a different answer...

mysql select max(close_date), symbol, name from TD
- where symbol in ('aa','bb','cc') group by symbol;
+-++-+
| max(close_date) | symbol | name|
+-++-+
| 2004-10-05  | aa | cmptrhw |
| 2004-10-05  | bb | biotech |
| 2004-10-05  | cc | biotech |
+-++-+
3 rows in set (0.01 sec)

Brad Eacker ([EMAIL PROTECTED])



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: simple but frustrating query

2004-10-14 Thread Michael Stassen
No, Shawn's answer is correct.  You are starting from a false assumption. 
You are expecting that MAX(closedate) corresponds to a row.  It does not. 
Consider the table

Table=stuff:
  sym val  note
  --- ---  
  AAA   2   one
  AAA   2   two
  AAA   4   three
  AAA   6   four
  AAA  12   five
  AAA   7   six
  AAA  12   seven
  BBB   1   eight
  BBB   2   nine
  BBB   3   ten
Now consider the query
  SELECT sym, note, MAX(val), MIN(val), AVG(val)
  FROM stuff
  GROUP BY sym;
Which row should be returned for sym='AAA'?  Do you see the problem?  It is 
clear that, for sym=AAA, MIN(val) is 2, MAX(val) is 12, and AVG(val) is 
about 6.43.  Which row is that?  The answer is that it's not a row.  MAX(), 
MIN(), and AVG() are aggregate functions.  They do not return rows, they 
return summary stats about a set of rows.

Many dbs wouldn't even allow a query like that, because we are asking for a 
column not included in the GROUP BY.  Mysql allows that, but the manual 
warns that it is pointless to do so if the extra column does not have a 
unique value per group 
http://dev.mysql.com/doc/mysql/en/GROUP-BY-hidden-fields.html.

Finding the notes which correspond to the maximum val is fundamentally a 2 
step process.  First you must find the maximum val, then you must find the 
rows(s) which have that val.  This is what Shawn was telling you.

The manual suggests 3 ways to solve this problem 
http://dev.mysql.com/doc/mysql/en/example-Maximum-column-group-row.html. 
The most efficient solution, and the one that works in all versions of 
mysql, is to use a temporary table, as Shawn described.  As you have mysql 
4.1, you could accomplish the same thing with a subquery.  In your case, 
that would be

  SELECT close_date, symbol, name
  FROM TD t1
  WHERE close_date = (SELECT MAX(t2.close_date)
  FROM TD t2
  WHERE t1.symbol = t2.symbol)
  AND symbol IN (list of characters);
Note that this is still really a 2 step process.  The subquery handles the 
first step, finding the max close_date, while the parent query handles step 
2, finding the matching rows.

There is a third way, the MAX-CONCAT trick.  It does it in one query without 
subqueries, and is very inefficient.  See the manual for the details.

In other words, this wasn't such a simple query, after all.
Michael
Jeff Mathis wrote:
we really don't want to issue two queries. this should be able to be 
done in one, and without using temp tables, but maybe not.

thanks for the help though
jeff
[EMAIL PROTECTED] wrote:
It takes two steps: first determine the max(closedate) for each 
symbol, then use those results to get the name field. You could do 
this with a subquery (both steps in the one statement) because you are 
using a version of MySQL  4.0.0 but here is a temp table 
implementation that will work with just about anyone.

CREATE TEMPORARY TABLE tmpSymbols
SELECT symbol, max(close_date) as last_date
FROM TD
WHERE symbol in (list of symbols)
GROUP BY symbol;
SELECT ts.symbol, ts.last_date, TD.name
FROM tmpSymbols ts
INNER JOIN TD
ON TD.symbol = ts.symbol
AND TD.close_date = ts.last_date;
Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Jeff Mathis [EMAIL PROTECTED] wrote on 10/14/2004 02:22:32 PM:

hello query gurus.
we have a table TD with the following columns:
close_date
symbol
name
close_date is just a date field
there is a unique constraint on the combination of close_date and 

symbol.
what we want is the value for the name field corresponding to the row 
with the most recent close_date.

something like this:
select max(close_date), symbol, name from TD where symbol in (list 
of characters) group by symbol;

this returns the max close_date value, but the name that is returned 
is garbage and seems to point to the earliest row in the table.

any suggestions?
jeff
ps we're using mysql 4.1.3 with the innodb engine
--
Jeff Mathis, Ph.D. 505-955-1434
Prediction Company [EMAIL PROTECTED]
525 Camino de los Marquez, Ste 6   http://www.predict.com
Santa Fe, NM 87505

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: simple but frustrating query

2004-10-14 Thread Jeff Mathis
4.1.3 and the innodb engine on solaris 5.8
Brad Eacker wrote:
Jeff Mathis writes:
now, leaving off name from the group by clause
(select max(close_date), symbol, name from TD
where symbol in ('aa','bb','cc') group by symbol;)
gives
+-++-+
| max(close_date) | symbol | name|
+-++-+
| 2004-10-05  | aa | cmptrhw |
| 2004-10-05  | bb | drugs   |
| 2004-10-05  | cc | infosvcs|
+-++-+

Jeff,
 What version of MySQL are you using?  I ran your query on
4.0.18 and got a different answer...
mysql select max(close_date), symbol, name from TD
- where symbol in ('aa','bb','cc') group by symbol;
+-++-+
| max(close_date) | symbol | name|
+-++-+
| 2004-10-05  | aa | cmptrhw |
| 2004-10-05  | bb | biotech |
| 2004-10-05  | cc | biotech |
+-++-+
3 rows in set (0.01 sec)
Brad Eacker ([EMAIL PROTECTED])


--
Jeff Mathis, Ph.D.  505-955-1434
Prediction Company  [EMAIL PROTECTED]
525 Camino de los Marquez, Ste 6http://www.predict.com
Santa Fe, NM 87505
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: simple but frustrating query

2004-10-14 Thread Jeff Mathis
well, obviously some, if not all, of what you are saying is true. your 
table example below though is not the same as mine. My table stores time 
series data. for every symbol, there are a series of rows all with 
different dates. there is a unique constraint on the combination of 
symbol and close_date. so, for every symbol, there is one and only one 
maximum date. i want that row and the name field it contains.

your example using the subsquery works. when we used the subquery 
approach, we forgot to include the equivalent of t1.symbol = t2.symbol.

if we use:
select max(close_date), symbol, name from TD
where symbol in (quoted char string) group by symbol,
name order by symbol;
we end up getting multiple rows for each symbol if the names change over 
time. but thats ok for now -- we can parse the query output within our 
application and get the one row with the most recent date.

what we want to get is conceptually simple, but perhaps not so in terms 
of SQL.

jeff
Michael Stassen wrote:
No, Shawn's answer is correct.  You are starting from a false 
assumption. You are expecting that MAX(closedate) corresponds to a row.  
It does not. Consider the table

Table=stuff:
  sym val  note
  --- ---  
  AAA   2   one
  AAA   2   two
  AAA   4   three
  AAA   6   four
  AAA  12   five
  AAA   7   six
  AAA  12   seven
  BBB   1   eight
  BBB   2   nine
  BBB   3   ten
Now consider the query
  SELECT sym, note, MAX(val), MIN(val), AVG(val)
  FROM stuff
  GROUP BY sym;
Which row should be returned for sym='AAA'?  Do you see the problem?  It 
is clear that, for sym=AAA, MIN(val) is 2, MAX(val) is 12, and AVG(val) 
is about 6.43.  Which row is that?  The answer is that it's not a row.  
MAX(), MIN(), and AVG() are aggregate functions.  They do not return 
rows, they return summary stats about a set of rows.

Many dbs wouldn't even allow a query like that, because we are asking 
for a column not included in the GROUP BY.  Mysql allows that, but the 
manual warns that it is pointless to do so if the extra column does not 
have a unique value per group 
http://dev.mysql.com/doc/mysql/en/GROUP-BY-hidden-fields.html.

Finding the notes which correspond to the maximum val is fundamentally a 
2 step process.  First you must find the maximum val, then you must find 
the rows(s) which have that val.  This is what Shawn was telling you.

The manual suggests 3 ways to solve this problem 
http://dev.mysql.com/doc/mysql/en/example-Maximum-column-group-row.html. 
The most efficient solution, and the one that works in all versions of 
mysql, is to use a temporary table, as Shawn described.  As you have 
mysql 4.1, you could accomplish the same thing with a subquery.  In your 
case, that would be

  SELECT close_date, symbol, name
  FROM TD t1
  WHERE close_date = (SELECT MAX(t2.close_date)
  FROM TD t2
  WHERE t1.symbol = t2.symbol)
  AND symbol IN (list of characters);
Note that this is still really a 2 step process.  The subquery handles 
the first step, finding the max close_date, while the parent query 
handles step 2, finding the matching rows.

There is a third way, the MAX-CONCAT trick.  It does it in one query 
without subqueries, and is very inefficient.  See the manual for the 
details.

In other words, this wasn't such a simple query, after all.
Michael
Jeff Mathis wrote:
we really don't want to issue two queries. this should be able to be 
done in one, and without using temp tables, but maybe not.

thanks for the help though
jeff
[EMAIL PROTECTED] wrote:
It takes two steps: first determine the max(closedate) for each 
symbol, then use those results to get the name field. You could do 
this with a subquery (both steps in the one statement) because you 
are using a version of MySQL  4.0.0 but here is a temp table 
implementation that will work with just about anyone.

CREATE TEMPORARY TABLE tmpSymbols
SELECT symbol, max(close_date) as last_date
FROM TD
WHERE symbol in (list of symbols)
GROUP BY symbol;
SELECT ts.symbol, ts.last_date, TD.name
FROM tmpSymbols ts
INNER JOIN TD
ON TD.symbol = ts.symbol
AND TD.close_date = ts.last_date;
Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Jeff Mathis [EMAIL PROTECTED] wrote on 10/14/2004 02:22:32 PM:

hello query gurus.
we have a table TD with the following columns:
close_date
symbol
name
close_date is just a date field
there is a unique constraint on the combination of close_date and 

symbol.
what we want is the value for the name field corresponding to the 
row with the most recent close_date.

something like this:
select max(close_date), symbol, name from TD where symbol in (list 
of characters) group by symbol;

this returns the max close_date value, but the name that is returned 
is garbage and seems to point to the earliest row in the table.

any suggestions?
jeff
ps we're using mysql 4.1.3 with the innodb engine
--
Jeff Mathis, Ph.D. 505-955-1434
Prediction Company [EMAIL 

Re: simple but frustrating query

2004-10-14 Thread Michael Stassen
Jeff Mathis wrote:
well, obviously some, if not all, of what you are saying is true. your 
I don't believe I said anything untrue.  Did you have something in mind?
table example below though is not the same as mine. My table stores time 
series data. for every symbol, there are a series of rows all with 
different dates. there is a unique constraint on the combination of 
symbol and close_date. so, for every symbol, there is one and only one 
maximum date. ...
You've missed my point.  My example table wasn't meant to be equivalent to 
yours.  Rather, it was a simple example to illustrate the idea that the 
MAX() function does not look for the row with the largest value.  Instead, 
it tells you what the largest value is with no reference to row.  That is, 
it is an aggregate function for use with GROUP BY.  In other words, it is a 
summary statistic, not a data point.

Now, you may know that in your particular situation there will be only one 
row for each group's max date, but the MAX() function doesn't know that. 
Even if it did (because your dates are unique), it still wouldn't make sense 
for MAX() to think in terms of finding a row, because it is perfectly 
reasonable to ask for the MIN() in the same query.

Look at my example again.  I asked for the max, the min, and the average. 
Even if we change that to use your table rather than mine, which row should 
be pulled?  The one with the max date, the one with the min date, or the one 
with the average date (which probably doesn't even exist)?

... i want that row and the name field it contains.
Right, I got that.
your example using the subquery works. when we used the subquery 
approach, we forgot to include the equivalent of t1.symbol = t2.symbol.
I'm glad it worked.  I was confident that it would.
if we use:
select max(close_date), symbol, name from TD
where symbol in (quoted char string) group by symbol,
name order by symbol;
we end up getting multiple rows for each symbol if the names change over 
time. but that's ok for now -- we can parse the query output within our 
application and get the one row with the most recent date.
Right.  Adding name to the GROUP BY makes it legitimate to have name in the 
list of columns to select, but this query doesn't do what you want.  It 
gives the maximum close_date for each symbol-name combination.  As you say, 
you can parse the results to find the max close_date for each symbol 
subsection, but why would you do that when you already have a query which 
gives exactly the result you want?

what we want to get is conceptually simple, but perhaps not so in terms 
of SQL.
It's easy to say, but describe how to do it:  For a given symbol, you have 
to look at all the close_date values to find the max, then you have to find 
the row with that value; or sort by close_date, then take the row at the 
high end of the sorted list; or compare the rows two at a time, storing the 
rest of the row for the winner of each comparison.  They all amount to the 
same thing: a 2 step process.  Those two steps are accomplished by the 2 
queries in the temp table method, or by the query + subquery.

Doing this in SQL, however, is tricky enough that yours is a frequently 
asked question.

jeff
Michael
Michael Stassen wrote:
No, Shawn's answer is correct.  You are starting from a false 
assumption. You are expecting that MAX(closedate) corresponds to a 
row.  It does not. Consider the table

Table=stuff:
  sym val  note
  --- ---  
  AAA   2   one
  AAA   2   two
  AAA   4   three
  AAA   6   four
  AAA  12   five
  AAA   7   six
  AAA  12   seven
  BBB   1   eight
  BBB   2   nine
  BBB   3   ten
Now consider the query
  SELECT sym, note, MAX(val), MIN(val), AVG(val)
  FROM stuff
  GROUP BY sym;
Which row should be returned for sym='AAA'?  Do you see the problem?  
It is clear that, for sym=AAA, MIN(val) is 2, MAX(val) is 12, and 
AVG(val) is about 6.43.  Which row is that?  The answer is that it's 
not a row.  MAX(), MIN(), and AVG() are aggregate functions.  They do 
not return rows, they return summary stats about a set of rows.

Many dbs wouldn't even allow a query like that, because we are asking 
for a column not included in the GROUP BY.  Mysql allows that, but the 
manual warns that it is pointless to do so if the extra column does 
not have a unique value per group 
http://dev.mysql.com/doc/mysql/en/GROUP-BY-hidden-fields.html.

Finding the notes which correspond to the maximum val is fundamentally 
a 2 step process.  First you must find the maximum val, then you must 
find the rows(s) which have that val.  This is what Shawn was telling 
you.

The manual suggests 3 ways to solve this problem 
http://dev.mysql.com/doc/mysql/en/example-Maximum-column-group-row.html. 
The most efficient solution, and the one that works in all versions of 
mysql, is to use a temporary table, as Shawn described.  As you have 
mysql 4.1, you could accomplish the same thing with a subquery.  In 
your case, that would be

  SELECT close_date,