James Nobis <[EMAIL PROTECTED]> wrote on 04/12/2005 
10:19:33 AM:

> Hi all,
> 
> Essentially I was asked to look into a problem with a query at the 
> office which
> used AVG on a datetime column which produces incorrect results.  (MySQL 
4.1.10
> on RHEL 3 update 4)  As you can see below the stark difference between 
the
> correct and incorrect results.  Though, the incorrect result has the 
correct
> year (this seems actually to be coincidence and nothing more) but I 
> am not sure
> exactly what MySQL did for casting.  The query produces no errors 
orwarnings. 
> I *think* the incorrect result is an "internal" timestamp format but 
with the
> wrong data.  Is there a part of the manual that I missed which explains 
this
> unexpected behavior?
> 
> AVG(col) v FROM_UNIXTIME(AVG(UNIX_TIMESTAMP(col)))
> 20050258324839 v 2005-03-24 02:05:11
> 
> James Nobis
> Web Developer
> Academic Superstore
> 223 W. Anderson Ln. Suite A110, Austin, TX 78752
> Voice: (512) 450-1199 x453 Fax: (512) 450-0263
> http://www.academicsuperstore.com
> 
> ----------------------------------------------------------------
> This message was sent using IMP, the Internet Messaging Program.
> 

I can't find it in the online manual but it used to explain that datetime 
values will be cast as a number in the form of YYYYMMDDHHNNSS or as a 
string like 'YYYY-MM-DD HH:NN:SS'. AVG() is a numeric function so if you 
are averaging datetime values, they will appear in their numeric format. 

As an example, if you needed the AVG() of the times '2005-04-30 12:55:59' 
and '2005-04-30 12:56:01' you would have wound up averaging the following 
numbers:

20050430125559
20050430125601

Which does not numerically average out to an numeric representaion of 
'2005-04-30 12:56:00' but instead to 20050430125680 which is the numeric 
form of the time '2005-04-30 12:55:80'.  However, by first converting the 
dates to "seconds from epoch" (by using UNIX_TIMESTAMP()) you take the 
average of the numbers

1114880159
1114880161

which works out to 1114880160. Run that value through FROM_UNIXTIME() and 
you get:

select from_unixtime(1114880160);
+---------------------------+
| from_unixtime(1114880160) |
+---------------------------+
| 2005-04-30 12:56:00       |
+---------------------------+
1 row in set (0.00 sec)

Which is exactly what your test revealed. Basically, you got hosed results 
for your first column because the AVG() and SUM() aggregate functions do 
not work with dates or times in a format that is conducive to proper 
"date" math. They use the human-readable version of the date(a packed 
number) and not a machine-useful, mathematically sound value like the 
date's offset from epoch.

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

 

Reply via email to