James Nobis <[EMAIL PROTECTED]> wrote on 04/12/2005 10:19:33 AM: > Hi all, > > Essentially I was asked to look into a problem with a query at the > office which > used AVG on a datetime column which produces incorrect results. (MySQL 4.1.10 > on RHEL 3 update 4) As you can see below the stark difference between the > correct and incorrect results. Though, the incorrect result has the correct > year (this seems actually to be coincidence and nothing more) but I > am not sure > exactly what MySQL did for casting. The query produces no errors orwarnings. > I *think* the incorrect result is an "internal" timestamp format but with the > wrong data. Is there a part of the manual that I missed which explains this > unexpected behavior? > > AVG(col) v FROM_UNIXTIME(AVG(UNIX_TIMESTAMP(col))) > 20050258324839 v 2005-03-24 02:05:11 > > James Nobis > Web Developer > Academic Superstore > 223 W. Anderson Ln. Suite A110, Austin, TX 78752 > Voice: (512) 450-1199 x453 Fax: (512) 450-0263 > http://www.academicsuperstore.com > > ---------------------------------------------------------------- > This message was sent using IMP, the Internet Messaging Program. >
I can't find it in the online manual but it used to explain that datetime values will be cast as a number in the form of YYYYMMDDHHNNSS or as a string like 'YYYY-MM-DD HH:NN:SS'. AVG() is a numeric function so if you are averaging datetime values, they will appear in their numeric format. As an example, if you needed the AVG() of the times '2005-04-30 12:55:59' and '2005-04-30 12:56:01' you would have wound up averaging the following numbers: 20050430125559 20050430125601 Which does not numerically average out to an numeric representaion of '2005-04-30 12:56:00' but instead to 20050430125680 which is the numeric form of the time '2005-04-30 12:55:80'. However, by first converting the dates to "seconds from epoch" (by using UNIX_TIMESTAMP()) you take the average of the numbers 1114880159 1114880161 which works out to 1114880160. Run that value through FROM_UNIXTIME() and you get: select from_unixtime(1114880160); +---------------------------+ | from_unixtime(1114880160) | +---------------------------+ | 2005-04-30 12:56:00 | +---------------------------+ 1 row in set (0.00 sec) Which is exactly what your test revealed. Basically, you got hosed results for your first column because the AVG() and SUM() aggregate functions do not work with dates or times in a format that is conducive to proper "date" math. They use the human-readable version of the date(a packed number) and not a machine-useful, mathematically sound value like the date's offset from epoch. Shawn Green Database Administrator Unimin Corporation - Spruce Pine