Re: timestamp and DST: impossible to backup database? *AND* bugs in TIMEDIFF, FROM_UNIXTIME, et.al.?

2004-11-29 Thread Peter Valdemar Mørch
Thank you Michael for your very thoughtful reply. I know that it takes 
time and effort to answer at the level you did.

Michael Stassen Michael.Stassen-at-verizon.net |Lists| wrote:
 You seem to have a fundamental misunderstanding of the TIMESTAMP type.
 No timezone or DST information is stored in a TIMESTAMP column.
Yup. I thought it could be used to unambiguously represent any and all 
points in time. It can't. Thats it in a nutshell.

I need to be able to sort, get and set the time unambiguously, also 
during the one problem hour in october. I need to know that if I put 
in a field with a time value I can reliably retrieve it again. And 
that if a record went in at time X and another in at time Y, Y-X is 
accurate for all values of Y and X, regardless of how we humans have 
decided to present X and Y to each other. (Standard computer stuff, no?)

DATETIME is ambiguous, seconds since epoch UTC is not.
Maybe my surprise is more: Hey, depending on now(), a 
UNIX_TIMESTAMP(2004-10-31 02:15:00) has two different interal 
values!!! (Why now() should have any effect on that is still weird to 
me... I realize *how* it ends up having an effect implementationally, 
but it *shouldn't*.) The other value is not representable at all by 
any DATETIME value. And sorting on a DATETIME gives one result now and 
another after a dump/restore cycle. And there is no way around that.

We'll change our application to int(32) unsigned and handle presentation 
client-side. I don't think I'll ever use a DATETIME again... Maybe thats 
just me.

Thanks again, Michael.
Peter
--
Peter Valdemar Mørch
http://www.morch.com
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: timestamp and DST: impossible to backup database? *AND* bugs in TIMEDIFF, FROM_UNIXTIME, et.al.?

2004-11-28 Thread Michael Stassen
Peter Valdemar Mørch wrote:
 From the lack of responses I take it that nobody disagrees that the 
handling of the timestamp type is fundamentally broken in every version 
of MySQL.
Silence does not necessarily indicate assent.  You asked a complicated 
question (when the U.S. list members were on holiday rather than reading 
e-mail, I might add).  I agree that it does not do what you expect, but the 
question of fundamentally broken is not so simple.

I'll go ahead and file several bugs, and start changing our code to 
avoid the timestamp type altogther. I'm quite surprised!

Peter
Peter Valdemar Mørch swp5jhu02-at-sneakemail.com |Lists| wrote:
***
   Claim
***
I seem to have discovered that MySQL cannot handle the hour where DST 
becomes non-DST reliably (on Oct 31st in CET):

2am 3am
 |  problem  |
 |   time|
--DST-/
   /
|/
   /
 /
   /- non-DST --
Essentially, the problem seems to be that October 31, 2:15 am is 
ambiguous and MySQL cannot disambiguate between them.
Well, of course.  October 31, 2:15 am *is* ambiguous.  Which one does it 
mean?  You cannot tell from the value October 31, 2:15 am whether it is 
the first or second occurrence of that time.  No system could -- additional 
info is required.

It seems that internally MySQL stores timestamp values with all 
timezone and DST/non-DST information intact. But all operations (e.g. 
TIMEDIFF(), FROM_UNIXTIME() and UPDATE TABLE SET 
timestamp=timestamp+0) destroy this important information and operate 
only now()'s timezone ignoring the appropriate original source time zone.
You seem to have a fundamental misunderstanding of the TIMESTAMP type.  No 
timezone or DST information is stored in a TIMESTAMP column.  TIMESTAMPs are 
stored internally as seconds since epoch UTC, according to the manual, but 
they are always translated to DATETIMES in the current timezone when 
retrieved.  The operations you mention, TIMEDIFF(), FROM_UNIXTIME() and 
UPDATE TABLE SET timestamp=timestamp+0, all behave as expected, once you 
understand how they, and timestamps, work

I just find it impossibly difficult to believe I'm the first one 
finding this, so now I'm curious. Googling has come up short.

I very much hope I'm mistaken and that I don't have to change our 
code, database format and contents to store # secs since 1970 GMT in 
an int everywhere and avoid timestamps altogether because they don't 
work...
That may be exactly what you should do, because that seems to be precisely 
what you've said you want.  You want rows to be stamped with seconds since 
epoch so you can preserve order across the end of DST.  Every operation 
you've attempted seems to be based on the assumption that timestamps are 
just seconds since epoch, but they aren't.


 Evidence for claim

Essentially we have a log table using a timestamp column as a sort 
key. For our logic to work we need to be able to:

* Sort reliably, because the order of entries is very important (we 
use the last log entry to determine current state). Entries that 
are made during 2am-3am DST *must* come before log entries that are 
made 2am-3am non-DST just as they follow chronologically in real life.
Timestamps have 1 second resolution.  Unless you are guaranteed never to get 
2 log entries in the same second, you cannot use a timestamp to sort 
reliably.  Don't you have an auto_increment primary key on which to sort?

This works initially, but gets broken by a mysqldump db | mysql 
new_db cycle. Timestamp values in mysqldump output don't contain DST 
timezone information and hence of course it is lost during restore. 
There is no option to mysqldump to maintain this info in output. How 
does one backup a database then? (Other than cp or mysqlhotcopy which 
fails if host != localhost)
Right, that's a problem, though I think not quite for the reason you 
describe.  MySQL doesn't handle timezone as part of a datetime anywhere. 
Timezone is handled separately.  Hence, we can't dump the timestamps' 
timezone/DST status on a row by row basis.  The problem is that the internal 
seconds since epoch is converted to a datetime in the first place.  Two 
different internal values translate to the same external datetime, so it's 
not a 1 to 1 function.  Having lost information on output, you can't get it 
back on input.

The question is whether or not this is a problem.  That is, does this 
contradict the intended design of the timestamp column?

* Be able to do TIMEDIFF() math and INSERT INTO TABLE ... 
VALUES(FROM_UNIXTIME(?)) maintaining the correct time. This is not 
possible.
You are expecting functions to do what they aren't designed to do. 
TIMEDIFF() operates on times and datetimes, not on unix timestamps. 

Re: timestamp and DST: impossible to backup database? *AND* bugs in TIMEDIFF, FROM_UNIXTIME, et.al.?

2004-11-27 Thread Peter Valdemar Mørch
From the lack of responses I take it that nobody disagrees that the 
handling of the timestamp type is fundamentally broken in every version 
of MySQL.

I'll go ahead and file several bugs, and start changing our code to 
avoid the timestamp type altogther. I'm quite surprised!

Peter
Peter Valdemar Mørch swp5jhu02-at-sneakemail.com |Lists| wrote:
***
   Claim
***
I seem to have discovered that MySQL cannot handle the hour where DST 
becomes non-DST reliably (on Oct 31st in CET):

2am 3am
 |  problem  |
 |   time|
--DST-/
   /
|/
   /
 /
   /- non-DST --
Essentially, the problem seems to be that October 31, 2:15 am is 
ambiguous and MySQL cannot disambiguate between them.

It seems that internally MySQL stores timestamp values with all timezone 
and DST/non-DST information intact. But all operations (e.g. TIMEDIFF(), 
FROM_UNIXTIME() and UPDATE TABLE SET timestamp=timestamp+0) destroy this 
important information and operate only now()'s timezone ignoring the 
appropriate original source time zone.

I just find it impossibly difficult to believe I'm the first one finding 
this, so now I'm curious. Googling has come up short.

I very much hope I'm mistaken and that I don't have to change our code, 
database format and contents to store # secs since 1970 GMT in an int 
everywhere and avoid timestamps altogether because they don't work...


 Evidence for claim

Essentially we have a log table using a timestamp column as a sort key. 
For our logic to work we need to be able to:

* Sort reliably, because the order of entries is very important (we use 
the last log entry to determine current state). Entries that are 
made during 2am-3am DST *must* come before log entries that are made 
2am-3am non-DST just as they follow chronologically in real life.

This works initially, but gets broken by a mysqldump db | mysql new_db 
cycle. Timestamp values in mysqldump output don't contain DST timezone 
information and hence of course it is lost during restore. There is no 
option to mysqldump to maintain this info in output. How does one backup 
a database then? (Other than cp or mysqlhotcopy which fails if host != 
localhost)

* Be able to do TIMEDIFF() math and INSERT INTO TABLE ... 
VALUES(FROM_UNIXTIME(?)) maintaining the correct time. This is not 
possible.

It seems that for all MySQL functions, the 2am-3am period (DST/non-DST) 
chosen depends on now(), not the value of ?, yeilding off-by-an-hour 
errors. As illustrated by this short example (more elaborate examples 
follow below):

susan:~# cat /etc/timezone
Europe/Copenhagen
susan:~# mysql -e '
select 1099185600-1099181400;
select TIMEDIFF(FROM_UNIXTIME(1099185600), FROM_UNIXTIME(1099181400));'
+---+
| 1099185600-1099181400 |
+---+
|  4200 |
+---+
++
| TIMEDIFF(FROM_UNIXTIME(1099185600), FROM_UNIXTIME(1099181400)) |
++
| 00:10:00   |
++
But 4200 secs is 01:10:00, not 00:10:00!!!
Before we:
1) File an enhancement request against mysqldump
2) File bugs against UNIX_TIMESTAMP() and TIMEDIFF() et. al.
3) change our database format and code to avoid the timestamp type 
altogether and use secondsSince1970 int instead (huge)

, I'd love to hear that I'm mistaken and that these problems can be 
solved with current mysql code. I've tried 4.1.7 and 4.0.22.

*
  More examples
*
I've created a table only with an ID and a timestamp. I've tried filling 
it in a loop in two ways (perl source code link below):

1) modifying system time and inserting into table using implied 
timestamp=now(). This works (order by timestamp - IDs are in order):
++-+---+
| ID | timestamp   | unix_timestamp(timestamp) |
++-+---+
|  1 | 2004-10-31 01:45:00 |1099179900 |
|  2 | 2004-10-31 02:15:00 |1099181700 |
|  3 | 2004-10-31 02:45:00 |1099183500 |
|  4 | 2004-10-31 02:15:00 |1099185300 |
|  5 | 2004-10-31 02:45:00 |1099187100 |
|  6 | 2004-10-31 03:15:00 |1099188900 |
++-+---+

But TIMEDIFF() on this otherwise correct data is erroneous:
select TIMEDIFF(A.timestamp, B.timestamp)
from table as A, table as B where A.ID=4 and B.ID=2;

timestamp and DST: impossible to backup database? *AND* bugs in TIMEDIFF, FROM_UNIXTIME, et.al.?

2004-11-25 Thread Peter Valdemar Mørch
***
   Claim
***
I seem to have discovered that MySQL cannot handle the hour where DST 
becomes non-DST reliably (on Oct 31st in CET):

2am 3am
 |  problem  |
 |   time|
--DST-/
   /
|/
   /
 /
   /- non-DST --
Essentially, the problem seems to be that October 31, 2:15 am is 
ambiguous and MySQL cannot disambiguate between them.

It seems that internally MySQL stores timestamp values with all timezone 
and DST/non-DST information intact. But all operations (e.g. TIMEDIFF(), 
FROM_UNIXTIME() and UPDATE TABLE SET timestamp=timestamp+0) destroy this 
important information and operate only now()'s timezone ignoring the 
appropriate original source time zone.

I just find it impossibly difficult to believe I'm the first one finding 
this, so now I'm curious. Googling has come up short.

I very much hope I'm mistaken and that I don't have to change our code, 
database format and contents to store # secs since 1970 GMT in an int 
everywhere and avoid timestamps altogether because they don't work...


 Evidence for claim

Essentially we have a log table using a timestamp column as a sort key. 
For our logic to work we need to be able to:

* Sort reliably, because the order of entries is very important (we use 
the last log entry to determine current state). Entries that are 
made during 2am-3am DST *must* come before log entries that are made 
2am-3am non-DST just as they follow chronologically in real life.

This works initially, but gets broken by a mysqldump db | mysql new_db 
cycle. Timestamp values in mysqldump output don't contain DST timezone 
information and hence of course it is lost during restore. There is no 
option to mysqldump to maintain this info in output. How does one backup 
a database then? (Other than cp or mysqlhotcopy which fails if host != 
localhost)

* Be able to do TIMEDIFF() math and INSERT INTO TABLE ... 
VALUES(FROM_UNIXTIME(?)) maintaining the correct time. This is not 
possible.

It seems that for all MySQL functions, the 2am-3am period (DST/non-DST) 
chosen depends on now(), not the value of ?, yeilding off-by-an-hour 
errors. As illustrated by this short example (more elaborate examples 
follow below):

susan:~# cat /etc/timezone
Europe/Copenhagen
susan:~# mysql -e '
select 1099185600-1099181400;
select TIMEDIFF(FROM_UNIXTIME(1099185600), FROM_UNIXTIME(1099181400));'
+---+
| 1099185600-1099181400 |
+---+
|  4200 |
+---+
++
| TIMEDIFF(FROM_UNIXTIME(1099185600), FROM_UNIXTIME(1099181400)) |
++
| 00:10:00   |
++
But 4200 secs is 01:10:00, not 00:10:00!!!
Before we:
1) File an enhancement request against mysqldump
2) File bugs against UNIX_TIMESTAMP() and TIMEDIFF() et. al.
3) change our database format and code to avoid the timestamp type 
altogether and use secondsSince1970 int instead (huge)

, I'd love to hear that I'm mistaken and that these problems can be 
solved with current mysql code. I've tried 4.1.7 and 4.0.22.

*
  More examples
*
I've created a table only with an ID and a timestamp. I've tried filling 
it in a loop in two ways (perl source code link below):

1) modifying system time and inserting into table using implied 
timestamp=now(). This works (order by timestamp - IDs are in order):
++-+---+
| ID | timestamp   | unix_timestamp(timestamp) |
++-+---+
|  1 | 2004-10-31 01:45:00 |1099179900 |
|  2 | 2004-10-31 02:15:00 |1099181700 |
|  3 | 2004-10-31 02:45:00 |1099183500 |
|  4 | 2004-10-31 02:15:00 |1099185300 |
|  5 | 2004-10-31 02:45:00 |1099187100 |
|  6 | 2004-10-31 03:15:00 |1099188900 |
++-+---+

But TIMEDIFF() on this otherwise correct data is erroneous:
select TIMEDIFF(A.timestamp, B.timestamp)
from table as A, table as B where A.ID=4 and B.ID=2;
++
| TIMEDIFF(A.timestamp, B.timestamp) |
++
| 00:00:00   |
++
This should have been a full hour!!!
An ever-so-slight modification (+0) on above table.timestamp makes it 
loose its timezone info:
update table set timestamp=timestamp+0
IDs are now