This sounds interesting.  Anything we can steal?

---------- Forwarded message ----------
Date: Tue, 02 Mar 2004 09:13:08 -0800
From: Brian S O'Neill <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Subject: Another tz compiler
Resent-Date: Tue, 2 Mar 2004 12:12:37 -0500 (EST)
Resent-From: [EMAIL PROTECTED]

The main tz project page shows various links to other time zone database
formats and other tz compilers. I have been working on the Joda-Time
project, which is designed as a replacement for Java's standard date and
time classes. It includes a tz compiler and it has its own compact
binary format for the resulting files. I would be pleased if the
Joda-time project was mentioned on the tz page as well.

Joda-Time project: http://joda-time.sourceforge.net/
Compiler (link is unstable):
http://joda-time.sourceforge.net/api-0.95/org/joda/time/tz/ZoneInfoCompiler.html

Most users will have no need to compile the files themselves, as the
Joda-Time distribution includes pre-compiled tz files in the jar. The
DateTimeZone class knows how to load the files and create objects for them.

I finished the tz compiler and new binary format about a year ago. I
would have just used Java's standard time zone class, except it did not
perform fast enough. Sun's Java v1.4 has a time zone implementation that
retrieves offsets using a binary search. Joda-Time's CachedDateTimeZone
is faster than a binary search. Caching is handled automatically, and
time zones with trivial rules are not wrapped with the cached
implementation. It could not be piggybacked onto Java's standard time
zone class, as it does not provide a way to iterate over offset
transitions.

The only documentation on the binary format and the caching is in the
source code itself. One of the features in the binary format is that it
stores times with variable precision and size. It can store up to
millisecond precision, as a 64-bit signed integer. It stores
precalculated offset transitions up to the point where a simple DST rule
can fully describe all future transitions.

Runtime caching is implemented by breaking the time line down into fixed
size regions of 2^32 milliseconds, or about 49.7 days. Offset lookup is
performed by retrieving one of these regions from the cache. The lookup
is performed by shifting out the lower 32 bits of the 64-bit timestamp.
This value, modulo 512, is used as an array index to retrieve the region
info. A hashtable with 512 regions of 49.7 days provides collision free
caching within periods of 69.7 years. The region info object contains a
linked list of offset transition instants. Since most regions have less
than two transitions, the linked list search is quite fast.


Reply via email to