Adriano,

Obviously the problem is in impedance mismatch between ICU and other systems that use IANA timezone database (notably POSIX, can't speak for Windows). This is then moved to programming languages that typically use OS services to handle timezone information, and not ICU (which is just a library on host system).

On POSIX/Linux, these data use separate time(zone) names for DST (as far as I can tell, I did not checked all of them). Historically, the time zones were hour-shifts from GMT, and regions/countries picked one (or more) to use depending on geo-location. These "picks" get names, so we've got Central European Time (GMT+1) etc. Some are retained till today, some were deliberately added or removed from systems (that's how we got "Arab Standard Time" and "Arabic Standard Time" (both UTC+3) and "Arabian Standard Time" (UTC+4) in Windows. The DST in some regions complicated it event more as it's a seasonal time shift to another time zone, but people will not use the "adjacent" timezone name so "summer" time was invented and we've got CEST and the likes. But DST is not the only one time shift in effect, it's just the most prevalent one. The DST made a lot of confusion among people and especially SW developers. Some retain a view / assumption that DST shift is applied to timezone as it's actually a shift in local time at usually uses some timezone, while others see it as temporary switch to another (adjacent) timezone (hence the different names) because the timezone can't "travel" because it's defined as fixed offset from GMT.

Eventually we've got named regions that effectively pair shifts and switches in time zone (which is still a slice of global 24h defined by offset from UTC - former GMT) to particular region on Earth. This should fix the mess caused by DST confusion. And it eventually fix that, IF people will cease to use timezone names and use regions that "observe" timezones that are offsets from UTC. Timezone names like CET or CST6CDT are DEPRECATED (check that on wikipedia). Canonical names for timezones are those in "Etc/" for example "/Etc/GMT+2". The "funny" thing is that offset in GMT name is actually inverted offset from UTC, so Etc/GMT+2 is actually UTC-02:00.

So, we still have a mess at our hands until transition will be completed. For example on Linux (and other POSIX) the timezone info for region still refers to timezones through old names like CET/CEST etc. when you ask for timezone in effect for specific timestamp for the region. On Windows it's even worse (see "arabian nights" example mentioned earlier). Now imagine the situation of poor developer who relies on prog. language library that typically uses host system facilities to deal with timezones. He gets what he gets.

I solved the problem in Python driver by augmenting the tzinfo object with required (region) metadata, and it will work on POSIX if only regions would be used (which includes direct use of timezones in Etc/* range). The solution for Windows is still work in progress, hope I'll find a solution there as well.

regards
Pavel

Dne 08. 07. 20 v 15:32 Adriano dos Santos Fernandes napsal(a):

We will need to decide if we maintain things as is, will drop support
from regions in TIME-TZ or will drop TIME-TZ completely.

I personally think TIME-TZ with regions are a valid thing (albeit weird
depending on the operations) because it fills a gap where one creates a
TIME and a additional region column. TIME-TZ with offsets only (no
regions) does not have the weird things by definition, but weird things
will happen when converting from timestamps.

The problem is that Firebird recalculates the time to UTC for storage, which means that some fixed date is needed for recalculation for regions, which may screw up the data. I would rather store the time as is than in UTC, so it will be actually a WALL CLOCK time + region info. The math / comparison of TIME WITH TZ between two regions is meaningless anyway, unless you consider it as wall clock times.

I see two ways that time zones are handled:

1) Time zone name changes if the date is in DST or not
2) The one Firebird use, where time zone name does not change

ICU does not seems to work in the 1 way. I also think not all software
work like that, nor exists DST time zone names for every country/time zone.

Certainly not all, but it's also not a marginal number. Globally, think 50:50 and you wouldn't be wrong by much.

Do every software you use changes CET to CEST depending on the date?
Windows? Linux? The browser?

Anything that uses POSIX timezone database present on all POSIX systems (for example on linux it's in /usr/share/zoneinfo).

What will happen if in Python you try to create a date not in DST using
the CEST time zone? Or the contrary, a date in DST using CET?

The problem is that I'll not create a date in CET/CEST, but in 'Europe/Prague' region. However, when I'll ask for timezone name needed by iUtil.encodeTimestampTz(), I get one from these names according to DST state. Python does not retains the 'Europe/Prague' region name (as it's not part of POSIX timezone database records), it's used just to locate the timezone information (transitions etc.).

regards
Pavel


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to