On 27 July 2015 at 14:59, R. David Murray <rdmur...@bitdance.com> wrote: > I have a feeling that I'm completely misunderstanding things, since > tzinfo is still a bit of a mystery to me.
You're not the only one :-) I think the following statements are true. If they aren't, I'd appreciate clarification. I'm going to completely ignore leap seconds in the following - I hope that's OK, I don't understand leap seconds *at all* and I don't work in any application areas where they are relevant (to my knowledge) so I feel that for my situation, ignoring them (and being able to) is reasonable. Note that I'm not talking about internal representations - this is purely about user-visible semantics. 1. "Naive" datetime arithmetic means treating a day as 24 hours, an hour as 60 minutes, etc. Basically base-24/60/60 arithmetic. 2. If you're only working in a single timezone that's defined as UTC or a fixed offset from UTC, naive arithmetic is basically all there is. 3. Converting between (fixed offset) timezones is a separate issue from calculation - but it's nothing more than applying the relevant offsets. 4. Calculations involving 2 different timezones (fixed-offset ones as above) is like any other exercise involving values on different scales. Convert both values to a common scale (in this case, a common timezone) and do the calculation there. Simple enough. 5. The problems all arise *only* with timezones whose UTC offset varies depending on the actual time (e.g., timezones that include the transition to DST and back). Are we OK to this point? This much comprises what I would class as a "naive" (i.e. 99% of the population ;-)) understanding of datetimes. The stdlib datetime module handles naive datetime values, and fixed-offset timezones, fine, as far as I can see. (I'm not sure that the original implementation included fixed-offset tzinfo objects, but the 3.4 docs say they are there now, so that's fine). Looking at the complicated cases, the only ones I'm actually aware of in practice are the ones that switch to DST and back, so typically have two offsets that differ by an hour, switching between the two at some essentially arbitrary points. If there are other more complex forms of timezone, I'd like to never need to know about them, please ;-) The timezones we're talking about here are things like "Europe/London", not "GMT" or "BST" (the latter two are fixed-offset). There are two independent issues with complex timezones: 1. Converting to and from them. That's messy because the conversion to UTC needs more information than just the date & time (typically, for example, there is a day when 01:45:00 maps to 2 distinct UTC times). This is basically the "is_dst" bit that Tim discussed in an earlier post. The semantic issue here is that users typically say "01:45" and it never occurs to them to even think about *which* 01:45 they mean. So recovering that extra information is hard (it's like dealing with byte streams where the user didn't provide details of the text encoding used). Once we have the extra information, though, doing conversions is just a matter of applying a set of rules. 2. Arithmetic within a complex timezone. Theoretically, this is simple enough (convert to UTC, do the calculation naively, and convert back). But in practice, that approach doesn't always match user expectations. So you have 2 mutually incompatible semantic options - 1 day after 4pm is 3pm the following day, or adding 1 day adds 25 hours - either is a viable choice, and either will confuse *some* set of users. This, I think, is the one where all the debate is occurring, and the one that makes my head explode. It seems to me that the problem is that for this latter issue, it's the *timedelta* object that's not rich enough. You can't say "add 1 day, and by 1 day I mean keep the same time tomorrow" as opposed to "add 1 day, and by that I mean 24 hours"[1]. In some ways, it's actually no different from the issue of adding 1 month to a date (which is equally ill-defined, but people "know what they mean" to just as great an extent). Python bypasses the latter by not having a timedelta for "a month". C (and the time module) bypasses the former by limiting all time offsets to numbers of seconds - datetime gave us a richer timedelta object and hence has extra problems. I don't have any solutions to this final issue. But hopefully the above analysis (assuming it's accurate!) helps clarify what the actual debate is about, for those bystanders like me who are interested in following the discussion. With luck, maybe it also gives the experts an alternative perspective from which to think about the problem - who knows? Paul [1] Well, you can, actually - you say that a timedelta of "1 day" means "the same time tomorrow" and if you want 24 hours, you say "24 hours" not "1 day". So timedelta(days=1) != timedelta(hours=24) even though they give the same result for every case except arithmetic involving complex timezones. Is that what Lennart has been trying to say in his posts? _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com