Hello,

This week, I've finished the work on serialization by making the deserializers 
capable of handling UTC offsets. I had to rewrite DateTimeField.to_python to 
extract and interpret timezone offsets. Still, deserialization of aware 
datetimes doesn't work with PyYAML: http://pyyaml.org/ticket/202

I also implemented the storage and retrieval of aware datetime objects in 
PostgreSQL, MySQL and Oracle. Conversions happen:
        - on storage, in `connection.ops.value_to_db_datetime`, called from 
`get_db_prep_value`;
        - on retrieval, in the database adapter's conversion functions.
The code is rather straightforward. When USE_TZ is True, naive datetimes are 
interpreted as local time in TIME_ZONE, for backwards compatibility with 
existing applications.

SQLite is more tricky because it uses 
`django.db.backends.util.typecast_timestamp` to convert string to datetimes. 
However:
        - this function is used elsewhere of Django, in combination with the 
`needs_datetime_string_cast` flag. 
        - it performs essentially the same operations as 
`DateTimeField.to_python`.
I'll review the history of this code, and I'll try to refactor it.

Besides adding support for SQLite, I still have to:
        - check that datetimes behave correctly when they're used as query 
arguments, in aggregation functions, etc.
        - optimize django.utils.tzinfo: fix #16899, use pytz.utc as the UTC 
timezone class when pytz is available, etc.

I won't have much time for this project next week. See you in two weeks for the 
next check-in!

Best regards,

-- 
Aymeric Augustin.

On 24 sept. 2011, at 15:24, Aymeric Augustin wrote:

> Hello,
> 
> This week, I've been working on a related topic that I had missed entirely in 
> my initial proposal: serialization.
> 
> Developers will obtain aware datetimes from Django when USE_TZ = True. We 
> must ensure that they serialize correctly.
> 
> Currently, the serialization code isn't very consistent with datetimes:
>       - JSON: the serializer uses the '%Y-%m-%d %H:%M:%S' format, losing 
> microseconds and timezone information. This dates back to the initial commit 
> at r3237. See also #10201.
>       - XML: the serializer delegates to DateTimeField.value_to_string, who 
> also uses the '%Y-%m-%d %H:%M:%S' format.
>       - YAML: the serializer handles datetimes natively, and it includes 
> microseconds and UTC offset in the output.
> 
> I've hesitated between converting datetimes to UTC or rendering them as-is 
> with an UTC offset. The former would be more consistent with the database and 
> it's recommended in YAML. But the latter avoids modifying the data: not only 
> is it faster, but it's also more predictable. Serialization isn't just about 
> storing the data for further retrieval, it can be used to print arbitrary 
> data in a different format. Finally, when the data comes straight from the 
> database (the common case), it will be in UTC anyway.
> 
> Eventually, I've decided to serialize aware datetimes without conversion. The 
> implementation is here:
> https://bitbucket.org/aaugustin/django/compare/..django/django
> 
> Here are the new serialization formats for datetimes:
>       - JSON: as described in the specification at 
> http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf > 
> 15.9.1.15 Date Time String Format.
>       - XML: as produced by datetime.isoformat(), ISO8601.
>       - YAML: unchanged, compatible with http://yaml.org/type/timestamp.html 
> — the canonical representation uses 'T' as separator and is in UTC, but it's 
> also acceptable to use a space and include an offset like pyyaml does.
> These formats follow the best practices described in 
> http://www.w3.org/TR/NOTE-datetime.
> 
> This fix is backwards-incompatible for the JSON and XML serializers: it 
> includes fractional seconds and timezone information, and it uses the 
> normalized separator, 'T', between the date and time parts. However, I've 
> made sure that existing fixtures will load properly with the new code. I'll 
> mention all this in the release notes.
> 
> Unrelatedly, I have switched the SQLite backend to supports_timezones = 
> False, because it really doesn't make sense to write the UTC offset but 
> ignore it when reading back the data.
> 
> Best regards,
> 
> -- 
> Aymeric Augustin.
> 
> On 17 sept. 2011, at 09:59, Aymeric Augustin wrote:
> 
>> Hello,
>> 
>> This week, I've gathered all the information I need about how the database 
>> engines and adapters supported by Django handle datetime objects. I'm 
>> attaching my findings.
>> 
>> The good news is that the database representations currently used by Django 
>> are already optimal for my proposal. I'll store data in UTC:
>> - with an explicit timezone on PostgreSQL,
>> - without timezone on SQLite and MySQL because the database engine doesn't 
>> support it,
>> - without timezone on Oracle because the database adapter doesn't support it.
>> 
>> 
>> Currently, Django sets the "supports_timezones feature" to True for SQLite. 
>> I'm skeptical about this choice. Indeed, the time zone is stored: SQLite 
>> just saves the output of "<datetime>.isoformat(), which includes the UTC 
>> offset for aware datetime objects. However, the timezone information is 
>> ignored when reading the data back from the database, thus yielding 
>> incorrect data when it's different from the local time defined by 
>> settings.TIME_ZONE.
>> 
>> As far as I can tell, the "supports_timezones" and the 
>> "needs_datetime_string_cast" database features are incompatible, at least 
>> with the current implementation of "typecast_timestamp". There's a comment 
>> about this problem that dates back to the merge of magic-removal, possibly 
>> before:
>> https://code.djangoproject.com/browser/django/trunk/django/db/backends/util.py?annotate=blame#L79
>> 
>> SQLite is the only engine who has these two flags set to True. I think 
>> "supports_timezones" should be False. Does anyone know why it's True? Is it 
>> just an historical artifact?
>> 
>> 
>> Finally, I have read the document that describes "to_python", 
>> "value_to_string", and r"get_(db_)?prep_(value|save|lookup)". The next step 
>> is to adjust these functions in DateFieldField, depending on the value of 
>> settings.USE_TZ.
>> 
>> Best regards,
>> 
>> -- 
>> Aymeric Augustin.
>> 
>> <DATABASE-NOTES.html>
>> 
>> On 11 sept. 2011, at 23:18, Aymeric Augustin wrote:
>> 
>>> Hello,
>>> 
>>> Given the positive feedback received here and on IRC, I've started the 
>>> implementation.
>>> 
>>> Being most familiar with mercurial, I've forked the Bitbucket mirror. This 
>>> page that compares my branch to trunk:
>>> https://bitbucket.org/aaugustin/django/compare/..django/django
>>> 
>>> I've read a lot of code in django.db, and also the documentation of 
>>> PostgreSQL, MySQL and SQLite regarding date/time types.
>>> 
>>> I've written some tests that validate the current behavior of Django. Their 
>>> goal is to guarantee backwards-compatibility when USE_TZ = False.
>>> 
>>> At first they failed because runtests.py doesn't set os.environ['TZ'] and 
>>> doesn't call time.tzset() , so the tests ran with my system local time. I 
>>> fixed that in setUp and tearDown. Maybe we should call them in runtests.py 
>>> too for consistency?
>>> 
>>> By the way, since everything is supposed to be in UTC internally when 
>>> USE_TZ is True, it is theoretically to get rid of os.environ['TZ'] and 
>>> time.tzset(). They are only useful to make timezone-dependant functions 
>>> respect the TIME_ZONE setting. However, for backwards compatibility (in 
>>> particular with third-party apps), it's better to keep them and interpret 
>>> naive datetimes in the timezone defined by settings.TIME_ZONE (instead of 
>>> rejecting them outright). For this reason, I've decided to keep 
>>> os.environ['TZ'] and time.tzset() even when USE_TZ is True.
>>> 
>>> Best regards,
>>> 
>>> -- 
>>> Aymeric Augustin.
>>> 
>>> 
>>> On 3 sept. 2011, at 17:40, Aymeric Augustin wrote:
>>> 
>>>> Hello,
>>>> 
>>>> The GSoC proposal "Multiple timezone support for datetime representation" 
>>>> wasn't picked up in 2011 and 2010. Although I'm not a student and the 
>>>> summer is over, I'd like to tackle this problem, and I would appreciate it 
>>>> very much if a core developer accepted to mentor me during this work, 
>>>> GSoC-style.
>>>> 
>>>> Here is my proposal, following the GSoC guidelines. I apologize for the 
>>>> wall of text; this has been discussed many times in the past 4 years and 
>>>> I've tried to address as many concerns and objections as possible.
>>>> 
>>>> Definition of success
>>>> ---------------------
>>>> 
>>>> The goal is to resolve ticket #2626 in Django 1.4 or 1.5 (depending on 
>>>> when 1.4 is released).
>>>> 
>>>> Design specification
>>>> --------------------
>>>> 
>>>> Some background on timezones in Django and Python
>>>> .................................................
>>>> 
>>>> Currently, Django stores datetime objects in local time in the database, 
>>>> local time being defined by the TIME_ZONE setting. It retrieves them as 
>>>> naive datetime objects. As a consequence, developers work with naive 
>>>> datetime objects in local time.
>>>> 
>>>> This approach sort of works when all the users are in the same timezone 
>>>> and don't care about data loss (inconsistencies) when DST kicks in or out. 
>>>> Unfortunately, these assumptions aren't true for many Django projects: for 
>>>> instance, one may want to log sessions (login/logout) for security 
>>>> purposes: that's a 24/7 flow of important data. Read tickets #2626 and 
>>>> #10587 for more details.
>>>> 
>>>> Python's standard library provides limited support for timezones, but this 
>>>> gap is filled by pytz <http://pytz.sourceforge.net/>. If you aren't 
>>>> familiar with the topic, strongly recommend reading this page before my 
>>>> proposal. It explains the problems of working in local time and the 
>>>> limitations of Python's APIs. It has a lot of examples, too.
>>>> 
>>>> Django should use timezone-aware UTC datetimes internally
>>>> .........................................................
>>>> 
>>>> Example : datetime.datetime(2011, 09, 23, 8, 34, 12, tzinfo=pytz.utc)
>>>> 
>>>> In my opinion, the problem of local time is strikingly similar to the 
>>>> problem character encodings. Django uses only unicode internally and 
>>>> converts at the borders (HTTP requests/responses and database). I propose 
>>>> a similar solution: Django should always use UTC internally, and 
>>>> conversion should happen at the borders, i.e. when rendering the templates 
>>>> and processing POST data (in form fields/widgets). I'll discuss the 
>>>> database in the next section.
>>>> 
>>>> Quoting pytz' docs: "The preferred way of dealing with times is to always 
>>>> work in UTC, converting to localtime only when generating output to be 
>>>> read by humans." I think we can trust pytz' developers on this topic.
>>>> 
>>>> Note that a timezone-aware UTC datetime is different from a naive 
>>>> datetime. If we were using naive datetimes, and assuming we're using pytz, 
>>>> a developer could write:
>>>> 
>>>> mytimezone.localize(datetime_django_gave_me)
>>>> 
>>>> which is incorrect, because it will interpret the naive datetime as local 
>>>> time in "mytimezone". With timezone-aware UTC datetime, this kind of 
>>>> errors can't happen, and the equivalent code is:
>>>> 
>>>> datetime_django_gave_me.astimezone(mytimezone)
>>>> 
>>>> Django should store datetimes in UTC in the database
>>>> ....................................................
>>>> 
>>>> This horse has been beaten to death on this mailing-list so many times 
>>>> that I'll  keep the argumentation short. If Django handles everything as 
>>>> UTC internally, it isn't useful to convert to anything else for storage, 
>>>> and re-convert to UTC at retrieval.
>>>> 
>>>> In order to make the database portable and interoperable:
>>>> - in databases that support timezones (at least PostgreSQL), the timezone 
>>>> should be set to UTC, so that the data is unambiguous;
>>>> - in databases that don't (at least SQLite), storing data in UTC is the 
>>>> most reasonable choice: if there's a "default timezone", that's UTC.
>>>> 
>>>> I don't intend to change the storage format of datetimes. It has been 
>>>> proposed on this mailing-list to store datetimes with original timezone 
>>>> information. However, I suspect that in many cases, datetimes don't have a 
>>>> significant "original timezone" by themselves. Furthermore, there are many 
>>>> different ways to implemented this outside of Django's core. One is to 
>>>> store a local date + a local time + a place or timezone + is_dst flag and 
>>>> skip datetime entirely. Another is to store an UTC datetime + a place or 
>>>> timezone. In the end, since there's no obvious and consensual way to 
>>>> implement this idea, I've chosen to exclude it from my proposal. See the 
>>>> "Timezone-aware storage of DateTime" thread on this mailing list for a 
>>>> long and non-conclusive discussion of this idea.
>>>> 
>>>> I'm expecting to take some flak because of this choice :) Indeed, if 
>>>> you're writing a multi-timezone calendaring application, my work isn't 
>>>> going to resolve all your problems — but it won't hurt either. It may even 
>>>> provide a saner foundation to build upon. Once again, there's more than 
>>>> one way to solve this problem, and I'm afraid that choosing one would 
>>>> offend some people sufficiently to get the entire proposal rejected.
>>>> 
>>>> Django should convert between UTC and local time in the templates and forms
>>>> ...........................................................................
>>>> 
>>>> I regard the problem of local time (in which time zone is my user?) as 
>>>> very similar to internationalization (which language does my user read?), 
>>>> and even more to localization (in which country does my user live?), 
>>>> because localization happens both on output and on input.
>>>> 
>>>> I want controllable conversion to local time when rendering a datetime in 
>>>> a template. I will introduce:
>>>> - a template tag, {% localtime on|off %}, that works exactly like {% 
>>>> localize on|off %}; it will be available with {% load tz %};
>>>> - two template filters, {{ datetime|localtime }} and {{ datetime|utctime 
>>>> }}, that work exactly like {{ value|localize }} and {{ value|unlocalize }}.
>>>> 
>>>> I will convert datetimes to local time when rendering a DateTimeInput 
>>>> widget, and also handle SplitDateTimeWidget and SplitHiddenDateTimeWidget 
>>>> which are more complicated.
>>>> 
>>>> Finally, I will convert datetimes entered by end-users in forms to UTC. I 
>>>> can't think of cases where you'd want an interface in local time but user 
>>>> input in UTC. As a consequence, I don't plan to introduce the equivalent 
>>>> of the `localize` keyword argument in form fields, unless someone brings 
>>>> up a sufficiently general use case.
>>>> 
>>>> How to set each user's timezone
>>>> ...............................
>>>> 
>>>> Internationalization and localization are based on the LANGUAGES setting. 
>>>> There's a widely accepted standard to select automatically the proper 
>>>> language and country, the Accept-Language header.
>>>> 
>>>> Unfortunately, some countries like the USA have more than one timezone, so 
>>>> country information isn't enough to select a timezone. To the best of my 
>>>> knowledge, there isn't a widely accepted way to determine the timezones of 
>>>> the end users on the web.
>>>> 
>>>> I intend to use the TIME_ZONE setting by default and to provide an 
>>>> equivalent of `translation.activate()` for setting the timezone. With this 
>>>> feature, developers can implement their own middleware to set the timezone 
>>>> for each user, for instance they may want to use 
>>>> <http://pytz.sourceforge.net/#country-information>.
>>>> 
>>>> This means I'll have to introduce another thread local. I know this is 
>>>> frowned upon. I'd be very interested if someone has a better idea.
>>>> 
>>>> It might be no longer necessary to set os.environ['TZ'] and run 
>>>> time.tzset() at all. That would avoid a number of problems and make 
>>>> Windows as well supported as Unix-based OSes — there's a bunch of tickets 
>>>> in Trac about this.
>>>> 
>>>> I'm less familiar with this part of the project and I'm interested in 
>>>> advice about how to implement it properly.
>>>> 
>>>> Backwards compatibility
>>>> .......................
>>>> 
>>>> Most previous attempts to resolve have stumbled upon this problem.
>>>> 
>>>> I propose to introduce a USE_TZ settings (yes, I know, yet another 
>>>> setting) that works exactly like USE_L10N. If set to False, the default, 
>>>> you will get the legacy (current) behavior. Thus, existing websites won't 
>>>> be affected. If set to True, you will get the new behavior described above.
>>>> 
>>>> I will also explain in the release notes how to migrate a database — which 
>>>> means shifting all datetimes to UTC. I will attempt to develop a script to 
>>>> automate this task.
>>>> 
>>>> Dependency on pytz
>>>> ..................
>>>> 
>>>> I plan to make pytz a mandatory dependency when USE_TZ is True. This would 
>>>> be similar to the dependency on on gettext when USE_I18N is True.
>>>> 
>>>> pytz gets a new release every time the Olson database is updated. For this 
>>>> reason, it's better not to copy it in Django, unlike simplejson and 
>>>> unittest2.
>>>> 
>>>> It was split from Zope some time ago. It's a small amount of clean code 
>>>> and it could be maintained within Django if it was abandoned (however 
>>>> unlikely that sounds).
>>>> 
>>>> Miscellaneous
>>>> .............
>>>> 
>>>> The following items have caused bugs in the past and should be checked 
>>>> carefully:
>>>> 
>>>> - caching: add timezone to cache key? See #5691.
>>>> - functions that use LocalTimezone: naturaltime, timesince, timeuntil, 
>>>> dateformat.
>>>> - os.environ['TZ']. See #14264.
>>>> - time.tzset() isn't supported on Windows. See #7062.
>>>> 
>>>> Finally, my proposal shares some ideas with 
>>>> https://github.com/brosner/django-timezones; I didn't find any 
>>>> documentation, but I intend to review the code.
>>>> 
>>>> About me
>>>> --------
>>>> 
>>>> I've been working with Django since 2008. I'm doing a lot of triage in 
>>>> Trac, I've written some patches (notably r16349, r16539, r16548, also some 
>>>> documentation improvements and bug fixes), and I've helped to set up 
>>>> continuous integration (especially for Oracle). In my day job, I'm 
>>>> producing enterprise software based on Django with a team of ten 
>>>> developers.
>>>> 
>>>> Work plan
>>>> ---------
>>>> 
>>>> Besides the research that's about 50% done, and discussion that's going to 
>>>> take place now, I expect the implementation and tests to take me around 
>>>> 80h. Given how much free time I can devote to Django, this means three to 
>>>> six months.
>>>> 
>>>> Here's an overview of my work plan:
>>>> 
>>>> - Implement the USE_TZ flag and database support — this requires checking 
>>>> the capabilities of each supported database in terms of datetime types and 
>>>> time zone support. Write tests, especially to ensure backwards 
>>>> compatibility. Write docs. (20h)
>>>> 
>>>> - Implement timezone localization in templates. Write tests. Write docs. 
>>>> (10h)
>>>> 
>>>> - Implement timezone localization in widgets and forms. Check the admin 
>>>> thoroughly. Write tests. Write docs. (15h)
>>>> 
>>>> - Implement the utilities to set the user's timezone. Write tests. Write 
>>>> docs. (15h)
>>>> 
>>>> - Reviews, etc. (20h)
>>>> 
>>>> What's next?
>>>> ------------
>>>> 
>>>> Constructive criticism, obviously :) Remember that the main problems here 
>>>> are backwards-compatibility and keeping things simple.
>>>> 
>>>> Best regards,
>>>> 
>>>> -- 
>>>> Aymeric.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> Annex: Research notes
>>>> ---------------------
>>>> 
>>>> Wiki
>>>> ....
>>>> 
>>>> [GSOC] 
>>>> https://code.djangoproject.com/wiki/SummerOfCode2011#Multipletimezonesupportfordatetimerepresentation
>>>> 
>>>> Relevant tickets
>>>> ................
>>>> 
>>>> #2626: canonical ticket for this issue
>>>> 
>>>> #2447: dupe, an alternative solution
>>>> #8953: dupe, not much info
>>>> #10587: dupe, a fairly complete proposal, but doesn't address backwards 
>>>> compatibility for existing data
>>>> 
>>>> Relevant related tickets
>>>> ........................
>>>> 
>>>> #14253: how should "now" behave in the admin when "client time" != "server 
>>>> time"?
>>>> 
>>>> Irrelevant related tickets
>>>> ..........................
>>>> 
>>>> #11385: make it possible to enter data in a different timezone in 
>>>> DateTimeField
>>>> #12666: timezone in the 'Date:' headers of outgoing emails - independant 
>>>> resolution
>>>> 
>>>> Relevant threads
>>>> ................
>>>> 
>>>> 2011-05-31  Timezone-aware storage of DateTime
>>>> http://groups.google.com/group/django-developers/browse_thread/thread/76e2b486d561ab79
>>>> 
>>>> 2010-08-16  Datetimes with timezones for mysql
>>>> https://groups.google.com/group/django-developers/browse_thread/thread/5e220687b7af26f5
>>>> 
>>>> 2009-03-23  Django internal datetime handling
>>>> https://groups.google.com/group/django-developers/browse_thread/thread/ca023360ab457b91
>>>> 
>>>> 2008-06-25  Proposal: PostgreSQL backends should *stop* using 
>>>> settings.TIME_ZONE
>>>> http://groups.google.com/group/django-developers/browse_thread/thread/b8c885389374c040
>>>> 
>>>> 2007-12-02  Timezone aware datetimes and MySQL (ticket #5304)
>>>> https://groups.google.com/group/django-developers/browse_thread/thread/a9d765f83f552fa4
>>>> 
>>>> Relevant related threads
>>>> ........................
>>>> 
>>>> 2009-11-24  Why not datetime.utcnow() in auto_now/auto_now_add
>>>> http://groups.google.com/group/django-developers/browse_thread/thread/4ca560ef33c88bf3
>>>> 
>>>> Irrelevant related threads
>>>> ..........................
>>>> 
>>>> 2011-07-25  "c" date formating and Internet usage
>>>> https://groups.google.com/group/django-developers/browse_thread/thread/61296125a4774291
>>>> 
>>>> 2011-02-10  GSoC 2011 student contribution
>>>> https://groups.google.com/group/django-developers/browse_thread/thread/0596b562cdaeac97/585ce1b04632198a?#585ce1b04632198a
>>>> 
>>>> 2010-11-04  Changing settings per test
>>>> https://groups.google.com/group/django-developers/browse_thread/thread/65aabb45687e572e
>>>> 
>>>> 2009-09-15  What is the status of auto_now and auto_now_add?
>>>> https://groups.google.com/group/django-developers/browse_thread/thread/cd1a76bca6055179
>>>> 
>>>> 2009-03-09  TimeField broken in Oracle
>>>> https://groups.google.com/group/django-developers/browse_thread/thread/bba2f80a2ca9b068
>>>> 
>>>> 2009-01-12  Rolling back tests -- status and open issues
>>>> https://groups.google.com/group/django-developers/browse_thread/thread/1e4f4c840b180895
>>>> 
>>>> 2008-08-05  Transactional testsuite
>>>> https://groups.google.com/group/django-developers/browse_thread/thread/49aa551ad41fb919
>>>> 
>>> 
>> 
> 

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to