Greetings all,

Now that PEP 495 <https://www.python.org/dev/peps/pep-0495/> (the fold
attribute, added in Python 3.6) and PEP 615
<https://www.python.org/dev/peps/pep-0615/> (the zoneinfo module, added
in Python 3.9) have been accepted, there's not much reason to continue
using pytz, and its non-standard interface is a major source of bugs
<https://blog.ganssle.io/articles/2018/03/pytz-fastest-footgun.html>. In
fact, the creator and maintainer of pytz said during the PEP discussions
that he was looking forward to deprecating pytz
<https://discuss.python.org/t/pep-615-support-for-the-iana-time-zone-database-in-the-standard-library/3468/16>.
After merging zoneinfo upstream into CPython, I turned the reference
implementation into a backport to Python 3.6+
<https://github.com/pganssle/zoneinfo>, which I continue to maintain
alongside the upstream zoneinfo module, so it is already possible to
migrate Django /today/.

Right now, Django continues to use pytz as its source of time zones,
even for simple zones like UTC; I would like to propose starting the
migration to zoneinfo /now/, to enable Django users to start doing their
own migrations, and so there can be ample time for proper deprecation
cycles. I apologize in advance for the huge wall of text, the TL;DR
summary is:

  * It's important to start this migration as soon as possible, because
    it will take a long time and some of the bugs it fixes get worse as
    time goes on; right now Django's assumption that all time zones are
    from pytz will likely block Django users from migrating before
    Django does.
  * The combination of pytz's non-standard interface and Hyrum's Law
    <https://www.hyrumslaw.com/> makes this less straightforward than
    one would hope. Likely the best solution is to adopt "zoneinfo"
    immediately with a wrapper that deprecates the pytz-specific
    interface. I have provided a library to do this
    <https://pypi.org/project/pytz-deprecation-shim/>.
  * There are some open questions (which may only be open because I
    don't know Django's process well enough) in the migration plan:
      o Should this change be put under a feature flag of some sort? If
        so, what should be the default values for coming releases?
      o Should /all/ the pytz-specific tests be kept with zoneinfo tests
        added, or should they be migrated and pytz tests confined to a
        smaller subset?

*Rationale*

Before I get into the detailed migration plan, I'd like to lay out the
case for /why/ this needs to be done. I expect it to be at least
somewhat painful, but and I'd prefer it if y'all were convinced that
it's the right thing to do /before/ you hear about the costs 😉. I am
not the kind of person who thinks that just because something is in the
standard library it is automatically "better" than what's available on
PyPI, but in the case of pytz, I've been recommending people move away
from it from a long time, because most people don't know how to use it
correctly. The issue is that pytz was originally designed to work around
issues with ambiguous and imaginary times that were fixed by PEP 495 in
Python 3.6, and the compromise it made for correctness is that it is not
really compatible with the standard tzinfo interface. That's why you
aren't supposed to directly attach a timezone with the tzinfo argument,
and why datetime arithmetic requires normalization. Any substantial code
base I've ever seen that uses pytz either has bugs related to this or
had a bunch of bugs related to this until someone noticed the problem
and did some sort of large-scale change to fix all the bugs, then
imposed some sort of linting rules.

In addition to the "pytz is hard to use" problems (which, honestly,
should probably be enough), pytz also has a few additional issues that
the maintainer has said will not be fixed (unless pytz becomes a thin
wrapper around zoneinfo, which is at best just using a slower version of
zoneinfo). The biggest issue, which is going to become increasingly
relevant in the coming years, is that it only supports the Version 1
spec in TZif files, which (among other issues), does not support
datetimes after 2038 (or before 1902) — this was deprecated 15 years
ago, and it is unlikely that you will find modern TZif files that don't
have Version 2+ data. Pytz also does not support sub-minute offsets,
which is mostly relevant only in historical time zones. And, of course,
pytz is not compatible with PEP 495 and in some ways really cannot be
made compatible with PEP 495 (at least not easily).

Of course, one reasonable objection here is that Django is a huge
install base with very long release cycles, and it doesn't make sense
for Django to be an experimental "early adopter" of the new library.
This is a reasonable response, but it's actually /because/ it has a huge
install base and long release cycles that it's important for Django to
migrate early, because it can't "turn on a dime" like that. The long
release cycles mean that changes made /now/ won't be universal for many
years, and it's important for users to have a long notice period that
change is coming (particularly since, as time goes on, Year 2038 bugs
will become more and more common). The huge install base means that /at
a minimum/, zoneinfo should be /supported/, to allow users to start
their own migrations today.

*Migration Plan*

I am fairly certain this is going to be a tricky migration and will
inevitably come with /some/ user pain. I don't think this will be Python
2 → 3 style pain, but some users who have been doing the "right thing"
with pytz will need to make changes to their code in the long run, which
is unfortunate. I think there are several stages of "support" for
zoneinfo, not all of which are mandatory.

 1. Drop all requirements that time zones support pytz-specific
    interfaces internally — make it so that end users have the /option/
    to use zoneinfo and datetime.timezone.
 2. Document that users /should/ use zoneinfo, even if it's not the default.
 3. Convert all uses of pytz to using a shim that supports pytz's
    interface, but raises warnings whenever something pytz-specific is
    used. [0] I have already created a library for this purpose
    <https://pypi.org/project/pytz-deprecation-shim/>. [1] This has
    three sub-stages:
     1. Make this optional, /enabled/ by a feature flag. [2]
     2. Make this optional, /disabled/ by a feature flag.
     3. Remove all feature flags related to this, make it mandatory.
 4. Remove the shims, making zoneinfo the default for all options, but
    maintaining support for user-supplied pytz zones (again this can be
    rolled out under feature flags if desired).
 5. Remove support for pytz entirely.

I am not sufficiently familiar with Django's development process to know
if feature flags are frequently used. I recommend /at a minimum/ doing
#1 immediately and adding tests for compatibility with the zoneinfo
module. Preferably that would be accompanied with #2 as well.

I have created a Proof of Concept PR
<https://github.com/django/django/pull/13073> using the deprecation
shims (#3), but with no feature flags, which I think is the /fastest/
you should move on this. It was a fairly minimal change, which was
encouraging, but a great many of the tests still have an explicit
dependency on pytz. This /will break some users/, as is probably evident
from the fact that I needed to make changes /other than/ the
constructors and fixes to warnings. Encouragingly, the only test I
needed to touch was to fix a warning, not an error; to the extent that
the Django test suite describes Django's public interface, this is a
"non-breaking change", but as I mention in the pytz-deprecation-shim
migration guide
<https://pytz-deprecation-shim.readthedocs.io/en/latest/migration.html>,
there are some semantic differences that you could easily encounter if
you are counting on `django.utils.timezone.get_current_timezone()`
returning a pytz time zone instead of a shim class. The majority of
people won't encounter these, but... Hyrum's Law.

One remaining question here is what to do with the pytz-specific tests.
Until Django gets to stage 5 (pytz zones aren't even /supported/), there
should be at least /some/ tests for pytz support, but presumably most of
the tests that explicitly use pytz are only doing so incidentally (e.g.
almost all uses of `pytz.utc`). In stage 3, I would think that most
pytz-specific tests should either be parameterized to test with both
pytz and zoneinfo or tested only with zoneinfo.

I am not terribly familiar with the release schedule or backwards
compatibility guarantees that Django makes in point versions, etc. I
read the Django documentation on stability, but it suffers from the same
problems that SemVer in general suffers from (and that there's no
avoiding, really), which is that breaking changes are in the eye of the
beholder. I leave it to y'all to decide the roll-out schedule for this
stuff (assuming it's accepted at all), but I'm happy to offer what
advice I can on the matter.

For those of you who have made it this far, thank you for indulging me.

Best,
Paul


[0] There's an additional step between 2 and 3 that can be taken, which
is the adoption of pytz-deprecation-shim (or an equivalent thereof), but
only for UTC. Most (all?) of the semantic differences between pytz and
the shims don't come up with fixed-offset zones or UTC, and a shim
around UTC to provide localize and normalize + warning can be a wrapper
around datetime.timezone.utc rather than the much newer zoneinfo.

[1] I created pytz-deprecation-shim specifically to help migrate big
code bases and popular libraries like Django and pandas off pytz, but if
you are concerned with taking on an additional dependency, it's not
terribly difficult to extract the core of the library directly into
Django (for example, Django doesn't need any of the stuff for Python 2
compatibility), but the downside there is that what support exists for a
Django fork would lag behind the upstream library.

[2] Unless this is a build-time flag, there's no way to have Django
defaulting to having pytz as a required dependency with an option to not
depend on it.


-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/b6e8e75c-dddf-da65-3af5-43f1b5b23eca%40ganssle.io.

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to