This discussion is more about the known problem of pendulum and how we could deal with it and maybe how we (as Community) might help autor.
The library is mostly supported by a single author Sébastien Eustace ( https://github.com/sdispater) and it seems like we bump into the situation which is described in xkcd #2347 ( https://imgs.xkcd.com/comics/dependency.png). To be honest it is not something new when library mainly supported by one author so there is always a risk that the library will no longer be supported / abandoned And if takes in account that pendulum provides core functionality in Airflow it could have dramatical impact in the future. Pendulum is a really nice library which helps a lot of developers to work with dates/datetimes. However there is one major problem, the last release of this library happened more than 3 years ago ( https://pypi.org/project/pendulum/#history) in the time when Airflow 1.10.11 was released Fortunately, the project is not abandoned and on a regular basis commits add into the master branch. However these commits are not included into any final release and that's why some things related to datetime don't work as expected in Airflow. There are list of known (for me) issues which are affect Airflow *Memory Leak on parse*: - https://github.com/sdispater/pendulum/issues/720, this one fixed 2 years ago but not available yet (https://github.com/sdispater/pendulum/pull/563). Since we use parse dates in airflow codebase: datetime parameters and datetime in logs this one could be a reason for memory leakage in Airflow: - https://github.com/apache/airflow/discussions/24694 - https://github.com/apache/airflow/discussions/28597 *Incorrect time zones*, known issues and should be already fixed in master branch - https://github.com/sdispater/pendulum/issues/700, Mexico do not use DST anymore - https://github.com/sdispater/pendulum/issues/706, Egypt reinstate DST We add clarification in https://github.com/apache/airflow/pull/30467, however it seems like there is no other way rather than patching Pendulum right now. All these issues should be solved as soon as pendulum 3 is released. The current announced estimation is end of september/ beginning of October: https://github.com/sdispater/pendulum/issues/600#issuecomment-1711299677 So in theory we would have a fixed version of pendulum soon, and it might break something in Airflow but from my point of view it is better than current status. However there might be a situation where the release of the pendulum would be postponed, so maybe better to have a backup plan. What could we do in this case? Maybe we should start to use zoneinfo.ZoneInfo instead of pendulum datetime? https://github.com/apache/airflow/issues/19450 Pros: - stdlib (python 3.9+) - In pendulum 3.0 Timezone based on zoneinfo.Zoneinfo Cons: - Current serialization model can't deal with backport packages. E.g. timezone which are serialized in backport_zoneinfo can't be deserialized in zoneinfo Maybe we should replace parse datetime with another solution. Does anyone know a good replacement? Maybe someone from Airflow Community could propose their help with maintenance of library: - https://github.com/sdispater/pendulum/issues/590 Maybe we should get rid of the pendulum at all, as a last resort solution. I can't imagine how we could do that, because a lot of stuff depends on the pendulum and removing it would be a breaking change. ---- Best Wishes *Andrey Anshin*
