Hi, I would like to propose that, as a community, we jointly maintain a curated set of Python packages that are known to work together. These packages would receive security updates for some time and every couple of months a new major release of the curated set comes available. The idea of this is inspired by Haskell LTS, so maybe we should call this PyPI LTS?
So why a PyPI LTS? PyPI makes available all versions of packages that were uploaded, and by default installers like pip will try to use the latest available versions of packages, unless told otherwise. With a requirements.txt file (or a future pipfile.lock) and setup.py we can pin as much as we like our requirements of respectively the environment and package requirements, thereby making a more reproducible environment possible and also fixing the API for developers. Pinning requirements is often a manual job, although one could use pip freeze or other tools. A common problem is when two packages in a certain environment require different versions of a package. Having a curated set of packages, developers could be encouraged to test against the latest stable and nightly of the curated package set, thereby increasing compatibility between different packages, something I think we all want. Having a compatible set of packages is not only interesting for developers, but also for downstream distributions. All distributions try to find a set of packages that are working together and release them. This is a lot of work, and I think it would be in everyone's benefit if we try to solve this issue together. A possible solution Downstream, that is developers and distributions, will need a set of packages that are known to work together. At minimum this would consist of, per package, the name of the package and its version, but for reproducibility I would propose adding the filename and hash as well. Because there isn't any reliable method to extract the requirements of a package, I propose also including `setup_requires`, install_requires`, and `tests_require` explicitly. That way, distributions can automatically build recipes for the packages (although non-Python dependencies would still have to be resolved by the distribution). The package set would be released as lts-YYYY-MM-REVISION, and developers can choose to track a specific revision, but would typically be asked to track only lts-YYYY-MM which would resolve to the latest REVISION. Because dependencies vary per Python language version, interpreter, and operating system, we would have to have these sets for each combination and therefore I propose having a source which evaluates to say a TOML/JSON file per version/interpreter/OS. How this source file should be written I don't know; while I think the Nix expression language is an excellent choice for this, it is not possible for everyone to use and therefore likely not an option. Open questions There are still plenty of open questions. - Who decides when a package is updated that would break dependents? This is an issue all distributions face, so maybe we should involve them. - How would this be integrated with pip / virtualenv / pipfile.lock / requirements.txt / setup.py? See e.g. https://github.com/pypa/pipfile/issues/10#issuecomment-262229620 References to Haskell LTS Here are several links to some interesting documents on how Haskell LTS works. - A blog post describing what Haskell LTS is: https://www.fpcomplete.com/blog/2014/12/backporting-bug-fixes - Rules regarding uploading and breaking packages: https://github.com/fpco/stackage/blob/master/MAINTAINERS.md#adding-a-package - The actual LTS files https://github.com/fpco/lts-haskell What do you think of this proposal? Would you be interested in this as developer, or packager? Freddy
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig