Can you clarify the relationship to PEP426 metadata? There's no standard for metadata in here other than what's required to run a build hook. Does that imply you would have each build tool enforce their own convention for where metadata is found?
On Thu, Oct 1, 2015 at 9:53 PM, Nathaniel Smith <n...@pobox.com> wrote: > Hi all, > > We realized that actually as far as we could tell, it wouldn't be that > hard at this point to clean up how sdists work so that it would be > possible to migrate away from distutils. So we wrote up a little draft > proposal. > > The main question is, does this approach seem sound? > > -n > > --- > > PEP: ?? > Title: Standard interface for interacting with source trees > and source distributions > Version: $Revision$ > Last-Modified: $Date$ > Author: Nathaniel J. Smith <n...@pobox.com> > Thomas Kluyver <tak...@gmail.com> > Status: Draft > Type: Standards-Track > Content-Type: text/x-rst > Created: 30-Sep-2015 > Post-History: > Discussions-To: <distutils-sig@python.org> > > Abstract > ======== > > Distutils delenda est. > > > Extended abstract > ================= > > While ``distutils`` / ``setuptools`` have taken us a long way, they > suffer from three serious problems: (a) they're missing important > features like autoconfiguration and usable build-time dependency > declaration, (b) extending them is quirky, complicated, and fragile, > (c) you are forced to use them anyway, because they provide the > standard interface for installing python packages expected by both > users and installation tools like ``pip``. > > Previous efforts (e.g. distutils2 or setuptools itself) have attempted > to solve problems (a) and/or (b). We propose to solve (c). > > The goal of this PEP is get distutils-sig out of the business of being > a gatekeeper for Python build systems. If you want to use distutils, > great; if you want to use something else, then the more the merrier. > The difficulty of interfacing with distutils means that there aren't > many such systems right now, but to give a sense of what we're > thinking about see `flit <https://github.com/takluyver/flit>`_ or > `bento > <https://cournape.github.io/Bento/>`_. Fortunately, wheels have now > solved many of the hard problems here -- e.g. it's no longer necessary > that a build system also know about every possible installation > configuration -- so pretty much all we really need from a build system > is that it have some way to spit out standard-compliant wheels. > > We therefore propose a new, relatively minimal interface for > installation tools like ``pip`` to interact with package source trees > and source distributions. > > > Synopsis and rationale > ====================== > > To limit the scope of our design, we adopt several principles. > > First, we distinguish between a *source tree* (e.g., a VCS checkout) > and a *source distribution* (e.g., an official snapshot release like > ``lxml-3.4.4.zip``). > > There isn't a whole lot that *source trees* can be assumed to have in > common. About all you know is that they can -- via some more or less > Rube-Goldbergian process -- produce one or more binary distributions. > In particular, you *cannot* tell via simple static inspection: > - What version number will be attached to the resulting packages (e.g. > it might be determined programmatically by consulting VCS metadata -- > I have here a build of numpy version "1.11.0.dev0+4a9ad17") > - What build- or run-time dependencies are required (e.g. these may > depend on arbitrarily complex configuration settings that are > determined via a mix of manual settings and auto-probing) > - Or even how many distinct binary distributions will be produced > (e.g. a source distribution may always produce wheel A, but only > produce wheel B when built on Unix-like systems). > > Therefore, when dealing with source trees, our goal is just to provide > a standard UX for the core operations that are commonly performed on > other people's packages; anything fancier and more developer-centric > we leave at the discretion of individual package developers. So our > source trees just provide some simple hooks to let a tool like > ``pip``: > > - query for build dependencies > - run a build, producing wheels as output > - set up the current source tree so that it can be placed on > ``sys.path`` in "develop mode" > > and that's it. We teach users that the standard way to install a > package from a VCS checkout is now ``pip install .`` instead of > ``python setup.py install``. (This is already a good idea anyway -- > e.g., pip can do reliable uninstall / upgrades.) > > Next, we note that pretty much all the operations that you might want > to perform on a *source distribution* are also operations that you > might want to perform on a source tree, and via the same UX. The only > thing you do with source distributions that you don't do with source > trees is, well, distribute them. There's all kind of metadata you > could imagine including in a source distribution, but each piece of > metadata puts an increased burden on source distribution generation > tools, and most operations will still have to work without this > metadata. So we only include extra metadata in source distributions if > it helps solve specific problems that are unique to distribution. If > you want wheel-style metadata, get a wheel and look at it -- they're > great and getting better. > > Therefore, our source distributions are basically just source trees + > a mechanism for signing. > > Finally: we explicitly do *not* have any concept of "depending on a > source distribution". As in other systems like Debian, dependencies > are always phrased in terms of binary distributions (wheels), and when > a user runs something like ``pip install <package>``, then the > long-run plan is that <package> and all its transitive dependencies > should be available as wheels in a package index. But this is not yet > realistic, so as a transitional / backwards-compatibility measure, we > provide a simple mechanism for ``pip install <package>`` to handle > cases where <package> is provided only as a source distribution. > > > Source trees > ============ > > We retroactively declare the legacy source tree format involving > ``setup.py`` to be "version 0". We don't try to specify it further; > its de facto specification is encoded in the source code of > ``distutils``, ``setuptools``, ``pip``, and other tools. > > A version 1-or-greater format source tree can be identified by the > presence of a file ``_pypackage/_pypackage.cfg``. > > If both ``_pypackage/_pypackage.cfg`` and ``setup.py`` are present, > then we have a version 1+ source tree, i.e., ``setup.py`` is ignored. > This is necessary because we anticipate that version 1+ source trees > may want to contain a ``setup.py`` file for backwards compatibility, > e.g.:: > > #!/usr/bin/env python > import sys > print("Don't call setup.py directly!") > print("Use 'pip install .' instead!") > print("(You might have to upgrade pip first.)") > sys.exit(1) > > In the current version of the specification, the one file > ``_pypackage/_pypackage.cfg`` is where pretty much all the action is > (though see below). The motivation for putting it into a subdirectory > is that: > - the way of all standards is that cruft accumulates over time, so > this way we pre-emptively have a place to put it, > - real-world projects often accumulate build system cruft as well, so > we might as well provide one obvious place to put it too. > > Of course this then creates the possibility of collisions between > standard files and user files, and trying to teach arbitrary users not > to scatter files around willy-nilly never works, so we adopt the > convention that names starting with an underscore are reserved for > official use, and non-underscored names are available for > idiosyncratic use by individual projects. > > The alternative would be to simply place the main configuration file > at the top-level, create the subdirectory only when specifically > needed (most trees won't need it), and let users worry about finding > their own place for their cruft. Not sure which is the best approach. > Plus we can have a nice bikeshed about the names in general (FIXME). > > _pypackage.cfg > -------------- > > The ``_pypackage.cfg`` file contains various settings. Another good > bike-shed topic is which file format to use for storing these (FIXME), > but for purposes of this draft I'll write examples using `toml > <https://github.com/toml-lang/toml>`_, because you'll instantly be > able to understand the semantics, it has similar expressivity to JSON > while being more human-friendly (e.g., it supports comments and > multi-line strings), it's better-specified than ConfigParser, and it's > much simpler than YAML. Rust's package manager uses toml for similar > purposes. > > Here's an example ``_pypackage/_pypackage.cfg``:: > > # Version of the "pypackage format" that this file uses. > # Optional. If not present then 1 is assumed. > # All version changes indicate incompatible changes; backwards > # compatible changes are indicated by just having extra stuff in > # the file. > version = 1 > > [build] > # An inline requirements file. Optional. > # (FIXME: I guess this means we need a spec for requirements files?) > requirements = """ > mybuildtool >= 2.1 > special_windows_tool ; sys_platform == "win32" > """ > # The path to an out-of-line requirements file. Optional. > requirements-file = "build-requirements.txt" > # A hook that will be called to query build requirements. Optional. > requirements-dynamic = "mybuildtool:get_requirements" > > # A hook that will be called to build wheels. Required. > build-wheels = "mybuildtool:do_build" > > # A hook that will be called to do an in-place build (see below). > # Optional. > build-in-place = "mybuildtool:do_inplace_build" > > # The "x" namespace is reserved for third-party extensions. > # To use x.foo you should own the name "foo" on pypi. > [x.mybuildtool] > spam = ["spam", "spam", "spam"] > > All paths are relative to the ``_pypackage/`` directory (so e.g. the > build.requirements-file value above refers to a file named > ``_pypackage/build-requirements.txt``). > > A *hook* is a Python object that is looked up using the same rules as > traditional setuptools entry_points: a dotted module name, followed by > a colon, followed by a dotted name that is looked up within that > module. *Running a hook* means: first, find or create a python > interpreter which is executing in the current venv, whose working > directory is set to the ``_pypackage/`` directory, and which has the > ``_pypackage/`` directory on ``sys.path``. Then, inside this > interpreter, look up the hook object, and call it, with arguments as > specified below. > > A build command like ``pip wheel <source tree>`` performs the following > steps: > > 1) Validate the ``_pypackage.cfg`` version number. > > 2) Create an empty virtualenv / venv, that matches the environment > that the installer is targeting (e.g. if you want wheels for CPython > 3.4 on 64-bit windows, then you make a CPython 3.4 64-bit windows > venv). > > 3) If the build.requirements key is present, then in this venv run the > equivalent of ``pip install -r <a file containing its value>``, using > whatever index settings are currently in effect. > > 4) If the build.requirements-file key is present, then in this venv > run the equivalent of ``pip install -r <the named file>``, using > whatever index settings are currently in effect. > > 5) If the build.requirements-dynamic key is present, then in this venv > run the hook with no arguments, capture its stdout, and pipe it into > ``pip install -r -``, using whatever index settings are currently in > effect. If the hook raises an exception, then abort the build with an > error. > > Note: because these steps are performed in sequence, the > build.requirements-dynamic hook is allowed to use packages that are > listed in build.requirements or build.requirements-file. > > 6) In this venv, run the build.build-wheels hook. This should be a > Python function which takes one argument. > > This argument is an arbitrary dictionary intended to contain > user-specified configuration, specified via some install-tool-specific > mechanism. The intention is that tools like ``pip`` should provide > some way for users to specify key/value settings that will be passed > in here, analogous to the legacy ``--install-option`` and > ``--global-option`` arguments. > > To make it easier for packages to transition from version 0 to > version 1 sdists, we suggest that ``pip`` and other tools that have > such existing option-setting interfaces SHOULD map them to entries in > this dictionary when -- e.g.:: > > pip --global-option=a --install-option=b --install-option=c > > could produce a dict like:: > > {"--global-option": ["a"], "--install-option": ["b", "c"]} > > The hook's return value is a list of pathnames relative to the > scratch directory. Each entry names a wheel file created by this > build. > > Errors are signaled by raising an exception. > > When performing an in-place build (e.g. for ``pip install -e .``), > then the same steps are followed, except that instead of the > build.build-wheels hook, we call the build.build-in-place hook, and > instead of returning a list of wheel files, it returns the name of a > directory that should be placed onto ``sys.path`` (usually this will > be the source tree itself, but may not be, e.g. if a build system > wants to enforce a rule where the source is always kept pristine then > it could symlink the .py files into a build directory, place the > extension modules and dist-info there, and return that). This > directory must contain importable versions of the code in the source > tree, along with appropriate .dist-info directories. > > (FIXME: in-place builds are useful but intrinsically kinda broken -- > e.g. extensions / source / metadata can all easily get out of sync -- > so while I think this paragraph provides a reasonable hack that > preserves current functionality, maybe we should defer specifying them > to until after we've thought through the issues more?) > > When working with source trees, build tools like ``pip`` are > encouraged to cache and re-use virtualenvs for performance. > > > Other contents of _pypackage/ > ----------------------------- > > _RECORD, _RECORD.jws, _RECORD.p7s: see below. > > _x/<pypi name>/: reserved for use by tools (e.g. > _x/mybuildtool/build/, _x/pip/venv-cache/cp34-none-linux_x86_64/) > > > Source distributions > ==================== > > A *source distribution* is a file in a well-known archive format such > as zip or tar.gz, which contains a single directory, and this > directory is a source tree (in the sense defined in the previous > section). > > The ``_pypackage/`` directory in a source distribution SHOULD also > contain a _RECORD file, as defined in PEP 427, and MAY also contain > _RECORD.jws and/or _RECORD.p7s signature files. > > For official releases, source distributions SHOULD be named as > ``<package>-<version>.<ext>``, and the directory they contain SHOULD > be named ``<package>-<version>``, and building this source tree SHOULD > produce a wheel named ``<package>-<version>-<compatibility tag>.whl`` > (though it may produce other wheels as well). > > (FIXME: maybe we should add that if you want your sdist on PyPI then > you MUST include a proper _RECORD file and use the proper naming > convention?) > > Integration tools like ``pip`` SHOULD take advantage of this > convention by applying the following heuristic: when seeking a package > <package>, if no appropriate wheel can be found, but an sdist named > <package>-<version>.<ext> is found, then: > > 1) build the sdist > 2) add the resulting wheels to the package search space > 3) retry the original operation > > This handles a variety of simple and complex cases -- for example, if > we need a package 'foo', and we find foo-1.0.zip which builds foo.whl > and bar.whl, and foo.whl depends on bar.whl, then everything will work > out. There remain other cases that are not handled, e.g. if we start > out searching for bar.whl we will never discover foo-1.0.zip. We take > the perspective that this is nonetheless sufficient for a transitional > heuristic, and anyone who runs into this problem should just upload > wheels already. If this turns out to be inadequate in practice, then > it will be addressed by future extensions. > > > Examples > ======== > > **Example 1:** While we assume that installation tools will have to > continue supporting version 0 sdists for the indefinite future, it's a > useful check to make sure that our new format can continue to support > packages using distutils / setuptools as their build system. We assume > that a future version ``pip`` will take its existing knowledge of > distutils internals and expose them as the appropriate hooks, and then > existing distutils / setuptools packages can be ported forward by > using the following ``_pypackage/_pypackage.cfg``:: > > [build] > requirements = """ > pip >= whatever > wheel > """ > # Applies monkeypatches, then does 'setup.py dist_info' and > # extracts the setup_requires > requirements-dynamic = "pip.pypackage_hooks:setup_requirements" > # Applies monkeypatches, then does 'setup.py wheel' > build-wheels = "pip.pypackage_hooks:build_wheels" > # Applies monkeypatches, then does: > # setup.py dist_info && setup.py build_ext -i > build-in-place = "pip.pypackage_hooks:build_in_place" > > This is also useful for any other installation tools that may want to > support version 0 sdists without having to implement bug-for-bug > compatibility with pip -- if no ``_pypackage/_pypackage.cfg`` is > present, they can use this as a default. > > **Example 2:** For packages using numpy.distutils. This is identical > to the distutils / setuptools example above, except that numpy is > moved into the list of static build requirements. Right now, most > projects using numpy.distutils don't bother trying to declare this > dependency, and instead simply error out if numpy is not already > installed. This is because currently the only way to declare a build > dependency is via the ``setup_requires`` argument to the ``setup`` > function, and in this case the ``setup`` function is > ``numpy.distutils.setup``, which... obviously doesn't work very well. > Drop this ``_pypackage.cfg`` into an existing project like this and it > will become robustly pip-installable with no further changes:: > > [build] > requirements = """ > numpy > pip >= whatever > wheel > """ > requirements-dynamic = "pip.pypackage_hooks:setup_requirements" > build-wheels = "pip.pypackage_hooks:build_wheels" > build-in-place = "pip.pypackage_hooks:build_in_place" > > **Example 3:** `flit <https://github.com/takluyver/flit>`_ is a tool > designed to make distributing simple packages simple, but it currently > has no support for sdists, and for convenience includes its own > installation code that's redundant with that in pip. These 4 lines of > boilerplate make any flit-using source tree pip-installable, and lets > flit get out of the package installation business:: > > [build] > requirements = "flit" > build-wheels = "flit.pypackage_hooks:build_wheels" > build-in-place = "flit.pypackage_hooks:build_in_place" > > > FAQ > === > > **Why is it version 1 instead of version 2?** Because the legacy sdist > format is barely a format at all, and to `remind us to keep things > simple < > https://en.wikipedia.org/wiki/The_Mythical_Man-Month#The_second-system_effect > >`_. > > **What about cross-compilation?** Standardizing an interface for > cross-compilation seems premature given how complicated the > configuration required can be, the lack of an existing de facto > standard, and the authors of this PEP's inexperience with > cross-compilation. This would be a great target for future extensions, > though. In the mean time, there's no requirement that > ``_pypackage/_pypackage.cfg`` contain the *only* entry points to a > project's build system -- packages that want to support > cross-compilation can still do so, they'll just need to include a > README explaining how to do it. > > **PEP 426 says that the new sdist format will support automatically > creating policy-compliant .deb/.rpm packages. What happened to that?** > Step 1: enhance the wheel format as necessary so that a wheel can be > automatically converted into a policy-compliant .deb/.rpm package (see > PEP 491). Step 2: make it possible to automatically turn sdists into > wheels (this PEP). Step 3: we're done. > > **What about automatically running tests?** Arguably this is another > thing that should be pushed off to wheel metadata instead of sdist > metadata: it's good practice to include tests inside your built > distribution so that end-users can test their install (and see above > re: our focus here being on stuff that end-users want to do, not > dedicated package developers), there are lots of packages that have to > be built before they can be tested anyway (e.g. because of binary > extensions), and in any case it's good practice to test against an > installed version in order to make sure your install code works > properly. But even if we do want this in sdist, then it's hardly > urgent (e.g. there is no ``pip test`` that people will miss), so we > defer that for a future extension to avoid blocking the core > functionality. > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG@python.org > https://mail.python.org/mailman/listinfo/distutils-sig >
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig