Hi Donald, Thanks for taking the time to make such detailed comments! Thoughts below.
On Fri, Oct 2, 2015 at 4:04 PM, Donald Stufft <[email protected]> wrote: > On October 2, 2015 at 12:54:03 AM, Nathaniel Smith ([email protected]) wrote: >> >> Distutils delenda est. >> > > I think that you should drop (from this PEP) the handling of a VCS/arbitrary > directories and focus solely on creating a format for source distributions. A > source distribution can (and should) be fairly strict and well defined exactly > where all of the files go, what files exist and don't exist, and things of > that > nature (more on this later). Hmm. Okay, I think this really helps clarify our key point of difference! For me, an important requirement is that there continue to be a single standard command that end-users can use to install a VCS checkout. This is a really important usability property -- everyone knows "setup.py install". Unfortunately we can't keep "setup.py install" given its entanglement with distutils and the desire to split building and installation, so the obvious answer is that this should become 'pip install <directory>', and from that everything else follows. Having a standard way to install from a VCS checkout is also useful for things like requirements files... and in fact it's required by our current standards. PEP 440 has this as an example of a valid dependency specification: pip @ git+https://github.com/pypa/pip.git@7921be1537eac1e97bc40179a57f0349c2aee67d So I'm extremely reluctant to give up on standardizing how to handle VCS checkouts. And if we're going to have a standard for that, then would sure be nice if we could share the work between this standard and the one for sdists, given how similar they are. [...] > I don't believe that Python should develop anything like the Debian ability to > have a single source "package" create multiple binary packages. The metadata > of > the Wheel *must* strictly match the metadata of the sdist (except for things > that are Wheel specific). This includes things like name, version, etc. Trying > to go down this path I think will make things a lot more complicated since we > have a segmented archive where people have to claim particular names, > otherwise > how do you prevent me from registering the name "foobar" on PyPI and saying it > produces the "Django" wheel? What prevents it in the current draft is that there's no way for foobar to say any such thing :-). If you ask for Django, then the only sdist it will look at is the one in the Django segment. This is an intentionally limited solution, based on the intuition that multiple wheels from a single sdist will tend to be a relatively rare case, when they do occur then there will generally be one "main" wheel that people will want to depend on, and that people should be uploading wheels anyway rather than relying on sdists. (Part of the intuition for the last part is that we also have a not-terribly-secret-conspiracy here for writing a PEP to get Linux wheels onto PyPI and at least achieve feature parity with Windows / OS X. Obviously there will always be weird platforms -- iOS and FreeBSD and Linux-without-glibc and ... -- but this should dramatically reduce the frequency with which people need sdist dependencies.) If that proves inadequate, then the obvious extension would be to add some metadata to the sdist similar to Debian's, where an sdist has a list of all the wheels that it (might) produce when built, PyPI would grow an API by which pip-or-whoever could query for all sdists that claim to be able to produce wheel X, and at the same time PyPI would start enforcing the rule that if you want to upload an sdist that claims to produce wheel X then you have to own the name X. (After all, you need to own that name anyway so you can upload the wheels.) Or alternatively people could just split up their packages, like would be required by your proposal anyway :-). So I sorta doubt it will be a problem in practice, but even if becomes one then it won't be hard to fix. (And to be clear, the multiple-wheels-from-one-sdist thing is not a primary goal of this proposal -- the main reason we put it in is that once you've given up on having static wheel metadata inside the sdist then supporting multiple-wheels-from-one-sdist is trivial, so you might as well do it, esp. since it's a case that does seem to come up with some regularity in real life and you don't want to make people fight with their tools when it's unnecessary.) > Since I think this should only deal with source distributions, then the > primary > thing we need is an operation that will take an unpacked source distribution > that is currently sitting on the filesystem and turn it into a wheel located > in a specific location. > > The layout for a source distribution should be specified, I think something > like: > > . > ├── meta > │ ├── DESCRIPTION.rst > │ ├── FORMAT-VERSION > │ ├── LICENSE.txt > │ └── METADATA.json > └── src > ├── my-cool-build-tool.cfg > └── mypackage > └── __init__.py > > I don't particularly care about the exact names, but this layout gives us two > top level directories (and only two), one is a place where all of the source > distribution metadata goes, and one is a src directory where all of the files > for the project should go, including any relevant configuration for the build > tool in use by the project. Having two directories like this eliminates the > need to worry about naming collisions between the metadata files and the > project itself. > > We should probably give this a new name instead of "sdist" and give it a > dedicated extension. Perhaps we should call them "source wheels" and have the > extension be something like .swhl or .src.whl. This means we don't need to > worry about making the same artifact compatible with both the legacy toolchain > and a toolchain that supports "source wheels". > > We should also probably specify a particular container format to be used for > a .whl/.src.whl. It probably makes sense to simply use zip since that is what > wheels use and it supports different compression algorithms internally. We > probably want to at least suggest limiting compression algorithms used to > Deflate and None, if not mandate that one of those two are used. > > We should include absolutely as much metadata as part of the static metadata > inside the sdist as we can. I don't think there is any case to be made for > things like name, version, summary, description, classifiers, license, > keywords, contact information (author/maintainers), project URLs, etc are > Wheel specific. I think there are other things which are arguably able to be > specified in the sdist, but I'd need to fiddle with it to be sure. Basically > any metadata that isn't included as static information will not be able to be > displayed on PyPI. I feel like this idea of "source wheels" makes some sense if we want something that looks like a wheel, but without the ABI compatibility issues of wheels. I'm uncertain how well it can be made to work in practice, or how urgent it is once we have a 95% solution in place for linux wheels, but it's certainly an interesting idea. To me it feels rather different from a traditional sdist, and obviously there's still the problem of having a standard way to build from a VCS checkout. It might even make sense to have standard methods to go: VCS checkout -> sdist -> (wheels and/or source wheels) ? > The metada should directly include the specifiers inside of it and shouldn't > propagate the meme that pip's requirements.txt format is anything but a way > to recreate a specific environment with pip. Yeah, there's a big question mark next to the requirements.txt stuff in the draft PEP, because something more standard and structured would certainly be nice. But requirements.txt is wildly popular, and for a good reason -- it provides a simple terse syntax that does what people want. (By comparison, the PEP 426 JSON syntax for requirements with extras and environment specifiers is extremely cumbersome yet less featureful.) And semantically, what we want here is a way to say "to build *this* I need an environment that looks like *this*", which is pretty close to what requirements.txt is actually designed for. So I dunno -- instead of fighting the meme maybe we should embrace it :-). But obviously this is a tangent to the main questions. [...] > I don't think there's ever going to be a world where pip depends on virtualenv > or pyvenv. Huh, really? Can you elaborate on why not? The standard doesn't have to require the use of clean build environments (I was thinking of the text in the standard as applying with the "as if rule" -- a valid sdist is one that can be built the way described, if you have some way that will work to build such sdists then your way is valid too). But using clean environments by default is really the only way that we're going to get a world where most packages have accurate build requirements. -n -- Nathaniel J. Smith -- http://vorpus.org _______________________________________________ Distutils-SIG maillist - [email protected] https://mail.python.org/mailman/listinfo/distutils-sig
