On October 12, 2015 at 12:54:07 AM, Nathaniel Smith (n...@pobox.com) wrote:
I dunno -- I'm sure there must exist some other ways forward that
don't require dropping the dream of static dependencies. At one
extreme, we had a birds-of-a-feature at SciPy this year on "the future
of numpy", and the most vocal audience contingent was in favor of
numpy simply dropping upstream support for pip/wheels/pypi entirely
and requiring all downstream packages/users to switch to conda or
building by hand. It sounds like a terrible idea to me. But I would
find it easier to believe in the pypi/pip ecosystem if there were some
concrete plan for how this all-static world was actually going to
work, and that it wasn't just chasing rainbows.
FTR, I plan on making some sort of auto builder for PyPI so it’s possible that
we can get pip to a point where almost all things it downloads are binary
wheels and we don’t need to worry too much about needing to optimize the sdist
case. I also think that part of the problem with egg-info and setup.y’s and
pip’s attempted use of them is just down to the setup.py interface being pretty
horrible and setup.py’s serving too many masters where it needs to be both VCS
entry point, sdist metadata entry point, installation entry point, and wheel
building entry point.
I also think that it would be a terrible idea to have the science stack leave
the “standard” Python packaging ecosystem and go their own way and I think it’d
make the science packages essentially useless to the bulk of the non science
users. I think a lot of the chasing rainbows like stuff comes mostly from: We
have some desires from our experiences but we haven’t yet taken the time to
fully flesh out what the impact of those desires are, nor are any of us science
stack users (or contributors) to my knowledge, so we don’t deal with the
complexities of that much [1].
One possible idea that I’ve thought of here, which may or may not be a good
idea:
Require packages to declare up front conditional dependencies (I’m assuming the
list of dependencies that a project *could* have is both finite and known ahead
of time) and let them give groups of these dependencies a name. I’m thinking
something similar to setuptools extras where you might be able to put a list of
dependencies to a named group. The build interface could include a way for the
thing that’s calling the build tool to say “I require the feature represented
by this named group of dependencies”[2], and then the build tool can hard fail
if it detects it can’t be build in a way that requires those dependencies at
runtime. When the final build is done, it could put into the Wheel a list of
all of the additional named groups of dependencies it built with. The runtime
dependencies of a wheel would then be the combination of all of those named
groups of dependencies + the typical install_requires dependencies. This could
possibly even be presented nicely on PyPI as a sort of “here are the extra
features you can get with this library, and what that does to it’s
dependencies”.
Would something like that solve Numpy’s dependency needs? Or is it the case
that you don’t know ahead of time what the entire list of the dependency
specifiers could be (or that it would be unreasonable to have to declare them
all up front?). I think I recall someone saying that something might depend on
something like “Numpy >= 1.0” in their sdist, but once it’s been built then
they’ll need to depend on something like “Numpy >=
$VERSION_IT_WAS_BUILT_AGAINST”. If this is something that’s needed, then we
might not be able to satisfy this particular thing.
I think I should point out too, that I’m not dead set on having static
dependency information inside of a sdist/source wheel/whatever. What I am dead
set on having is that all of the metadata *inside* of the sdist/source
wheel/whatever should be static and it should include as much as possible which
isn’t specific to a particular wheel. This means that things like name,
version, summary, description, project URLs, etc. these are obviously (to me
anyways) not going to be specific to a wheel and should be kept static inside
of the sdist (and then copied over to resulting wheel as static as well [3])
and you simply can’t get information out of a sdist that is inherent to wheels.
Obvious (again, to me) examples of data like that are things like build number,
ABI the wheel was compiled against, etc. Is the list of runtime dependencies
one of the things that are specific to a particular wheel? I don’t know, it’s
not obvious to me but maybe it is. [4]
[1] That being said, I’d love it if someone who does deal with this things
would become more involved with the “standard” ecosystem so we can have experts
who deal with that side of things as well, because I do think it’s an important
use case.
[2] And probably: “Even if you can do the feature suggested by this named group
of dependencies, I don’t want it”
[3] This rule should be setup so that we can have an assertion put into place
that this data will remain the exact same when building a wheel from a sdist.
[4] I think there’s possibly some confusion in what is causing problems. I
think that the entirety of ``setup.py`` is full of problems with executing in
different environments and isn’t very well designed to enable robust packages.
A better designed interface would likely resolve a good number of the problems
that pip currently has either way. It is true however that needing to download
the sdist prior to resolving dependencies makes things a lot slower, but I also
think that doing it correctly is more important than doing it quickly. We could
possibly resolve some of this by pushing more people to publish Wheels,
expanding the cases where Wheels are possible to be uploaded, and creating a
wheel builder service. These things could make it possible to reduce the
situations where you *need* to download + build locally to do a dependency
resolution. I think that something like the pip wheel building cache can reduce
the need for this as well, since we’ll already have built wheels available
locally in the wheel cache that we can query for dependency information.
-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
_______________________________________________
Distutils-SIG maillist - Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig