On October 12, 2015 at 12:54:07 AM, Nathaniel Smith (n...@pobox.com) wrote:
I dunno -- I'm sure there must exist some other ways forward that 
don't require dropping the dream of static dependencies. At one 
extreme, we had a birds-of-a-feature at SciPy this year on "the future 
of numpy", and the most vocal audience contingent was in favor of 
numpy simply dropping upstream support for pip/wheels/pypi entirely 
and requiring all downstream packages/users to switch to conda or 
building by hand. It sounds like a terrible idea to me. But I would 
find it easier to believe in the pypi/pip ecosystem if there were some 
concrete plan for how this all-static world was actually going to 
work, and that it wasn't just chasing rainbows. 
FTR, I plan on making some sort of auto builder for PyPI so it’s possible that 
we can get pip to a point where almost all things it downloads are binary 
wheels and we don’t need to worry too much about needing to optimize the sdist 
case. I also think that part of the problem with egg-info and setup.y’s and 
pip’s attempted use of them is just down to the setup.py interface being pretty 
horrible and setup.py’s serving too many masters where it needs to be both VCS 
entry point, sdist metadata entry point, installation entry point, and wheel 
building entry point.

I also think that it would be a terrible idea to have the science stack leave 
the “standard” Python packaging ecosystem and go their own way and I think it’d 
make the science packages essentially useless to the bulk of the non science 
users. I think a lot of the chasing rainbows like stuff comes mostly from: We 
have some desires from our experiences but we haven’t yet taken the time to 
fully flesh out what the impact of those desires are, nor are any of us science 
stack users (or contributors) to my knowledge, so we don’t deal with the 
complexities of that much [1].

One possible idea that I’ve thought of here, which may or may not be a good 
idea:

Require packages to declare up front conditional dependencies (I’m assuming the 
list of dependencies that a project *could* have is both finite and known ahead 
of time) and let them give groups of these dependencies a name. I’m thinking 
something similar to setuptools extras where you might be able to put a list of 
dependencies to a named group. The build interface could include a way for the 
thing that’s calling the build tool to say “I require the feature represented 
by this named group of dependencies”[2], and then the build tool can hard fail 
if it detects it can’t be build in a way that requires those dependencies at 
runtime. When the final build is done, it could put into the Wheel a list of 
all of the additional named groups of dependencies it built with. The runtime 
dependencies of a wheel would then be the combination of all of those named 
groups of dependencies + the typical install_requires dependencies. This could 
possibly even be presented nicely on PyPI as a sort of “here are the extra 
features you can get with this library, and what that does to it’s 
dependencies”.

Would something like that solve Numpy’s dependency needs? Or is it the case 
that you don’t know ahead of time what the entire list of the dependency 
specifiers could be (or that it would be unreasonable to have to declare them 
all up front?). I think I recall someone saying that something might depend on 
something like “Numpy >= 1.0” in their sdist, but once it’s been built then 
they’ll need to depend on something like “Numpy >= 
$VERSION_IT_WAS_BUILT_AGAINST”. If this is something that’s needed, then we 
might not be able to satisfy this particular thing.

I think I should point out too, that I’m not dead set on having static 
dependency information inside of a sdist/source wheel/whatever. What I am dead 
set on having is that all of the metadata *inside* of the sdist/source 
wheel/whatever should be static and it should include as much as possible which 
isn’t specific to a particular wheel. This means that things like name, 
version, summary, description, project URLs, etc. these are obviously (to me 
anyways) not going to be specific to a wheel and should be kept static inside 
of the sdist (and then copied over to resulting wheel as static as well [3]) 
and you simply can’t get information out of a sdist that is inherent to wheels. 
Obvious (again, to me) examples of data like that are things like build number, 
ABI the wheel was compiled against, etc. Is the list of runtime dependencies 
one of the things that are specific to a particular wheel? I don’t know, it’s 
not obvious to me but maybe it is. [4]

[1] That being said, I’d love it if someone who does deal with this things 
would become more involved with the “standard” ecosystem so we can have experts 
who deal with that side of things as well, because I do think it’s an important 
use case.

[2] And probably: “Even if you can do the feature suggested by this named group 
of dependencies, I don’t want it”

[3] This rule should be setup so that we can have an assertion put into place 
that this data will remain the exact same when building a wheel from a sdist.

[4] I think there’s possibly some confusion in what is causing problems. I 
think that the entirety of ``setup.py`` is full of problems with executing in 
different environments and isn’t very well designed to enable robust packages. 
A better designed interface would likely resolve a good number of the problems 
that pip currently has either way. It is true however that needing to download 
the sdist prior to resolving dependencies makes things a lot slower, but I also 
think that doing it correctly is more important than doing it quickly. We could 
possibly resolve some of this by pushing more people to publish Wheels, 
expanding the cases where Wheels are possible to be uploaded, and creating a 
wheel builder service. These things could make it possible to reduce the 
situations where you *need* to download + build locally to do a dependency 
resolution. I think that something like the pip wheel building cache can reduce 
the need for this as well, since we’ll already have built wheels available 
locally in the wheel cache that we can query for dependency information. 

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
_______________________________________________
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig

Reply via email to