On 03/12/2013 01:21 PM, PJ Eby wrote: >> - In some way, migrate to a situation where the popular installer tools >> install only release files from PyPI by default, but are capable of >> installing from other locations if the user provides an option. > > Perhaps I'm confused, but ISTM that every time I've said this, Donald > and Lennart argue that it should not be possible to provide such an > option -- or to be more specific, that PyPI should not publish the > information that makes that option possible. > > If that's *not* the position they're taking, it'd be good to know, > because we could totally stop arguing about it in that case.
I think there's been misunderstanding on this point. Donald and Lennart can confirm for themselves, but I don't believe _anyone_ thinks that tools should not be able to install from non-PyPI sources when explicitly requested to do so. And IIUC from your previous message, you've "already agreed to change setuptools to default this option to only allow downloads from the same host as its index URL, in a future release". So I think everyone is roughly on the same page about where we should be headed. There is disagreement about how to make that work. My point is that I don't think PyPI publishing scraped-from-metadata external links on the simple/ index specifically, in perpetuity, is necessary or even beneficial to that future state. >> A) Leave external links in the PyPI simple index, but migrate the major >> tools to not use external links by default (i.e. Philip's plan to make >> allow-hosts=pypi the default in a future setuptools), with an option to >> turn them back on. > > I don't know who has proposed this option, but it's not me. You seem > to be confusing external links and HTML-scraped links (rel="" > attributed links in /simple). No, I'm not confusing those. All I'm referring to here is where you said you've "already agreed to change setuptools to default [allow-hosts] to only allow downloads from the same host as its index URL, in a future release." Did I not characterize that accurately? > I was the first person to propose disabling HTML-scraped links from > PyPI *ASAP*. I still want them gone. That won't require tool > changes, it just requires a rollout plan. Holger has one, let's work > on that. Fully agreed. I understand from Holger that he would like his PEP to also discuss the rough plan beyond just disabling rel-link HTML scraping, for how to get to a point where the tools don't follow off-PyPI links at all by default. This second stage is what I'm talking about. > The second thing I proposed is that new tools be developed to *assist* > package authors in moving their files onto PyPI, so that future tool > changes wouldn't result in widespread instances of people needing to > set their tools to insecure settings just to get anything done. We > need to get people's files moving onto PyPI *first*, in order to make > changing the tool defaults practical. Totally agreed that such tools could be useful, I should have included that point explicitly in my summary. > The *only* thing I object to is the part where some people want to ban > external links from /simple, always and forever, regardless of the > package authors' choice in the matter. I think the question of external links in /simple is causing far more heat than it's worth (from all sides), because it's fundamentally an implementation detail, not an end in itself. Discussing the pros and cons of this implementation detail is more or less what rest is all about. >> B) Do a second PyPI migration, again with a per-package toggle and >> package owners in control, to a "no external links in simple index" setting. >> >> Consider for a moment how similar the end state here is with either A or >> B. In either case, by default users install only from PyPI, but by >> providing a special option they can install from some external source. >> (In B, that special option would be something like --find-links with a >> URL). In either case, we can continue to allow packages to register >> themselves on PyPI, be found in searches, etc, without uploading release >> files to PyPI if they prefer not to; they'll just have to provide >> special installation instructions to their users in that case. > > Not true: approach B means that you won't know what values to pass to > the option. You say below that "nobody has proposed a 'trust everything' flag." If there is no "trust everything" flag, then it seems to me that with either option A or option B the user needs to specify what they intend to trust. I.e. if you make the default value of allow-hosts the index url host, as you said you plan to do at some point, users would need to override it with the hosts they want to allow. It seems like maybe what you are wanting is automatically-discoverable installation from externally-hosted files? I.e. that I could say "easy_install Foo --allow-external", without needing to know any specific external url for Foo? This is what I was characterizing as a "trust everything" flag, but on reflection I don't think I have any problem with that. I do think that: 1) external release-file URLs should be explicitly nominated by the package owner, not automatically sucked out of text metadata. 2) (After a suitable package-owner-controlled migration) those external links should live at a new separate (machine-readable) endpoint, not the existing /simple index. This has two benefits: a) even tools that exist today eventually gain the benefit of safer-by-default installations, and b) it's simpler and more reliable for future tools to distinguish between internal and external release file links. > It's also confused about an important point. All the links that > appear in /simple are *already* completely under the package author's > control. No new switches are required to remove external links - you > can simply remove them from your releases' descriptions. This process > could be made more transparent or easy, sure -- but it's a mistake to > say that this is granting the package owners control that they don't > already have. This is partly true. An explicit flag grants package owners more control in that right now they don't have a choice about whether external links to tarballs in their long_description automatically get sucked into the simple index. This is not hypothetical; even if there were no rel-link scraping, I've had cases where package owners have complained to me about pip installing an RC tarball they had linked directly from their long-description, not intending it to be auto-installable. I think it would be preferable if in the future package owners wouldn't need to be careful what release-file links they might place in their long_description, and release files would be only explicitly nominated. I think the current "automatically suck in links to simple/" behavior is only useful as a backwards-compatibility hack, which is why I think an explicit switch to disable it (on by default for newly-registered projects, slowly, gently, carefully migrated to on for existing projects) is better than keeping this link-scraping behavior indefinitely for all projects and asking package owners to clean up their long-descriptions. > What they lack control over is the rel="" attributes, short of > removing those links entirely. That's why I've proposed having a > switch for that , as reflected in Holger's pre-PEP. I agree with this switch, but I think there is more benefit than cost in extending the concept to all automatically-sucked-in external links. >> 1) With B, we can provide a gentler migration for package owners, where >> they are in control of when the switch happens. >> >> 2) With B, all end users benefit from the new defaults, not only end >> users who update to the latest and greatest tools. >> >> 3) With B (and probably some forms of A as well), end users clearly >> state which external sources they would like to trust and install from, >> rather than having a global "trust everything!" flag, which is less >> secure and less sensible. > > These 3 statements all mischaracterize things substantially, because > none of those benefits are exclusive to A, and nobody has proposed a > "trust everything" flag. You're right that item 1 is not technically exclusive to B, although I think B makes it much easier and simpler for package owners. "Just flip a switch and done" rather than "Go clean up all your package metadata including all past releases, or trust this tool we built to go editing all your release metadata for you." I'm not even sure how that hypothetical tool would work - what exactly would it do to automatically clean up a link to an external tarball that it finds in the long_description of a release from three years ago? Just remove it? What if the package owner actually wants that link there for human use? > Removing rel="" attributes also benefits > everyone right away, *without* new tools. Sure, and I'm fully in support of that being the first stage. Carl _______________________________________________ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig