On Wednesday, February 27, 2013 at 7:08 PM, PJ Eby wrote: > On Wed, Feb 27, 2013 at 6:16 PM, Aaron Meurer <asmeu...@gmail.com > (mailto:asmeu...@gmail.com)> wrote: > > As far as I'm concerned, this is all about helping package > > maintainers. The way pip works now, every time I do a release > > candidate, pip automatically installs it, even though I only upload it > > to Google Code. I don't want it to do this, but the only way around > > it would be either 1. give it some weird name so that pip doesn't > > think it is newer 2. upload it somewhere else or 3. go in to PyPI and > > remove all mentions of Google Code from the index. > > > > > There's also a *fourth* way, which I asked the PyPI developers many > years ago to do, which is to stop including download links on the > /simple index for "hidden" (i.e., non-current) releases. > > (Something I am still in favor of, btw. Jim Fulton argued against it, > IIRC, and it ended in a stalemate. However, I don't think we > discussed distinguishing PyPI downloads from other downloads, just > getting rid of old links in general) > > Frankly, just dropping /simple links for hidden releases would also > fix a good chunk of expired domain, stale releases, too many > downloads, etc. In addition, if a project migrates to using PyPI > uploads, they will not still be subject to external downloads for > older versions being crawled. > > So, if we must do away with the links, I would suggest that the phases be: > > 1. Remove homepage/download URLs for "hidden" versions from the > /simple index altogether (leaving PyPI download links available) > 2. Remove the rel="..." attributes from the remaining download and > home page links (this will stop off-site crawling, but not off-site > downloading) > 3. Re-evaluate whether anything else actually needs to be removed. > >
This seems a bit complicated, people in general don't even know the external link spidering exists, much less understand the intricacies of what types of links get spidered when. A simple "After X date no new urls will be added and after Y date all existing urls will be removed" removes ambiguity from the process. Having "this kind of link will get removed Y and that matters in Z conditions" leads to a lot of confusion about what does and doesn't work. > > Basically, 99% of the complaints here are lumping together all of > these different kinds of links -- stale links, spidered links, and > plain external download links -- even though they don't create the > same sorts of problems. Taking it in stages will give authors time to > change processes, while still getting rid of the biggest problem > sources right away (stale homepage/download URLs). > > My complaints is external urls at all, for a myriad of reasons, some specific to particular cases of them, some not. > > The first of these changes could be done now, though I'd check with > Jim about the buildout use case; IIRC it was to allow pinned > versions. But if the main use cases also had eggs on PyPI rather than > downloading them from elsewhere, then removing *just* the > homepage/download links would clean things up nicely, including your > runaway Google Code downloads, without needing to change any installer > code that's out in the field right now. > _______________________________________________ > Catalog-SIG mailing list > Catalog-SIG@python.org (mailto:Catalog-SIG@python.org) > http://mail.python.org/mailman/listinfo/catalog-sig > >
_______________________________________________ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig