Thank for the feedback, Holger and Phillip. I'll bake this into a version 0.2 of the proposal over the weekend.
On 01.03.2013 17:29, PJ Eby wrote: > On Fri, Mar 1, 2013 at 6:17 AM, holger krekel <hol...@merlinux.eu> wrote: >> On Fri, Mar 01, 2013 at 06:09 -0500, Donald Stufft wrote: >>> On Friday, March 1, 2013 at 6:04 AM, M.-A. Lemburg wrote: >>>> On 01.03.2013 11:19, holger krekel wrote: >>>>> Hi Richard, all, >>>>> >>>>> somewhere deep in the threads i mentioned i wrote a little "cleanpypi.py" >>>>> script which takes a project name as an argument and then goes to >>>>> pypi.python.org (http://pypi.python.org) and removes all >>>>> homepage/download metadata entries for >>>>> this project. This sanitizes/speeds up installation because >>>>> pip/easy_install don't need to crawl them anymore. I just did this for >>>>> three of my projects, (pytest, tox and py) and it seems to work fine. >>>>> >>>> >>>> >>>> Does it also cleanup the links that PyPI adds to the /simple/ by >>>> parsing the project description for links ? >>>> >>>> I think those are far nastier than the homepage and download links, >>>> which can be put to some good use to limit the external lookups >>>> (see http://wiki.python.org/moin/PyPI/DownloadMetaDataProposal) >>>> >>>> See e.g. https://pypi.python.org/simple/zc.buildout/ >>>> for a good example of the mess this generates... even mailto links >>>> get listed and "file:///" links open up the installers for all >>>> kinds of nasty things (unless they explicitly protect against >>>> following these). >>>> >>>> >>> >>> pip at least, and I assume the other tools don't spider those links, but >>> they do consider them for download (e.g. if the link looks installable >>> it will be a candidate for installing, but it won't fetch it, and look for >>> more links like it will donwnload_url/home_page). >>> >>> I believe that's the way it's structured atm. >> >> That's right. Even though the long-description extracted links >> look ugly on a simple/PKGNAME page, neither pip nor easy_install do anything >> with them except if the "href" ends in "#egg=PKGNAME-" in which case they are >> taken as pointing to a development tarball (e.g. at github or bitbucket). >> ASFAIK a link like "PKGNAME-VER.tar.gz" will not be treated as >> an installation candidate, just the "#egg=PKGNAME" one. > > Both are considered "primary links". A primary link is a link whose > filename portion matches one of the supported distutils or setuptools > file formats, or is marked with an #egg tag. Primary links are > indexed as to project name and version, so that if that version/format > is chosen as the best candidate, it will be downloaded and installed. > > Links marked with rel="homepage" or rel="download" are "secondary > links". Secondary links are actively retrieved and scanned to look > for more primary links. No further secondary links are scanned or > followed. (Details of all of this can be found at: > http://peak.telecommunity.com/DevCenter/setuptools#making-your-package-available-for-easyinstall > ) > > This basically means that MAL's proposal for a download.html file is > actually a bit moot: you can just stick direct "primary" download URLs > in your PyPI description field, and the tools will pick them up. They > can even include #md5 info. (See > http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api > - item 4 mentions the description part.) > > This means, by the way, that you could make an external link cleaner > which spiders the external pages and pulls the candidates onto the > description for that release, thereby keeping useful primary links and > getting rid of the secondary links used to fetch them. > _______________________________________________ > Catalog-SIG mailing list > Catalog-SIG@python.org > http://mail.python.org/mailman/listinfo/catalog-sig > -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 01 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ _______________________________________________ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig