On Friday, March 1, 2013 at 2:31 PM, M.-A. Lemburg wrote: > On 01.03.2013 12:17, holger krekel wrote: > > On Fri, Mar 01, 2013 at 06:09 -0500, Donald Stufft wrote: > > > On Friday, March 1, 2013 at 6:04 AM, M.-A. Lemburg wrote: > > > > On 01.03.2013 11:19, holger krekel wrote: > > > > > Hi Richard, all, > > > > > > > > > > somewhere deep in the threads i mentioned i wrote a little > > > > > "cleanpypi.py" > > > > > script which takes a project name as an argument and then goes to > > > > > pypi.python.org (http://pypi.python.org) and removes all > > > > > homepage/download metadata entries for > > > > > this project. This sanitizes/speeds up installation because > > > > > pip/easy_install don't need to crawl them anymore. I just did this for > > > > > three of my projects, (pytest, tox and py) and it seems to work fine. > > > > > > > > > > > > > > > > > > > > > Does it also cleanup the links that PyPI adds to the /simple/ by > > > > parsing the project description for links ? > > > > > > > > I think those are far nastier than the homepage and download links, > > > > which can be put to some good use to limit the external lookups > > > > (see http://wiki.python.org/moin/PyPI/DownloadMetaDataProposal) > > > > > > > > See e.g. https://pypi.python.org/simple/zc.buildout/ > > > > for a good example of the mess this generates... even mailto links > > > > get listed and "file:///" links open up the installers for all > > > > kinds of nasty things (unless they explicitly protect against > > > > following these). > > > > > > > > > > > > > pip at least, and I assume the other tools don't spider those links, but > > > they do consider them for download (e.g. if the link looks installable > > > it will be a candidate for installing, but it won't fetch it, and look > > > for > > > more links like it will donwnload_url/home_page). > > > > > > I believe that's the way it's structured atm. > > > > That's right. Even though the long-description extracted links > > look ugly on a simple/PKGNAME page, neither pip nor easy_install do anything > > with them except if the "href" ends in "#egg=PKGNAME-" in which case they > > are > > taken as pointing to a development tarball (e.g. at github or bitbucket). > > ASFAIK a link like "PKGNAME-VER.tar.gz" will not be treated as > > an installation candidate, just the "#egg=PKGNAME" one. > > > > > Hmm, then why not remove links that don't match the above from > the /simple/ index pages ? > > Note that it's easily possible to make e.g. file:/// links > have a fragment that matches what you described, so I guess the > filters would have to be more careful about what to allow > (e.g. only http/ftp schemes, perhaps even only https schemes) > and what not. > > BTW: Are those links also shown as-is on the description page ? > People could do nasty stuff by adding "javascript:" links which look > like normal links to the descriptions. > >
The descriptions don't allow javascript: urls anymore (I reported that ages ago and Richard fixed it). home_page and probably download_url do though. > > -- > Marc-Andre Lemburg > eGenix.com (http://eGenix.com) > > Professional Python Services directly from the Source (#1, Mar 01 2013) > > > > Python Projects, Consulting and Support ... http://www.egenix.com/ > > > > mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ > > > > mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ > > > > > > > > > > > ________________________________________________________________________ > > ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: > > eGenix.com (http://eGenix.com) Software, Skills and Services GmbH > Pastor-Loeh-Str.48 > D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg > Registered at Amtsgericht Duesseldorf: HRB 46611 > http://www.egenix.com/company/contact/ > >
_______________________________________________ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig