Re: [Catalog-sig] simple index and urls exracted from metadata text fields

P.J. Eby Fri, 11 Sep 2009 06:33:09 -0700

At 03:13 PM 9/11/2009 +0200, Tarek Ziadé wrote:

This leads to some problems when scripts like easy_install scans theindex page: it might try to visit urls the author just put there inhis description text with no particular intent of making it viewable.


Easy_install only visits pages marked as "home page" links or "download" links.

Plus, old urls that don't work anymore are not removed, leading toeasy_install timeouts. 1. what's the purpose of having them in there ?

To allow easy_install to find "dev" links and other identifiabledirect-download links.

2. if there's a purpose, what about adding an attribute to each <a>tag to identify from which metadata field it was extracted from ?

The attribute already exists: rel="download" and rel="homepage"; ifthere's no 'rel' it's from the description.

I'm rather surprised you don't know these things already, sincethey're all rather prominently documented as part of easy_install's"index API" here:


   http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api

_______________________________________________
Catalog-SIG mailing list
[email protected]
http://mail.python.org/mailman/listinfo/catalog-sig

Re: [Catalog-sig] simple index and urls exracted from metadata text fields

Reply via email to