At 03:13 PM 9/11/2009 +0200, Tarek Ziadé wrote:
This leads to some problems when scripts like easy_install scans the index page: it might try to visit urls the author just put there in his description text with no particular intent of making it viewable.
Easy_install only visits pages marked as "home page" links or "download" links.
Plus, old urls that don't work anymore are not removed, leading to easy_install timeouts. 1. what's the purpose of having them in there ?
To allow easy_install to find "dev" links and other identifiable direct-download links.
2. if there's a purpose, what about adding an attribute to each <a> tag to identify from which metadata field it was extracted from ?
The attribute already exists: rel="download" and rel="homepage"; if there's no 'rel' it's from the description.
I'm rather surprised you don't know these things already, since they're all rather prominently documented as part of easy_install's "index API" here:
http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api _______________________________________________ Catalog-SIG mailing list [email protected] http://mail.python.org/mailman/listinfo/catalog-sig
