At 03:13 PM 9/11/2009 +0200, Tarek Ziadé wrote:
This leads to some problems when scripts like easy_install scans the index page: it might try to visit urls the author just put there in his description text with no particular intent of making it viewable.

Easy_install only visits pages marked as "home page" links or "download" links.


Plus, old urls that don't work anymore are not removed, leading to easy_install timeouts. 1. what's the purpose of having them in there ?

To allow easy_install to find "dev" links and other identifiable direct-download links.


2. if there's a purpose, what about adding an attribute to each <a> tag to identify from which metadata field it was extracted from ?

The attribute already exists: rel="download" and rel="homepage"; if there's no 'rel' it's from the description.

I'm rather surprised you don't know these things already, since they're all rather prominently documented as part of easy_install's "index API" here:

   http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api

_______________________________________________
Catalog-SIG mailing list
[email protected]
http://mail.python.org/mailman/listinfo/catalog-sig

Reply via email to