On Fri, Oct 5, 2012 at 11:39 AM, Vinay Sajip <vinay_sa...@yahoo.co.uk> wrote: > Paul Moore <p.f.moore <at> gmail.com> writes: > >> The first ones are fine, as they point to files. The second is often a >> file, and seems to frequently duplicate the first. I'm not sure how >> useful it is. The final one often points to a further webpage - I >> presume that's what you plan to scrape. That's where the issue lies, >> though, as at least some of those links time out (lxml's does, IIRC) >> and as I say, I don't think I know of a case where it's actually worth >> doing. >> >> But this is based on a very superficial and limited experience. I'll >> happily bow to better information. >> >> On the other hand, is manually parsing the static page any faster in a >> practical sense than using XMLRPC? > > Well, XML-RPC is of course preferable; the current code in distlib is just > whatever I copied across from packaging, but the next step will be to look > at the releases which are available from the different sources (XML-RPC, > PyPI metadata URLs, dependency_links etc.) to see what sorts of things > wouldn't > be accessible if we restricted to say, just using XML-RPC. Since all the > information in the static pages seems to be available via XML-RPC, what is the > point of the simple interface, other than for occasional viewing by a human?
IIRC the most practical limitation is that the XML-RPC interface doesn't exist on the mirrors. _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig