On 21-Dec-10 12:22 PM, Jon Clements wrote:
import lxml
from urlparse import urlsplit

doc = lxml.html.parse('http://www.google.com')
print map(urlsplit, doc.xpath('//a/@href'))

[SplitResult(scheme='http', netloc='www.google.co.uk', path='/imghp',
query='hl=en&tab=wi', fragment=''), SplitResult(scheme='http',
netloc='video.google.co.uk', path='/', query='hl=en&tab=wv',
fragment=''), SplitResult(scheme='http', netloc='maps.google.co.uk',
path='/maps', query='hl=en&tab=wl', fragment=''),
SplitResult(scheme='http', netloc='news.google.co.uk', path='/nwshp',
query='hl=en&tab=wn', fragment=''), ...]

Jon,

What version of Python was used to run this?

Colin W.
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to