On 13/09/2007, sacha rook <[EMAIL PROTECTED]> wrote:

>  [CODE]
>
>  from BeautifulSoup import BeautifulSoup
> doc = ['<html><head><title>Page title</title></head>',
>        '<body><p id="firstpara" align="center">This is paragraph
> <b>one</b>.',
>        '<p id="secondpara" align="blah">This is paragraph <b>two</b>.',
>        '<a href="http://www.google.co.uk";></a>',
>        '<a href="http://www.bbc.co.uk";></a>',
>        '<a href="http://www.amazon.co.uk";></a>',
>        '<a href="http://www.redhat.co.uk";></a>',
>        '</html>']
> soup = BeautifulSoup(''.join(doc))
> blist = soup.findAll('a')
> print blist
>  import urlparse
> for a in blist:
>     href = a['href']
>     print urlparse.urlparse(href)[1]
>
>  [/CODE]

Works fine for me:

>>> ## working on region in file python-tmp-371673F...
[<a href="http://www.google.co.uk";></a>, <a
href="http://www.bbc.co.uk";></a>, <a
href="http://www.amazon.co.uk";></a>, <a
href="http://www.redhat.co.uk";></a>]
www.google.co.uk
www.bbc.co.uk
www.amazon.co.uk
www.redhat.co.uk

But as Kent wrote; show the whole traceback, not just the last line.


-- 
- Rikard - http://bos.hack.org/cv/
_______________________________________________
Tutor maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/tutor

Reply via email to