Laszlo Zsolt Nagy wrote:
> [...]
> For example this malformed link:
>
> http://samplesite.current_location/page.html','Samle link']
Your options AFAIK are:
* Beautiful Soup (http://www.crummy.com/software/BeautifulSoup/)
* Various implementations of tidy (uTidyLib, mxTidy)
* XIST (http://www.liv
Fredrik Lundh wrote:
>Laszlo Zsolt Nagy wrote:
>
>
>
>>The question: is there a good library for Python for extraction links and
>>images
>>out of (possibly malformed) HTML soucre code?
>>
>>
>
>http://www.crummy.com/software/BeautifulSoup/
>
>
Thanks a lot! This is just what I wanted. W
Laszlo Zsolt Nagy wrote:
> The question: is there a good library for Python for extraction links and
> images
> out of (possibly malformed) HTML soucre code?
http://www.crummy.com/software/BeautifulSoup/
--
http://mail.python.org/mailman/listinfo/python-list
Hi All,
I'm writting a spider program. I need to go to serveral URLs and extract
information from the HTML source. Including links.
I was using FancyURLOpener and my own function that extracts the links
from a HTML page. The problem is that I always
need to change it. This is because some sit