On Apr 2, 12:54 am, "Dotan Cohen" <[EMAIL PROTECTED]> wrote: > On 1 Apr 2007 07:56:04 -0700, Ulysse <[EMAIL PROTECTED]> wrote: > > > I have seen the Beautiful Soup online help and tried to apply that to > > my problem. But it seems to be a little bit hard. I will rather try to > > do this with regular expressions... > > If you think that Beautiful Soup is difficult than wait till you try > to do this with regexes. Granted you know the exact format of the HTML > you are scraping will help, if you ever need to parse HTML from an > unknown source than Beautiful Soup is the only way to go. Not all HTML > authors close their td and tr tags, and sometimes there are attributes > to those tags. If you plan on ever reusing the code or the format of > the HTML may change, then you are best off sticking with Beautiful > Soup. > > Dotan Cohen > > http://lyricslist.com/http://what-is-what.com/
Have you tried HTMLParser. It can do the task you want to perform http://docs.python.org/lib/module-HTMLParser.html -anjesh -- http://mail.python.org/mailman/listinfo/python-list