In <[EMAIL PROTECTED]>, irstas wrote:
> I'd like to see how this transformation can be done with
> BeautifulSoup. Well, the last two regexps can be replaced with this:
>
> unicode(BeautifulStoneSoup(s,convertEntities=BeautifulStoneSoup.HTML_ENTITIES).contents[0])
Completely without regular expressions:
def main():
soup = BeautifulSoup(source, convertEntities=BeautifulSoup.HTML_ENTITIES)
print ' '.join(''.join(soup(text=True)).split())
Ciao,
Marc 'BlackJack' Rintsch
--
http://mail.python.org/mailman/listinfo/python-list