Kent Johnson writes: > [EMAIL PROTECTED] wrote: >> Is possible deleting all tags from a text and how? >> >> i.e.: >> >> s='<td><a href="..." title="...">foo bar</a>;<br /> >> <a href="..." title="...">foo2</a> <a href="..." >> title="...">bar2</a></td>' >> >> so, I would get only: foo bar, foo2, bar2 > > How about this? > > In [1]: import BeautifulSoup > > In [2]: s=BeautifulSoup.BeautifulSoup('''<td><a href="..." title="...">foo > bar</a>;<br /> > ...: <a href="..." title="...">foo2</a> <a href="..." > title="...">bar2</a></td>''') > > In [4]: ' '.join(i.string for i in s.fetch() if i.string) > Out[4]: 'foo bar foo2 bar2' > > > Here are a couple of tag strippers that don't use BS: > http://www.aminus.org/rbre/python/cleanhtml.py > http://www.oluyede.org/blog/2006/02/13/html-stripper/ > > Kent >
Another way (valid only for this case): : for i in s.fetch('a'): print i.string _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor