Re: html 2 plain text

2006-05-29 Thread Fredrik Lundh
[EMAIL PROTECTED] wrote: > text=re.sub(r'(?s)\<.+?\>', '', html_text) > (this will keep html entities, though) here's a variation that handles that too: http://effbot.org/zone/re-sub.htm#strip-html -- http://mail.python.org/mailman/listinfo/python-list

Re: html 2 plain text

2006-05-29 Thread garabik-news-2005-05
robin <[EMAIL PROTECTED]> wrote: > hi, > i remember seeing this simple python function which would take raw html > and output the content (body?) of the page as plain text (no <..> tags > etc) > i have been looking at htmllib and htmlparser but this all seems to > complicated for what i'm looking f

Re: html 2 plain text

2006-05-28 Thread Ravi Teja
> i remember seeing this simple python function which would take raw html > and output the content (body?) of the page as plain text (no <..> tags > etc) http://www.aaronsw.com/2002/html2text/ -- http://mail.python.org/mailman/listinfo/python-list

Re: html 2 plain text

2006-05-28 Thread robin
lucks yummy. merci beaucoup. robin -- http://mail.python.org/mailman/listinfo/python-list

Re: html 2 plain text

2006-05-28 Thread Faber
robin wrote: > i remember seeing this simple python function which would take raw html > and output the content (body?) of the page as plain text (no <..> tags > etc) > i have been looking at htmllib and htmlparser but this all seems to > complicated for what i'm looking for. i just need the main

html 2 plain text

2006-05-28 Thread robin
hi, i remember seeing this simple python function which would take raw html and output the content (body?) of the page as plain text (no <..> tags etc) i have been looking at htmllib and htmlparser but this all seems to complicated for what i'm looking for. i just need the main text in the body of