On Apr 2, 10:08 pm, Michael Hoffman <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] wrote: > > But it could be that he just wants all HTML tags to disappear, like in > > his example. A code like this might be sufficient then: re.sub(r'<[^>] > > +>', '', s). > > Won't work for, say, this: > > <img src="src" alt="<text>"> > -- > Michael Hoffman
True, but is that legal? I think the alt attribute needs to use < and >. Although I know what you're going to reply. That BeautifulSoup probably parses it even if it's invalid HTML. And I'd say that I agree, using BeautifulSoup is a better solution than custom regexps. -- http://mail.python.org/mailman/listinfo/python-list