Michael Spencer <[EMAIL PROTECTED]> writes: > Mike Meyer wrote: > >> It also fails on tags with a ">" in a string in the tag. That's >> well-formed but ill-used HTML. >> <mike > True enough...however, it doesn't fail too horribly: > >>> striptags("""<sometag attribute = '>'>the text</sometag>""") > "'>the text" > >>>
Depends on your example: <sometage attribute='>' otherattribute='otherstuff' moreattribute='yet more stuff'> and so on. Then again, early browsers actually did the same kind of parsing as you do, so this type of thing is discouraged. > and I think that case could be rectified rather easily, by stripping > any content up to '>' in the result without breaking anything else. Yes, but then what happens with: <sometag>>text</sometag> ? <mike -- Mike Meyer <[EMAIL PROTECTED]> http://www.mired.org/home/mwm/ Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information. -- http://mail.python.org/mailman/listinfo/python-list