ID:          36112
 Updated by:  [EMAIL PROTECTED]
 Reported By: pornel at despammed dot com
-Status:      Assigned
+Status:      Open
 Bug Type:    Documentation problem
 PHP Version: Irrelevant
 Assigned To: gavinfo


Previous Comments:
------------------------------------------------------------------------

[2006-03-12 17:06:18] [EMAIL PROTECTED]

There are lot of inconsistencies in this example:

1) About @<script[^>]*?>.*?</script>@si :
   a) the first ? is useless.

2) About @<[\/\!]*?[^<>]*?>@si :
   a) / and ! don't have to be escaped. 
   b) [\/\!]*? is useless, as it's already matched by [^<>]*?. 
   c) the ? of [^<>]*? is useless.
   d) the PCRE_DOTALL modifier is useless, there is no dot.
   e) the PCRE_CASELESS modifier is useless.
   f) what is the point avoiding "<" in a tag?

3) About @([\r\n])[\s]+@ :
   a) no need to put \s in a char class.
   b) every \r\n will be changed to \r, as \s matches \n.

I think the whole example has to be reconsidered, because there are
already functions to do some of the job, like strip_tags() and
html_entity_decode().

------------------------------------------------------------------------

[2006-01-20 23:54:03] pornel at despammed dot com

Description:
------------
The code on http://uk.php.net/preg_replace:

$search = array ('@<script[^>]*?>.*?</script>@si', // Strip 
out javascript
                 '@<[\/\!]*?[^<>]*?>@si',          // Strip 
out HTML tags

doesn't work as advertised. For example it will leave 
contents of:
<script>xxx</script       >
and worse, it will output valid script tags if given:
<<>script>evil<<>/script>

If these patterns were used on some website (for stripping 
markup from user's comments for example), they'd allow XSS 
attack.


Since it's near impossible to properly parse HTML with 
regular expressions I suggest:
* renaming example from 'Convert HTML to text' to 'Remove 
HTML markup'
* adding replacement of '<' as '&gt;'
* suggesting use of more robust methods, like strip_tags, 
nl2br, htmlspecialchars or DOM interface.




------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=36112&edit=1

Reply via email to