Edit report at https://bugs.php.net/bug.php?id=62032&edit=1
ID: 62032
Comment by: iamcraigcampbell at gmail dot com
Reported by: iamcraigcampbell at gmail dot com
Summary: filter_var incorrectly strips characters from
strings after "<"
Status: Open
Type: Bug
Package: Filter related
Operating System: Mac OS X
PHP Version: 5.4.3
Block user comment: N
Private report: N
New Comment:
@pajoye I agree with you, but there is a use case that encoding will not solve.
Let's say there is a forum where users are posting. If the user posts:
"This is <strong>NOT</strong> good!" and the tags get encoded then that means
the
HTML tags will be displayed in the forum as plain text. I think it is more
expected
behavior to display this string as "This is NOT good!".
So one option would be encoding the < only if it is not followed by a > but
that is a
lot of extra work to figure that out.
At the end of the day the point is that no matter how you look at it I still
think
this is a bug.
$string = 'This is true: 2<5';
strip_tags($string); and filter_var($string, FILTER_SANITIZE_STRING);
Should not strip out <5 since that is not an HTML tag.
Previous Comments:
------------------------------------------------------------------------
[2012-05-15 14:51:09] aleksey dot v dot korzun at gmail dot com
How is stripping anything after < with a space is a valid operation? That seems
like a lazy man's html stripper.
Let's just blindly strip everything that can possibly be made into an html tag
of
any sort. Not.
------------------------------------------------------------------------
[2012-05-15 14:49:02] [email protected]
> or < should be encoded then, see
http://www.php.net/manual/en/filter.filters.sanitize.php
btw, any option should be added using the option array or defaults, as it is
the
case already.
------------------------------------------------------------------------
[2012-05-15 14:45:27] iamcraigcampbell at gmail dot com
So in that case I think strip_tags and filter_var are both broken. In this
context:
"It is true that 5<10"
"It is true that 5 < 10"
Neither of these are html tags so the string should not be touched regardless
of if
there is a space or not.
------------------------------------------------------------------------
[2012-05-15 14:42:47] reeze dot xia at gmail dot com
PS: the reason why strip_tags() didn't strip it is '<' is followed by a
space char but not without ending '>', this is the key point.
look deep into the source code, there difference is switch whether or
not to trait '<' followed by a(or more) spaces a tag or not.
------------------------------------------------------------------------
[2012-05-15 14:36:26] reeze dot xia at gmail dot com
strip_tags will strip it even without the ending '>' if '<' followed by a
non-space char.
If we need to check whether is a closed tag it is a feature request to change
it's
behavior. it will break BC.
------------------------------------------------------------------------
The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at
https://bugs.php.net/bug.php?id=62032
--
Edit this bug report at https://bugs.php.net/bug.php?id=62032&edit=1