ID:               45599
 Updated by:       [email protected]
 Reported By:      david at grudl dot com
-Status:           Open
+Status:           Closed
 Bug Type:         Strings related
 Operating System: *
 PHP Version:      5.*, 6
 New Comment:

This bug has been fixed in SVN.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.
 
Thank you for the report, and for helping us make PHP better.




Previous Comments:
------------------------------------------------------------------------

[2009-12-22 02:04:13] [email protected]

Automatic comment from SVN on behalf of iliaa
Revision: http://svn.php.net/viewvc/?view=revision&revision=292465
Log: Fixed bug #45599 (strip_tags() truncates rest of string with
invalid attribute).

------------------------------------------------------------------------

[2009-08-24 15:53:52] [email protected]

PHP 5.x patch:
Index: ext/standard/string.c
===================================================================
--- ext/standard/string.c       (revision 284189)
+++ ext/standard/string.c       (working copy)
@@ -4367,7 +4367,7 @@
                                        tp = ((tp-tbuf) >= PHP_TAG_BUF_SIZE ? 
tbuf: tp);
                                        *(tp++) = c;
                                }
-                               if (state && p != buf && *(p-1) != '\\' && 
(!in_q || *p == in_q))
{
+                               if (state && p != buf && (state == 1 || *(p-1) 
!= '\\') && (!in_q
|| *p == in_q)) {
                                        if (in_q) {
                                                in_q = 0;
                                        } else {

Trunk patch:
Index: ext/standard/string.c
===================================================================
--- ext/standard/string.c       (revision 284189)
+++ ext/standard/string.c       (working copy)
@@ -6519,7 +6519,7 @@
                                tp = ((tp-tbuf) >= UBYTES(PHP_TAG_BUF_SIZE) ? 
tbuf: tp);
                                *(tp++) = ch;
                        }
-                       if (state && prev1 != 0x5C /*'\\'*/ && (!in_q || ch == 
in_q)) {
+                       if (state && (state ==1 || prev1 != 0x5C /*'\\'*/) && 
(!in_q || ch
== in_q)) {
                                if (in_q) {
                                        in_q = 0;
                                } else {
@@ -6763,7 +6763,7 @@
                                        tp = ((tp-tbuf) >= PHP_TAG_BUF_SIZE ? 
tbuf: tp);
                                        *(tp++) = c;
                                }
-                               if (state && p != buf && *(p-1) != '\\' && 
(!in_q || *p == in_q))
{
+                               if (state && p != buf && (state ==1 || *(p-1) 
!= '\\') && (!in_q
|| *p == in_q)) {
                                        if (in_q) {
                                                in_q = 0;
                                        } else {


Test case:
--TEST--
Bug #45599 (strip_tags() ignore backslash (\) character inside html
tags)
--FILE--
<?php
echo strip_tags('Hello <a href="any\"> World') . "\n";
echo strip_tags('Hello <a href="any\\"> World') . "\n";
echo strip_tags('Hello <a href=\"any"> World');
?>
--EXPECT--
Hello  World
Hello  World
Hello  World


------------------------------------------------------------------------

[2008-08-06 16:52:29] david at grudl dot com

Character \ is allowed in tag attribute, so strip_tags('Hello <a
href="any\"> World') leading to "Hello" (without "World") is bug.

------------------------------------------------------------------------

[2008-08-06 16:30:17] [email protected]

Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

The parser continues until it founds the end of the tag, which can not
be in an attribute value (XML allows all characters except [%&'] in
attribute values).

In the given examples the attribute value never terminates and the end
of the tag is never found, which causes the rest of the string to be
truncated.

This change as been made to fix the following bug:
http://bugs.php.net/bug.php?id=40432


------------------------------------------------------------------------

[2008-07-30 04:42:40] jet at synth-tec dot com

I am having the same problem.  If an attribute has an extra quote in
it, will cut off all the text afterwards.  

Example Input:
----------------
strip_tags('
text before link
<a href="http://google.com"";>google.com</a>
text after link
test 1
test 2
')


Expected Output:
-----------------
text before link
text after link
test 1
test 2


Actual Output:
--------------
text before link



Note, I do not have this problem in PHP 5.0.4 or previous versions

------------------------------------------------------------------------

The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at
    http://bugs.php.net/45599

-- 
Edit this bug report at http://bugs.php.net/?id=45599&edit=1

Reply via email to