ID: 34863 Comment by: lm at bible dot ch Reported By: backdream at gmail dot com Status: Open Bug Type: Strings related Operating System: Windows PHP Version: 4.3.11 New Comment:
OS: Windows XP PHP: 5.0.3 Further investigation: ---------------------- strip_tags does not handle tags correctly if '<' is occurring in an attribute value as it does in the example in the onload="if(screen.width*0.7<this(...)" attribute strip_tags does provide for nested tags as shows the following example, but doesn't ignore quoted < and > inside tags as the following two examples show: <?php //example 1 $txt = 'text1<tag attr="<"> text2<tag attr=">"> text3'; $txt = strip_tags( $txt ); print $txt; // prints 'text1 text3' // should print 'text1 text2 text3' //example 2 $txt = 'text1<tag <nested tag>> text2'; $txt = strip_tags( $txt ); print $txt; // prints 'text1 text2' as it should if strip_tags searches for nested tags ?> Two questions arise: 1. Is the detection of nested tags reasonable? 2. Is a HTML-tag malformed if it contains a quoted '<' or '>' sign? 3. If such a HTML-tag is malformed, shouldn't strip_tag() still work correctly, since there is no ambiguity thanks to the quotes? Answers: 1. I cannot figure any real world example of nested tags other than HTML-comments like <!--<br />--> 2. According to w3c.org html-validator, it is valid. 3. Strip_tags should still work No parsing of quotes: --------------------- The following example proves, that strip_tags does not parse quotes inside '<' and '>' at all <?php //example 3 $txt = 'text1<tag attr="> text2 <tag attr="> text3'; $txt = strip_tags( $txt ); print $txt; // prints: 'text1 text2 text3' // should print: 'text1 text3' since there is one only tag with an attribute 'attr="> text2 <tag attr="' ?> Conclusion: ----------- This IS a real bug, but not related to <img> tag, but to not parsing of attributes and quotes inside tags. It would be important to correct that bug With best regards Lorenz Meyer Previous Comments: ------------------------------------------------------------------------ [2005-10-14 19:36:25] backdream at gmail dot com I tested with 4.3.11, It also have the bug. ------ <?php $txt = "Next: <ahref=\"http://www.joelonsoftware.com/uibook/chapters/fog0000000063.html\" target=\"_blank\">Designing for People Who Have Better Things To DoWith Their Lives, Part Two</a> <br />¡¡¡¡<br />¡¡¡¡<imgsrc=\"http://86.0.190.20/test/ipb21/skin_acp/IPB2_Standard/images/users.png\" border=\"0\" align=\"absmiddle\" alt=\"Á´½ÓͼƬ\"onload=\"if(screen.width*0.7<this.width) {this.resized=true;this.width=screen.width*0.7;}\" /><br />¡¡¡¡<b>Adverment!</b> Doyouneed to control a computer remotely, even when firewalls get in theway?My company's latest product, <a href=\"https://www.copilot.com/\"target=\"_blank\">Fog Creek Copilot</a>, is a remote control systemthatrequires no setup, no configuration, and works even if both users arebehind firewalls. It's designed to make remote tech support easy.<br />¡¡¡¡<br />¡¡¡¡Enter your email address to receive a (veryoccasional) email whenever I write a major new article. You canunsubscribe at any time, of course.<br />¡¡¡¡</div>"; $txt = strip_tags( $txt ); print $txt; ?> ----------- It print "Next: Designing for People Who Have Better Things To DoWith Their Lives, Part Two ¡¡¡¡¡¡¡¡", but not the whole html tags striped string. ------------------------------------------------------------------------ [2005-10-14 13:18:53] [EMAIL PROTECTED] Can't reproduce. Both your scripts work fine with any PHP version I can find (4.3.11, 4.4, 5.0.x, 5.1). ------------------------------------------------------------------------ [2005-10-14 03:31:24] backdream at gmail dot com Sorry, the Reproduce code would be: --------------- <?php $txt = "Next: <a href=\"http://www.joelonsoftware.com/uibook/chapters/fog0000000063.html\ " target=\"_blank\">Designing for People Who Have Better Things To Do With Their Lives, Part Two</a> <br />¡¡¡¡<br />¡¡¡¡<img src=\"http://86.0.190.20/test/ipb21/skin_acp/IPB2_Standard/images/users. png\" border=\"0\" align=\"absmiddle\" alt=\"Á´½ÓͼƬ\" onload=\"if(screen.width*0.7<this.width) {this.resized=true; this.width=screen.width*0.7;}\" /><br />¡¡¡¡<b>Adverment!</b> Do you need to control a computer remotely, even when firewalls get in the way? My company's latest product, <a href=\"https://www.copilot.com/\" target=\"_blank\">Fog Creek Copilot</a>, is a remote control system that requires no setup, no configuration, and works even if both users are behind firewalls. It's designed to make remote tech support easy. <br />¡¡¡¡<br />¡¡¡¡Enter your email address to receive a (very occasional) email whenever I write a major new article. You can unsubscribe at any time, of course.<br />¡¡¡¡</div>"; $txt = strip_tags( $txt ); print $txt; ?> ------------------------------------------------------------------------ [2005-10-14 03:24:18] backdream at gmail dot com Description: ------------ strip_tags cannot work when string include <img tags Reproduce code: --------------- <?php $txt = "Next: <a href=\"http://www.joelonsoftware.com/uibook/chapters/fog0000000063.html\" target=\"_blank\">Designing for People Who Have Better Things To Do With Their Lives, Part Two</a> <br />¡¡¡¡<br />¡¡¡¡<img src=\"http://86.0.190.20/test/ipb21/skin_acp/IPB2_Standard/images/users.png\" border=\"0\" align=\"absmiddle\" alt=\"Á´½ÓͼƬ\" onload=\"if(screen.width*0.7<this.width) {this.resized=true; this.width=screen.width*0.7;}\" /><br />¡¡¡¡<b>Adverment!</b> Do you need to control a computer remotely, even when firewalls get in the way? My company's latest product, <a href=\"https://www.copilot.com/\" target=\"_blank\">Fog Creek Copilot</a>, is a remote control system that requires no setup, no configuration, and works even if both users are behind firewalls. It's designed to make remote tech support easy. <br />¡¡¡¡<br />¡¡¡¡Enter your email address to receive a (very occasional) email whenever I write a major new article. You can unsubscribe at any time, of course.<br />¡¡¡¡</div>"; $txt = preg_replace( "#<img[^>]*>#i", "", $txt ); $txt = strip_tags( $txt ); print $txt; ?> Expected result: ---------------- strip all the html tags Actual result: -------------- break when run over <img tag. ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=34863&edit=1