ID:               45996
 Comment by:       phpbugs at colin dot guthr dot ie
 Reported By:      phpbugs at colin dot guthr dot ie
 Status:           Assigned
 Bug Type:         XML related
 Operating System: Mandriva Linux
 PHP Version:      5.2.6
 Assigned To:      rrichards
 New Comment:

Yes, I suspect that the comments left by ptn at post dot cz are
incorrect when they say it is fixed in libxml. rrichards has given a
very complete explanation of the problem and it is more fundamental than
a simple bug.

Compiling PHP with libexpat is the correct workaround for now.


Previous Comments:
------------------------------------------------------------------------

[2008-10-08 09:18:54] uraes at hot dot ee

just tried libxml2-2.7.2 and 5.2.6-pl7-gentoo and it is still broken:

Example PHP code:
<?
$data="<?xml version = '1.0' encoding = 'UTF-8'?>
<rss version=\"2.0\" >
  <channel>
    <item>
      <description>&lt;a
href=&quot;http://www.google.com&quot;>Google&lt;/a></description>
    </item>
  </channel>
</rss>
";

$parser = xml_parser_create('UTF-8');
xml_parser_set_option($parser, XML_OPTION_SKIP_WHITE, 1);
xml_parse_into_struct($parser, $data, $vals, $index);
xml_parser_free($parser);

echo "<pre>";

echo "<b>Original XML:</b><br>".htmlentities($data);

echo "<br><br><b>Parsed struct:</b><br>";
print_r($vals);
?>

.. parsed result is "a href=http://www.google.com>Google/a>"

------------------------------------------------------------------------

[2008-10-07 11:19:33] ptn at post dot cz

this bug seems to be fixed in libxm2-2.7.2

http://svn.gnome.org/viewvc/libxml2?view=revision&revision=3798

------------------------------------------------------------------------

[2008-09-09 23:06:00] phpbugs at colin dot guthr dot ie

Comments by Daniel Veillard on the libxml ML:

  The only thing I can think of is that libxml2 doesn't anymore ask
though a SAX callback when looking for entities references if they
are in the predefined set. This comes in essence by an old decision
from the XML working group stating that user definition for those 5
entities could not override the default predefined ones. So I guess
that change is logical. Now what is done on top of SAX to result
in that bug, I don't really know  :-\

------------------------------------------------------------------------

[2008-09-06 15:43:29] [EMAIL PROTECTED]

Assigned to the maintainer (Rob, don't forget to change status too when
you assign something to yourself :)

------------------------------------------------------------------------

[2008-09-04 17:29:21] phpbugs at colin dot guthr dot ie

Description:
------------
With libxml2 2.7.1, When using the expat type xml parsing routines in
PHP, the characater data seems to silently drop any encoded text e.g.
&gt; &lt; and friends.

Please see Mandriva bug for details:
https://qa.mandriva.com/show_bug.cgi?id=43486

And also please note the thread on the libxml mailing list:
http://thread.gmane.org/gmane.comp.gnome.lib.xml.general/14610

And most notably the reply to the above thread:
<quote>
Can you report this as a PHP bug? It looks like some really old hack 
code in the PHP extension in order to mimic some specific expat 
functionality. The behavior change you see though resulting from a code
changes in libxml2 is really due to the hackish code in the extension
doing things it wasnt meant to be doing.
</quote>

Reproduce code:
---------------
Please see this code:
https://qa.mandriva.com/attachment.cgi?id=10757

Expected result:
----------------
<
foo
>
wibble
<
/foo
>


Actual result:
--------------
foo
wibble
/foo



------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=45996&edit=1

Reply via email to