ID:               45996
 Comment by:       mike at kogan dot org
 Reported By:      phpbugs at colin dot guthr dot ie
 Status:           Assigned
 Bug Type:         XML related
 Operating System: Mandriva Linux
 PHP Version:      5.2.6
 Assigned To:      rrichards
 New Comment:

Nevermind we got it - libexpat is in and workaround is fine. Thanks
again and happy coding!


Previous Comments:
------------------------------------------------------------------------

[2008-10-16 02:01:00] mike at Kogan dot org

Thanks Col - unfortunately after thrashing on this for a day I either
have gotten libexpat built in and it hasn't worked, or my efforts to
build it into Apache2 have not worked. How can I tell once I've rebuilt
apache whether it's in or not? Will it show up on phpinfo? I see
libexpat.so on the system and configured apache --with-expat=builtin and
tried using the expat on the system but I'm not sure if it's actually
getting there. Sorry in advance if this is not the place to ask such a
question but googling libexpat has not been fruitful.

------------------------------------------------------------------------

[2008-10-15 09:02:54] phpbugs at colin dot guthr dot ie

Mike, it's fairly easy to recompile PHP with the libexpat library for
the legacy XML parsing functions while keeping libxml2 for the more
modern ones.

We did that in the Mandriva package for our 2009.0 release after I
reported the bug.

See the SPEC file here:
http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/updates/2009.0/php/current/SPECS/php.spec?revision=291141&view=markup

The particular change that worked around it is here:
http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/updates/2009.0/php/current/SPECS/php.spec?r1=278891&r2=281822

I'm sure you can work out how to get the needed patch that is mentioned
by navigating the webcvs :) You should be able to use this to recompile
the CentOS PHP package accordingly.

Hope this helps.

Col

------------------------------------------------------------------------

[2008-10-15 00:04:01] mike at kogan dot org

I also have run into this - we had some legacy php code on the
xml_parser that was fine on some centos 4 servers with php4 and 5
running apache 1.3. We've been debugging this failure for a day now on
our new centos 5 server running php5 and libxml2 2.7.2, and we confirm
the same problem. The characterHandler is not called for the known
entities so scripts depending on this (rss feed converters etc) emit
flawed html. I agree there's much better ways to parse XML but this is
legacy stuff thats somewhat pervasive and we didn;t choose what these
folks used for their apps.

I'd love to rebuild their server with an older libxml2 but am not sure
how to go backwards without causing some other problem. Customer has
cpanel/whm and all that hooey and I'd rather not create a mess on their
new server.

Hope ya'll fix this soon as it is an issue on the cpanel folks that
have 2.7.2 in their stable branch for centos 5 that is being spread by
their updater.

If someone can give me a pointer that a straightup build and install of
the old release code wont make things worse I'll take a crack at moving
their server back.

------------------------------------------------------------------------

[2008-10-08 09:50:16] phpbugs at colin dot guthr dot ie

Yes, I suspect that the comments left by ptn at post dot cz are
incorrect when they say it is fixed in libxml. rrichards has given a
very complete explanation of the problem and it is more fundamental than
a simple bug.

Compiling PHP with libexpat is the correct workaround for now.

------------------------------------------------------------------------

[2008-10-08 09:18:54] uraes at hot dot ee

just tried libxml2-2.7.2 and 5.2.6-pl7-gentoo and it is still broken:

Example PHP code:
<?
$data="<?xml version = '1.0' encoding = 'UTF-8'?>
<rss version=\"2.0\" >
  <channel>
    <item>
      <description>&lt;a
href=&quot;http://www.google.com&quot;>Google&lt;/a></description>
    </item>
  </channel>
</rss>
";

$parser = xml_parser_create('UTF-8');
xml_parser_set_option($parser, XML_OPTION_SKIP_WHITE, 1);
xml_parse_into_struct($parser, $data, $vals, $index);
xml_parser_free($parser);

echo "<pre>";

echo "<b>Original XML:</b><br>".htmlentities($data);

echo "<br><br><b>Parsed struct:</b><br>";
print_r($vals);
?>

.. parsed result is "a href=http://www.google.com>Google/a>"

------------------------------------------------------------------------

The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at
    http://bugs.php.net/45996

-- 
Edit this bug report at http://bugs.php.net/?id=45996&edit=1

Reply via email to