ID:               35447
 Updated by:       [EMAIL PROTECTED]
 Reported By:      saramaca at libertysurf dot fr
-Status:           Assigned
+Status:           Open
 Bug Type:         XML related
 Operating System: *
 PHP Version:      5.1.1
 Assigned To:      rrichards


Previous Comments:
------------------------------------------------------------------------

[2005-11-28 20:28:32] [EMAIL PROTECTED]

As far as the default attribute values - have to check on expat
behavior.

The other issue is fixed with libxml2 2.6.18. I have a patch
(http://www.ctindustries.net/patches/xml.compat.diff.txt) that looks
like it should work around the issue with older libxml2 libs, but need
more testing with different encoding/BOM schemes to make sure it doesnt
break anything as were playing with the libxml encoding handling here.

------------------------------------------------------------------------

[2005-11-28 18:03:18] [EMAIL PROTECTED]

expat vs libxml2 incompatibility?

------------------------------------------------------------------------

[2005-11-28 14:55:33] saramaca at libertysurf dot fr

Description:
------------
In PHP4 xml_parse_into_struct() can parse an UTF-8-encoded XML file
with or without a UTF-8 BOM (\xEF\xBB\xBF). In PHP 5, this is no longer
the case and it raises an error saying the string doesn't contain any
XML data (Empty document). 

Additionally PHP 5's xml_parse_into_struct() does *NOT* place default
attribute values into the struct (e.g. despite the DTD provided,
$content[1]['attributes']['type'] isn't set to "literal" in actual
result section below ; please compare it to expected result.) This used
to work under PHP 4.1.x and above (but the parser is based on expat
AFAIK.) 

PS: I guess "manually" stripping this magic number -- if embedded --
before calling the function would yield the expected result. However I
found an acceptable work-around that seems to work equally well across
versions 4 and 5 of PHP :

<?php
...
$parser = xml_parser_create('');
xml_parser_set_option($parser, XML_OPTION_TARGET_ENCODING, $encoding);
...
?>

Rather than:

<?php
...
$parser = xml_parser_create($encoding);
...
?>

Reproduce code:
---------------
http://www.diptyque.net/bugs/utf8_bom.php
; running PHP 4 --> outputs expected result

http://www.diptyque.net/bugs/utf8_bom.phps
; source code

Expected result:
----------------
w/ autodetect -->
Array
(
    [0] => Array
        (
            [tag] => bundle
            [type] => open
            [level] => 1
            [value] =>

        )

    [1] => Array
        (
            [tag] => resource
            [type] => complete
            [level] => 2
            [attributes] => Array
                (
                    [key] => rSeeYou
                    [type] => literal
                )

            [value] => A bient&244;t
        )

    [2] => Array
        (
            [tag] => bundle
            [value] =>

            [type] => cdata
            [level] => 1
        )

    [3] => Array
        (
            [tag] => bundle
            [type] => close
            [level] => 1
        )

)
w/o autodetect -->
Array
(
    [0] => Array
        (
            [tag] => bundle
            [type] => open
            [level] => 1
            [value] =>

        )

    [1] => Array
        (
            [tag] => resource
            [type] => complete
            [level] => 2
            [attributes] => Array
                (
                    [key] => rSeeYou
                    [type] => literal
                )

            [value] => A bient&244;t
        )

    [2] => Array
        (
            [tag] => bundle
            [value] =>

            [type] => cdata
            [level] => 1
        )

    [3] => Array
        (
            [tag] => bundle
            [type] => close
            [level] => 1
        )

)

Actual result:
--------------
w/ autodetect -->
Array
(
    [0] => Array
        (
            [tag] => bundle
            [type] => open
            [level] => 1
            [value] =>

        )

    [1] => Array
        (
            [tag] => resource
            [type] => complete
            [level] => 2
            [attributes] => Array
                (
                    [key] => rSeeYou
                )

            [value] => A bient&244;t
        )

    [2] => Array
        (
            [tag] => bundle
            [value] =>

            [type] => cdata
            [level] => 1
        )

    [3] => Array
        (
            [tag] => bundle
            [type] => close
            [level] => 1
        )

)
w/o autodetect -->
Empty document


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=35447&edit=1

Reply via email to