ID: 27808 Updated by: [EMAIL PROTECTED] Reported By: jcalvert at gmx dot net -Status: Open +Status: Closed Bug Type: Documentation problem Operating System: Debian Sid PHP Version: 5.0.0RC1 New Comment:
This bug has been fixed in the documentation's XML sources. Since the online and downloadable versions of the documentation need some time to get updated, we would like to ask you to be a bit patient. Thank you for the report, and for helping us make our documentation better. "If empty string is passed, the parser attempts to identify which encoding the document is encoded in by looking at the heading 3 or 4 bytes." Previous Comments: ------------------------------------------------------------------------ [2004-03-31 20:23:04] [EMAIL PROTECTED] Corrected summary. 1. For the sake of backwards compatibility, xml_parser_create() with no arguments generates a parser that only recognises ISO-8859-1. 2. If one passed "UTF-8" to it for the "encoding" argument, the parser backed by libxml assumes any given XML document to be encoded in plain UTF-8 encoding, where no BOM (Byte order mark) is allowed. 3. If one passed "" (a null string) to it, the parser attempts to identify which encoding the document is encoded in by looking at the heading 3 or 4 bytes. In this case a BOM must be there. This might fix your problem. It seems the third feature is not documented yet, so I'm marking this as a documentation problem. ------------------------------------------------------------------------ [2004-03-31 13:00:42] jcalvert at gmx dot net Description: ------------ In PHP4 parsing a UTF-8 file with the magic string (\xEF\xBB\xBF) works just fine. In PHP5.0.0RC1 the function returns with an error message saying the string didn't contain any XML data. Stripping the magic string before calling the function yields the expected result. libxml2* version 2.6.7-1 ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=27808&edit=1