ID: 15092 Updated by: chregu Reported By: [EMAIL PROTECTED] Old Status: Open Status: Feedback Bug Type: XML related Operating System: Win 2k (all I gues) PHP Version: 4.1.0 New Comment:
This doesn't work, because the default entities are only: <!ENTITY lt "&#60;"> <!ENTITY gt ">"> <!ENTITY amp "&#38;"> <!ENTITY apos "'"> <!ENTITY quot """> For the latin1-entities to work, you have to set an external entity to http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent (or some local file with that content) and then set up a xml_set_external_entity_ref_handler(). See the details about that in the manual. (set to feedback, 'cause I didn't really test it, if someone can verify that, we can close it.) Previous Comments: ------------------------------------------------------------------------ [2002-01-22 09:49:09] [EMAIL PROTECTED] After some more testes I found that the only literal entities that work are: &, < > and ". *ALL* others (like © a.s.o.) cause an XML_ERROR_UNDEFINED_ENTITY error. The best work around to this problem, is to tranlate the entities found in the XML source to theire numeric equivalent E.g. to   / © to © a.s.o. Following function will do the job: /** * Translate literal entities to their numeric equivalents and vice versa. * * PHP's XML parser (in V 4.1.0) has problems with entities! The only one's that are recognized * are &, < > and ". *ALL* others (like © a.s.o.) cause an * XML_ERROR_UNDEFINED_ENTITY error. I reported this as bug at http://bugs.php.net/bug.php?id=15092 * The work around is to translate the entities found in the XML source to their numeric equivalent * E.g. to   / © to © a.s.o. * * NOTE: Entities &, < > and " are left 'as is' * * @author Sam Blum [EMAIL PROTECTED] * @param string $xmlSource The XML string * @param bool $reverse (default=FALSE) Translate numeric entities to literal entities. * @return The XML string with translatet entities. */ function _translateLiteral2NumericEntities($xmlSource, $reverse = FALSE) { static $literal2NumericEntity; if (empty($literal2NumericEntity)) { $transTbl = get_html_translation_table(HTML_ENTITIES); foreach ($transTbl as $char => $entity) { if (strpos('&"<>', $char) !== FALSE) continue; $literal2NumericEntity[$entity] = '&#'.ord($char).';'; } } if ($reverse) { return strtr($xmlSource, array_flip($literal2NumericEntity)); } else { return strtr($xmlSource, $literal2NumericEntity); } } ------------------------------------------------------------------------ [2002-01-17 21:03:07] [EMAIL PROTECTED] PHP XML-parser has problems with the full iso8859-1 char set when trying to use entity names. E.g. the parser will fail with "undefined entity" if the XML data you parse contains or © a.s.o. (there many more). Some entities do work, like < > & as well as the alternative notation unsing the ISO-code number: like non-breaking space ===   For a full iso8859-1 list and it's entities see: http://www.ramsch.org/martin/uni/fmi-hp/iso8859-1.html Here's the test script you can use to check the error : <?php $xmlString[0] = "<AAA> </AAA>"; $xmlString[1] = "<AAA> </AAA>"; function startElement($xml_parser, $name, $attrs) {} function endElement($xml_parser, $name) {} function characterData($xml_parser, $text) {echo "Handling character data: '".htmlspecialchars($text)."'<br>";} $xml_parser = xml_parser_create(); xml_set_element_handler($xml_parser, "startElement", "endElement"); xml_set_character_data_handler($xml_parser, "characterData"); // Parse the XML data. if (!xml_parse($xml_parser, $xmlString[1], TRUE)) { echo "XML error in given {$source} on line ". xml_get_current_line_number($xml_parser) . ' column ' . xml_get_current_column_number($xml_parser) . '. Reason:' . xml_error_string(xml_get_error_code($xml_parser)); } ?> ------------------------------------------------------------------------ Edit this bug report at http://bugs.php.net/?id=15092&edit=1 -- PHP Development Mailing List <http://www.php.net/> To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]