ID:               30846
 User updated by:  migmam at ya dot coom
 Reported By:      migmam at ya dot coom
 Status:           Bogus
 Bug Type:         XML related
 Operating System: Windows 2000
-PHP Version:      5.0.1
+PHP Version:      5.0.2
 New Comment:

Hi Derick,

I've read the manual again. And I've tried the examples changing one
character to a non ascii character and it doesn't work.
I have changed all the non ascii characters to their corresponding
&#xxx; code and it only works with simplexml (you were right in this
case) but not with SAX (using exactly the same xml file). With SAX it
still splits the string.

Please, help!

Best regards,

Miguel Angel.


Previous Comments:
------------------------------------------------------------------------

[2004-11-22 10:14:46] [EMAIL PROTECTED]

No, that default charset setting has nothing to do with it, this is for
the generation of the HTTP headers only as the documentation describes:
http://no.php.net/manual/en/ini.sect.data-handling.php#ini.default-charset

------------------------------------------------------------------------

[2004-11-22 10:08:52] migmam at ya dot coom

Hello again.
Thank you very much for your answer.
So if I change the default encoding to "ISO-8859-1" in php.ini
(default_charset = "ISO-8859-1") it must work. But it doesn't.

Note: Sorry for not copying all code. But it is the standard code
copied from php manual for SAX parser. 
Here you have:

<?php 
        

  $elementoActual = ""; 
  $elementos          = array(); 
  $identificador = ""; 
                $xml_idioma=0;
    
 function comienzaElemento($parser, $name, $attr) 
 { 

        global $elementoActual;
        $elementoActual = $name; 
                
                
        

 } 
    
 function finElemento($parser, $name) 
 { 
        
        
 } 

    
 function DatoCaracter($parser, $data) 
 { 
        global $elementos;
        global $elementoActual;
        global $identificador;
                                global $nodos;
                                global $xmlprimernodo;
        
                
        if(ord($data)!=10 && ord($data)!=9 && ord($data)!=13){
        
                                        $data=htmlentities($data);
                if($elementoActual==$xmlprimernodo){
                        $identificador=$data;
                }
                
                //$nodos = array ( 'descripcion', 'dato','id'); //Los nodos
definidos en el fichero XML
        foreach ($nodos as $elemento_array) { 
                                        //echo "<p>!!->$elemento_array</p>";
                                                        if ($elementoActual == 
$elemento_array) { 
                                 $elementos[$identificador][$elemento_array] = 
$data; 
                                                                                
 

                                                                //echo 
"<p>$elementos[$identificador][$elemento_array]</p>";
                                  } 
        }        
                
                }
 } 

    
    
 function examinaFichero($xmlSource,$xmlNodes,$xmlFirstNode,$idioma) 
 { 
        global $elementos; 
                global $nodos;
                global $xmlprimernodo;
                
                
                                
                                //Eliminar cualquier referencia anterior
                                
                                $elementoActual = null; 
                                $elementos=null; 
                                $identificador = null;  
                                //----------------
                                $nodos=$xmlNodes;
                                $xmlprimernodo=$xmlFirstNode;
                                //------------
                                global $xml_idioma;
     
        $xml_parser = xml_parser_create(); 

                                        
        xml_parser_set_option
($xml_parser,XML_OPTION_TARGET_ENCODING,"ISO-8859-1");
        xml_parser_set_option ($xml_parser,XML_OPTION_SKIP_WHITE,0);
        xml_parser_set_option
($xml_parser,XML_OPTION_SKIP_TAGSTART,1);
        xml_set_element_handler($xml_parser, "comienzaElemento",
"finElemento"); 
        xml_set_character_data_handler($xml_parser, "DatoCaracter"); 
        xml_parser_set_option ($xml_parser, XML_OPTION_CASE_FOLDING,
FALSE); 
                                        


                                        
       
        if (!($fp = fopen($xmlSource,"r"))) { 
            die("Cannot open $xmlSource."); 
        } 
                                        
        while (($data =
fread($fp,filesize(str_replace("\\","/",$xmlSource))))) {                       
                
            if (!xml_parse($xml_parser, $data, feof($fp))) { 
                                                            die (sprintf("XML 
error at line %d column %d file %s",
xml_get_current_line_number($xml_parser),
xml_get_current_column_number($xml_parser),$xmlSource)); 
            } 
        } 

        
      xml_parser_free($xml_parser);
                                

        return $elementos; 
    }

------------------------------------------------------------------------

[2004-11-20 01:51:25] [EMAIL PROTECTED]

Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

The default encoding is utf8 now (this is a change), but it is expected
and documented.

------------------------------------------------------------------------

[2004-11-20 01:08:34] migmam at ya dot coom

Description:
------------
Hi,

I have upgraded to PHP5 and an old module that reads an XML file
doesn't work now.
I use SAX to read the file and everything works fine until a non ASCII
characters is found.
When it finds a non ascii character (spanish characters in my case
-áéíóú-) it splits the element in to two different ones. For example,
the word "Información" is divided into  "Informaci" and "ón".
I have indicated in the XML document heading the type 
encoding="ISO-8859-1"
I have saved it like codified as ISO-8859-1 text file.
In the parser option I have specified ISO-8859-1
(XML_OPTION_TARGET_ENCODING)

And it doesn't work

Best regards,

Miguel Angel




------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=30846&edit=1

Reply via email to