ID:               30846
 User updated by:  migmam at ya dot coom
 Reported By:      migmam at ya dot coom
 Status:           Bogus
 Bug Type:         XML related
 Operating System: Windows 2000
 PHP Version:      5.0.2
 New Comment:

It has never happened to me in PHP 4. Same code, same xml file. The
only change was PHP version and it is splitting the string exactly
before non ascii characters.
I can assume it rewritting my code but I think it is not a good
behaviour.

Anyway, thanks for your time.


Previous Comments:
------------------------------------------------------------------------

[2004-12-17 11:35:26] [EMAIL PROTECTED]

There's absolutely nothing wrong with SAX splitting the string. Change
your code. It was always the case, that SAX can split the code. Also in
PHP 4. If it didn't happen for you, good for you. But it's clearly
stated, that SAX *can* split the string, if it thinks it has to. Get
over it.

For your other problem. Use

xml_parser_create ( "ISO-8859-1") and you should be able to parse
ISO-8859-1 encoded files.

See http://ch.php.net/manual/en/function.xml-parser-create.php and read
also the user comments.

Please do not reopen this bug. It isn't a bug.

------------------------------------------------------------------------

[2004-12-17 11:20:06] migmam at ya dot coom

Hi Derick,

Please, take a look to this source code:

//-------------------------------------------------

$file = "prueba.xml";

function startElement($parser, $name, $attrs)
{

}

function endElement($parser, $name)
{

}

function characterData($parser, $data)
{
        
   echo "->".$data."<-";
}

$xml_parser = xml_parser_create();
xml_parser_set_option($xml_parser,XML_OPTION_TARGET_ENCODING,"ISO-8859-1");
xml_parser_set_option($xml_parser, XML_OPTION_CASE_FOLDING, false);
xml_set_element_handler($xml_parser, "startElement", "endElement");
xml_set_character_data_handler($xml_parser, "characterData");
if (!($fp = fopen($file, "r"))) {
   die("could not open XML input");
}

while ($data = fread($fp, 4096)) {
   if (!xml_parse($xml_parser, $data, feof($fp))) {
       die(sprintf("XML error: %s at line %d",
                   xml_error_string(xml_get_error_code($xml_parser)),
                   xml_get_current_line_number($xml_parser)));
   }
}
xml_parser_free($xml_parser);

// XML FILE prueba.xml-------------------------//
<?xml version="1.0" encoding="ISO-8859-1" ?>
<!DOCTYPE elements[
                                        <!ELEMENT element (data)*>
                                        <!ELEMENT data (#PCDATA)*>
                                        
]>
<elements>
        <element>
                 <data>botón cancelar</data>
        </element>
</elements>

//---------------EXPECTED RESULT-------------------
->  <-->  <-->botón cancelar<-->  <--> <-
//---------------ACTUAL OUTPUT---------------------
->  <-->  <-->bot<-->ón cancelar<-->  <--> <-


Best regards,

Miguel Angel.

------------------------------------------------------------------------

[2004-11-26 10:56:14] migmam at ya dot coom

Hi Derick,

I've read the manual again. And I've tried the examples changing one
character to a non ascii character and it doesn't work.
I have changed all the non ascii characters to their corresponding
&#xxx; code and it only works with simplexml (you were right in this
case) but not with SAX (using exactly the same xml file). With SAX it
still splits the string.

Please, help!

Best regards,

Miguel Angel.

------------------------------------------------------------------------

[2004-11-22 10:14:46] [EMAIL PROTECTED]

No, that default charset setting has nothing to do with it, this is for
the generation of the HTTP headers only as the documentation describes:
http://no.php.net/manual/en/ini.sect.data-handling.php#ini.default-charset

------------------------------------------------------------------------

[2004-11-22 10:08:52] migmam at ya dot coom

Hello again.
Thank you very much for your answer.
So if I change the default encoding to "ISO-8859-1" in php.ini
(default_charset = "ISO-8859-1") it must work. But it doesn't.

Note: Sorry for not copying all code. But it is the standard code
copied from php manual for SAX parser. 
Here you have:

<?php 
        

  $elementoActual = ""; 
  $elementos          = array(); 
  $identificador = ""; 
                $xml_idioma=0;
    
 function comienzaElemento($parser, $name, $attr) 
 { 

        global $elementoActual;
        $elementoActual = $name; 
                
                
        

 } 
    
 function finElemento($parser, $name) 
 { 
        
        
 } 

    
 function DatoCaracter($parser, $data) 
 { 
        global $elementos;
        global $elementoActual;
        global $identificador;
                                global $nodos;
                                global $xmlprimernodo;
        
                
        if(ord($data)!=10 && ord($data)!=9 && ord($data)!=13){
        
                                        $data=htmlentities($data);
                if($elementoActual==$xmlprimernodo){
                        $identificador=$data;
                }
                
                //$nodos = array ( 'descripcion', 'dato','id'); //Los nodos
definidos en el fichero XML
        foreach ($nodos as $elemento_array) { 
                                        //echo "<p>!!->$elemento_array</p>";
                                                        if ($elementoActual == 
$elemento_array) { 
                                 $elementos[$identificador][$elemento_array] = 
$data; 
                                                                                
 

                                                                //echo 
"<p>$elementos[$identificador][$elemento_array]</p>";
                                  } 
        }        
                
                }
 } 

    
    
 function examinaFichero($xmlSource,$xmlNodes,$xmlFirstNode,$idioma) 
 { 
        global $elementos; 
                global $nodos;
                global $xmlprimernodo;
                
                
                                
                                //Eliminar cualquier referencia anterior
                                
                                $elementoActual = null; 
                                $elementos=null; 
                                $identificador = null;  
                                //----------------
                                $nodos=$xmlNodes;
                                $xmlprimernodo=$xmlFirstNode;
                                //------------
                                global $xml_idioma;
     
        $xml_parser = xml_parser_create(); 

                                        
        xml_parser_set_option
($xml_parser,XML_OPTION_TARGET_ENCODING,"ISO-8859-1");
        xml_parser_set_option ($xml_parser,XML_OPTION_SKIP_WHITE,0);
        xml_parser_set_option
($xml_parser,XML_OPTION_SKIP_TAGSTART,1);
        xml_set_element_handler($xml_parser, "comienzaElemento",
"finElemento"); 
        xml_set_character_data_handler($xml_parser, "DatoCaracter"); 
        xml_parser_set_option ($xml_parser, XML_OPTION_CASE_FOLDING,
FALSE); 
                                        


                                        
       
        if (!($fp = fopen($xmlSource,"r"))) { 
            die("Cannot open $xmlSource."); 
        } 
                                        
        while (($data =
fread($fp,filesize(str_replace("\\","/",$xmlSource))))) {                       
                
            if (!xml_parse($xml_parser, $data, feof($fp))) { 
                                                            die (sprintf("XML 
error at line %d column %d file %s",
xml_get_current_line_number($xml_parser),
xml_get_current_column_number($xml_parser),$xmlSource)); 
            } 
        } 

        
      xml_parser_free($xml_parser);
                                

        return $elementos; 
    }

------------------------------------------------------------------------

The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at
    http://bugs.php.net/30846

-- 
Edit this bug report at http://bugs.php.net/?id=30846&edit=1

Reply via email to