#40762 [Bgs]: xml_parser failed to parse mixed coding file

2007-03-09 Thread forward at hongyu dot org
 ID:   40762
 User updated by:  forward at hongyu dot org
 Reported By:  forward at hongyu dot org
 Status:   Bogus
 Bug Type: *XML functions
 Operating System: Linux and Windows
 PHP Version:  5.2.1
 New Comment:

Exactly what you said. Thanks a lot!


Previous Comments:


[2007-03-09 06:11:13] [EMAIL PROTECTED]

You have to change this line in the XML, too

?xml version=1.0 encoding=gb2312 ?





[2007-03-08 21:56:57] forward at hongyu dot org

Description:

My RSS parser failed after I upgrade the PHP version on my server from
4.x to 5.2. When I debugged the code, I found the error was caused by
the xml_parse() function's failure to parse the UTF-8 encoded RSS
message, which is originally converted from a GB18030 string. 

The error message looks like: 
Warning: xml_parse() [function.xml-parse]: input conversion failed due
to input error, bytes 0x9B 0xE6 ...

The orginal GB encoded string consists of Chinese characters, but I
converted it to UTF-8 coding using function iconv(). I can view the
converted string correctly on web browsers, which means that there is
no converting error. So the failure only comes from xml_parse()
function, I believe.

For your testing purpose, an example of the original GB18030 string can
be downloaded at http://www.la-chinese.com/forum/rss.php?f=19

Thanks!



Reproduce code:
---
// variable $gb contains the GB encoded string, e.g., from
// web address http://www.la-chinese.com/forum/rss.php?f=19

// variable $utf contains the UTF-8 string converted from 
// the original GB encoded string

$urf = iconv('GB18030','UTF-8', $gb);

// function feed_start_end and feed_end_element etc. are from
// the package Magpierss http://magpierss.sourceforge.net/

xml_set_object( $parser, $this );
xml_set_element_handler($parser, 
'feed_start_element', 'feed_end_element' );

xml_set_character_data_handler( $parser, 'feed_cdata' ); 

$status = xml_parse( $parser, $utf );


Expected result:

No error message.

Actual result:
--
Warning: xml_parse() [function.xml-parse]: input conversion failed due
to input error, bytes 0x9B 0xE6 ...






-- 
Edit this bug report at http://bugs.php.net/?id=40762edit=1


#40762 [NEW]: xml_parser failed to parse mixed coding file

2007-03-08 Thread forward at hongyu dot org
From: forward at hongyu dot org
Operating system: Linux and Windows
PHP version:  5.2.1
PHP Bug Type: *XML functions
Bug description:  xml_parser failed to parse mixed coding file

Description:

My RSS parser failed after I upgrade the PHP version on my server from 4.x
to 5.2. When I debugged the code, I found the error was caused by the
xml_parse() function's failure to parse the UTF-8 encoded RSS message,
which is originally converted from a GB18030 string. 

The error message looks like: 
Warning: xml_parse() [function.xml-parse]: input conversion failed due to
input error, bytes 0x9B 0xE6 ...

The orginal GB encoded string consists of Chinese characters, but I
converted it to UTF-8 coding using function iconv(). I can view the
converted string correctly on web browsers, which means that there is no
converting error. So the failure only comes from xml_parse() function, I
believe.

For your testing purpose, an example of the original GB18030 string can be
downloaded at http://www.la-chinese.com/forum/rss.php?f=19

Thanks!



Reproduce code:
---
// variable $gb contains the GB encoded string, e.g., from
// web address http://www.la-chinese.com/forum/rss.php?f=19

// variable $utf contains the UTF-8 string converted from 
// the original GB encoded string

$urf = iconv('GB18030','UTF-8', $gb);

// function feed_start_end and feed_end_element etc. are from
// the package Magpierss http://magpierss.sourceforge.net/

xml_set_object( $parser, $this );
xml_set_element_handler($parser, 
'feed_start_element', 'feed_end_element' );

xml_set_character_data_handler( $parser, 'feed_cdata' ); 

$status = xml_parse( $parser, $utf );


Expected result:

No error message.

Actual result:
--
Warning: xml_parse() [function.xml-parse]: input conversion failed due to
input error, bytes 0x9B 0xE6 ...


-- 
Edit bug report at http://bugs.php.net/?id=40762edit=1
-- 
Try a CVS snapshot (PHP 4.4): 
http://bugs.php.net/fix.php?id=40762r=trysnapshot44
Try a CVS snapshot (PHP 5.2): 
http://bugs.php.net/fix.php?id=40762r=trysnapshot52
Try a CVS snapshot (PHP 6.0): 
http://bugs.php.net/fix.php?id=40762r=trysnapshot60
Fixed in CVS: http://bugs.php.net/fix.php?id=40762r=fixedcvs
Fixed in release: 
http://bugs.php.net/fix.php?id=40762r=alreadyfixed
Need backtrace:   http://bugs.php.net/fix.php?id=40762r=needtrace
Need Reproduce Script:http://bugs.php.net/fix.php?id=40762r=needscript
Try newer version:http://bugs.php.net/fix.php?id=40762r=oldversion
Not developer issue:  http://bugs.php.net/fix.php?id=40762r=support
Expected behavior:http://bugs.php.net/fix.php?id=40762r=notwrong
Not enough info:  
http://bugs.php.net/fix.php?id=40762r=notenoughinfo
Submitted twice:  
http://bugs.php.net/fix.php?id=40762r=submittedtwice
register_globals: http://bugs.php.net/fix.php?id=40762r=globals
PHP 3 support discontinued:   http://bugs.php.net/fix.php?id=40762r=php3
Daylight Savings: http://bugs.php.net/fix.php?id=40762r=dst
IIS Stability:http://bugs.php.net/fix.php?id=40762r=isapi
Install GNU Sed:  http://bugs.php.net/fix.php?id=40762r=gnused
Floating point limitations:   http://bugs.php.net/fix.php?id=40762r=float
No Zend Extensions:   http://bugs.php.net/fix.php?id=40762r=nozend
MySQL Configuration Error:http://bugs.php.net/fix.php?id=40762r=mysqlcfg