ID: 35711 Updated by: [EMAIL PROTECTED] Reported By: matteo at beccati dot com -Status: Open +Status: Assigned Bug Type: mbstring related Operating System: Debian GNU/Linux PHP Version: 5.1.1 -Assigned To: moriyoshi +Assigned To: hirokawa New Comment:
Rui, can you check this out please? Previous Comments: ------------------------------------------------------------------------ [2005-12-19 09:00:50] matteo at beccati dot com Oops, I just realized that I forgot the -u flag :) Here is the downlaodable patch: http://beccati.com/download/mbstring-patch-20051219.txt ------------------------------------------------------------------------ [2005-12-19 08:48:47] [EMAIL PROTECTED] Please provide any patches in unified diff format. (like the first one). And downloadable somewhere. ------------------------------------------------------------------------ [2005-12-16 23:50:13] matteo at beccati dot com I've made a patch which seems to fix the issue. It basicly checks filter status during judgement. Status seems to be != 0 only when it is matching a multibyte character. I added anyway a fallback to the old judgement routine, just in case no matching encoding is found. Index: ext/mbstring/libmbfl/mbfl/mbfilter.c =================================================================== RCS file: /repository/php-src/ext/mbstring/libmbfl/mbfl/mbfilter.c,v retrieving revision 1.7.2.1 diff -u -r1.7.2.1 mbfilter.c --- ext/mbstring/libmbfl/mbfl/mbfilter.c 5 Nov 2005 04:49:57 -0000 1.7.2.1 +++ ext/mbstring/libmbfl/mbfl/mbfilter.c 16 Dec 2005 22:46:26 -0000 @@ -575,12 +575,22 @@ for (i = 0; i < num; i++) { filter = &flist[i]; - if (!filter->flag) { + if (!filter->flag && !filter->status) { encoding = filter->encoding; break; } } + if (!encoding) { + for (i = 0; i < num; i++) { + filter = &flist[i]; + if (!filter->flag) { + encoding = filter->encoding; + break; + } + } + } + /* cleanup */ /* dtors should be called in reverse order */ i = num; while (--i >= 0) { ------------------------------------------------------------------------ [2005-12-16 17:51:13] [EMAIL PROTECTED] Moriyoshi, if ext/mbstring is not maintained anymore, please let us know. ------------------------------------------------------------------------ [2005-12-16 17:18:27] matteo at beccati dot com Description: ------------ I was evaluating the mbstring extension because of its capabilities to filter and convert input parameter to the correct encoding. During my test I found out that an ISO-8859-1 string which ends with an an accented character is wrongly detected as UTF-8, even if it ends with an incomplete multibyte character (using iconv to convert the string raises such notice). Also reproduced with PHP 4.3.11 on FreeBSD 4 and 5.0.2 on Win32. Reproduce code: --------------- <?php error_reporting(E_ALL); mb_detect_order('ASCII,UTF-8,ISO-8859-1'); // \xE0 is ISO-8859-1 small a grave char test_bug("Test: \xE0"); test_bug("Test: \xE0a"); function test_bug($s) { echo "Trying: "; var_dump($s); iconv('UTF8', 'UCS2', $s); echo "Detected encoding: ".mb_detect_encoding($s)."\n"; echo "Converted string:"; var_dump(mb_convert_encoding($s, 'UTF-8', 'ASCII,UTF-8,ISO-8859-1')); echo "\n"; } ?> Expected result: ---------------- Trying: string(7) "Test: à" Notice: iconv(): Detected an incomplete multibyte character in input string in test.php on line 13 Detected encoding: ISO-8859-1 Converted string:string(8) "Test: Ã " Trying: string(8) "Test: àa" Notice: iconv(): Detected an illegal character in input string in /var/www/mbstring/test.php on line 13 Detected encoding: ISO-8859-1 Converted string:string(9) "Test: Ã a" Actual result: -------------- Trying: string(7) "Test: à" Notice: iconv(): Detected an incomplete multibyte character in input string in test.php on line 13 Detected encoding: UTF-8 Converted string:string(6) "Test: " Trying: string(8) "Test: àa" Notice: iconv(): Detected an illegal character in input string in /var/www/mbstring/test.php on line 13 Detected encoding: ISO-8859-1 Converted string:string(9) "Test: Ã a" ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=35711&edit=1