Edit report at http://bugs.php.net/bug.php?id=50189&edit=1
ID: 50189 Updated by: fel...@php.net Reported by: yoarvi at gmail dot com Summary: [PATCH] - unicode byte order difference between SPARC and x86 -Status: Open +Status: Bogus Type: Bug Package: Unicode Engine related Operating System: Solaris 10 (SPARC) PHP Version: 6SVN-2009-11-16 (SVN) Block user comment: N Private report: N New Comment: php6 code is dead. Previous Comments: ------------------------------------------------------------------------ [2009-11-17 10:51:24] yoarvi at gmail dot com Updated patch using WORDS_BIGENDIAN (suggested by christopher dot jones at oracle dot com) Index: ext/standard/php_smart_str.h =================================================================== --- ext/standard/php_smart_str.h (revision 290471) +++ ext/standard/php_smart_str.h (working copy) @@ -86,10 +86,17 @@ smart_str_appendc_ex((dest), (c), 0) /* appending of a single UTF-16 code unit (2 byte)*/ +#ifndef WORDS_BIGENDIAN #define smart_str_append2c(dest, c) do { \ smart_str_appendc_ex((dest), (c&0xFF), 0); \ smart_str_appendc_ex((dest), (c&0xFF00 ? c>>8 : '\0'), 0); \ } while (0) +#else +#define smart_str_append2c(dest, c) do { \ + smart_str_appendc_ex((dest), (c&0xFF00 ? c>>8 : '\0'), 0); \ + smart_str_appendc_ex((dest), (c&0xFF), 0); \ +} while (0) +#endif #define smart_str_free(s) \ smart_str_free_ex((s), 0) ------------------------------------------------------------------------ [2009-11-16 16:35:08] yoarvi at gmail dot com ext/sqlite3/libsqlite/sqlite3.c uses #if defined(i386) || defined(__i386__) || defined(_M_IX86)\ || defined(__x86_64) || defined(__x86_64__) Is that better? ------------------------------------------------------------------------ [2009-11-16 13:07:50] tokul at users dot sourceforge dot net If is not "#if (defined(i386) || defined(__i386__) || defined(_X86_)) " vs others. It is little endian vs big endian. I suspect that code should not assume that all other archs are big endian. ------------------------------------------------------------------------ [2009-11-16 12:20:44] yoarvi at gmail dot com Description: ------------ zspprintf() incorrectly represents strings/chars as unicode characters on Solaris (SPARC). There are byte ordering differences for unicode representations between x86 and SPARC: For example, the unicode representation (i've grouped them in sets of 2chars) of '/tmp' on x86 is '/''\0' 't''\0' 'm''\0' 'p''\0' and on SPARC it is '\0''/' '\0''t' '\0''m' '\0''p' http://marc.info/?l=php-internals&m=125811990106419&w=2 has some more details. the problem seems to be in the smart_str_append2c macro that zspprintf()/xbuf_format_converter end up using. The following patch fixes the problem: Index: ext/standard/php_smart_str.h =================================================================== --- ext/standard/php_smart_str.h (revision 290471) +++ ext/standard/php_smart_str.h (working copy) @@ -86,10 +86,17 @@ smart_str_appendc_ex((dest), (c), 0) /* appending of a single UTF-16 code unit (2 byte)*/ +#if (defined(i386) || defined(__i386__) || defined(_X86_)) #define smart_str_append2c(dest, c) do { \ smart_str_appendc_ex((dest), (c&0xFF), 0); \ smart_str_appendc_ex((dest), (c&0xFF00 ? c>>8 : '\0'), 0); \ } while (0) +#else +#define smart_str_append2c(dest, c) do { \ + smart_str_appendc_ex((dest), (c&0xFF00 ? c>>8 : '\0'), 0); \ + smart_str_appendc_ex((dest), (c&0xFF), 0); \ +} while (0) +#endif #define smart_str_free(s) \ smart_str_free_ex((s), 0) Reproduce code: --------------- % sapi/cli/php ext/spl/tests/DirectoryIterator_getBasename_basic_test.php Expected result: ---------------- getBasename_test Actual result: -------------- php goes into an infinite loop ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/bug.php?id=50189&edit=1