#40749 [Com]: pack and unpack erroneous behavior on 64bits hosts
ID: 40749 Comment by: martin at netimage dot dk Reported By: ben at ateor dot com Status: Open Bug Type: Unknown/Other Function Operating System: OpenBSD amd64 and sparc64 PHP Version: 5.2.1 New Comment: It appears that the sign bit is taken from LSB instead of MSB php -r 'print_r( unpack('N',pack('N',127)));' Array ( [1] = 127 ) php -r 'print_r( unpack('N',pack('N',128)));' Array ( [1] = -2147483520 ) The last number is 2's complement of -128 for 32 bit integers Cheers Previous Comments: [2007-03-14 20:57:41] pz at mysqlperformanceblog dot com In any case if you call it bug or a feature this is serious behavior change for something which a lot of people could be depending on. It breaks in MySQL 5.2.0 - 5.2.1 which is minor version upgrade. [2007-03-09 09:06:47] windeler at mediafinanz dot de Here is another example of a problem with unpack on 64bit systems. It worked in 5.1.6, but with 5.2.1 the results are bogus. The expected value from the file content is 200, but PHP says -2147483448 when I echo $a['i']. ?php $f = fopen('test.pdf','rb'); //Read a 4-byte integer from file $a = unpack('Ni',fread($f,4)); echo $a['i']; fclose($f); ? [2007-03-07 17:12:58] ben at ateor dot com Description: This is a follow-up on #40543 (http://bugs.php.net/bug.php?id=40543, since that bug is closed, I can't add comments). Please note : it's not identical to #4053 (other weird behaviors are demonstrated). Iliaa, Not sure why you suggest to use little endian or host conversions routines, but in my standpoint if you reverse twice a number's byte ordering then you should get the original number back (assuming the number don't overflows php's internals). At least, that's the standard behavior for perl and python. Beside, I can't see why php should handles those endianness conversions differently on an i386 (32 bits) and on an x86_64 (64 bits), both having the same byte order. The following on a 64bit, little endian host : x86_64$ uname -mprsv OpenBSD 4.0 GENERIC#0 amd64 AMD Sempron(tm) Processor 3400+ x86_64$ perl -e 'print unpack(N, pack(N, 41445)) .\n' 41445 x86_64$ python Python 2.4.3 (#1, Sep 6 2006, 20:33:08) [GCC 3.3.5 (propolice)] on openbsd4 Type help, copyright, credits or license for more information. from struct import * unpack('L', pack('L', 41445)) (41445L,) And, indeed : #include stdio.h #include sys/types.h int main(void) { u_int32_t x, y, z; /* 32 bits unsigned longs */ x = 41445; /* conv host (little) to network (big endian) long : pack(N, 41445) */ y = htonl(x); /* conv network (big endian) to host (little) long : unpack(N, ...) */ z = ntohl(y); printf(Host : %li\nBig : %li\nHost : %li\n, x, y, z); return 0; } x86_64$ gcc conv.c -o conv ; ./conv Host : 41445 Big : -442433536 Host : 41445 But still (PHP 5.2.2-dev (cli) (built: Feb 27 2007 22:10:11)) : x86_64$ php -r 'print_r(unpack(N, pack(N, 41445)));' Array ( [1] = -2147442203 ) While on a plain old x86 little endian host (PHP 4.4.0), we get a different result : i386_32$ uname -mprsv OpenBSD 3.9 GENERIC#0 i386 Intel(R) Pentium(R) 4 CPU 1.70GHz (GenuineIntel 686-class) i386_32$ php -r 'print_r(unpack(N, pack(N, 41445)));' Array ( [1] = 41445 ) Still on the 64 bits little endian host : x86_64$ php -r '$a = unpack(N, pack(N, 65536)); printf($a[1]\n);' 65536 # Ok x86_64$ php -r '$a = unpack(N, pack(N, 65535)); printf($a[1]\n);' -2147418113 # Weird x86_64$ php -r '$a = unpack(N, pack(N, 0x1234)); printf(0x%x\n, $a[1]);' 0x1234 # Ok x86_64$ php -r '$a = unpack(L, pack(N, 0x1234)); printf(0x%x\n, $a[1]);' 0x3412 # Ok x86_64$ php -r '$a = unpack(L, pack(L, 0x)); printf(0x%x\n, $a[1]);' 0x # Ok x86_64$ php -r '$a = unpack(N, pack(N, 0x)); printf(0x%x\n, $a[1]);' 0x8000 # The doc says N gives you always 32 bit, and we get # 8 bytes. No wonder why we overflow. x86_64$ php -r '$a = unpack(N, pack(N, 0xff )); printf(0x%x\n, $a[1]);' 0x80ff # Same. Don't tell me 0xff is too large. And now, all the following tests are on a 64 bits _big endian_ host (sparc64, running php-5.2.1) : sparc64$ uname -mprsv OpenBSD 3.8 GENERIC#607 sparc64 SUNW,UltraSPARC-IIi @ 440 MHz, version 0 FPU sparc64$ php -r '$a = unpack(N, pack(N, 0x)); printf(0x%x\n, $a[1]);' 0x # Ok # The same, but prefixing to the argument : sparc64$ php -r '$a = unpack(N, pack(N, 0x)); printf(0x%x\n, $a[1]);' 0x8000 # Weird (and with N, we stayed on the host byte order this time). # Shouldn't 0x == 0x, even on big endian ? Apparently, yes : sparc64$ php -r 'printf(0x%x\n, 0x);' 0x #
#40749 [Com]: pack and unpack erroneous behavior on 64bits hosts
ID: 40749 Comment by: pz at mysqlperformanceblog dot com Reported By: ben at ateor dot com Status: Open Bug Type: Unknown/Other Function Operating System: OpenBSD amd64 and sparc64 PHP Version: 5.2.1 New Comment: In any case if you call it bug or a feature this is serious behavior change for something which a lot of people could be depending on. It breaks in MySQL 5.2.0 - 5.2.1 which is minor version upgrade. Previous Comments: [2007-03-09 09:06:47] windeler at mediafinanz dot de Here is another example of a problem with unpack on 64bit systems. It worked in 5.1.6, but with 5.2.1 the results are bogus. The expected value from the file content is 200, but PHP says -2147483448 when I echo $a['i']. ?php $f = fopen('test.pdf','rb'); //Read a 4-byte integer from file $a = unpack('Ni',fread($f,4)); echo $a['i']; fclose($f); ? [2007-03-07 17:12:58] ben at ateor dot com Description: This is a follow-up on #40543 (http://bugs.php.net/bug.php?id=40543, since that bug is closed, I can't add comments). Please note : it's not identical to #4053 (other weird behaviors are demonstrated). Iliaa, Not sure why you suggest to use little endian or host conversions routines, but in my standpoint if you reverse twice a number's byte ordering then you should get the original number back (assuming the number don't overflows php's internals). At least, that's the standard behavior for perl and python. Beside, I can't see why php should handles those endianness conversions differently on an i386 (32 bits) and on an x86_64 (64 bits), both having the same byte order. The following on a 64bit, little endian host : x86_64$ uname -mprsv OpenBSD 4.0 GENERIC#0 amd64 AMD Sempron(tm) Processor 3400+ x86_64$ perl -e 'print unpack(N, pack(N, 41445)) .\n' 41445 x86_64$ python Python 2.4.3 (#1, Sep 6 2006, 20:33:08) [GCC 3.3.5 (propolice)] on openbsd4 Type help, copyright, credits or license for more information. from struct import * unpack('L', pack('L', 41445)) (41445L,) And, indeed : #include stdio.h #include sys/types.h int main(void) { u_int32_t x, y, z; /* 32 bits unsigned longs */ x = 41445; /* conv host (little) to network (big endian) long : pack(N, 41445) */ y = htonl(x); /* conv network (big endian) to host (little) long : unpack(N, ...) */ z = ntohl(y); printf(Host : %li\nBig : %li\nHost : %li\n, x, y, z); return 0; } x86_64$ gcc conv.c -o conv ; ./conv Host : 41445 Big : -442433536 Host : 41445 But still (PHP 5.2.2-dev (cli) (built: Feb 27 2007 22:10:11)) : x86_64$ php -r 'print_r(unpack(N, pack(N, 41445)));' Array ( [1] = -2147442203 ) While on a plain old x86 little endian host (PHP 4.4.0), we get a different result : i386_32$ uname -mprsv OpenBSD 3.9 GENERIC#0 i386 Intel(R) Pentium(R) 4 CPU 1.70GHz (GenuineIntel 686-class) i386_32$ php -r 'print_r(unpack(N, pack(N, 41445)));' Array ( [1] = 41445 ) Still on the 64 bits little endian host : x86_64$ php -r '$a = unpack(N, pack(N, 65536)); printf($a[1]\n);' 65536 # Ok x86_64$ php -r '$a = unpack(N, pack(N, 65535)); printf($a[1]\n);' -2147418113 # Weird x86_64$ php -r '$a = unpack(N, pack(N, 0x1234)); printf(0x%x\n, $a[1]);' 0x1234 # Ok x86_64$ php -r '$a = unpack(L, pack(N, 0x1234)); printf(0x%x\n, $a[1]);' 0x3412 # Ok x86_64$ php -r '$a = unpack(L, pack(L, 0x)); printf(0x%x\n, $a[1]);' 0x # Ok x86_64$ php -r '$a = unpack(N, pack(N, 0x)); printf(0x%x\n, $a[1]);' 0x8000 # The doc says N gives you always 32 bit, and we get # 8 bytes. No wonder why we overflow. x86_64$ php -r '$a = unpack(N, pack(N, 0xff )); printf(0x%x\n, $a[1]);' 0x80ff # Same. Don't tell me 0xff is too large. And now, all the following tests are on a 64 bits _big endian_ host (sparc64, running php-5.2.1) : sparc64$ uname -mprsv OpenBSD 3.8 GENERIC#607 sparc64 SUNW,UltraSPARC-IIi @ 440 MHz, version 0 FPU sparc64$ php -r '$a = unpack(N, pack(N, 0x)); printf(0x%x\n, $a[1]);' 0x # Ok # The same, but prefixing to the argument : sparc64$ php -r '$a = unpack(N, pack(N, 0x)); printf(0x%x\n, $a[1]);' 0x8000 # Weird (and with N, we stayed on the host byte order this time). # Shouldn't 0x == 0x, even on big endian ? Apparently, yes : sparc64$ php -r 'printf(0x%x\n, 0x);' 0x # Also, look at this : sparc64$ php -r '$a = unpack(N, pack(N, 41445)); printf($a[1]\n);' 41445 # And now let's just remove the line feed (\n) from the above printf : sparc64$ php -r '$a = unpack(N, pack(N, 41445)); printf($a[1]);' -2147442203 # Same for 2^16 -1 / 65535 / 0xfff : sparc64$ php -r '$a = unpack(N, pack(N, 65535)); printf($a[1]\n);' 65535 sparc64$ php -r '$a = unpack(N, pack(N, 65535));
#40749 [Com]: pack and unpack erroneous behavior on 64bits hosts
ID: 40749 Comment by: windeler at mediafinanz dot de Reported By: ben at ateor dot com Status: Open Bug Type: Unknown/Other Function Operating System: OpenBSD amd64 and sparc64 PHP Version: 5.2.1 New Comment: Here is another example of a problem with unpack on 64bit systems. It worked in 5.1.6, but with 5.2.1 the results are bogus. The expected value from the file content is 200, but PHP says -2147483448 when I echo $a['i']. ?php $f = fopen('test.pdf','rb'); //Read a 4-byte integer from file $a = unpack('Ni',fread($f,4)); echo $a['i']; fclose($f); ? Previous Comments: [2007-03-07 17:12:58] ben at ateor dot com Description: This is a follow-up on #40543 (http://bugs.php.net/bug.php?id=40543, since that bug is closed, I can't add comments). Please note : it's not identical to #4053 (other weird behaviors are demonstrated). Iliaa, Not sure why you suggest to use little endian or host conversions routines, but in my standpoint if you reverse twice a number's byte ordering then you should get the original number back (assuming the number don't overflows php's internals). At least, that's the standard behavior for perl and python. Beside, I can't see why php should handles those endianness conversions differently on an i386 (32 bits) and on an x86_64 (64 bits), both having the same byte order. The following on a 64bit, little endian host : x86_64$ uname -mprsv OpenBSD 4.0 GENERIC#0 amd64 AMD Sempron(tm) Processor 3400+ x86_64$ perl -e 'print unpack(N, pack(N, 41445)) .\n' 41445 x86_64$ python Python 2.4.3 (#1, Sep 6 2006, 20:33:08) [GCC 3.3.5 (propolice)] on openbsd4 Type help, copyright, credits or license for more information. from struct import * unpack('L', pack('L', 41445)) (41445L,) And, indeed : #include stdio.h #include sys/types.h int main(void) { u_int32_t x, y, z; /* 32 bits unsigned longs */ x = 41445; /* conv host (little) to network (big endian) long : pack(N, 41445) */ y = htonl(x); /* conv network (big endian) to host (little) long : unpack(N, ...) */ z = ntohl(y); printf(Host : %li\nBig : %li\nHost : %li\n, x, y, z); return 0; } x86_64$ gcc conv.c -o conv ; ./conv Host : 41445 Big : -442433536 Host : 41445 But still (PHP 5.2.2-dev (cli) (built: Feb 27 2007 22:10:11)) : x86_64$ php -r 'print_r(unpack(N, pack(N, 41445)));' Array ( [1] = -2147442203 ) While on a plain old x86 little endian host (PHP 4.4.0), we get a different result : i386_32$ uname -mprsv OpenBSD 3.9 GENERIC#0 i386 Intel(R) Pentium(R) 4 CPU 1.70GHz (GenuineIntel 686-class) i386_32$ php -r 'print_r(unpack(N, pack(N, 41445)));' Array ( [1] = 41445 ) Still on the 64 bits little endian host : x86_64$ php -r '$a = unpack(N, pack(N, 65536)); printf($a[1]\n);' 65536 # Ok x86_64$ php -r '$a = unpack(N, pack(N, 65535)); printf($a[1]\n);' -2147418113 # Weird x86_64$ php -r '$a = unpack(N, pack(N, 0x1234)); printf(0x%x\n, $a[1]);' 0x1234 # Ok x86_64$ php -r '$a = unpack(L, pack(N, 0x1234)); printf(0x%x\n, $a[1]);' 0x3412 # Ok x86_64$ php -r '$a = unpack(L, pack(L, 0x)); printf(0x%x\n, $a[1]);' 0x # Ok x86_64$ php -r '$a = unpack(N, pack(N, 0x)); printf(0x%x\n, $a[1]);' 0x8000 # The doc says N gives you always 32 bit, and we get # 8 bytes. No wonder why we overflow. x86_64$ php -r '$a = unpack(N, pack(N, 0xff )); printf(0x%x\n, $a[1]);' 0x80ff # Same. Don't tell me 0xff is too large. And now, all the following tests are on a 64 bits _big endian_ host (sparc64, running php-5.2.1) : sparc64$ uname -mprsv OpenBSD 3.8 GENERIC#607 sparc64 SUNW,UltraSPARC-IIi @ 440 MHz, version 0 FPU sparc64$ php -r '$a = unpack(N, pack(N, 0x)); printf(0x%x\n, $a[1]);' 0x # Ok # The same, but prefixing to the argument : sparc64$ php -r '$a = unpack(N, pack(N, 0x)); printf(0x%x\n, $a[1]);' 0x8000 # Weird (and with N, we stayed on the host byte order this time). # Shouldn't 0x == 0x, even on big endian ? Apparently, yes : sparc64$ php -r 'printf(0x%x\n, 0x);' 0x # Also, look at this : sparc64$ php -r '$a = unpack(N, pack(N, 41445)); printf($a[1]\n);' 41445 # And now let's just remove the line feed (\n) from the above printf : sparc64$ php -r '$a = unpack(N, pack(N, 41445)); printf($a[1]);' -2147442203 # Same for 2^16 -1 / 65535 / 0xfff : sparc64$ php -r '$a = unpack(N, pack(N, 65535)); printf($a[1]\n);' 65535 sparc64$ php -r '$a = unpack(N, pack(N, 65535)); printf($a[1]);' -2147418113 # We get the opposite (bogus with \n, correct without) when converting # to little endian and back to host : sparc64$ php -r '$a = unpack(L, pack(L, 0x)); printf( $a[1]);' 65535 sparc64$ php -r '$a = unpack(L, pack(L, 0x)); printf( $a[1].\n);' -2147418113 This doesn't help : SKIP Generic pack()/unpack()