#40749 [Com]: pack and unpack erroneous behavior on 64bits hosts

2007-03-16 Thread martin at netimage dot dk
 ID:   40749
 Comment by:   martin at netimage dot dk
 Reported By:  ben at ateor dot com
 Status:   Open
 Bug Type: Unknown/Other Function
 Operating System: OpenBSD amd64 and sparc64
 PHP Version:  5.2.1
 New Comment:

It appears that the sign bit is taken from LSB instead of MSB

 php -r 'print_r( unpack('N',pack('N',127)));'
Array
(
[1] = 127
)

 php -r 'print_r( unpack('N',pack('N',128)));'
Array
(
[1] = -2147483520
)

The last number is 2's complement of -128 for 32 bit integers

Cheers


Previous Comments:


[2007-03-14 20:57:41] pz at mysqlperformanceblog dot com

In any case if you call it bug or a feature this is serious behavior
change for something which a lot of people could be depending on. 

It breaks in MySQL 5.2.0 - 5.2.1  which is minor version upgrade.



[2007-03-09 09:06:47] windeler at mediafinanz dot de

Here is another example of a problem with unpack on 64bit systems. It
worked in 5.1.6, but with 5.2.1 the results are bogus.

The expected value from the file content is 200, but PHP says
-2147483448 when I echo $a['i'].

?php
$f = fopen('test.pdf','rb');
//Read a 4-byte integer from file
$a = unpack('Ni',fread($f,4));
echo $a['i'];
fclose($f);
?



[2007-03-07 17:12:58] ben at ateor dot com

Description:

This is a follow-up on #40543 (http://bugs.php.net/bug.php?id=40543,
since
that bug is closed, I can't add comments). 
Please note : it's not identical to #4053 (other weird behaviors 
are demonstrated).

Iliaa,
Not sure why you suggest to use little endian or host conversions
routines,
but in my standpoint if you reverse twice a number's byte ordering
then you should get the original number back (assuming the number
don't
overflows php's internals).

At least, that's the standard behavior for perl and python.

Beside, I can't see why php should handles those endianness
conversions
differently on an i386 (32 bits) and on an x86_64 (64 bits), both
having the same byte order.

The following on a 64bit, little endian host :
x86_64$ uname -mprsv
OpenBSD 4.0 GENERIC#0 amd64 AMD Sempron(tm) Processor 3400+

x86_64$ perl -e 'print unpack(N, pack(N, 41445)) .\n'
41445

x86_64$ python
Python 2.4.3 (#1, Sep  6 2006, 20:33:08)
[GCC 3.3.5 (propolice)] on openbsd4
Type help, copyright, credits or license for more information.
 from struct import *
 unpack('L', pack('L', 41445))
(41445L,)

And, indeed :
#include stdio.h
#include sys/types.h
int main(void)
{
u_int32_t x, y, z; /* 32 bits unsigned longs */
x = 41445;
/* conv host (little) to network (big endian) long : pack(N,
41445) */
y = htonl(x);
/* conv network (big endian) to host (little) long :
unpack(N, ...) */
z = ntohl(y);
printf(Host : %li\nBig : %li\nHost : %li\n, x, y, z);
return 0;
}

x86_64$ gcc conv.c -o conv ; ./conv
Host : 41445
Big : -442433536
Host : 41445

But still (PHP 5.2.2-dev (cli) (built: Feb 27 2007 22:10:11)) :
x86_64$ php -r 'print_r(unpack(N, pack(N, 41445)));'
Array
(
[1] = -2147442203
)

While on a plain old x86 little endian host (PHP 4.4.0), we get
a different result :
i386_32$ uname -mprsv
OpenBSD 3.9 GENERIC#0 i386 Intel(R) Pentium(R) 4 CPU 1.70GHz
(GenuineIntel 686-class)
i386_32$ php -r 'print_r(unpack(N, pack(N, 41445)));'
Array
(
[1] = 41445
)

Still on the 64 bits little endian host :
x86_64$ php -r '$a = unpack(N, pack(N, 65536));
printf($a[1]\n);'
65536   # Ok
x86_64$ php -r '$a = unpack(N, pack(N, 65535));
printf($a[1]\n);'
-2147418113 # Weird

x86_64$ php -r '$a = unpack(N, pack(N, 0x1234)); printf(0x%x\n,
$a[1]);'
0x1234 # Ok
x86_64$ php -r '$a = unpack(L, pack(N, 0x1234)); printf(0x%x\n,
$a[1]);'
0x3412 # Ok
x86_64$ php -r '$a = unpack(L, pack(L, 0x)); printf(0x%x\n,
$a[1]);'
0x # Ok
x86_64$ php -r '$a = unpack(N, pack(N, 0x)); printf(0x%x\n,
$a[1]);'
0x8000 # The doc says N gives you always 32 bit, and we
get
   # 8 bytes. No wonder why we overflow.
x86_64$ php -r '$a = unpack(N, pack(N, 0xff )); printf(0x%x\n,
$a[1]);'
0x80ff # Same. Don't tell me 0xff is too large.

And now, all the following tests are on a 64 bits _big endian_ host
(sparc64, running php-5.2.1) :
sparc64$ uname -mprsv
OpenBSD 3.8 GENERIC#607 sparc64 SUNW,UltraSPARC-IIi @ 440 MHz, version
0 FPU
sparc64$ php -r '$a = unpack(N, pack(N, 0x)); printf(0x%x\n,
$a[1]);'
0x # Ok
# The same, but prefixing  to the argument :
sparc64$ php -r '$a = unpack(N, pack(N, 0x));
printf(0x%x\n, $a[1]);'
0x8000
# Weird (and with N, we stayed on the host byte order this time).
# Shouldn't 0x == 0x, even on big endian ? Apparently, yes
:
sparc64$ php -r 'printf(0x%x\n, 0x);'
0x

# 

#40749 [Com]: pack and unpack erroneous behavior on 64bits hosts

2007-03-14 Thread pz at mysqlperformanceblog dot com
 ID:   40749
 Comment by:   pz at mysqlperformanceblog dot com
 Reported By:  ben at ateor dot com
 Status:   Open
 Bug Type: Unknown/Other Function
 Operating System: OpenBSD amd64 and sparc64
 PHP Version:  5.2.1
 New Comment:

In any case if you call it bug or a feature this is serious behavior
change for something which a lot of people could be depending on. 

It breaks in MySQL 5.2.0 - 5.2.1  which is minor version upgrade.


Previous Comments:


[2007-03-09 09:06:47] windeler at mediafinanz dot de

Here is another example of a problem with unpack on 64bit systems. It
worked in 5.1.6, but with 5.2.1 the results are bogus.

The expected value from the file content is 200, but PHP says
-2147483448 when I echo $a['i'].

?php
$f = fopen('test.pdf','rb');
//Read a 4-byte integer from file
$a = unpack('Ni',fread($f,4));
echo $a['i'];
fclose($f);
?



[2007-03-07 17:12:58] ben at ateor dot com

Description:

This is a follow-up on #40543 (http://bugs.php.net/bug.php?id=40543,
since
that bug is closed, I can't add comments). 
Please note : it's not identical to #4053 (other weird behaviors 
are demonstrated).

Iliaa,
Not sure why you suggest to use little endian or host conversions
routines,
but in my standpoint if you reverse twice a number's byte ordering
then you should get the original number back (assuming the number
don't
overflows php's internals).

At least, that's the standard behavior for perl and python.

Beside, I can't see why php should handles those endianness
conversions
differently on an i386 (32 bits) and on an x86_64 (64 bits), both
having the same byte order.

The following on a 64bit, little endian host :
x86_64$ uname -mprsv
OpenBSD 4.0 GENERIC#0 amd64 AMD Sempron(tm) Processor 3400+

x86_64$ perl -e 'print unpack(N, pack(N, 41445)) .\n'
41445

x86_64$ python
Python 2.4.3 (#1, Sep  6 2006, 20:33:08)
[GCC 3.3.5 (propolice)] on openbsd4
Type help, copyright, credits or license for more information.
 from struct import *
 unpack('L', pack('L', 41445))
(41445L,)

And, indeed :
#include stdio.h
#include sys/types.h
int main(void)
{
u_int32_t x, y, z; /* 32 bits unsigned longs */
x = 41445;
/* conv host (little) to network (big endian) long : pack(N,
41445) */
y = htonl(x);
/* conv network (big endian) to host (little) long :
unpack(N, ...) */
z = ntohl(y);
printf(Host : %li\nBig : %li\nHost : %li\n, x, y, z);
return 0;
}

x86_64$ gcc conv.c -o conv ; ./conv
Host : 41445
Big : -442433536
Host : 41445

But still (PHP 5.2.2-dev (cli) (built: Feb 27 2007 22:10:11)) :
x86_64$ php -r 'print_r(unpack(N, pack(N, 41445)));'
Array
(
[1] = -2147442203
)

While on a plain old x86 little endian host (PHP 4.4.0), we get
a different result :
i386_32$ uname -mprsv
OpenBSD 3.9 GENERIC#0 i386 Intel(R) Pentium(R) 4 CPU 1.70GHz
(GenuineIntel 686-class)
i386_32$ php -r 'print_r(unpack(N, pack(N, 41445)));'
Array
(
[1] = 41445
)

Still on the 64 bits little endian host :
x86_64$ php -r '$a = unpack(N, pack(N, 65536));
printf($a[1]\n);'
65536   # Ok
x86_64$ php -r '$a = unpack(N, pack(N, 65535));
printf($a[1]\n);'
-2147418113 # Weird

x86_64$ php -r '$a = unpack(N, pack(N, 0x1234)); printf(0x%x\n,
$a[1]);'
0x1234 # Ok
x86_64$ php -r '$a = unpack(L, pack(N, 0x1234)); printf(0x%x\n,
$a[1]);'
0x3412 # Ok
x86_64$ php -r '$a = unpack(L, pack(L, 0x)); printf(0x%x\n,
$a[1]);'
0x # Ok
x86_64$ php -r '$a = unpack(N, pack(N, 0x)); printf(0x%x\n,
$a[1]);'
0x8000 # The doc says N gives you always 32 bit, and we
get
   # 8 bytes. No wonder why we overflow.
x86_64$ php -r '$a = unpack(N, pack(N, 0xff )); printf(0x%x\n,
$a[1]);'
0x80ff # Same. Don't tell me 0xff is too large.

And now, all the following tests are on a 64 bits _big endian_ host
(sparc64, running php-5.2.1) :
sparc64$ uname -mprsv
OpenBSD 3.8 GENERIC#607 sparc64 SUNW,UltraSPARC-IIi @ 440 MHz, version
0 FPU
sparc64$ php -r '$a = unpack(N, pack(N, 0x)); printf(0x%x\n,
$a[1]);'
0x # Ok
# The same, but prefixing  to the argument :
sparc64$ php -r '$a = unpack(N, pack(N, 0x));
printf(0x%x\n, $a[1]);'
0x8000
# Weird (and with N, we stayed on the host byte order this time).
# Shouldn't 0x == 0x, even on big endian ? Apparently, yes
:
sparc64$ php -r 'printf(0x%x\n, 0x);'
0x

# Also, look at this :
sparc64$ php -r '$a = unpack(N, pack(N, 41445));
printf($a[1]\n);'
41445
# And now let's just remove the line feed (\n) from the above printf :
sparc64$ php -r '$a = unpack(N, pack(N, 41445)); printf($a[1]);'
-2147442203

# Same for 2^16 -1 / 65535 / 0xfff :
sparc64$ php -r '$a = unpack(N, pack(N, 65535));
printf($a[1]\n);'
65535
sparc64$ php -r '$a = unpack(N, pack(N, 65535)); 

#40749 [Com]: pack and unpack erroneous behavior on 64bits hosts

2007-03-09 Thread windeler at mediafinanz dot de
 ID:   40749
 Comment by:   windeler at mediafinanz dot de
 Reported By:  ben at ateor dot com
 Status:   Open
 Bug Type: Unknown/Other Function
 Operating System: OpenBSD amd64 and sparc64
 PHP Version:  5.2.1
 New Comment:

Here is another example of a problem with unpack on 64bit systems. It
worked in 5.1.6, but with 5.2.1 the results are bogus.

The expected value from the file content is 200, but PHP says
-2147483448 when I echo $a['i'].

?php
$f = fopen('test.pdf','rb');
//Read a 4-byte integer from file
$a = unpack('Ni',fread($f,4));
echo $a['i'];
fclose($f);
?


Previous Comments:


[2007-03-07 17:12:58] ben at ateor dot com

Description:

This is a follow-up on #40543 (http://bugs.php.net/bug.php?id=40543,
since
that bug is closed, I can't add comments). 
Please note : it's not identical to #4053 (other weird behaviors 
are demonstrated).

Iliaa,
Not sure why you suggest to use little endian or host conversions
routines,
but in my standpoint if you reverse twice a number's byte ordering
then you should get the original number back (assuming the number
don't
overflows php's internals).

At least, that's the standard behavior for perl and python.

Beside, I can't see why php should handles those endianness
conversions
differently on an i386 (32 bits) and on an x86_64 (64 bits), both
having the same byte order.

The following on a 64bit, little endian host :
x86_64$ uname -mprsv
OpenBSD 4.0 GENERIC#0 amd64 AMD Sempron(tm) Processor 3400+

x86_64$ perl -e 'print unpack(N, pack(N, 41445)) .\n'
41445

x86_64$ python
Python 2.4.3 (#1, Sep  6 2006, 20:33:08)
[GCC 3.3.5 (propolice)] on openbsd4
Type help, copyright, credits or license for more information.
 from struct import *
 unpack('L', pack('L', 41445))
(41445L,)

And, indeed :
#include stdio.h
#include sys/types.h
int main(void)
{
u_int32_t x, y, z; /* 32 bits unsigned longs */
x = 41445;
/* conv host (little) to network (big endian) long : pack(N,
41445) */
y = htonl(x);
/* conv network (big endian) to host (little) long :
unpack(N, ...) */
z = ntohl(y);
printf(Host : %li\nBig : %li\nHost : %li\n, x, y, z);
return 0;
}

x86_64$ gcc conv.c -o conv ; ./conv
Host : 41445
Big : -442433536
Host : 41445

But still (PHP 5.2.2-dev (cli) (built: Feb 27 2007 22:10:11)) :
x86_64$ php -r 'print_r(unpack(N, pack(N, 41445)));'
Array
(
[1] = -2147442203
)

While on a plain old x86 little endian host (PHP 4.4.0), we get
a different result :
i386_32$ uname -mprsv
OpenBSD 3.9 GENERIC#0 i386 Intel(R) Pentium(R) 4 CPU 1.70GHz
(GenuineIntel 686-class)
i386_32$ php -r 'print_r(unpack(N, pack(N, 41445)));'
Array
(
[1] = 41445
)

Still on the 64 bits little endian host :
x86_64$ php -r '$a = unpack(N, pack(N, 65536));
printf($a[1]\n);'
65536   # Ok
x86_64$ php -r '$a = unpack(N, pack(N, 65535));
printf($a[1]\n);'
-2147418113 # Weird

x86_64$ php -r '$a = unpack(N, pack(N, 0x1234)); printf(0x%x\n,
$a[1]);'
0x1234 # Ok
x86_64$ php -r '$a = unpack(L, pack(N, 0x1234)); printf(0x%x\n,
$a[1]);'
0x3412 # Ok
x86_64$ php -r '$a = unpack(L, pack(L, 0x)); printf(0x%x\n,
$a[1]);'
0x # Ok
x86_64$ php -r '$a = unpack(N, pack(N, 0x)); printf(0x%x\n,
$a[1]);'
0x8000 # The doc says N gives you always 32 bit, and we
get
   # 8 bytes. No wonder why we overflow.
x86_64$ php -r '$a = unpack(N, pack(N, 0xff )); printf(0x%x\n,
$a[1]);'
0x80ff # Same. Don't tell me 0xff is too large.

And now, all the following tests are on a 64 bits _big endian_ host
(sparc64, running php-5.2.1) :
sparc64$ uname -mprsv
OpenBSD 3.8 GENERIC#607 sparc64 SUNW,UltraSPARC-IIi @ 440 MHz, version
0 FPU
sparc64$ php -r '$a = unpack(N, pack(N, 0x)); printf(0x%x\n,
$a[1]);'
0x # Ok
# The same, but prefixing  to the argument :
sparc64$ php -r '$a = unpack(N, pack(N, 0x));
printf(0x%x\n, $a[1]);'
0x8000
# Weird (and with N, we stayed on the host byte order this time).
# Shouldn't 0x == 0x, even on big endian ? Apparently, yes
:
sparc64$ php -r 'printf(0x%x\n, 0x);'
0x

# Also, look at this :
sparc64$ php -r '$a = unpack(N, pack(N, 41445));
printf($a[1]\n);'
41445
# And now let's just remove the line feed (\n) from the above printf :
sparc64$ php -r '$a = unpack(N, pack(N, 41445)); printf($a[1]);'
-2147442203

# Same for 2^16 -1 / 65535 / 0xfff :
sparc64$ php -r '$a = unpack(N, pack(N, 65535));
printf($a[1]\n);'
65535
sparc64$ php -r '$a = unpack(N, pack(N, 65535)); printf($a[1]);'
-2147418113

# We get the opposite (bogus with \n, correct without) when converting
# to little endian and back to host :
sparc64$ php -r '$a = unpack(L, pack(L, 0x)); printf( $a[1]);'
65535
sparc64$ php -r '$a = unpack(L, pack(L, 0x)); printf(
$a[1].\n);'
-2147418113


This doesn't help :
SKIP Generic pack()/unpack()