ID: 47481 Updated by: j...@php.net Reported By: carsten_sttgt at gmx dot de -Status: Open +Status: Closed Bug Type: Strings related Operating System: * PHP Version: 5.2.8 New Comment:
This bug has been fixed in CVS. Snapshots of the sources are packaged every three hours; this change will be in the next snapshot. You can grab the snapshot at http://snaps.php.net/. Thank you for the report, and for helping us make PHP better. Previous Comments: ------------------------------------------------------------------------ [2009-03-31 03:25:29] hrad...@php.net Hi Carsten, I have no idea why I thought zend_parse_paramters was a problem. I see no reason why strnatcmp_ex() couldn't use unsigned char's rather than a normal char. I suspect the type casting is done to make sure the character is properly promoted for the is*() calls. Test case - http://www.hermanradtke.com/patches/bug47481.phpt Patch - http://www.hermanradtke.com/patches/php-47481-natcasesort-extended-ascii.patch ------------------------------------------------------------------------ [2009-03-10 11:13:38] carsten_sttgt at gmx dot de > The strnatcmp uses the zend_parse_paramters function to parse > the function parameters. Ah, ok, I_m not familiar with the PHP/Zend internals (or C...). Just a question about the difference between natsort() and asort(). Should they not work in the same way if you have an array without numbers in the key values? And if I look into array.c, PHP_FUNCTION(asort) is also using zend_parse_parameters. e.g. IMHO this script should result in 2 times the same output: <?php $datensort = $datennat = $daten = array( 'Süden','spielen','Sonne','Wind','Regen','Meer' ); natsort($datennat); print_r($datennat); asort($datensort); print_r($datensort); ?> Regards, Carsten ------------------------------------------------------------------------ [2009-03-03 05:06:31] hrad...@php.net The strnatcmp uses the zend_parse_paramters function to parse the function parameters. The zend_parse_parameters function converts the string from php space into a char. Seeing as how this is a core function, I doubt this will be fixed soon. I may be completely off base, so I will leave this bug open in case someone else wants to comment. ------------------------------------------------------------------------ [2009-02-23 13:40:21] carsten_sttgt at gmx dot de Description: ------------ Hello, why is the nat_char defined as char instead of unsigned char? char limit us (and a correct sorting) to ASCII 0-127. With a unsigned char (ASCII 0-255) the sorting is "correct" for all single byte charsets like iso-8850-1 (which is the default in PHP). Internally the function is already doing a cast to unsigned char many times (but not in the main comparison). In the original header (strnatcmp.h) from the author, the typedef for nat_char is also only a hint. Regards, Carsten Reproduce code: --------------- <?php $daten = array('Süden','spielen','Sonne','Wind','Regen','Meer'); natcasesort($daten); print_r($daten); ?> Expected result: ---------------- Array ( [5] => Meer [4] => Regen [2] => Sonne [1] => spielen [0] => Süden [3] => Wind ) Actual result: -------------- Array ( [5] => Meer [4] => Regen [0] => Süden [2] => Sonne [1] => spielen [3] => Wind ) ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=47481&edit=1