ID:               21728
 Comment by:       [EMAIL PROTECTED]
 Reported By:      [EMAIL PROTECTED]
 Status:           Open
 Bug Type:         Documentation problem
 Operating System: All
 PHP Version:      4.4.0-dev
 New Comment:

Well, one of the problems here is that some of the array elements will
take different values in an element-to-element comparison depending on
the type of the other element. For example, "true" will be just that
compared to another string, but 0 when compared against an integer;
strings and integers are both converted to Boolean when compared to
true/false (with resulting loss of significant information).

Another problem is that if you're using a non-sequential sorting
algorithm (such as shellsort or quicksort), simply changing the length
of the array will probably change which element is compared to which,
and hence, because of the strangeness of "dual values" caused by
type-juggling, the final order of the array.  (This may be even worse
for an algorithm that is not guaranteed to maintain the order of equal
items.)

If you take a look at the sorted versions of each array cited, you will
find that all of the element-to-neighbour-element comparisons are
actually valid, thus:

array("a","b","c","d","4",5,4,"true","TRUE",true) --
  true   : 4       ==>  (bool)    true   == true
  4      : "4"     ==>  (int)     4      == 4
  "4"    : "TRUE"  ==>  (string)  "4"    <  "TRUE"
  "TRUE" : "a"     ==>  (string)  "TRUE" <  "a"
  "a"    : "b"     ==>  (string)  "a"    <  "b"
  "b"    : "c"     ==>  (string)  "b"    <  "c"
  "c"    : "d"     ==>  (string)  "c"    <  "d"
  "d"    : "true"  ==>  (string)  "d"    <  "true"
  "true" : 5       ==>  (int)     0      <  5

array("a","b","4",5,4,"true","TRUE",true, false, "c") --
  false  : "TRUE"  ==>  (bool)   false  <  true
  "TRUE" : "a"     ==>  (string) "TRUE" <  "a"
  "a"    : "true"  ==>  (string) "a"    <  "true"
  "true" : true    ==>  (bool)   true   == true
  true   : "b"     ==>  (bool)   true   == true
  "b"    : "c"     ==>  (string) "b"    <  "c"
  "c"    : 4       ==>  (int)    0      == 4
  4      : "4"     ==>  (int)    4      == 4
  "4"    : 5       ==>  (int)    4      <  5

array("a","b","4",5,4,"true","TRUE",true, false, "c", "d") --
  false  : "4"     ==>  (bool)    false  <  true
  "4"    : "TRUE"  ==>  (string)  "4"    <  "TRUE"
  "TRUE" : "a"     ==>  (string)  "TRUE" <  "a"
  "a"    : "b"     ==>  (string)  "a"    <  "b"
  "b"    : "c"     ==>  (string)  "b"    <  "c"
  "c"    : "d"     ==>  (string)  "c"    <  "d"
  "d"    : "true"  ==>  (string)  "d"    <  "true"
  "true" : true    ==>  (bool)    true   == true
  true   : 4       ==>  (bool)    true   == true
  4      : 5       ==>  (int)     4      <  5

So, in each case, we have a valid sort -- just a *different* valid
sort.  The prime determiners here seem to be the non-sequential order
in which the individual comparisons are performed, and, as has been
indicated, the automatic casting that takes place for each one.

(Incidentally, whilst putting the above together I was unable to find a
definitive listing of *exactly* what automatic type-conversions take
place in which contexts.  This is a definite oversight, as in contexts
like the above it's important to know, for example, that comparing an
int to a bool will cast the int to bool, and not the bool to int. 
Perhaps this needs to become a doc problem for the inclusion of such a
list or table?)

Hope this enlightens at least some souls reading this far!

Cheers!

Mike


Previous Comments:
------------------------------------------------------------------------

[2003-01-18 12:23:24] [EMAIL PROTECTED]

 Maybe it should not happen but the as I said the comparisons done are
correct (extensive type juggling). Maybe SORT_REGULAR is not the way to
sort (by default) but SORT_STRING.

Comments from other people are welcome :)

------------------------------------------------------------------------

[2003-01-18 12:17:00] [EMAIL PROTECTED]

I swear I get different results by just adding a "d" to the end.  This
should not happen.

------------------------------------------------------------------------

[2003-01-18 12:05:10] [EMAIL PROTECTED]

 As I said this is very complicated case because of the type juggling.
I needed 30 minute to realize that 21444 is not a bug but a bogus (for
me and Derick). I agree that the result is weird. I modified the the
compare function to see what comparisons are made. All of them look
ok.
On my php I have the same results on the script with "d" added at the
end. A little modification changes the order of comparisons and thus
the result is different. Maybe this is because the default sort type is
SORT_REGULAR. If SORT_STRING is used the result is expected. I think
that the case I provided is good to show the users that the results are
kinda unexpected when both the array contains values from various
datatypes and SORT_REGULAR is used. So if the users use such array they
have to be warned of the "unexpected" results.

------------------------------------------------------------------------

[2003-01-18 11:48:56] [EMAIL PROTECTED]

How about:

<?php
$arr1 = array("a","b","4",5,4,"true","TRUE",true, false, "c");
sort($arr1);
var_dump($arr1);
?>

Which gives:
array(10) {
  [0]=>
  bool(false)
  [1]=>
  string(4) "TRUE"
  [2]=>
  string(1) "a"
  [3]=>
  string(4) "true"
  [4]=>
  bool(true)
  [5]=>
  string(1) "b"
  [6]=>
  string(1) "c"
  [7]=>
  int(4)
  [8]=>
  string(1) "4"
  [9]=>
  int(5)
}

Which is weird as "4" looks misplaced.  For example in this:
<?php
$arr1 = array("a","b","4",5,4,"true","TRUE",true, false, "c", "d");
sort($arr1);
var_dump($arr1);
?>

We get different results (all I added was "d" to the end):

array(11) {
  [0]=>
  bool(false)
  [1]=>
  string(1) "4"
  [2]=>
  string(4) "TRUE"
  [3]=>
  string(1) "a"
  [4]=>
  string(1) "b"
  [5]=>
  string(1) "c"
  [6]=>
  string(1) "d"
  [7]=>
  string(4) "true"
  [8]=>
  bool(true)
  [9]=>
  int(4)
  [10]=>
  int(5)
}

Notice the different order, is this a genuine bug?

------------------------------------------------------------------------

[2003-01-18 10:26:27] [EMAIL PROTECTED]

 Today I closed bug #21444. The user has to master the type juggling to
know the expected output. I think that it is good idea to add it's
example as comprehensive one.
The script goes here (the explanation is after it) :
<?php
$arr1 = array("a","b","c","d","4",5,4,"true","TRUE",true);
sort($arr1);
var_dump($arr1);
?>

The output is :
array(10) {
  [0]=>
  bool(true)
  [1]=>
  int(4)
  [2]=>
  string(1) "4"
  [3]=>
  string(4) "TRUE"
  [4]=>
  string(1) "a"
  [5]=>
  string(1) "b"
  [6]=>
  string(1) "c"
  [7]=>
  string(1) "d"
  [8]=>
  string(4) "true"
  [9]=>
  int(5)
}
It may look strange - why (int)5 is after all the strings. This is
because "4" is lower than (int) 5, "4" is before "true" and "true" is
before 5. The first 2 are obvious, the third one is not. But it is ok.
It's better not to mix types in the array. If 5 is changed to "5" then
"5" goes right after "4".

Thanks

------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=21728&edit=1


-- 
PHP Documentation Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to