On 9/19/06, Tim Hochberg <[EMAIL PROTECTED]> wrote:
Except for heapsort those are doing insertion sorts because n is so small. Heapsort is trickier to understand because of the way the heap is formed and values pulled off. I'll look into that. I bet searchsorted gets messed up by nan's. Do you think it worthwhile to worry about it?
Chuck
A. M. Archibald wrote:
> On 19/09/06, Tim Hochberg <[EMAIL PROTECTED]> wrote:
>
>> Keith Goodman wrote:
>>
>>> In what order would you like argsort to sort the values -inf, nan, inf?
>>>
>>>
>> Ideally, -inf should sort first, inf should sort last and nan should
>> raise an exception if present.
>>
>> -tim
>>
>
> Mmm. Somebody who's working with NaNs has more or less already decided
> they don't want to be pestered with exceptions for invalid data.
Do you really think so? In my experience NaNs are nearly always just an
indication of a mistake somewhere that didn't get trapped for one reason
or another.
> I'd
> be happy if they wound up at either end, but I'm not sure it's worth
> hacking up the sort algorithm when a simple isnan() can pull them out.
>
Moving them to the end seems to be the worst choice to me. Leaving them
alone is fine with me. Or raising an exception would be fine. Or doing
one or the other depending on the error mode settings would be even
better if it is practical.
> What's happening now is that NaN<a, NaN==a, and NaN>a are all false,
> which rather confuses the sorting algorithm. But as long as it doesn't
> actually *break* (that is, leave some of the non-NaNs incorrectly
> sorted) I don't care.
>
Is that true? Are all of numpy's sorting algorithms robust against
nontransitive objects laying around? The answer to that appears to be
no. Try running this a couple of times to see what I mean:
n = 10
for i in range(10):
for kind in 'quicksort', 'heapsort', 'mergesort':
a = rand(n)
b = a.copy()
a[n//2] = nan
a.sort(kind=kind)
b.sort(kind=kind)
assert alltrue(a[:n//2] == b[:n//2]), (kind, a, b)
The values don't correctly cross the inserted NaN and the sort is incorrect.
Except for heapsort those are doing insertion sorts because n is so small. Heapsort is trickier to understand because of the way the heap is formed and values pulled off. I'll look into that. I bet searchsorted gets messed up by nan's. Do you think it worthwhile to worry about it?
Chuck
------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion