Re: Sorting NaNs

2018-06-11 Thread Anders Munch
Steven D'Aprano: It is not a guess if the user explicitly specifies that as the behaviour. If that was the context, sure, no problem. - Anders -- https://mail.python.org/mailman/listinfo/python-list

Re: Sorting NaNs

2018-06-10 Thread Steven D'Aprano
On Sun, 10 Jun 2018 21:28:02 +0200, Anders Munch wrote: > Richard Damon wrote: > >> The two behaviors that I have heard suggested are: >> >> 1) If any of the inputs are a NaN, the median should be a NaN. >> (Propagating the NaN as indicator of a numeric error) >> >> 2) Remove the NaNs from the

Re: Sorting NaNs

2018-06-10 Thread Richard Damon
On 6/10/18 3:28 PM, Anders Munch wrote: > Richard Damon wrote: > >> The two behaviors that I have heard suggested are: >> >> 1) If any of the inputs are a NaN, the median should be a NaN. >> (Propagating the NaN as indicator of a numeric error) >> >> 2) Remove the NaNs from the input set and

Re: Sorting NaNs

2018-06-10 Thread Anders Munch
Richard Damon wrote: The two behaviors that I have heard suggested are: 1) If any of the inputs are a NaN, the median should be a NaN. (Propagating the NaN as indicator of a numeric error) 2) Remove the NaNs from the input set and process what is left. If nothing, then return a NaN (treating

Re: Sorting NaNs

2018-06-08 Thread Gregory Ewing
Peter Pearson wrote: What applications would have to worry about colliding floats? I don't know. I'm coming from cryptology, where worrying about such things becomes a reflex. If collisions are something to be feared, then you're into hash territory, where you're probably going to want

Re: Sorting NaNs

2018-06-08 Thread Steven D'Aprano
On Fri, 08 Jun 2018 17:45:58 +, Peter Pearson wrote: > What bothered me was my feeling that a "reasonable observer" > would expect the random-float population to be much larger than 2**32, > and the probably-collision-free sample size to be accordingly much > larger than 2**16, which is,

Re: Sorting NaNs

2018-06-08 Thread Peter Pearson
On Fri, 8 Jun 2018 02:15:02 + (UTC), Steven D'Aprano wrote: > On Thu, 07 Jun 2018 20:43:10 +, Peter Pearson wrote: [snip] >> >> But gosh, if there are only 2**32 different "random" floats, then you'd >> have about a 50% chance of finding a collision among any set of 2**16 >> samples. Is

Re: Sorting NaNs

2018-06-08 Thread Chris Angelico
On Fri, Jun 8, 2018 at 9:37 PM, Steven D'Aprano wrote: > On Fri, 08 Jun 2018 12:23:40 +1000, Chris Angelico wrote: > >> On Fri, Jun 8, 2018 at 12:15 PM, Steven D'Aprano >> wrote: >>> If you truly were limited to 2**32 different values (we're not), then >>> it would be exactly right and proper to

Re: Sorting NaNs

2018-06-08 Thread Steven D'Aprano
On Fri, 08 Jun 2018 12:23:40 +1000, Chris Angelico wrote: > On Fri, Jun 8, 2018 at 12:15 PM, Steven D'Aprano > wrote: >> If you truly were limited to 2**32 different values (we're not), then >> it would be exactly right and proper to expect a collision in 2**16 >> samples. Actually, a lot less

Re: Sorting NaNs

2018-06-08 Thread Chris Angelico
On Fri, Jun 8, 2018 at 12:15 PM, Steven D'Aprano wrote: > If you truly were limited to 2**32 different values (we're not), then it > would be exactly right and proper to expect a collision in 2**16 samples. > Actually, a lot less than that: more like 78000. > >

Re: Sorting NaNs

2018-06-08 Thread Gregory Ewing
Michael Lamparski wrote: In any case, it's verifiably not true for CPython. Yes, CPython uses a particularly good PRNG. You may not be as lucky using libraries that come with other languages. A great many PRNG algorithms have been proposed, and a good proportion of them produce 32-bit ints as

Re: Sorting NaNs

2018-06-07 Thread Steven D'Aprano
On Thu, 07 Jun 2018 20:43:10 +, Peter Pearson wrote: > On Thu, 07 Jun 2018 19:02:42 +1200, Gregory Ewing wrote: >> Steven D'Aprano wrote: >>> But if it were (let's say) 1 ULP greater or less than one half, would >>> we even know? >> >> In practice it's probably somewhat bigger than 1 ULP. A

Re: Sorting NaNs

2018-06-07 Thread Michael Lamparski
On Thu, Jun 7, 2018 at 4:43 PM, Peter Pearson wrote: > But gosh, if there are only 2**32 different "random" floats, then > you'd have about a 50% chance of finding a collision among any > set of 2**16 samples. Is that really tolerable? > -- > https://mail.python.org/mailman/listinfo/python-list

Re: Sorting NaNs

2018-06-07 Thread Peter Pearson
On Thu, 07 Jun 2018 19:02:42 +1200, Gregory Ewing wrote: > Steven D'Aprano wrote: >> But if it were (let's say) 1 ULP greater or less >> than one half, would we even know? > > In practice it's probably somewhat bigger than 1 ULP. > A typical PRNG will first generate a 32-bit integer and > then

Re: Sorting NaNs

2018-06-07 Thread Chris Angelico
On Thu, Jun 7, 2018 at 2:14 PM, Steven D'Aprano wrote: > On Sat, 02 Jun 2018 21:02:14 +1000, Chris Angelico wrote: > >> Point of curiosity: Why "> 0.5"? > > No particular reason, I just happened to hit that key and then copied and > pasted the line into the next one. Hah! The simplicity of it.

Re: Sorting NaNs

2018-06-07 Thread Gregory Ewing
Steven D'Aprano wrote: But if it were (let's say) 1 ULP greater or less than one half, would we even know? In practice it's probably somewhat bigger than 1 ULP. A typical PRNG will first generate a 32-bit integer and then map it to a float, giving a resolution coarser than the 52 bits of an

Re: Sorting NaNs

2018-06-06 Thread Steven D'Aprano
On Sat, 02 Jun 2018 21:02:14 +1000, Chris Angelico wrote: > Point of curiosity: Why "> 0.5"? No particular reason, I just happened to hit that key and then copied and pasted the line into the next one. > Normally when I want a fractional > chance, I write the comparison the other way:

Re: Sorting NaNs

2018-06-02 Thread Richard Damon
On 6/2/18 7:35 PM, Steven D'Aprano wrote: > On Sat, 02 Jun 2018 17:28:28 +0200, Peter J. Holzer wrote: > >> On 2018-06-02 10:40:48 +, Steven D'Aprano wrote: >>> On Sat, 02 Jun 2018 09:32:05 +0200, Peter J. Holzer wrote: Also nope. It looks like NaNs just mess up sorting in an

Re: Sorting NaNs

2018-06-02 Thread Richard Damon
On 6/2/18 7:50 PM, Ned Batchelder wrote: > Careful, "same input" is vague.  In Python2, object() instances are > compared based on their id, in other words, their memory location. It > would be easy to overlook the idea that the layout in memory is part > of whether two inputs are "the same". > >

Re: Sorting NaNs

2018-06-02 Thread Steven D'Aprano
On Sat, 02 Jun 2018 21:51:16 +0100, Ben Bacarisse wrote: > Paul Rubin writes: > >> Steven D'Aprano writes: >>> it too will mess up sorting in unpredictable ways. So don't do that. >> >> Hmm. GHCi 7.4.2: >> >> Prelude> let x = 0.0 / 0.0 >> Prelude> x >> NaN >> Prelude> x==x >>

Re: Sorting NaNs

2018-06-02 Thread Ned Batchelder
On 6/2/18 6:16 PM, Richard Damon wrote: On 6/2/18 4:51 PM, Ben Bacarisse wrote: Paul Rubin writes: Steven D'Aprano writes: it too will mess up sorting in unpredictable ways. So don't do that. Hmm. GHCi 7.4.2: Prelude> let x = 0.0 / 0.0 Prelude> x NaN Prelude> x==x

Re: Sorting NaNs

2018-06-02 Thread Gregory Ewing
Peter J. Holzer wrote: If it was a deliberate decicion I would say it was intentional. It was a deliberate decision not to define an ordering for NaNs, but the particular behaviour of sorting them is accidental. -- Greg -- https://mail.python.org/mailman/listinfo/python-list

Re: Sorting NaNs

2018-06-02 Thread Steven D'Aprano
On Sat, 02 Jun 2018 17:28:28 +0200, Peter J. Holzer wrote: > On 2018-06-02 10:40:48 +, Steven D'Aprano wrote: >> On Sat, 02 Jun 2018 09:32:05 +0200, Peter J. Holzer wrote: >> > Also nope. It looks like NaNs just mess up sorting in an >> > unpredictable way. Is this the intended behaviour or

Re: Sorting NaNs

2018-06-02 Thread Richard Damon
On 6/2/18 4:51 PM, Ben Bacarisse wrote: > Paul Rubin writes: > >> Steven D'Aprano writes: >>> it too will mess up sorting in unpredictable ways. So don't do that. >> Hmm. GHCi 7.4.2: >> >> Prelude> let x = 0.0 / 0.0 >> Prelude> x >> NaN >> Prelude> x==x >> False >>

Re: Sorting NaNs

2018-06-02 Thread Ben Bacarisse
Paul Rubin writes: > Steven D'Aprano writes: >> it too will mess up sorting in unpredictable ways. So don't do that. > > Hmm. GHCi 7.4.2: > > Prelude> let x = 0.0 / 0.0 > Prelude> x > NaN > Prelude> x==x > False > Prelude> :m Data.List > Prelude Data.List> sort

Re: Sorting NaNs

2018-06-02 Thread Peter J. Holzer
On 2018-06-02 10:40:48 +, Steven D'Aprano wrote: > On Sat, 02 Jun 2018 09:32:05 +0200, Peter J. Holzer wrote: > > Also nope. It looks like NaNs just mess up sorting in an unpredictable > > way. Is this the intended behaviour or just an accident of > > implementation? (I think it's the latter:

Re: Sorting NaNs

2018-06-02 Thread Chris Angelico
On Sat, Jun 2, 2018 at 9:02 PM, Steven D'Aprano wrote: > The violation of reflexivity is weird though :-) > > x = float(NAN) > x == x # returns False > > Don't argue, just accept it :-) The way I learned it was: there are (of course) a finite number of bit patterns that can represent

Re: Sorting NaNs

2018-06-02 Thread Steven D'Aprano
hat list includes NaN, Correct. [...] > So call it an accident of implementation of you like. Or "sorting a list > with NaNs in it is meaningless" if you prefer. Or "undefined behaviour" > if you're a fan of the language in the C standard. While sorting NANs will return i

Re: Sorting NaNs

2018-06-02 Thread Chris Angelico
On Sat, Jun 2, 2018 at 8:40 PM, Steven D'Aprano wrote: > Python's sort algorithm assumes that objects have consistent, sensible > comparisons. If you write an object like this: > > > class Rubbish: > def __eq__(self, other): > return random.random() > 0.5 > def

Re: Sorting NaNs

2018-06-02 Thread Steven D'Aprano
On Sat, 02 Jun 2018 09:32:05 +0200, Peter J. Holzer wrote: > Also nope. It looks like NaNs just mess up sorting in an unpredictable > way. Is this the intended behaviour or just an accident of > implementation? (I think it's the latter: I can see how a sort algorithm > which doesn't treat NaN

Re: Sorting NaNs

2018-06-02 Thread Chris Angelico
On Sat, Jun 2, 2018 at 7:31 PM, Paul Moore wrote: > On 2 June 2018 at 08:32, Peter J. Holzer wrote: >> Browsing through older messages I came upon a thread discussing the >> treatment of NaNs by median(). Since you have to (partially) sort the >> values to compute the median, I played around

Re: Sorting NaNs

2018-06-02 Thread Paul Moore
On 2 June 2018 at 08:32, Peter J. Holzer wrote: > Browsing through older messages I came upon a thread discussing the > treatment of NaNs by median(). Since you have to (partially) sort the > values to compute the median, I played around with sorted(): > > Python 3.5.3 (default, Jan 19 2017,

Sorting NaNs

2018-06-02 Thread Peter J. Holzer
Browsing through older messages I came upon a thread discussing the treatment of NaNs by median(). Since you have to (partially) sort the values to compute the median, I played around with sorted(): Python 3.5.3 (default, Jan 19 2017, 14:11:04) [GCC 6.3.0 20170118] on linux Type "help",