Re: [Rd] identical(0, -0)

Duncan Murdoch Mon, 10 Aug 2009 07:21:35 -0700

On 8/10/2009 9:55 AM, Simon Urbanek wrote:

On Aug 10, 2009, at 5:47 , Duncan Murdoch wrote:
Petr Savicky wrote:
On Sat, Aug 08, 2009 at 10:39:04AM -0400, Prof. John C Nash wrote:
I'll save space and not include previous messages.
My 2 cents: At the very least the documentation needs a fix. If itis easy to do, then Ted Harding's suggestion of a switch (defaultOFF) to check for sign difference would be sensible.
I would urge inclusion in the documentation of the +0, -0example(s) if there is NOT a way in R to distinguish these.
It is possible to distinguish 0 and -0 in R, since 1/0 == Inf and
1/(-0) == -Inf.
I do not know, whether there are also other such situations. Inparticular
 (0)^(-1) == (-0)^(-1) # [1] TRUE
 log(0) == log(-0) # [1] TRUE
There are occasions where it is useful to be able to detect thingslike this (and NaN and Inf and -Inf etc.). They are usually not ofinterest to users, but sometimes are needed for developers tocheck edge effects. For those cases it may be time to consider apackage FPIEEE754 or some similar name to allow testing andpossibly setting of flags for some of the fancier features. Likelyused by just a few of us in extreme situations.
I think that distinguishing 0 and -0 may be useful even for nonexpert
users for debugging purposes. Mainly, because x == y does not imply
that x and y behave equally as demonstrated above or by
 x <- 0
 y <-  - 0
 x == y # [1] TRUE
 1/x == 1/y # [1] FALSE

I would like to recall the suggestion
 On Sat, Aug 08, 2009 at 03:04:07PM +0200, Martin Maechler wrote:
 > Maybe we should introduce a function that's basically
 > isTRUE(all.equal(..., tol=0))  {but faster},  or
 > do you want a 3rd argument to identical, say 'method'
 > with default  c("oneNaN", "use.==", "strict")
> > oneNaN: my proposal of using memcmp() on doubles as itsused for
 >        other types already  (and hence distinguishing +0 and -0;
 >      otherwise keeping the feature that there's just one NaN
 >      which differs from 'NA' (and there's just one 'NA').
> > use.==: the previous R behaviour, using '==' on doubles> (and the "oneNaN" behavior)
 >   > strict: be even stricter than oneNaN:  Use  memcmp()
 >   unconditionally for doubles.  This would be the fastest
 >   version of all three.
In my opinion, for debugging purposes, the optionidentical(x,y,method="strict"),which implies that x and y behave equally, could be useful, if itis available
in R base,
At the R interactive level, negative zero as the value of -0 couldpossiblybe avoided. However, negative zero may also occur in numericalcalculations,since it may be obtained as x * 0, where x is negative. So, ithink, negativezero cannot be eliminated from consideration as something tooinfrequent.
I wouldn't mind a "strict" option. It would compare bit patterns,so would distinguish +0 from -0, and different NaN values. Buthaving the value of identical(x-y, -(y-x)) depend on whether x andy are equal or not would just lead to confusion.
... but so do other things routinely such as floating pointarithmetics so I don't think this is a strong argument here. IMHOidentical(0, -0) should return FALSE, because they are simply not thesame objects and that's what identical is supposed test for. If youwant to test equality of elements there are other means you should beusing that were mentioned in this thread.

+0 and -0 are exactly equal, which is what identical is documented to betesting. They are not indistinguishable, and not identical in theEnglish meaning of the word, but they are identical in the sense of whatthe identical() function is documented to test.

The cases where you want to distinguish between them are rare. Theyshould not be distinguished in the default identical() test, any morethan different values of NaN should be distinguished (and identical() isexplicitly documented *not* to distinguish those).

Of the 1600 uses of identical() in the R base plus recommended packages,there are lots of cases where equality of elements is clearly theintention. There are almost no uses of the all.equal(..., tol=0) idiomin base R, and among the recommended packages, only Matrix uses it (butuses identical() for values as well, I think.)

Distinguishing between different NaN values might be harmless, becausewe probably only generate one. (I'm not sure about that, the literalNaN might be different from sqrt(-1) or 0/0. But I'd guess only onecomes up in normal usage.) But we definitely generate both +0 and -0all the time, and distinguishing between them would mean identical()would be useless for value-based comparison. Do you want to evaluateall 1600 uses in the base and recommended package, and who knows howmany on CRAN, to figure out which ones should be changed toall.equal(..., tol=0)? I don't.


Duncan Murdoch

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] identical(0, -0)

Reply via email to