On 8/10/2009 9:55 AM, Simon Urbanek wrote:
On Aug 10, 2009, at 5:47 , Duncan Murdoch wrote:

Petr Savicky wrote:
On Sat, Aug 08, 2009 at 10:39:04AM -0400, Prof. John C Nash wrote:

I'll save space and not include previous messages.

My 2 cents: At the very least the documentation needs a fix. If it is easy to do, then Ted Harding's suggestion of a switch (default OFF) to check for sign difference would be sensible.

I would urge inclusion in the documentation of the +0, -0 example(s) if there is NOT a way in R to distinguish these.


It is possible to distinguish 0 and -0 in R, since 1/0 == Inf and
1/(-0) == -Inf.

I do not know, whether there are also other such situations. In particular
 (0)^(-1) == (-0)^(-1) # [1] TRUE
 log(0) == log(-0) # [1] TRUE


There are occasions where it is useful to be able to detect things like this (and NaN and Inf and -Inf etc.). They are usually not of interest to users, but sometimes are needed for developers to check edge effects. For those cases it may be time to consider a package FPIEEE754 or some similar name to allow testing and possibly setting of flags for some of the fancier features. Likely used by just a few of us in extreme situations.


I think that distinguishing 0 and -0 may be useful even for nonexpert
users for debugging purposes. Mainly, because x == y does not imply
that x and y behave equally as demonstrated above or by
 x <- 0
 y <-  - 0
 x == y # [1] TRUE
 1/x == 1/y # [1] FALSE

I would like to recall the suggestion
 On Sat, Aug 08, 2009 at 03:04:07PM +0200, Martin Maechler wrote:
 > Maybe we should introduce a function that's basically
 > isTRUE(all.equal(..., tol=0))  {but faster},  or
 > do you want a 3rd argument to identical, say 'method'
 > with default  c("oneNaN", "use.==", "strict")
> > oneNaN: my proposal of using memcmp() on doubles as its used for
 >        other types already  (and hence distinguishing +0 and -0;
 >      otherwise keeping the feature that there's just one NaN
 >      which differs from 'NA' (and there's just one 'NA').
> > use.==: the previous R behaviour, using '==' on doubles > (and the "oneNaN" behavior)
 >   > strict: be even stricter than oneNaN:  Use  memcmp()
 >   unconditionally for doubles.  This would be the fastest
 >   version of all three.

In my opinion, for debugging purposes, the option identical(x,y,method="strict"), which implies that x and y behave equally, could be useful, if it is available
in R base,
At the R interactive level, negative zero as the value of -0 could possibly be avoided. However, negative zero may also occur in numerical calculations, since it may be obtained as x * 0, where x is negative. So, i think, negative zero cannot be eliminated from consideration as something too infrequent.

I wouldn't mind a "strict" option. It would compare bit patterns, so would distinguish +0 from -0, and different NaN values. But having the value of identical(x-y, -(y-x)) depend on whether x and y are equal or not would just lead to confusion.

... but so do other things routinely such as floating point arithmetics so I don't think this is a strong argument here. IMHO identical(0, -0) should return FALSE, because they are simply not the same objects and that's what identical is supposed test for. If you want to test equality of elements there are other means you should be using that were mentioned in this thread.

+0 and -0 are exactly equal, which is what identical is documented to be testing. They are not indistinguishable, and not identical in the English meaning of the word, but they are identical in the sense of what the identical() function is documented to test.

The cases where you want to distinguish between them are rare. They should not be distinguished in the default identical() test, any more than different values of NaN should be distinguished (and identical() is explicitly documented *not* to distinguish those).

Of the 1600 uses of identical() in the R base plus recommended packages, there are lots of cases where equality of elements is clearly the intention. There are almost no uses of the all.equal(..., tol=0) idiom in base R, and among the recommended packages, only Matrix uses it (but uses identical() for values as well, I think.)

Distinguishing between different NaN values might be harmless, because we probably only generate one. (I'm not sure about that, the literal NaN might be different from sqrt(-1) or 0/0. But I'd guess only one comes up in normal usage.) But we definitely generate both +0 and -0 all the time, and distinguishing between them would mean identical() would be useless for value-based comparison. Do you want to evaluate all 1600 uses in the base and recommended package, and who knows how many on CRAN, to figure out which ones should be changed to all.equal(..., tol=0)? I don't.

Duncan Murdoch

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to