Hi R-devel, I have a question about the differentiation between NA and NaN values as implemented in R. In arithmetic.c, we have
int R_IsNA(double x) { if (isnan(x)) { ieee_double y; y.value = x; return (y.word[lw] == 1954); } return 0; } ieee_double is just used for type punning so we can check the final bits and see if they're equal to 1954; if they are, x is NA, if they're not, x is NaN (as defined for R_IsNaN). My question is -- I can see a substantial increase in speed (on my computer, in certain cases) if I replace this check with int R_IsNA(double x) { return memcmp( (char*)(&x), (char*)(&NA_REAL), sizeof(double) ) == 0; } IIUC, there is only one bit pattern used to encode R NA values, so this should be safe. But I would like to be sure: Is there any guarantee that the different functions in R would return NA as identical to the bit pattern defined for NA_REAL, for a given architecture? Similarly for NaN value(s) and R_NaN? My guess is that it is possible some functions used internally by R might encode NaN values differently; ie, setting the lower word to a value different than 1954 (hence being NaN, but potentially not identical to R_NaN), or perhaps this is architecture-dependent. However, NA should be one specific bit pattern (?). And, I wonder if there is any guarantee that the different functions used in R would return an NaN value as identical to R_NaN (which appears to be the 'IEEE NaN')? (interested parties can see + run a simple benchmark from the gist at https://gist.github.com/kevinushey/8911432) Thanks, Kevin ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel