I wrote about this once over here: http://www.markvanderloo.eu/yaRb/2012/07/08/representation-of-numerical-nas-in-r-and-the-1954-enigma/
-M Op zo 23 mei 2021 15:33 schreef brodie gaslam via R-devel < r-devel@r-project.org>: > I should add, I don't know that you can rely on this > particular encoding of R's NA. If I were trying to restore > an NA from some external format, I would just generate an > R NA via e.g NA_real_ in the R session I'm restoring the > external data into, and not try to hand assemble one. > > Best, > > B. > > > On Sunday, May 23, 2021, 9:23:54 AM EDT, brodie gaslam via R-devel < > r-devel@r-project.org> wrote: > > > > > > This is because the NA in question is NA_real_, which > is encoded in double precision IEEE-754, which uses > 64 bits. The "1954" is just part of the NA. The NA > must also conform to the NaN encoding for double precision > numbers, which requires that the "beginning" portion of > the number be "0x7ff0" (well, I think it should be "0x7ff8" > but that's a different story), as you can see here: > > x.word[hw] = 0x7ff0; > x.word[lw] = 1954; > > Both those components are part of the same double precision > value. They are just accessed this way to make it easy to > set the high bits (63-32) and the low bits (31-0). > > So NA is not just 1954, its 0x7ff0 0000 & 1954 (note I'm > mixing hex and decimals here). > > In IEEE 754 double precision encoding numbers that start > with 0x7ff are all NaNs. The rest of the number except for > the first bit which designates "quiet" vs "signaling" NaNs can > be anything. R has taken advantage of that to designate the > R NA by setting the lower bits to be 1954. > > Note I'm being pretty loose about endianess, etc. here, but > hopefully this conveys the problem. > > In terms of your proposal, I'm not entirely sure what you gain. > You're still attempting to generate a 64 bit representation > in the end. If all you need is to encode the fact that there > was an NA, and restore it later as a 64 bit NA, then you can do > whatever you want so long as the end result conforms to the > expected encoding. > > In terms of using 'short' here (which again, I don't see the > need for as you're using it to generate the final 64 bit encoding), > I see two possible problems. You're adding the dependency that > short will be 16 bits. We already have the (implicit) assumption > in R that double is 64 bits, and explicit that int is 32 bits. > But I think you'd be going a bit on a limb assuming that short > is 16 bits (not sure). More important, if short is indeed 16 bits, > I think in: > > x.word[hw] = 0x7ff0; > > You overflow short. > > Best, > > B. > > > > On Sunday, May 23, 2021, 8:56:18 AM EDT, Adrian Dușa < > dusa.adr...@unibuc.ro> wrote: > > > > > > Dear R devs, > > I am probably missing something obvious, but still trying to understand why > the 1954 from the definition of an NA has to fill 32 bits when it normally > doesn't need more than 16. > > Wouldn't the code below achieve exactly the same thing? > > typedef union > { > double value; > unsigned short word[4]; > } ieee_double; > > > #ifdef WORDS_BIGENDIAN > static CONST int hw = 0; > static CONST int lw = 3; > #else /* !WORDS_BIGENDIAN */ > static CONST int hw = 3; > static CONST int lw = 0; > #endif /* WORDS_BIGENDIAN */ > > > static double R_ValueOfNA(void) > { > volatile ieee_double x; > x.word[hw] = 0x7ff0; > x.word[lw] = 1954; > return x.value; > } > > This question has to do with the tagged NA values from package haven, on > which I want to improve. Every available bit counts, especially if > multi-byte characters are going to be involved. > > Best wishes, > -- > Adrian Dusa > University of Bucharest > Romanian Social Data Archive > Soseaua Panduri nr. 90-92 > 050663 Bucharest sector 5 > Romania > https://adriandusa.eu > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel