On Sun, 4 Jan 2015, Duncan Murdoch wrote:

On 04/01/2015 5:13 PM, Mike Miller wrote:
The help doc for readBin writeBin tells me this:

Handling R's missing and special (Inf, -Inf and NaN) values is discussed
in the ‘R Data Import/Export’ manual.

So I go here:

http://cran.r-project.org/doc/manuals/r-release/R-data.html#Special-values

Unfortunately, I don't really understand that.  Suppose I am using
single-byte integers and I want 255 (binary 11111111) to be translated to
NA.  Is it possible to do that?  Of course I could always do something
like this:

X[ X==255 ] <- NA

The problem with that is that I want to process the data on the fly,
dividing the integer to produce a double in the range from 0 to 2:

X <- readBin( file, what="integer", n=N, size=1, signed=FALSE)/127

Why?  Why not do it in three steps, i.e.

X <- readBin( file, what="integer", n=N, size=1, signed=FALSE)
X[ X==255 ] <- NA
X <- X/127

If you are worried about the extra typing, then write a function to handle all three steps.

The thing I was concerned about is the memory usage, not the typing, because everything will be scripted. But maybe memory isn't an issue and I never have to hold two copies in memory simultaneously. There will be about 50 million elements, typically.

I think in terms of processing numbers that are streaming into memory, but that might not be what R is doing. For example, with scan() and na.strings="NA", I picture it changing strings to NA as they are read, it might load the whole file as character, then do all the work with things like what=numeric() and na.strings="NA" after the fact. Maybe that doesn't impose an extra memory burden.


It looks like this still works:

X[ X==255/127 ] <- NA

I suspect that would work on all current platforms, but I wouldn't trust it. Don't use == on floating point values unless you know they are fractions with 2^n in the denominator.

Good point about platforms. I was concerned about the use of ==, and you've convinced me it is not trustworthy.

Thanks very much.

Mike
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to