The OP may be interested in using low-level readBin() and writeBin() instead. One can then either assign dimension attributes to the object to access the data as a matrix/an array. Note that assigning dimension attributes will probably(?) allocate a copy. If that is not wanted, it is not that hard to translate matrix/array indices into 1D-vector indices and vice versa, cf. arrayIndex() of R.utils. readBin()/writeBin() would also allow you use random-access to any part of the data [which read.table() won't].
To me it sounds a bit odd to use read.table() for "raw" data; it is intended to be used with (ASCII only?) text files that do not have a fixed number of symbols per entry and row, but instead rely on separators and newlines to identify the rectangular structure. readBin()/writeBin() would certainly provide a more compact file format. Also, I do not think it would allow you to use NUL (\000), because that is reserved as the end-of-string symbol. My $.02 /Henrik On Fri, Feb 12, 2010 at 1:33 AM, jim holtman <jholt...@gmail.com> wrote: > What you might consider is to use save/load for storing the data in a > format that is easily accessible in R, and then using write.table for > creating a character based output for other external programs. For > the size files you are working with, this is the easiest and fastest > way of doing it. > > On Thu, Feb 11, 2010 at 4:08 PM, Johan Jackson > <johan.h.jack...@gmail.com> wrote: >> Apologies for my sarcastic/defensive reply email Peter. >> >> The issue is that I need this matrix to be read into other programs - not >> just R, so save() won't work. I like 'raw' mode because it saves so much >> space, but it's difficult to work with. This read/write issue is but one >> example; another is that R will try to convert the raw matrix to, e.g., >> double, if you forget and assign any element of it to be double (personally, >> I'd prefer there to be an option, set in options(), for R to downcast the >> variable to raw and give you a warning). >> >> Anyway, I've been working with R a bit, but I've come to the conclusion that >> it is just not user-friendly when it comes to large datasets. I've tried >> some of the large data packages but at least all that I've tried have their >> own sets of issues. As much as it pains me to say it, I may go back to SAS >> when working on such projects... >> >> Best, >> >> JJ >> >> >> >> >> >> On Thu, Feb 11, 2010 at 1:19 PM, Peter Ehlers <ehl...@ucalgary.ca> wrote: >> >>> Johan, >>> >>> My apologies if you took my comments to be sarcastic; they were >>> certainly not meant to be. I have no desire to put you or anyone >>> down. >>> >>> I see now that you want to somehow store data more 'efficiently', >>> presumably in order to be able to handle larger objects in RAM. >>> >>> I doubt that storage.mode raw will help. Your post implied that >>> you had saved an object and couldn't read it back into the same >>> format in which you think it was saved. So, did you have 16Gb >>> object to save? And why wouldn't you use save()? It's just a >>> guess, but I think you may have a file of _character_ data that >>> you want to read into R where its storage mode should be 'raw'. >>> I don't know how to do that. >>> >>> If the main purpose is to circumvent R's memory requirements, >>> then there have been plenty of posts on that issue. >>> >>> -Peter Ehlers >>> >>> >>> Johan Jackson wrote: >>> >>>> "I suspect that you really don't know what 'raw' type means and haven't >>>> bothered to check ?raw. It's also pretty clear that you haven't read the >>>> colClasses description in ?read.table very carefully." >>>> >>>> Gee, thanks Peter (this is what I love about the R help boards: people >>>> whose >>>> sole goal is to put others down as wittily as possible for asking *stupid >>>> stupid* questions). Gives me warm fuzzies :) >>>> >>>> Although I admit to not being the brightest of folks around, or knowing R >>>> backwards and forwards, I did read ?read.table and ?raw. But your >>>> suggestion >>>> is not at all helpful Peter: >>>> >>>> dat <- read.table(file="data", header=TRUE, colClasses="character") #wow! >>>> it >>>> works on a 5x3 matrix! amazing!! (sarcasm) >>>> >>>> dat2 <- as.matrix(dat) >>>> storage.mode(dat2) <- 'raw' >>>> >>>> if I had wanted 'character' data, I would have put that into my question. >>>> Any newbie can do what you did; the issue is that object.size(dat) is >>>> about >>>> 8 times larger than object.size(dat2) with any large dataset. That's why I >>>> want to store it as 'raw' - because the raw one takes about 2 Gb RAM and >>>> the >>>> other about 16Gb! Perhaps you need to understand the raw mode a bit >>>> better, >>>> Peter, because I thought the reason for wanting the data in 'raw' was >>>> quite >>>> obvious, but I guess not. >>>> >>>> Peter, here's what I want you to do. Use R to make a vector with 2^31 - 5 >>>> elements in it. Hey, make it of mode 'character' while you're at it! Write >>>> it out. Read it back in. Having problems? Then come talk to me... >>>> >>>> JJ >>>> >>>> [....] >>> >>> -- >>> Peter Ehlers >>> University of Calgary >>> >>> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem that you are trying to solve? > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.