Re: [R] bigmemory: Using backing file as alternate to write.big.matrix

2013-03-21 Thread Shraddha Pai
OK, did a test where I did both - wrote a ~6Mx58 double matrix as a .txt file
(write.big.matrix), but also left the backing file + descriptor file as-is
(rather than deleting it as I usually do). Opened a different R session.
Compared contents of first 100 rows of both, they seem identical.
Size-wise, the .bin file is over twice the size of the .txt file (here .bin
was 2,641Mb and .txt was 1,184Mb).  

So my conclusion is this: if the matrix will be read often by downstream
programs, save as .bin. Code that reads the matrix can just attach it, which
is super fast (0.002s elapsed; in contrast, using read.big.matrix to read
the .txt version took 76s on my machine).
If space is a constraint and the matrix isn't expected to be read in very
often, then save as text file and read using read.big.matrix.
-
library(bigmemory)
m - attach.big.matrix(rawXpr.desc) # attach descriptor -- super fast
n - read.table(rawXpr.txt,sep=\t,header=F,as.is=T,nrow=100) # same
context saved as txt - read 100 rows for test.
n - as.matrix(n) # was a data.frame before
sapply(1:nrow(n), function(x) { print(all.equal(n[x,], m[x,])) } )

  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
TRUE
 [16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
TRUE
 [31] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
TRUE
 [46] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
TRUE
 [61] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
TRUE
 [76] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
TRUE
 [91] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
-






--
View this message in context: 
http://r.789695.n4.nabble.com/bigmemory-Using-backing-file-as-alternate-to-write-big-matrix-tp4661958p4662055.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] bigmemory: Using backing file as alternate to write.big.matrix

2013-03-20 Thread Shraddha Pai
Hi,
Does the backing file of a big.matrix store the contents of entire matrix?
Or does it store the portion of it that is not stored in RAM? In other
words, can the backing file be treated as a file containing the matrix's
full data?

I have been writing my big.matrix objects to disk (write.big.matrix), and
other programs that want to access this matrix then just read it in
(read.big.matrix). A colleague pointed out that the explicit write
operation is unnecessary as the contents of the matrix are already in the
backing file. If the backing file is stored in the appropriate directory,
then future reads of the matrix can proceed directly from the matrix's
descriptor file (attach.big.matrix). Seems to make sense; no need to write
out in text format.

I just want to confirm from people more aware of the internals of the big
matrix that indeed the backingfile is a safe way to persistently store big
matrices. Looking at the documentation gave me pause because the backing
file is described as a cache.

Thanks in advance,
Shraddha
-
Shraddha Pai
Post-doctoral fellow
Krembil Family Epigenetic Research Laboratory (Lab head: Dr. Art Petronis)
Centre for Addiction and Mental Health, Toronto



 

__
This email has been scanned by the CAMH Email Security System.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.