It is not very surprising that the R process might crash once the maximum memory limit is reached. View anything done in a session after that as suspect. (The Unix equivalent is often to crash without even telling you that you are out of memory.)
On Mon, 7 Jun 2004 [EMAIL PROTECTED] wrote: > I'm consistently seeing R crash with a particular large data set. What's > strange is that although the crash seems related to running out of memory, > I'm unable to construct a pseudo-random data set of the same size that also > causes the crash. Further adding to the strangeness is that the crash only > happens if the dataset goes through a save()/load() cycle -- without that, > the command in question just gives an out-of-memory error, but does not crash. > > To make this clear, three different versions of the same data consistently > produce very different behavior: > > (1) original data read with read.table: memory error; fail to allocate > 164062 Kb > (2) original data through save()/load() cycle: memory error; fail to > allocate 82031 Kb, followed by crash > (3) psuedo-random data of same size and similar characteristics: works > without problem > > This is with R-1.9.0 under Windows 2000. I'm not loading any optional > packages. I get the same crash behavior with R-1.9.0 patched, and R-2.0.0 > alpha, but I didn't test success with the psuedo-random data under those > programs. (In case it matters, I got R-1.9.0 patched and R-2.0.0 alpha as > pre-compiled Windows binaries from http://cran.us.r-project.org/ at 9:30am > MDT on Jun 7, 2004.) Unfortunately, I don't have sufficient knowledge of > how to debug memory problems in R to make further progress than I've made > here, but maybe the following will provide some clues for someone else. > > All the following transcripts are from Rgui.exe, with new runs at each > comment beginning with "###" > > ### Read in the data and get a out-of-memory error (but no crash) > > # ClassifyTrain.txt is from http://mill.ucsd.edu/data/ClassifyTrain.zip > > X <- read.table("ClassifyTrain.txt", skip=2) > > X1 <- as.matrix(X) > > hist(log(X1[,-(1:2)]+1)) > Error: cannot allocate vector of size 164062 Kb > In addition: Warning message: > Reached total allocation of 1024Mb: see help(memory.size) > > > > ### Read in the data and save it as a .RData file for faster runs (I > initially did this for speed, > ### but this seems to be essential to causing the crash) > > # ClassifyTrain.txt is from http://mill.ucsd.edu/data/ClassifyTrain.zip > > X <- read.table("ClassifyTrain.txt", skip=2) > > X1 <- as.matrix(X) > > c(class(X1), storage.mode(X1), dim(X1)) > [1] "matrix" "double" "30000" "702" > > save(list="X1", file="X1.RData") > > ### Produce the crash > > version > _ > platform i386-pc-mingw32 > arch i386 > os mingw32 > system i386, mingw32 > status > major 1 > minor 9.0 > year 2004 > month 04 > day 12 > language R > > > > load("X1.RData") > > c(class(X1), storage.mode(X1), dim(X1)) > [1] "matrix" "double" "30000" "702" > > # all of the following 3 command consistently cause a crash > > hist(log(X1[,-(1:2)]+1)) > > hist(log(X1[,-(1:2)]+1), breaks=seq(0,13,0.5)) > > hist(log(X1[,-(1:2)]+1), breaks=seq(0,13,0.5), plot=F) > Error: cannot allocate vector of size 82031 Kb > In addition: Warning message: > Reached total allocation of 1024Mb: see help(memory.size) > > [message that comes in a Windows dialog box after a wait of many seconds:] > > R Console: Rgui.exe - Application Error > The exception unknown software exception (0xc00000fd) occured in the > application at location 0x6b5b0a53 > > #### The following is a failed attempt to reproduce the crash with > psuedo-random > #### data, i.e., R functions correctly (even when X1 is in memory) > > > > # Look at some characteristics of the original data in > > # order to produce a matrix of similar psuedo-random numbers. > > load("X1.RData") > > dim(X1) > [1] 30000 702 > > class(X1) > [1] "matrix" > > storage.mode(X1) > [1] "double" > > table(is.na(X1)) > > FALSE > 21060000 > > table(X1==0) > > FALSE TRUE > 2284455 18775545 > > exp(diff(log(table(X1==0)))) > TRUE > 8.218829 > > table(X1>=0) > > TRUE > 21060000 > > range(X1) > [1] 0 326022 > > memory.limit() > [1] 1073741824 > > memory.limit()/2^20 > [1] 1024 > > object.size(X1)/2^20 > [1] 161.0267 > > > > set.seed(1) > > X <- matrix(rexp(30000 * 702, 5e-5) * rbinom(30000 * 702, 1, 1/8), ncol=702) > > range(X) > [1] 3.615044e-04 3.249415e+05 > > > > # Both of thse commands seem to work without problems > > hist(log(X[,-(1:2)]+1)) > > hist(log(X[,-(1:2)]+1), breaks=seq(0,13,0.5)) > > ______________________________________________ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-devel > > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
