As an example of how one might do this sort of thing in SparseM ignoring the rounding aspect...
require(SparseM) require(msm) #for rtnorm sm <- function(dim,rnd,q){ n <- rbinom(1, dim * dim, 2 * pnorm(q) - 1) ia <- sample(dim,n,replace = TRUE) ja <- sample(dim,n,replace = TRUE) ra <- rtnorm(n,lower = -q, upper = q) A <- new("matrix.coo", ia = as.integer(ia), ja = as.integer (ja), ra = ra, dimension = as.integer(c(dim,dim))) A <- as.matrix.csr(A) } For dim = 5000 and q = .03 which exceeds Gavin's suggested 1 percent density, this takes about 30 seconds on my imac and according to Rprof about 95 percent of that (total) time is spent generating the truncated normals. Word of warning: pushing this too much further gets tedious since the number of random numbers grows like dim^2. For example, dim = 20,000 and q = .02 takes 432 seconds with again 93% of the total time spent in rnorm and rtnorm... url: www.econ.uiuc.edu/~roger Roger Koenker email [EMAIL PROTECTED] Department of Economics vox: 217-333-4558 University of Illinois fax: 217-244-6678 Champaign, IL 61820 On Jun 10, 2006, at 12:53 PM, g l wrote: > Hi, > > I'm Sorry for any cross-posting. I've reviewed the archives and could > not find an exact answer to my question below. > > I'm trying to generate very large sparse matrices (< 1% non-zero > entries per row). I have a sparse matrix function below which works > well until the row/col count exceeds 10,000. This is being run on a > machine with 32G memory: > > sparse_matrix <- function(dims,rnd,p) { > ptm <- proc.time() > x <- round(rnorm(dims*dims),rnd) > x[((abs(x) - p) < 0)] <- 0 > y <- matrix(x,nrow=dims,ncol=dims) > proc.time() - ptm > } > > When trying to generate the matrix around 20,000 rows/cols on a > machine with 32G of memory, the error message I receive is: > > R(335) malloc: *** vm_allocate(size=3200004096) failed (error code=3) > R(335) malloc: *** error: can't allocate region > R(335) malloc: *** set a breakpoint in szone_error to debug > R(335) malloc: *** vm_allocate(size=3200004096) failed (error code=3) > R(335) malloc: *** error: can't allocate region > R(335) malloc: *** set a breakpoint in szone_error to debug > Error: cannot allocate vector of size 3125000 Kb > Error in round(rnorm(dims * dims), rnd) : unable to find the argument > 'x' in selecting a method for function 'round' > > * Last error line is obvious. Question: on machine w/32G memory, why > can't it allocate a vector of size 3125000 Kb? > > When trying to generate the matrix around 30,000 rows/cols, the error > message I receive is: > > Error in rnorm(dims * dims) : cannot allocate vector of length > 900000000 > Error in round(rnorm(dims * dims), rnd) : unable to find the argument > 'x' in selecting a method for function 'round' > > * Last error line is obvious. Question: is this 900000000 bytes? > kilobytes? This error seems to be specific now to rnorm, but it > doesn't indicate the length metric (b/Kb/Mb) as it did for 20,000 > rows/cols. Even if this Mb, why can't this be allocated on a machine > with 32G free memory? > > When trying to generate the matrix with over 50,000 rows/cols, the > error message I receive is: > > Error in rnorm(n, mean, sd) : invalid arguments > In addition: Warning message: > NAs introduced by coercion > Error in round(rnorm(dims * dims), rnd) : unable to find the argument > 'x' in selecting a method for function 'round' > > * Same. > > Why would it generate different errors in each case? Code fixes? Any > simple ways to generate sparse matrices which would avoid above > problems? > > Thanks in advance, > > Gavin > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting- > guide.html ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html