Re: [R] Memory hungry routines
Thanks to Duncan, Hadley and Henrik. Duncan, I used Rprof and could pinpoint the critical routine that was doing the memory crash. Henrik, you got it right: the culprit was a big matrix of integers, but where some of its fields are filled with -Inf and Inf. This matrix is global, it's used only once, it does not consume too much memory, and it should be harmless, but... Hadley, your link to memory allocation and management helped to identify the problem. I did a very stupid think, I added some debug in the critical routine that duplicated it at each iteration of a loop... So that big matrix with integers and Infs and -Infs was being copied several times, killing memory needlessly. Thanks for all the help. I got 99 problems but you won't be one Alberto Monteiro __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Memory hungry routines
Is there any way to detect which calls are consuming memory? I run a program whose global variables take up about 50 Megabytes of memory, but when I monitor the progress of the program it seems to allocating 150 Megabytes of memory, with peaks of up to 2 Gigabytes. I know that the global variables aren't copied many times by the routines, but I suspect something weird must be happening. Alberto Monteiro PS: the lines, below, count the memory allocated to all global variables, probably it could be adapted to track the local variables: y - ls(pat=) # get all names of the variables z - rep(0, length(y)) # create array of sizes for (i in 1:length(y)) z[i] - object.size(get(y[i])) # loop: get all sizes (in bytes) of the variables # BTW, is there any way to vectorialize the above loop? xix - sort.int(z, index.return = TRUE) # sort the sizes y - y[xix$ix] # apply the sort to the variables z - z[xix$ix] # apply the sort to the sizes y - c(y, total) # add a totalizator z - c(z, sum(z)) # sum them all cbind(y, z) # ugly way to list them __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory hungry routines
On 29/12/2014 1:52 PM, ALBERTO VIEIRA FERREIRA MONTEIRO wrote: Is there any way to detect which calls are consuming memory? The Rprofmem() function can do this, but you need to build R to enable it.Rprof() does a more limited version of the same thing if run with memory.profiling = TRUE. Duncan Murdoch I run a program whose global variables take up about 50 Megabytes of memory, but when I monitor the progress of the program it seems to allocating 150 Megabytes of memory, with peaks of up to 2 Gigabytes. I know that the global variables aren't copied many times by the routines, but I suspect something weird must be happening. Alberto Monteiro PS: the lines, below, count the memory allocated to all global variables, probably it could be adapted to track the local variables: y - ls(pat=) # get all names of the variables z - rep(0, length(y)) # create array of sizes for (i in 1:length(y)) z[i] - object.size(get(y[i])) # loop: get all sizes (in bytes) of the variables # BTW, is there any way to vectorialize the above loop? xix - sort.int(z, index.return = TRUE) # sort the sizes y - y[xix$ix] # apply the sort to the variables z - z[xix$ix] # apply the sort to the sizes y - c(y, total) # add a totalizator z - c(z, sum(z)) # sum them all cbind(y, z) # ugly way to list them __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory hungry routines
You might find the advice at http://adv-r.had.co.nz/memory.html helpful. Hadley On Tue, Dec 30, 2014 at 7:52 AM, ALBERTO VIEIRA FERREIRA MONTEIRO albm...@centroin.com.br wrote: Is there any way to detect which calls are consuming memory? I run a program whose global variables take up about 50 Megabytes of memory, but when I monitor the progress of the program it seems to allocating 150 Megabytes of memory, with peaks of up to 2 Gigabytes. I know that the global variables aren't copied many times by the routines, but I suspect something weird must be happening. Alberto Monteiro PS: the lines, below, count the memory allocated to all global variables, probably it could be adapted to track the local variables: y - ls(pat=) # get all names of the variables z - rep(0, length(y)) # create array of sizes for (i in 1:length(y)) z[i] - object.size(get(y[i])) # loop: get all sizes (in bytes) of the variables # BTW, is there any way to vectorialize the above loop? xix - sort.int(z, index.return = TRUE) # sort the sizes y - y[xix$ix] # apply the sort to the variables z - z[xix$ix] # apply the sort to the sizes y - c(y, total) # add a totalizator z - c(z, sum(z)) # sum them all cbind(y, z) # ugly way to list them __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- http://had.co.nz/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory hungry routines
On Mon, Dec 29, 2014 at 10:52 AM, ALBERTO VIEIRA FERREIRA MONTEIRO albm...@centroin.com.br wrote: Is there any way to detect which calls are consuming memory? I run a program whose global variables take up about 50 Megabytes of memory, but when I monitor the progress of the program it seems to allocating 150 Megabytes of memory, with peaks of up to 2 Gigabytes. I know that the global variables aren't copied many times by the routines, but I suspect something weird must be happening. Alberto Monteiro PS: the lines, below, count the memory allocated to all global variables, probably it could be adapted to track the local variables: y - ls(pat=) # get all names of the variables z - rep(0, length(y)) # create array of sizes for (i in 1:length(y)) z[i] - object.size(get(y[i])) # loop: get all sizes (in bytes) of the variables # BTW, is there any way to vectorialize the above loop? xix - sort.int(z, index.return = TRUE) # sort the sizes y - y[xix$ix] # apply the sort to the variables z - z[xix$ix] # apply the sort to the sizes y - c(y, total) # add a totalizator z - c(z, sum(z)) # sum them all cbind(y, z) # ugly way to list them Duncan already suggested Rprofmem(). For a neat interface to that, see also lineprof package. Common memory hogs are cbind(), rbind() and other ways of incrementally building up objects. These can often be avoided by pre-allocating the final object up front and populating it as you go. Another source of unnecessary memory duplication is coercion of data types, e.g. allocating an integer matrix but populating it with doubles. A related mistake is to use matrix(nrow, ncol) for allocate matrices that will hold numeric values. That is actually doing matrix(NA, nrow, ncol), which becomes a *logical* matrix, which will be coerced (involving copying and large memory allocation) the first thing as soon as it get's populated with a numeric value. One should have used matrix(NA_real_, nrow, ncol) here. For listing objects, their sizes and more, you can use ll() in the R.oo package which returns a data.frame, e.g. example(iris) a - 1:1e6 R.oo::ll() member data.class dimension objectSize 1 anumeric 100440 2 dni3 list 3600 3 ii data.frame c(150,5) 7088 4 iris data.frame c(150,5) 7088 R.oo::ll(sortBy=objectSize) member data.class dimension objectSize 2 dni3 list 3600 3 ii data.frame c(150,5) 7088 4 iris data.frame c(150,5) 7088 1 anumeric 100440 tbl - R.oo::ll() tbl - tbl[order(tbl$objectSize, decreasing=TRUE),] tbl member data.class dimension objectSize 1 anumeric 100440 3 ii data.frame c(150,5) 7088 4 iris data.frame c(150,5) 7088 5 objs data.framec(4,4) 2760 2 dni3 list 3600 sum(tbl$objectSize) [1] 4017576 /Henrik __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.