Hello all,

I have a triple nested loop in R like this:

all <- list()
for(a in A){
    all[[a]] <- list()
    for(b in B){
        all[[a]][[b]] <- foreach(c=C, .combine=rbind) %dopar% {
            ## I'm leaving out some preprocessing here
this_GAM <- gam(formula, data=data, family=nb(link="log", theta=THETA))
            predict(this_GAM, newdata=newdata)
        }
    }
}

The problem I have is that, with time, the individual R processes which the %dopar% spawns use up more and more RAM. When I start the triple loop, each process requires about 2GB of RAM, but after around eight hours, they use >4GB each. Here's the first two lines of a 'top' output:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 20880 engelhar 20 0 7042m 4.0g 2436 R 59.2 6.4 14:30.15 R 20878 engelhar 20 0 7042m 4.3g 2436 D 53.5 6.8 14:07.18 R

I don't understand how this can happen. To my understanding, as soon as the foreach loop is done, i.e. as soon as a new 'b' is chosen from 'B' in the second loop, the individual parallel R processes should terminate and release the memory. There should not be an increase of memory consumption over time.

Does anyone know what is going on and how I can avoid this behavior?

Thanks in advance,
 Alex Engelhardt

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to