Hi,

as far as I know, there is no limitation on data size in regard to foreach. You 
should reserve though enough memory for your application on the cluster (via 
ulimit -s unlimited and ulimit -v unlimited). 

Furthermore I would check the following: 
Check if there are two versions of R on the cluster/your home directory on the 
frontend (LSF loads this frontend environment and uses the R version installed 
there). If you have two R executables (R and R64) make sure you use the 64bit 
version.

Run R and call memory.limit() to see what are the limits of memory in your 
system. 

If this is limited to sizes below your needed sizes, increase it by calling R 
in the LSF script with the options --max-mem-size=YourSize and if you get 
errors of kind " cannot allocate vector of size" you should also use 
--max-vsize=YourVSize. 

Then, check if there is a memory leak in your application: If you compiled R 
with the --enable-memory-profiling you can use Rprof to do this otherwise you 
must rely on profiling instruments given by the cluster environment (I think 
you work there as well with modules, so type in the shell 'module avail' for 
listing available modules). 

If you detect a memory leak or if you see, that at certain points in your 
algorithms some objects are not used anymore call rm(ObjectName) and gc() for 
garbage collection. 


To your nested loop using foreach: That is a highly delicate issue in parallel 
computing and for the foreach syntax I refer to the must-read 
http://cran.r-project.org/web/packages/foreach/vignettes/nested.pdf. 

Using nested loops should be considered carefully in regard to organizing the 
nesting. In C++ you have the ability to determine how many cores should work on 
which loop. In the foreach environment using doMC this seems to me not 
possible. 


And, please keep the discussion to the r-help mailing list, so others can learn 
from it and researchers with more experience can also leave comments. 


Best

Simon


On Sep 19, 2013, at 9:24 PM, pko...@bgc-jena.mpg.de wrote:

> Hi again,
> 
> if you have some time I would like to bother you again with 2 more questions. 
> After your response the parallel code is working perfect but when I implement 
> that to the real case (big matrices) I get an error for not numeric dimension 
> and i guess that again it returns NULL or something. Are you aware if foreach 
> loop can handle only a certain size objects? the equation that I am using 
> includes 3 objects with 2Gb size each. 
> 
> The second question has to deal with the cores that foreach uses. Although I 
> am asking to our cluster (LSF) to give me certain number of cpus, and also i 
> am specifing that with
> library(doMC)
> registerDoMC(n) 
> 
> it seems from the top command that I am using all the cores. I am using 2 
> foreach as nest  foreach(i in 1:16){
>         foreach(j in 1:10)  etc etc..
> maybe i should do something with this kind of nest? I am not aware about that.
> 
> I am sorry for the long text , and thank you for your nice solution
> 
> _____________________________________
> Sent from http://r.789695.n4.nabble.com
> 

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to