Dear OpenMPI developers,

One of our users was running a benchmark on a 1032 core simulation. He had a 
successful run at 900 cores but when he stepped up to 1032 cores the job just 
stalled and his logs contained many occurrences of the following line:

[d6copt368.crc.nd.edu][[25621,1],0][btl_tcp_component.c:885:mca_btl_tcp_component_accept_handler]
 accept() failed: Too many open files (24)

The simulation has a single master task that communicates with all the other 
tasks to write out some I/O via the master. We are assuming the message is 
related to this bottleneck. Is there a 1024 limit on the number of open 
files/connections for instance?

Can anyone confirm the meaning of this error and secondly provide a resolution 
that hopefully doesn't involve a code rewrite.

Thanks in advance,

Tim.

Tim Stitt PhD (User Support Manager).
Center for Research Computing | University of Notre Dame |
P.O. Box 539, Notre Dame, IN 46556 | Phone:  574-631-5287 | Email: 
tst...@nd.edu<mailto:tst...@nd.edu>

Reply via email to