OK, but then on this other machine it hangs.   This one is using SLURM, so I’m 
not exactly sure but I think this tells me the OpenMPI version:

siegel@cisc372:~$ mpiexec.openmpi --version
mpiexec.openmpi (OpenRTE) 2.1.1

Report bugs to http://www.open-mpi.org/community/help/


siegel@cisc372:~/372/code/mpi/io$ mpicc io_byte_shared.c
siegel@cisc372:~/372/code/mpi/io$ srun -n 4 ./a.out
srun: job 143344 queued and waiting for resources
srun: job 143344 has been allocated resources
Proc 0: file has been opened.
Proc 0: About to write to file.
Proc 1: file has been opened.
Proc 2: file has been opened.
Proc 3: file has been opened.
^Csrun: interrupt (one more within 1 sec to abort)
srun: step:143344.0 tasks 0-3: running
^Csrun: sending Ctrl-C to job 143344.0
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
slurmstepd-beowulf: error: *** STEP 143344.0 ON beowulf CANCELLED AT 
2020-06-05T19:03:59 ***



> On Jun 5, 2020, at 6:55 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> 
> wrote:
> 
> On Jun 5, 2020, at 6:35 PM, Stephen Siegel via users 
> <users@lists.open-mpi.org> wrote:
>> 
>> [ilyich:12946] 3 more processes have sent help message help-mpi-btl-base.txt 
>> / btl:no-nics
>> [ilyich:12946] Set MCA parameter "orte_base_help_aggregate" to 0 to see all 
>> help / error messages
> 
> It looks like your output somehow doesn't include the actual error message.  
> That error message was sent to stderr, so you may not have captured it if you 
> only did "mpirun ... > foo.txt".  The actual error message template is this:
> 
> -----
> %s: A high-performance Open MPI point-to-point messaging module
> was unable to find any relevant network interfaces:
> 
> Module: %s
>  Host: %s
> 
> Another transport will be used instead, although this may result in
> lower performance.
> 
> NOTE: You can disable this warning by setting the MCA parameter
> btl_base_warn_component_unused to 0.
> -----
> 
> This is not actually an error -- just a warning.  It typically means that 
> your Open MPI has support for HPC-class networking, Open MPI saw some 
> evidence of HPC-class networking on the nodes on which your job ran, but 
> ultimately didn't use any of those HPC-class networking interfaces for some 
> reason and therefore fell back to TCP.
> 
> I.e., your program ran correctly, but it may have run slower than it could 
> have if it were able to use HPC-class networks.
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> 

Reply via email to