Re: [OMPI users] Deadlock in netcdf tests

2019-10-25 Thread Orion Poplawski via users
Thanks for the response, the workaround helps. With that out of the way I see: + mpiexec -n 4 ./tst_parallel4 Error in ompi_io_ompio_calcl_aggregator():rank_index(-2) >= num_aggregators(1)fd_size=461172966257152 off=4156705856 Error in ompi_io_ompio_calcl_aggregator():rank_index(-2) >= num

[OMPI users] MPI_Comm_spawn: no allocated resources for the application ...

2019-10-25 Thread Mccall, Kurt E. (MSFC-EV41) via users
I am trying to launch a number of manager processes, one per node, and then have each of those managers spawn, on its own same node, a number of workers. For this example, I have 2 managers and 2 workers per manager. I'm following the instructions at this link https://stackoverflow.com/questi

Re: [OMPI users] Deadlock in netcdf tests

2019-10-25 Thread Gabriel, Edgar via users
Never mind, I see it in the backtrace :-) Will look into it, but am currently traveling. Until then, Gilles suggestion is probably the right approach. Thanks Edgar > -Original Message- > From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Gabriel, > Edgar via users > Sent:

Re: [OMPI users] Deadlock in netcdf tests

2019-10-25 Thread Gabriel, Edgar via users
Orion, I will look into this problem, is there a specific code or testcase that triggers this problem? Thanks Edgar > -Original Message- > From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Orion > Poplawski via users > Sent: Thursday, October 24, 2019 11:56 PM > To: Open

Re: [OMPI users] Deadlock in netcdf tests

2019-10-25 Thread Gilles Gouaillardet via users
Orion, thanks for the report. I can confirm this is indeed an Open MPI bug. FWIW, a workaround is to disable the fcoll/vulcan component. That can be achieved by mpirun --mca fcoll ^vulcan ... or OMPI_MCA_fcoll=^vulcan mpirun ... I also noted the tst_parallel3 program crashes with the RO