Thanks for the response, the workaround helps.
With that out of the way I see:
+ mpiexec -n 4 ./tst_parallel4
Error in ompi_io_ompio_calcl_aggregator():rank_index(-2) >=
num_aggregators(1)fd_size=461172966257152 off=4156705856
Error in ompi_io_ompio_calcl_aggregator():rank_index(-2) >=
num
I am trying to launch a number of manager processes, one per node, and then have
each of those managers spawn, on its own same node, a number of workers. For
this example,
I have 2 managers and 2 workers per manager. I'm following the instructions at
this link
https://stackoverflow.com/questi
Never mind, I see it in the backtrace :-)
Will look into it, but am currently traveling. Until then, Gilles suggestion is
probably the right approach.
Thanks
Edgar
> -Original Message-
> From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Gabriel,
> Edgar via users
> Sent:
Orion,
I will look into this problem, is there a specific code or testcase that
triggers this problem?
Thanks
Edgar
> -Original Message-
> From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Orion
> Poplawski via users
> Sent: Thursday, October 24, 2019 11:56 PM
> To: Open
Orion,
thanks for the report.
I can confirm this is indeed an Open MPI bug.
FWIW, a workaround is to disable the fcoll/vulcan component.
That can be achieved by
mpirun --mca fcoll ^vulcan ...
or
OMPI_MCA_fcoll=^vulcan mpirun ...
I also noted the tst_parallel3 program crashes with the RO