Hi Jose,
A number of things.
First for recent versions of Open MPI including the 4.1.x release stream,
MPI_THREAD_MULTIPLE is supported by default. However, some transport options
available when using MPI_Init may not be available when requesting
MPI_THREAD_MULTIPLE.
You may want to let Ope
It seems I misunderstood something regarding attaching files. And sorry for the
footer I used my company Email so I get answers also when I work.
here is the valgrind output https://pastebin.com/Wwvn8Pa7
here the ompi_info –all output https://pastebin.com/FW0fazZH
here the gdb output https://past
Thanks. The verbose output is:
[kahan01.upvnet.upv.es:29732] mca: base: components_register: registering
framework btl components
[kahan01.upvnet.upv.es:29732] mca: base: components_register: found loaded
component self
[kahan01.upvnet.upv.es:29732] mca: base: components_register: component self
Hello Jose,
I suspect the issue here is that the OpenIB BTl isn't finding a connection
module when you are requesting MPI_THREAD_MULTIPLE.
The rdmacm connection is deselected if MPI_THREAD_MULTIPLE thread support level
is being requested.
If you run the test in a shell with
export OMPI_MCA_btl
One disturbing thing in your note was:
I'm very sorry about that. That is just wrong. Somehow I overlooked it,
just because it was not were I supposed it to be. I apologize.
I'm still investigating what could've gone wrong and I'm also trying
Bernd's suggestion: that could indeed be an even
I'm sure nobody has looked at the rankfile docs in many a year - nor actually
tested the code for some time, especially with the newer complex chips. I can
try to take a look at it locally, but it may be a few days before I get around
to it.
One disturbing thing in your note was:
Also, on the
Hello whoever reads this,
I am running my code using CUDA aware OpenMPI (see ompi_info –all attached).
First I will explain the problem, further down I will give additional info
about versions, hardware and debugging.
The Problem:
My application solves multiple mathematical equations on GPU via
Hi David,
On 03/02/2022 00:03 , David Perozzi wrote:
Helo,
I'm trying to run a code implemented with OpenMPI and OpenMP (for
threading) on a large cluster that uses LSF for the job scheduling and
dispatch. The problem with LSF is that it is not very straightforward to
allocate and bind the r
No problem, to give detailed explanation is the least I can do! Thank
you for taking your time.
Yeah, to be honest I'm not completely sure I'm doing the right thing
with the IDs, as I had some troubles in understanding the manpages.
Maybe you can help me and we'll end up seeing that that was i
Hmmm...okay, I found the code path that fails without an error - not one of the
ones I was citing. Thanks for that detailed explanation of what you were doing!
I'll add some code to the master branch to plug that hole along with the other
I identified.
Just an FYI: we stopped supporting "physic
Thanks for looking into that and sorry if I only included the version in
use in the pastebin. I'll ask the cluster support if they could install
OMPI master.
I really am unfamiliar with openmpi's codebase, so I haven't looked into
it and are very thanful that you could already identify possibl
11 matches
Mail list logo