Hi Bernd,
Thanks for your valuable input! Your suggested approach indeed seems
like the correct one and is actually what I've always wanted to do. In
the past, I've also asked our cluster support if there was this
possibility, but they always suggested the following approach:
export
One disturbing thing in your note was:
I'm very sorry about that. That is just wrong. Somehow I overlooked it,
just because it was not were I supposed it to be. I apologize.
I'm still investigating what could've gone wrong and I'm also trying
Bernd's suggestion: that could indeed be an even
I'm sure nobody has looked at the rankfile docs in many a year - nor actually
tested the code for some time, especially with the newer complex chips. I can
try to take a look at it locally, but it may be a few days before I get around
to it.
One disturbing thing in your note was:
Also, on the
Hi David,
On 03/02/2022 00:03 , David Perozzi wrote:
Helo,
I'm trying to run a code implemented with OpenMPI and OpenMP (for
threading) on a large cluster that uses LSF for the job scheduling and
dispatch. The problem with LSF is that it is not very straightforward to
allocate and bind the
No problem, to give detailed explanation is the least I can do! Thank
you for taking your time.
Yeah, to be honest I'm not completely sure I'm doing the right thing
with the IDs, as I had some troubles in understanding the manpages.
Maybe you can help me and we'll end up seeing that that was
Hmmm...okay, I found the code path that fails without an error - not one of the
ones I was citing. Thanks for that detailed explanation of what you were doing!
I'll add some code to the master branch to plug that hole along with the other
I identified.
Just an FYI: we stopped supporting
Thanks for looking into that and sorry if I only included the version in
use in the pastebin. I'll ask the cluster support if they could install
OMPI master.
I really am unfamiliar with openmpi's codebase, so I haven't looked into
it and are very thanful that you could already identify
Are you willing to try this with OMPI master? Asking because it would be hard
to push changes all the way back to 4.0.x every time we want to see if we fixed
something.
Also, few of us have any access to LSF, though I doubt that has much impact
here as it sounds like the issue is in the
The linked pastebin includes the following version information:
[1,0]:package:Open MPI spackapps@eu-c7-042-03 Distribution
[1,0]:ompi:version:full:4.0.2
[1,0]:ompi:version:repo:v4.0.2
[1,0]:ompi:version:release_date:Oct 07, 2019
[1,0]:orte:version:full:4.0.2
[1,0]:orte:version:repo:v4.0.2
Errr...what version OMPI are you using?
> On Feb 2, 2022, at 3:03 PM, David Perozzi via users
> wrote:
>
> Helo,
>
> I'm trying to run a code implemented with OpenMPI and OpenMP (for threading)
> on a large cluster that uses LSF for the job scheduling and dispatch. The
> problem with LSF is
Helo,
I'm trying to run a code implemented with OpenMPI and OpenMP (for
threading) on a large cluster that uses LSF for the job scheduling and
dispatch. The problem with LSF is that it is not very straightforward to
allocate and bind the right amount of threads to an MPI rank inside a
single
11 matches
Mail list logo