Hi,
Is there any reason why you do not build the latest 5.0.2 package?
Anyway, the issue could be related to an unknown filesystem.
Do you get a meaningful error if you manually run
/.../test/util/opal_path_nfs?
If not, can you share the output of
mount | cut -f3,5 -d' '
Cheers,
Gilles
test"?
Though the warning might be ignored, SIGILL is definitely an issue.
I encourage you to have your app dump a core in order to figure out where
this is coming from
Cheers,
Gilles
On Tue, Apr 16, 2024 at 5:20 AM Greg Samonds via users <
users@lists.open-mpi.org> wrote:
> H
Christopher,
I do not think Open MPI explicitly asks SLURM which cores have been
assigned on each node.
So if you are planning to run multiple jobs on the same node, your best bet
is probably to have SLURM
use cpusets.
Cheers,
Gilles
On Sat, Feb 24, 2024 at 7:25 AM Christopher Daley via users
, are you able to craft a reproducer that causes the crash?
How many nodes and MPI tasks are needed in order to evidence the crash?
Cheers,
Gilles
On Wed, Jan 31, 2024 at 10:09 PM afernandez via users <
users@lists.open-mpi.org> wrote:
> Hello Joseph,
> Sorry for the delay but I didn
Hi,
Please open a GitHub issue at https://github.com/open-ompi/ompi/issues and
provide the requested information
Cheers,
Gilles
On Sat, Jan 27, 2024 at 12:04 PM Kook Jin Noh via users <
users@lists.open-mpi.org> wrote:
> Hi,
>
>
>
> I’m installing OpenMPI 5.0.1 on Archli
Luis,
you can pass the --bind-to hwthread option in order to bind on the first
thread of each core
Cheers,
Gilles
On Fri, Sep 8, 2023 at 8:30 PM Luis Cebamanos via users <
users@lists.open-mpi.org> wrote:
> Hello,
>
> Up to now, I have been using numerous ways of bindin
MPI was built with
SLURM support (e.g. configure --with-slurm ...)
and then
srun --mpi=pmi2 ...
Cheers,
Gilles
On Tue, Jul 25, 2023 at 5:07 PM Aziz Ogutlu via users <
users@lists.open-mpi.org> wrote:
> Hi there all,
> We're using Slurm 21.08 on Redhat 7.9 HPC cluster with OpenMPI
,
Gilles
On Wed, Jul 19, 2023 at 11:36 PM Luis Cebamanos via users <
users@lists.open-mpi.org> wrote:
> Hello,
>
> I was wondering if anyone has ever seen the following runtime error:
>
> mpirun -np 32 ./hello
> .
> [LOG_CAT_SBGP] libnuma.so: cannot open share
options.
Cheers,
Gilles
On Sat, Jun 17, 2023 at 2:53 AM Mccall, Kurt E. (MSFC-EV41) via users <
users@lists.open-mpi.org> wrote:
> Joachim,
>
>
>
> Sorry to make you resort to divination. My sbatch command is as follows:
>
>
>
> sbatch --ntasks-per-node=24 --
vendors
and/or very different versions.
Cheers,
Gilles
On Thu, Jun 1, 2023 at 10:27 AM 深空探测 via users
wrote:
> Hi all,
>
> I am writing to seek assistance regarding an issue I encountered while
> running an MPI job on CentOS 7 virtual machine .
>
> To provide some context, I
test -z
"$with_pmix" || test "$with_pmix" = "yes")
then :
if test "$opal_external_pmix_version" != "3x"
and replace the last line with
if test $opal_external_pmix_version_major -lt 3
Cheers,
Gilles
On Sat, May 20, 2023 at 6:13 PM christo
should see the expected ip of the
second node.
If not, there is NAT somewhere and that does not fly well with Open MPI
Cheers,
Gilles
On 3/28/2023 8:53 AM, Todd Spencer via users wrote:
OpenMPI Users,
I hope this email finds you all well. I am writing to bring to your
attention an issue
node, you might be able
to get the best performances by forcing pml/ob1.
Bottom line, I think it is best for you to benchmark your application and
pick the combination that leads to the best performances,
and you are more than welcome to share your conclusions.
Cheers,
Gilles
On Mon, Mar 6
instead of
mpirun ...
Cheers,
Gilles
On 3/1/2023 12:44 PM, Rob Kudyba via users wrote:
Singularity 3.5.3 on RHEL 7 cluster w/ OpenMPI 4.0.3 lives inside a
SimpleFOAM version 10 container. I've confirmed the OpenMPI versions
are the same. Perhaps this is a question for Singularity users
You can pick one test, make it standalone, and open an issue on GitHub.
How does (vanilla) Open MPI compare to your vendor Open MPI based library?
Cheers,
Gilles
On Wed, Jan 11, 2023 at 10:20 PM Dave Love via users <
users@lists.open-mpi.org> wrote:
> Gilles Gouaillardet via user
Dave,
If there is a bug you would like to report, please open an issue at
https://github.com/open-mpi/ompi/issues and provide all the required
information
(in this case, it should also include the UCX library you are using and how
it was obtained or built).
Cheers,
Gilles
On Fri, Jan 6, 2023
Hi,
Simply add
btl = tcp,self
If the openib error message persists, try also adding
osc_rdma_btls = ugni,uct,ucp
or simply
osc = ^rdma
Cheers,
Gilles
On 11/29/2022 5:16 PM, Gestió Servidors via users wrote:
Hi,
If I run “mpirun --mca btl tcp,self --mca allow_ib 0 -n 12
omponent can be used as an inspiration.
I cannot commit to do this, but if you are willing to take a crack at
it, I can create such a component
so you can go directly to implementing the callback without spending too
much time on some Open MPI internals
(e.g. component creation).
Cheers,
Gi
it is bound.
Cheers,
Gilles
On Sat, Nov 26, 2022 at 5:38 PM Arham Amouei via users <
users@lists.open-mpi.org> wrote:
> Hi
>
> If I run a code with
>
> mpirun -n 28 ./code
>
> Is it guaranteed that Open MPI and/or OS give equal number of processes to
> each socket
FWIW, I cannot reproduce this error.
What is the netmask on both hosts?
Cheers,
Gilles
On 11/15/2022 1:32 PM, timesir via users wrote:
*(py3.9) ➜ /share mpirun -n 2 --machinefile hosts --mca
rmaps_base_verbose 100 --mca ras_base_verbose 100 which mpirun*
[computer01:39342] mca
There is a typo in your command line.
You should use --mca (minus minus) instead of -mca
Also, you can try --machinefile instead of -machinefile
Cheers,
Gilles
There are not enough slots available in the system to satisfy the 2
slots that were requested by the application:
–mca
On Mon, Nov
Chris,
Did you double check libopen-rte.so.40 and libopen-pal.so.40 are installed
in /mnt/software/o/openmpi/4.1.4-ct-test/lib?
If they are not present, it means your install is busted and you should try
to reinstall it.
Cheers,
Gilles
On Sat, Nov 5, 2022 at 3:42 AM Chris Taylor via users
to support Intel Fortran on OSX.
I am confident a Pull Request that does fix this issue will be considered
for inclusion in future Open MPI releases.
Cheers,
Gilles
On Fri, Sep 16, 2022 at 11:20 AM Volker Blum via users <
users@lists.open-mpi.org> wrote:
> Hi all,
>
> This issue
in a machine file (e.g. mpirun -machinefile
...) or the command line
(e.g. mpirun --hosts host0:96,host1:96 ...)
c) if none of the above is set, the number of detected cores on the
system
Cheers,
Gilles
On Tue, Sep 13, 2022 at 9:23 PM Lucas Chaloyard via users <
users@lists.o
slower.
Cheers,
Gilles
On Sun, Jul 24, 2022 at 8:11 PM Matias Ezequiel Vara Larsen via users <
users@lists.open-mpi.org> wrote:
> Hello everyone,
>
> I have started to play with MPI and unikernels and I have recently
> implemented a minimal set of MPI APIs on top of Toro Uni
to the wrong mailing list
Cheers,
Gilles
On Fri, Jun 24, 2022 at 9:06 PM Guillaume De Nayer via users <
users@lists.open-mpi.org> wrote:
> Hi Gilles,
>
> MPI_COMM_WORLD is positive (4400).
>
> In a short code I wrote I have something like that:
>
> MPI_Comm_dup(MPI_COMM
Sorry if I did not make my intent clear.
I was basically suggesting to hack the Open MPI and PMIx wrappers to
hostname() and remove the problematic underscores to make the regx
components a happy panda again.
Cheers,
Gilles
- Original Message -
> I think the files suggested by Gil
Patrick,
you will likely also need to apply the same hack to opal_net_get_hostname()
in opal/util/net.c
Cheers,
Gilles
On Thu, Jun 16, 2022 at 7:30 PM Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:
> Patrick,
>
> I am not sure Open MPI can do that out of the
Patrick,
I am not sure Open MPI can do that out of the box.
Maybe hacking pmix_net_get_hostname() in
opal/mca/pmix/pmix3x/pmix/src/util/net.c
can do the trick.
Cheers,
Gilles
On Thu, Jun 16, 2022 at 4:24 PM Patrick Begou via users <
users@lists.open-mpi.org> wrote:
> Hi al
not match the hostname
Cheers,
Gilles
On Thu, May 5, 2022 at 5:06 AM Scott Sayres via users <
users@lists.open-mpi.org> wrote:
> foo.sh is executable, again hangs without output.
> I command c x2 to return to shell, then
>
> ps auxwww | egrep 'mpirun|foo.sh'
> output shown
command with
mpirun --mca pml ob1 --mca btl tcp,self ...
Cheers,
Gilles
On Tue, May 3, 2022 at 7:08 PM Alois Schlögl via users <
users@lists.open-mpi.org> wrote:
>
> Within our cluster (debian10/slurm16, debian11/slurm20), with
> infiniband, and we have several instances of op
You can first double check you
MPI_Init_thread(..., MPI_THREAD_MULTIPLE, ...)
And the provided level is MPI_THREAD_MULTIPLE as you requested.
Cheers,
Gilles
On Fri, Apr 22, 2022, 21:45 Angel de Vicente via users <
users@lists.open-mpi.org> wrote:
> Hello,
>
> I'm runni
to the Intel compiler, I strongly encourage
you to run on Intel/AMD processors.
Otherwise, use a native compiler for aarch64, and in this case, brew is not
a bad option.
Cheers,
Gilles
On Thu, Apr 21, 2022 at 6:36 PM Cici Feng via users <
users@lists.open-mpi.org> wrote:
> Hi there,
>
&
Ernesto,
Not directly.
But you can use MPI_Comm_split_type(..., MPI_COMM_TYPE_SHARED, ...) and then
MPI_Comm_size(...) on the "returned" communicator.
Cheers,
Gilles
On Sun, Apr 3, 2022 at 5:52 AM Ernesto Prudencio via users <
users@lists.open-mpi.org> wrote:
> Than
Ernesto,
MPI_Bcast() has no barrier semantic.
It means the root rank can return after the message is sent (kind of eager
send) and before it is received by other ranks.
Cheers,
Gilles
On Sat, Apr 2, 2022, 09:33 Ernesto Prudencio via users <
users@lists.open-mpi.org> wrote:
&g
suggest you give UCX - built with multi threading support
- a try and see how it goes
Cheers,
Gilles
On Thu, Mar 24, 2022 at 5:43 PM Patrick Begou via users <
users@lists.open-mpi.org> wrote:
> Le 28/02/2022 à 17:56, Patrick Begou via users a écrit :
> > Hi,
> >
> > I
in order to exclude the coll/tuned component:
mpirun --mca coll ^tuned ...
Cheers,
Gilles
On Mon, Mar 14, 2022 at 5:37 PM Ernesto Prudencio via users <
users@lists.open-mpi.org> wrote:
> Thanks for the hint on “mpirun ldd”. I will try it. The problem is that I
> am running
under the hood with matching
but different signatures.
Cheers,
Gilles
On Mon, Mar 14, 2022 at 4:09 PM Ernesto Prudencio via users <
users@lists.open-mpi.org> wrote:
> Thanks, Gilles.
>
>
>
> In the case of the application I am working on, all ranks call MPI with
> th
PETSc doing
something different but legit that evidences a bug in Open MPI.
If you have time, you can also try
- Intel compilers
- MPICH (or a derivative such as Intel MPI)
- PETSc 3.16.5
=> a success would strongly point to Open MPI
Cheers,
Gilles
On Mon, Mar 14, 2022 at 2:56 PM Ernesto
Angel,
Infiniband detection likely fails before checking expanded verbs.
Please compress and post the full configure output
Cheers,
Gilles
On Fri, Feb 18, 2022 at 12:02 AM Angel de Vicente via users <
users@lists.open-mpi.org> wrote:
> Hi,
>
> I'm trying to compile the latest
Hari,
What does
ldd solver.exe
(or whatever your clever exe file is called) reports?
Cheers,
Gilles
On Mon, Jan 31, 2022 at 2:09 PM Hari Devaraj via users <
users@lists.open-mpi.org> wrote:
> Hello,
>
> I am trying to run a FEA solver exe file.
> I get this error messag
xotic" interconnects (that
might not be supported natively by Open MPI or abstraction layers) and/or
with an uncommon topology (for which collective communications are not
fully optimized by Open MPI). In the latter case, using the system/vendor
MPI is the best option performance wise.
Cheers,
Gill
Fair enough Ralph!
I was implicitly assuming a "build once / run everywhere" use case, my bad
for not making my assumption clear.
If the container is built to run on a specific host, there are indeed other
options to achieve near native performances.
Cheers,
Gilles
On Thu, Jan 27,
ot; at run time with Intel MPI (MPICH
based and ABI compatible).
Cheers,
Gilles
On Thu, Jan 27, 2022 at 1:07 PM Brian Dobbins via users <
users@lists.open-mpi.org> wrote:
>
> Hi Ralph,
>
> Thanks for the explanation - in hindsight, that makes perfect sense,
> since e
so your custom launcher would "only" have to implement a limited number
of callbacks.
Cheers,
Gilles
- Original Message -
Any pointers?
On Tue, Jan 25, 2022 at 12:55 PM Ralph Castain via users wrote:
Short answer is yes, but it it a bit complicated to do.
Could this be the root cause?
What is the PMIx library version used by SLURM?
Ralph, do you see something wrong on why Open MPI and SLURM cannot
communicate via PMIx?
Cheers,
Gilles
On Tue, Jan 25, 2022 at 5:47 PM Matthias Leopold <
matthias.leop...@meduniwien.ac.at> wrote:
> Hi Gilles
:
srun --mpi=list
will list the PMI flavors provided by SLURM
a) if PMIx is not supported, contact your sysadmin and ask for it
b) if PMIx is supported but is not the default, ask for it, for example
with
srun --mpi=pmix_v3 ...
Cheers,
Gilles
On Tue, Jan 25, 2022 at 12:30 AM
rank, n, m, v_glob);
and also resize rtype so the second element starts at v_glob[3][0] => upper
bound = (3*sizeof(int))
By the way, since this question is not Open MPI specific, sites such as
Stack Overflow are a better fit.
Cheers,
Gilles
On Thu, Dec 16, 2021 at 6:46 PM Gilles
Jonas,
Assuming v_glob is what you expect, you will need to
`MPI_Type_create_resized_type()` the received type so the block received
from process 1 will be placed at the right position (v_glob[3][1] => upper
bound = ((4*3+1) * sizeof(int))
Cheers,
Gilles
On Thu, Dec 16, 2021 at 6:33 PM Jo
...
Cheers,
Gilles
On Wed, Nov 3, 2021 at 6:06 PM Mccall, Kurt E. (MSFC-EV41) via users <
users@lists.open-mpi.org> wrote:
> I’m using OpenMPI 4.1.1 compiled with Nvidia’s nvc++ 20.9, and compiled
> with Torque support.
>
>
>
> I want to reserve multiple slots on each node,
Hi Ben,
have you tried
export OMPI_MCA_common_ucx_opal_mem_hooks=1
Cheers,
Gilles
On Mon, Nov 1, 2021 at 9:22 PM bend linux4ms.net via users <
users@lists.open-mpi.org> wrote:
> Ok, I a am newbie supporting the a HPC project and learning about MPI.
>
> I have the fo
ot;
simply manually remove "-dynamiclib" here and see if it helps
Cheers,
Gilles
On Fri, Oct 29, 2021 at 12:30 AM Matt Thompson via users <
users@lists.open-mpi.org> wrote:
> Dear Open MPI Gurus,
>
> This is a...confusing one. For some reason, I cannot build a working
is not happy about it.
If you can have all these 3 macros defined by the nvhpc compilers, that
would be great!
Otherwise, I will let George decide if and how Open MPI addresses this issue
Cheers,
Gilles
On Thu, Sep 30, 2021 at 11:33 PM Carl Ponder via users <
users@lists.open-mpi.org>
Ray,
there is a typo, the configure option is
--enable-mca-no-build=op-avx
Cheers,
Gilles
- Original Message -
Added --enable-mca-no-build=op-avx to the configure line. Still dies in
the same place.
CCLD mca_op_avx.la
./.libs/liblocal_ops_avx512
defined by at least GNU and LLVM compilers),
I will take a look at it when I get some time (you won't face this issue if
you use GNU compilers for C/C++)
Cheers,
Gilles
On Thu, Sep 30, 2021 at 2:31 AM Ray Muno via users
wrote:
>
> Tried this
>
> configure CC='nvc -fPIC' CXX='nvc
Jorge,
I am not that familiar with UCX, but I hope that will help:
The changes I mentioned were introduced by
https://github.com/open-mpi/ompi/pull/8549
I suspect mpirun --mca pml_ucx_tls any --mca pml_ucx_devices --mca pml ucx
...
will do what you expect
Cheers,
Gilles
On Mon, Sep 13
.
Then you can
grep ^ompi_cv_fortran_ config.cache
to generate the file you can pass to --with-cross when cross compiling
on your x86 system
Cheers,
Gilles
On 9/7/2021 7:35 PM, Jeff Hammond via users wrote:
I am attempting to cross-compile Open-MPI for RISC-V on an x86
system. I get this error
.
Cheers,
Gilles
On 8/26/2021 2:42 PM, Broi, Franco via users wrote:
Thanks Gilles but no go...
/usr/lib64/openmpi/bin/mpirun -c 1 --mca pml ^ucx
/home/franco/spawn_example 47
I'm the parent on fsc07
Starting 47 children
Process 1 ([[48649,2],32]) is on host: fsc08
Process 2 ([[48649,1
...
and see if it helps
Cheers,
Gilles
On Thu, Aug 26, 2021 at 2:13 PM Broi, Franco via users <
users@lists.open-mpi.org> wrote:
> Hi,
>
> I have 2 example progs that I found on the internet (attached) that
> illustrate a problem we are having launching multiple node jobs wi
to evaluate them and
report the performance numbers.
On 7/20/2021 11:00 PM, Dave Love via users wrote:
Gilles Gouaillardet via users writes:
One motivation is packaging: a single Open MPI implementation has to be
built, that can run on older x86 processors (supporting only SSE) and the
latest
One motivation is packaging: a single Open MPI implementation has to be
built, that can run on older x86 processors (supporting only SSE) and the
latest ones (supporting AVX512). The op/avx component will select at
runtime the most efficient implementation for vectorized reductions.
On Mon, Jul
Hi Jeff,
Assuming you did **not** explicitly configure Open MPI with
--disable-dlopen, you can try
mpirun --mca pml ob1 --mca btl vader,self ...
Cheers,
Gilles
On Thu, Jun 24, 2021 at 5:08 AM Jeff Hammond via users <
users@lists.open-mpi.org> wrote:
> I am running on a single no
anymore.
if you really want to use pml/ucx on your notebook, you need to
manually re-enable these providers.
That being said, your best choice here is really not to force any pml,
and let Open MPI use pml/ob1
(that has support for both TCP and shared memory)
Cheers,
Gilles
On Sat, May 29, 2021
should not happen if building from an official tarball though.
Cheers,
Gilles
- Original Message -
Hi John,
I don’t think an external dependency is going to fix this.
In your build area, do you see any .lo files in
opal/util/keyval
?
Which compiler are you using
of a compiler bug.
Anyway, I issued https://github.com/open-mpi/ompi/pull/8789 and asked
for a review.
Cheers,
Gilles
- Original Message -
> Dear Gilles,
> As per your suggestion, I tried the inline patch
as discussed in
https://github.com/open-mpi/ompi/pull/8622#issuec
).
IIRC, there is also an option in the Intel compiler to statically link
to the runtime.
Cheers,
Gilles
On Wed, Apr 7, 2021 at 9:00 AM Heinz, Michael William via users
wrote:
>
> I’m having a heck of a time building OMPI with Intel C. Compilation goes
> fine, installation goes fine, compi
Luis,
this file is never compiled when an external hwloc is used.
Please open a github issue and include all the required information
Cheers,
Gilles
On Tue, Mar 23, 2021 at 5:44 PM Luis Cebamanos via users
wrote:
>
> Hello,
>
> Compiling OpenMPI 4.0.5 with Intel 2020 I
Matt,
you can either
mpirun --mca btl self,vader ...
or
export OMPI_MCA_btl=self,vader
mpirun ...
you may also add
btl = self,vader
in your /etc/openmpi-mca-params.conf
and then simply
mpirun ...
Cheers,
Gilles
On Fri, Mar 19, 2021 at 5:44 AM Matt Thompson via users
wrote:
>
> Pr
Anthony,
Did you make sure you can compile a simple fortran program with
gfortran? and gcc?
Please compress and attach both openmpi-config.out and config.log, so
we can diagnose the issue.
Cheers,
Gilles
On Mon, Mar 8, 2021 at 6:48 AM Anthony Rollett via users
wrote:
>
> I am
On top of XPMEM, try to also force btl/vader with
mpirun --mca pml ob1 --mca btl vader,self, ...
On Fri, Mar 5, 2021 at 8:37 AM Nathan Hjelm via users
wrote:
>
> I would run the v4.x series and install xpmem if you can
> (http://github.com/hjelmn/xpmem). You will need to build with
>
yes, you need to (re)build Open MPI from source in order to try this trick.
On 2/26/2021 3:55 PM, LINUS FERNANDES via users wrote:
No change.
What do you mean by running configure?
Are you expecting me to build OpenMPI from source?
On Fri, 26 Feb 2021, 11:16 Gilles Gouaillardet via users
gt;>>
>>>> Errno==13 is EACCESS, which generically translates to "permission denied".
>>>> Since you're running as root, this suggests that something outside of
>>>> your local environment (e.g., outside of that immediate layer of
>>>&g
s which I
> obviously can't on Termux since it doesn't support OpenJDK.
>
> On Thu, 25 Feb 2021, 13:37 Gilles Gouaillardet via users,
> wrote:
>>
>> Is SELinux running on ArchLinux under Termux?
>>
>> On 2/25/2021 4:36 PM, LINUS FERNANDES via users wrote:
>> &g
Is SELinux running on ArchLinux under Termux?
On 2/25/2021 4:36 PM, LINUS FERNANDES via users wrote:
Yes, I did not receive this in my inbox since I set to receive digest.
ifconfig output:
dummy0: flags=195 mtu 1500
inet6 fe80::38a0:1bff:fe81:d4f5 prefixlen 64 scopeid 0x20
Can you run
ifconfig
or
ip addr
in both Termux and ArchLinux for Termux?
On 2/25/2021 2:00 PM, LINUS FERNANDES via users wrote:
Why do I see the following error messages when executing |mpirun| on
ArchLinux for Termux?
The same program executes on Termux without any glitches.
Diego,
IIRC, you now have to build your gfortran 10 apps with
-fallow-argument-mismatch
Cheers,
Gilles
- Original Message -
Dear OPENMPI users,
i'd like to notify you a strange issue that arised right after
installing a new up-to-date version of Linux (Kubuntu 20.10, with gcc
This is not an Open MPI question, and hence not a fit for this mailing
list.
But here we go:
first, try
cmake -DGMX_MPI=ON ...
if it fails, try
cmake -DGMX_MPI=ON -DCMAKE_C_COMPILER=mpicc -DCMAKE_CXX_COMPILER=mpicxx .
..
Cheers,
Gilles
- Original Message -
Hi, MPI
containing AVX512 code.
That being said, several changes were made to the op/avx component,
so if you are experiencing some crashes, I do invite you to give a try to the
latest nightly snapshot for the v4.1.x branch.
Cheers,
Gilles
On Wed, Feb 10, 2021 at 10:43 PM Max R. Dechantsreiter via users
wrote
of master and worker?
Cheers,
Gilles
On Fri, Feb 5, 2021 at 5:59 AM Martín Morales via users
wrote:
>
> Hello all,
>
>
>
> Gilles, unfortunately, the result is the same. Attached the log you ask me.
>
>
>
> Jeff, some time ago I tried with OMPI 4.1.0 (Lin
btl_tcp_base_verbose 20 ...
and then compress and post the logs so we can have a look
Cheers,
Gilles
On Thu, Feb 4, 2021 at 9:33 PM Martín Morales via users
wrote:
>
> Hi Marcos,
>
>
>
> Yes, I have a problem with spawning to a “worker” host (on localhost, works).
> There are just
(or
other filesystem if you use a non standard $TMPDIR
Cheers,
Gilles
On Fri, Jan 29, 2021 at 10:50 PM Diego Zuccato via users
wrote:
>
> Hello all.
>
> I'm having a problem with a job: if it gets scheduled on a specific node
> of our cluster, it f
/.../libmpi_mpifh.so | grep igatherv
and confirm the symbol does indeed exists
Cheers,
Gilles
On Tue, Jan 19, 2021 at 7:39 PM Passant A. Hafez via users
wrote:
>
> Hello Gus,
>
>
> Thanks for your reply.
>
> Yes I've read multiple threads for very old versions of OMPI and PGI before
>
hold
for this kind of BS. However, I have a much lower threshold for your
gross mischaracterization of the Open MPI community, its values, and how
the work gets done.
Cheers,
Gilles
Daniel,
the test works in my environment (1 node, 32 GB memory) with all the
mentioned parameters.
Did you check the memory usage on your nodes and made sure the oom
killer did not shoot any process?
Cheers,
Gilles
On Tue, Jan 12, 2021 at 1:48 AM Daniel Torres via users
wrote:
>
&
,
Gilles
On Sat, Jan 9, 2021 at 8:40 AM Sajid Ali via users
wrote:
>
> Hi OpenMPI-community,
>
> This is a cross post from the following spack issue :
> https://github.com/spack/spack/issues/20756
>
> In brief, when I install openmpi@4.1.0 with ucx and xpmem fabrics, the
> b
Daniel,
Can you please post the full error message and share a reproducer for
this issue?
Cheers,
Gilles
On Fri, Jan 8, 2021 at 10:25 PM Daniel Torres via users
wrote:
>
> Hi all.
>
> Actually I'm implementing an algorithm that creates a process grid and
> divides it into
Vineet,
probably *not* what you expect, but I guess you can try
$ cat host-file
host1 slots=3
host2 slots=3
host3 slots=3
$ mpirun -hostfile host-file -np 2 ./EXE1 : -np 1 ./EXE2 : -np 2
./EXE1 : -np 1 ./EXE2 : -np 2 ./EXE1 : -np 1 ./EXE2
Cheers,
Gilles
On Mon, Dec 21, 2020 at 10:26 PM
Hi Patrick,
Glad to hear you are now able to move forward.
Please keep in mind this is not a fix but a temporary workaround.
At first glance, I did not spot any issue in the current code.
It turned out that the memory leak disappeared when doing things differently
Cheers,
Gilles
On Mon, Dec
) but not with the v3.1.x branch
(this suggests there could be an error in the latest Open MPI ... or
in the code)
- the attached patch seems to have a positive effect, can you please
give it a try?
Cheers,
Gilles
On 12/7/2020 6:15 PM, Patrick Bégou via users wrote:
Hi,
I've written a small piece
10 --mca mtl_base_verbose 10 --mca
btl_base_verbose 10 ...
will point you to the component(s) used.
The output is pretty verbose, so feel free to compress and post it if
you cannot decipher it
Cheers,
Gilles
On 12/4/2020 4:32 PM, Patrick Bégou via users wrote:
Hi George and Gilles
will greatly help us debugging this issue.
Cheers,
Gilles
On 12/4/2020 7:20 AM, George Bosilca via users wrote:
Patrick,
I'm afraid there is no simple way to check this. The main reason being
that OMPI use handles for MPI objects, and these handles are not
tracked by the library
will be very much appreciated in order to improve ompio
Cheers,
Gilles
On Thu, Dec 3, 2020 at 6:05 PM Patrick Bégou via users
wrote:
>
> Thanks Gilles,
>
> this is the solution.
> I will set OMPI_MCA_io=^ompio automatically when loading the parallel
> hdf5 module on the cluster.
Patrick,
In recent Open MPI releases, the default component for MPI-IO is ompio
(and no more romio)
unless the file is on a Lustre filesystem.
You can force romio with
mpirun --mca io ^ompio ...
Cheers,
Gilles
On 12/3/2020 4:20 PM, Patrick Bégou via users wrote:
Hi,
I'm using
Dean,
That typically occurs when some nodes have multiple interfaces, and
several nodes have a similar IP on a private/unused interface.
I suggest you explicitly restrict the interface Open MPI should be using.
For example, you can
mpirun --mca btl_tcp_if_include eth0 ...
Cheers,
Gilles
but
you should first ask yourself if you really want to run 12 MPI tasks
on your machine.
Cheers,
Gilles
On Sun, Nov 8, 2020 at 11:14 AM Paul Cizmas via users
wrote:
>
> Hello:
>
> I just installed OpenMPI 4.0.5 on a Linux machine running Pop!_OS (made by
> System76). The worksta
ll Request to
get some help fixing the missing bits.
Cheers,
Gilles
On Sun, Nov 1, 2020 at 12:18 PM Ognen Duzlevski via users
wrote:
>
> Hello!
>
> If I wanted to support a specific filesystem in open mpi, how is this
> done? What code in the source tree does it?
>
> Thanks!
> Ognen
Alan,
thanks for the report, I addressed this issue in
https://github.com/open-mpi/ompi/pull/8116
As a temporary workaround, you can apply the attached patch.
FWIW, f18 (shipped with LLVM 11.0.0) is still in development and uses
gfortran under the hood.
Cheers,
Gilles
On Wed, Oct 21, 2020
Hi Jorge,
If a firewall is running on your nodes, I suggest you disable it and try again
Cheers,
Gilles
On Wed, Oct 21, 2020 at 5:50 AM Jorge SILVA via users
wrote:
>
> Hello,
>
> I installed kubuntu20.4.1 with openmpi 4.0.3-0ubuntu in two different
> computers in the standard
ucx but manually change the bcast algo
mpirun --mca coll_tuned_use_dynamic_rules 1 --mca
coll_tuned_bcast_algorithm 1 ...
/* you can replace the bcast algorithm with any value between 1 and 7
included */
Cheers,
Gilles
On Mon, Aug 24, 2020 at 10:58 PM Patrick McNally via users
wrote:
>
and
a multithreaded BLAS
(e.g. PxQ = 2x4 and 4 OpenMP threads per MPI task)
Cheers,
Gilles
On Mon, Aug 10, 2020 at 3:31 AM John Duffy via users
wrote:
>
> Hi
>
> I have generated this problem myself by tweaking the MTU of my 8 node
> Raspberry Pi 4 cluster to 9000 bytes, but I would be g
1 - 100 of 1053 matches
Mail list logo