Gilles Gouaillardet, on ven. 21 juil. 2017 10:57:36 +0900, wrote:
> if you are fine with using more memory, and your application should not
> generate too much unexpected messages, then you can bump the eager_limit
> for example
>
> mpirun --mca btl_tcp_eager_limit $((8*1024*1024+128)) ...
Hello,
George Bosilca, on jeu. 20 juil. 2017 19:05:34 -0500, wrote:
> Can you reproduce the same behavior after the first batch of messages ?
Yes, putting a loop around the whole series of communications, event
with a 1-second pause in between, gets the same behavior repeated.
> Assuming the
Sam,
this example is using 8 MB size messages
if you are fine with using more memory, and your application should not
generate too much unexpected messages, then you can bump the eager_limit
for example
mpirun --mca btl_tcp_eager_limit $((8*1024*1024+128)) ...
worked for me
George,
in
Sam,
Open MPI aggregates messages only when network constraints prevent the
messages from being timely delivered. In this particular case I think that
our delayed business card exchange and connection setup is delaying the
delivery of the first batch of messages (and the BTL will aggregate them
Hello,
We are getting a strong performance issue, which is due to a missing
pipelining behavior from OpenMPI when running over TCP. I have attached
a test case. Basically what it does is
if (myrank == 0) {
for (i = 0; i < N; i++)
MPI_Isend(...);
} else {
for (i =
I see... Now it all makes sense. Since Cpus_allowed(_list) shows the effective
CPU mask, I expected Mems_allowed(_list) would do the same.
Thanks for the clarification.
Cheers,
Hristo
-Original Message-
From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Brice Goglin
Hello
Mems_allowed_list is what your current cgroup/cpuset allows. It is
different from what mbind/numactl/hwloc/... change.
The former is a root-only restriction that cannot be ignored by
processes placed in that cgroup.
The latter is a user-changeable binding that must be inside the former.