Hi Devendar,

Thank again you for your answer.

I searched a little bit and found that UD stands for "Unreliable Datagram"
while RC is for "Reliable Connected" transport mechanism. I found another
called DC for "Dynamically Connected" which is not supported on our HCA.

Do you know what is basically the difference between them ?

I didn't find any information about this.

Which one is used by btl=openib (iverb), is it RC ?

Also are they all standard or some of them are supported only by Mellanox ?

I will try to convince the admin of the system I'm using to increase the
maximal shared segment size (SHMMAX). I guess what we have (e.g. 32 MB) is the
default. But I didn't find any document suggesting that we should increase
SHMMAX for helping MXM. This is a bit odd, if it's important, it should be
mentioned in Mellanox documentation at least.

I will check at the messaging rate benchmark osu_mbw_mr for sure to see if its
result are improved by MXM.

After looking at the MPI performance results published on your URL (e.g.
latencies around 1 us in native mode), I'm more and more convinced that our
results are suboptimal.

And after seeing the impact of SR-IOV published on your URL, I suspect more
and more that our mediocre latency is caused by this mechanism.

But our cluster is different: SR-IOV is not used in the context of Virtual
Machines running under a host VMM. SR-IOV is used with Linux LXC containers.


Martin Audet


> Hi Martin
>
> MXM default transport is UD (MXM_TLS=*ud*,shm,self), which is scalable when
> running with large applications.  RC(MXM_TLS=*rc,*shm,self)  is recommended
> for microbenchmarks and very small scale applications,
>
> yes, max seg size setting is too small.
>
> Did you check any message rate benchmarks(like osu_mbw_mr) with MXM?
>
> virtualization env will have some overhead.  see some perf comparision here
> with mvapich
> http://mvapich.cse.ohio-state.edu/performance/v-pt_to_pt/ .



_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to