> So, you mean that it guarantees the value received after the bcast call is
> consistent with value sent from root, but it doesn't have to wait till all
> the ranks have received it?
>
> this is what i believe, double checking the standard might not hurt though
> ...
>

No function has barrier semantics, except a barrier, although some
functions have barrier semantics due to data-dependencies for non-zero
counts (allgather, alltoall, allreduce).

Reduce, Bcast, gather, and scatter should never have barrier semantics and
should not synchronize more than the explicit data decencies require. The
send-only ranks may return long before the recv-only ranks do, particularly
when the messages go via an eager protocol.

One can imagine barrier as a 1-byte allreduce, but there are more efficient
implantations. Allreduce should never be faster than Bcast, as Gilles
explained.

There's a nice paper on self-consistent performance of MPI implementations
that has lots of details.

Jeff


-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/

Reply via email to