Matthew Knepley <knepley at gmail.com> writes: > On Mon, May 13, 2013 at 7:47 AM, Anton Popov <popov at uni-mainz.de> wrote: > >> Hi guys, >> >> I need "GlobalToLocal" simultaneously about five vectors composed with >> different DMs. >> What do you think will do better? >> >> 1) post all Begins first, followed by all Ends >> 2) post each Begin-End couple one after another >> > > It does not really matter unless you have work that can be done in the > middle.
And the fine-tuning depends on the network hardware. If the implementation is good, posting all the Begins first should be faster because it allows more overlap, but that causes more "out of order" network traffic, so it doesn't always work out that way. Some implementations do message coalescing (subject to a size threshold). In most cases, creating one fat vector that contains all five small ones will be faster on the network.
