Your code looks correct. There are few things I would change to improve it:
- There are too many calls to the clock(). I would move the operations on
"time" (the variable) outside the outer loop.
- Replace the 2 non scalable constructs to gather the 2 times on the root
either by an MPI_Reduce(+)
Coming back to this discussion after a long time let me clarify a few
issues that you have addressed.
1. Yes, the list of communicators in G is ordered in the same way on all
processes.
2. I am now using "mcComm != MPI_COMM_NULL" for participation check. I have
not seen much improvement but it's
On Tue, Nov 7, 2017 at 6:09 PM, Konstantinos Konstantinidis <
kostas1...@gmail.com> wrote:
> OK, I will try to explain a few more things about the shuffling and I have
> attached only specific excerpts of the code to avoid confusion. I have
> added many comments.
>
> First, let me note that this
OK, I will try to explain a few more things about the shuffling and I have
attached only specific excerpts of the code to avoid confusion. I have
added many comments.
First, let me note that this project is an implementation of the Terasort
benchmark with a master node which assigns jobs to the
If each process send a different amount of data, then the operation should
be an allgatherv. This also requires that you know the amount each process
will send, so you will need a allgather. Schematically the code should look
like the following:
long bytes_send_count = endata.size * sizeof(long);
OK, I started implementing the above Allgather() idea without success
(segmentation fault). So I will post the problematic lines hare:
* comm.Allgather(&(endata.size), 1, MPI::UNSIGNED_LONG_LONG,
&(endata_rcv.size), 1, MPI::UNSIGNED_LONG_LONG);*
* endata_rcv.data = new unsigned
On Sun, Nov 5, 2017 at 10:23 PM, Konstantinos Konstantinidis <
kostas1...@gmail.com> wrote:
> Hi George,
>
> First, let me note that the cost of q^(k-1)]*(q-1) communicators was fine
> for the values of parameters q,k I am working with. Also, the whole point
> of speeding up the shuffling phase
Hi George,
First, let me note that the cost of q^(k-1)]*(q-1) communicators was fine
for the values of parameters q,k I am working with. Also, the whole point
of speeding up the shuffling phase is trying to reduce this number even
more (compared to already known implementations) which is a major
It really depends what are you trying to achieve. If the question is
rhetorical: "can I write a code that does in parallel broadcasts on
independent groups of processes ?" then the answer is yes, this is
certainly possible. If however you add a hint of practicality in your
question "can I write an
Let me clarify one thing,
When I said "there are q-1 groups that can communicate in parallel at the
same time" I meant that this is possible at any particular time. So at the
beginning we have q-1 groups that could communicate in parallel, then
another set of q-1 groups and so on until we exhaust
Assume that we have K=q*k nodes (slaves) where q,k are positive integers >=
2.
Based on the scheme that I am currently using I create [q^(k-1)]*(q-1)
groups (along with their communicators). Each group consists of k nodes and
within each group exactly k broadcasts take place (each node broadcasts
11 matches
Mail list logo