[OMPI users] "self scheduled" work & mpi receive???

2010-09-23 Thread Lewis, Ambrose J.
Hi All:

I've written an openmpi program that "self schedules" the work.  

The master task is in a loop chunking up an input stream and handing off
jobs to worker tasks.  At first the master gives the next job to the
next highest rank.  After all ranks have their first job, the master
waits via an MPI receive call for the next free worker.  The master
parses out the rank from the MPI receive and sends the next job to this
node.  The jobs aren't all identical, so they run for slightly different
durations based on the input data.

 

When I plot a histogram of the number of jobs each worker performed, the
lower mpi ranks are doing much more work than the higher ranks.  For
example, in a 120 process run, rank 1 did 32 jobs while rank 119 only
did 2.  My guess is that openmpi returns the lowest rank from the MPI
Recv when I've got MPI_ANY_SOURCE set and multiple sends have happened
since the last call.

 

Is there a different Recv call to make that will spread out the data
better?

 

THANXS!

amb

 



Re: [OMPI users] "self scheduled" work & mpi receive???

2010-09-23 Thread pooja varshneya
Hi Lewis,

On Thu, Sep 23, 2010 at 9:38 AM, Lewis, Ambrose J.
 wrote:
> Hi All:
>
> I’ve written an openmpi program that “self schedules” the work.
>
> The master task is in a loop chunking up an input stream and handing off
> jobs to worker tasks.  At first the master gives the next job to the next
> highest rank.  After all ranks have their first job, the master waits via an
> MPI receive call for the next free worker.  The master parses out the rank
> from the MPI receive and sends the next job to this node.  The jobs aren’t
> all identical, so they run for slightly different durations based on the
> input data.
>
>
>
> When I plot a histogram of the number of jobs each worker performed, the
> lower mpi ranks are doing much more work than the higher ranks.  For
> example, in a 120 process run, rank 1 did 32 jobs while rank 119 only did 2.
>  My guess is that openmpi returns the lowest rank from the MPI Recv when
> I’ve got MPI_ANY_SOURCE set and multiple sends have happened since the last
> call.
>

What is the time taken by each computation ? It is possible that
computation time for longer tasks is much greater than computation
time for shorter tasks ?

>
>
> Is there a different Recv call to make that will spread out the data better?
>
>
>
> THANXS!
>
> amb
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



Re: [OMPI users] "self scheduled" work & mpi receive???

2010-09-23 Thread Bowen Zhou






Hi All:

I’ve written an openmpi program that “self schedules” the work.  

The master task is in a loop chunking up an input stream and handing off 
jobs to worker tasks.  At first the master gives the next job to the 
next highest rank.  After all ranks have their first job, the master 
waits via an MPI receive call for the next free worker.  The master 
parses out the rank from the MPI receive and sends the next job to this 
node.  The jobs aren’t all identical, so they run for slightly different 
durations based on the input data.


 

When I plot a histogram of the number of jobs each worker performed, the 
lower mpi ranks are doing much more work than the higher ranks.  For 
example, in a 120 process run, rank 1 did 32 jobs while rank 119 only 
did 2.  My guess is that openmpi returns the lowest rank from the MPI 
Recv when I’ve got MPI_ANY_SOURCE set and multiple sends have happened 
since the last call.


 


Is there a different Recv call to make that will spread out the data better?

 
How about using MPI_Irecv? Let the master issue an MPI_Irecv for each 
worker and call MPI_Test to get the list of idle workers, then choose 
one from the idle list by some randomization?




THANXS!

amb

 





___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] "self scheduled" work & mpi receive???

2010-09-23 Thread Lewis, Ambrose J.
That's a great suggestion...Thanks!
amb


-Original Message-
From: users-boun...@open-mpi.org on behalf of Bowen Zhou
Sent: Thu 9/23/2010 1:18 PM
To: Open MPI Users
Subject: Re: [OMPI users] "self scheduled" work & mpi receive???
 




> Hi All:
> 
> I've written an openmpi program that "self schedules" the work.  
> 
> The master task is in a loop chunking up an input stream and handing off 
> jobs to worker tasks.  At first the master gives the next job to the 
> next highest rank.  After all ranks have their first job, the master 
> waits via an MPI receive call for the next free worker.  The master 
> parses out the rank from the MPI receive and sends the next job to this 
> node.  The jobs aren't all identical, so they run for slightly different 
> durations based on the input data.
> 
>  
> 
> When I plot a histogram of the number of jobs each worker performed, the 
> lower mpi ranks are doing much more work than the higher ranks.  For 
> example, in a 120 process run, rank 1 did 32 jobs while rank 119 only 
> did 2.  My guess is that openmpi returns the lowest rank from the MPI 
> Recv when I've got MPI_ANY_SOURCE set and multiple sends have happened 
> since the last call.
> 
>  
> 
> Is there a different Recv call to make that will spread out the data better?
> 
>  
How about using MPI_Irecv? Let the master issue an MPI_Irecv for each 
worker and call MPI_Test to get the list of idle workers, then choose 
one from the idle list by some randomization?

> 
> THANXS!
> 
> amb
> 
>  
> 
> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] "self scheduled" work & mpi receive???

2010-09-23 Thread Mikael Lavoie
Hi Ambrose,

I'm interested in you work, i have a app to convert for myself and i don't
know enough the MPI structure and syntaxe to make it...

So if you wanna share your app i'm interested in taking a look at it!!

Thanks and have a nice day!!

Mikael Lavoie
2010/9/23 Lewis, Ambrose J. 

>  Hi All:
>
> I’ve written an openmpi program that “self schedules” the work.
>
> The master task is in a loop chunking up an input stream and handing off
> jobs to worker tasks.  At first the master gives the next job to the next
> highest rank.  After all ranks have their first job, the master waits via an
> MPI receive call for the next free worker.  The master parses out the rank
> from the MPI receive and sends the next job to this node.  The jobs aren’t
> all identical, so they run for slightly different durations based on the
> input data.
>
>
>
> When I plot a histogram of the number of jobs each worker performed, the
> lower mpi ranks are doing much more work than the higher ranks.  For
> example, in a 120 process run, rank 1 did 32 jobs while rank 119 only did 2.
>  My guess is that openmpi returns the lowest rank from the MPI Recv when
> I’ve got MPI_ANY_SOURCE set and multiple sends have happened since the last
> call.
>
>
>
> Is there a different Recv call to make that will spread out the data
> better?
>
>
>
> THANXS!
>
> amb
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] "self scheduled" work & mpi receive???

2010-09-24 Thread Richard Treumann
Amb 

It sounds like you have more workers than you can keep fed. Workers are 
finishing up and requesting their next assignment but sit idle because 
there are so many other idle workers too.

Load balance does not really matter if the choke point is the master.  The 
work is being done as fast as the master can hand it out.

Consider using fewer workers and seeing if your load balance improves and 
your total thruput stays the same. If you want to use all the workers you 
have efficiently, you need to find a way to make the master deliver 
assignments as fast as workers finish them. 

Compute processes do not care about fairness. Having half the processes 
busy 100% of the time and the other half idle  vs. having all the 
processes busy 50% of the time gives the same thruput and the hard workers 
will not complain. 


Dick Treumann  -  MPI Team 
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 Fax (845) 433-8363




From:
Mikael Lavoie 
To:
Open MPI Users 
Date:
09/23/2010 05:08 PM
Subject:
Re: [OMPI users] "self scheduled" work & mpi receive???
Sent by:
users-boun...@open-mpi.org



Hi Ambrose,

I'm interested in you work, i have a app to convert for myself and i don't 
know enough the MPI structure and syntaxe to make it...

So if you wanna share your app i'm interested in taking a look at it!! 

Thanks and have a nice day!!

Mikael Lavoie
2010/9/23 Lewis, Ambrose J. 
Hi All:
I’ve written an openmpi program that “self schedules” the work.  
The master task is in a loop chunking up an input stream and handing off 
jobs to worker tasks.  At first the master gives the next job to the next 
highest rank.  After all ranks have their first job, the master waits via 
an MPI receive call for the next free worker.  The master parses out the 
rank from the MPI receive and sends the next job to this node.  The jobs 
aren’t all identical, so they run for slightly different durations based 
on the input data.
 
When I plot a histogram of the number of jobs each worker performed, the 
lower mpi ranks are doing much more work than the higher ranks.  For 
example, in a 120 process run, rank 1 did 32 jobs while rank 119 only did 
2.  My guess is that openmpi returns the lowest rank from the MPI Recv 
when I’ve got MPI_ANY_SOURCE set and multiple sends have happened since 
the last call.
 
Is there a different Recv call to make that will spread out the data 
better?
 
THANXS!
amb
 

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] "self scheduled" work & mpi receive???

2010-09-24 Thread Lewis, Ambrose J.
Good points...I'll see if anything can be done to speed up the master.  If we 
can shrink the number of MPI processes without hurting overall throughput maybe 
I could save enough to fit another run on the freed cores.  Thanks for the 
ideas!
I was also worried about contention on the nodes since I'm running multiple MPI 
processes on the same multi-core box.  A typical run is 120 MPI processes on 5 
nodes, each with 24 cores. I may play a little with the "--bynode" parameter to 
see if this has any (significant) effect
THANXS
amb


-Original Message-
From: users-boun...@open-mpi.org on behalf of Richard Treumann
Sent: Fri 9/24/2010 9:16 AM
To: Open MPI Users
Subject: Re: [OMPI users] "self scheduled" work & mpi receive???
 
Amb 

It sounds like you have more workers than you can keep fed. Workers are 
finishing up and requesting their next assignment but sit idle because 
there are so many other idle workers too.

Load balance does not really matter if the choke point is the master.  The 
work is being done as fast as the master can hand it out.

Consider using fewer workers and seeing if your load balance improves and 
your total thruput stays the same. If you want to use all the workers you 
have efficiently, you need to find a way to make the master deliver 
assignments as fast as workers finish them. 

Compute processes do not care about fairness. Having half the processes 
busy 100% of the time and the other half idle  vs. having all the 
processes busy 50% of the time gives the same thruput and the hard workers 
will not complain. 


Dick Treumann  -  MPI Team 
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 Fax (845) 433-8363




From:
Mikael Lavoie 
To:
Open MPI Users 
List-Post: users@lists.open-mpi.org
Date:
09/23/2010 05:08 PM
Subject:
Re: [OMPI users] "self scheduled" work & mpi receive???
Sent by:
users-boun...@open-mpi.org



Hi Ambrose,

I'm interested in you work, i have a app to convert for myself and i don't 
know enough the MPI structure and syntaxe to make it...

So if you wanna share your app i'm interested in taking a look at it!! 

Thanks and have a nice day!!

Mikael Lavoie
2010/9/23 Lewis, Ambrose J. 
Hi All:
I've written an openmpi program that "self schedules" the work.  
The master task is in a loop chunking up an input stream and handing off 
jobs to worker tasks.  At first the master gives the next job to the next 
highest rank.  After all ranks have their first job, the master waits via 
an MPI receive call for the next free worker.  The master parses out the 
rank from the MPI receive and sends the next job to this node.  The jobs 
aren't all identical, so they run for slightly different durations based 
on the input data.
 
When I plot a histogram of the number of jobs each worker performed, the 
lower mpi ranks are doing much more work than the higher ranks.  For 
example, in a 120 process run, rank 1 did 32 jobs while rank 119 only did 
2.  My guess is that openmpi returns the lowest rank from the MPI Recv 
when I've got MPI_ANY_SOURCE set and multiple sends have happened since 
the last call.
 
Is there a different Recv call to make that will spread out the data 
better?
 
THANXS!
amb
 

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



<>