The MPI standard does not provide explicit support for process
migration. However, some MPI implementations (including Open MPI) have
integrated such support based on checkpoint/restart functionality. For
more information about the checkpoint/restart process migration
functionality in Open MPI see the links below:
  http://osl.iu.edu/research/ft/ompi-cr/
  http://osl.iu.edu/research/ft/ompi-cr/tools.php#ompi-migrate

I even implemented an MPI Extensions API to this functionality so you
can call it from within your application:
  http://osl.iu.edu/research/ft/ompi-cr/api.php#api-cr_migrate

These pieces of functionality are currently only available in the Open
MPI development trunk.

-- Josh

On Thu, Nov 10, 2011 at 8:19 AM, Jeff Squyres <jsquy...@cisco.com> wrote:
> On Nov 10, 2011, at 8:11 AM, Mudassar Majeed wrote:
>
>> Thank you for your reply. I am implementing a load balancing function for 
>> MPI, that will balance the computation load and the communication both at a 
>> time. So my algorithm assumes that all the cores may at the end get 
>> different number of processes to run.
>
> Are you talking about over-subscribing cores?  I.e., putting more than 1 MPI 
> process on each core?
>
> In general, that's not a good idea.
>
>> In the beginning (before that function will be called), each core will have 
>> equal number of processes. So I am thinking either to start more processes 
>> on each core (than needed) and run my function for load balancing and then 
>> block the remaining processes (on each core). In this way I will be able to 
>> achieve different number of processes per core.
>
> Open MPI spins aggressively looking for network progress.  For example, if 
> you block in an MPI_RECV waiting for a message, Open MPI is actively banging 
> on the CPU looking for network progress.  Because of this (and other 
> reasons), you probably do not want to over-subscribe your processors 
> (meaning: you probably don't want to put more than 1 process per core).
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey

Reply via email to