Re: [OMPI users] MPI_Comm_split and intercommunicator - Problem

2012-01-27 Thread Rodrigo Silva Oliveira
Hy Jeff, thanks for replying.

Does it mean that you don't have it working properly yet? I read the thread
at the devel list where you addressed the problem and a possible solution,
but I was not able to find a conclusion about the problem.

I'm in trouble without this function. Probably I'll need to redesign all my
implementation to achieve what I need.


On Fri, Jan 27, 2012 at 2:35 PM, Jeff Squyres  wrote:

> Unfortunately, I think that this is a known problem with INTERCOMM_MERGE
> and COMM_SPAWN parents and children:
>
>https://svn.open-mpi.org/trac/ompi/ticket/2904
>
>
> On Jan 26, 2012, at 12:11 PM, Rodrigo Oliveira wrote:
>
> > Hi there, I tried to understand the behavior Thatyene said and I think
> is a bug in open mpi implementation.
> >
> > I do not know what exactly is happening because I am not an expert in
> ompi code, but I could see that when one process define its color as
> MPI_UNDEFINED, one of the processes on the inter-communicator blocks in the
> call to the function bellow:
> >
> > /* Step 3: set up the communicator   */
> > /* - */
> > /* Create the communicator finally */
> > rc = ompi_comm_set ( ,   /* new comm */
> >  comm,   /* old comm */
> >  my_size,/* local_size */
> >  lranks, /* local_ranks */
> >  my_rsize,   /* remote_size */
> >  rranks, /* remote_ranks */
> >  NULL,   /* attrs */
> >  comm->error_handler,/* error handler */
> >  (pass_on_topo)?
> >  (mca_base_component_t *)comm->c_topo_component:
> >  NULL,   /* topo component */
> >  NULL,   /* local group */
> >  NULL/* remote group */
> > );
> >
> > This function is called inside ompi_comm_split, in the file
> ompi/communicator/comm.c
> >
> > Is there a solution for this problem in some revision? I insist in this
> problem because I need to use this function for a similar purpose.
> >
> > Any idea?
> >
> >
> > On Wed, Jan 25, 2012 at 4:50 PM, Thatyene Louise Alves de Souza Ramos <
> thaty...@gmail.com> wrote:
> > It seems the split is blocking when must return MPI_COMM_NULL, in the
> case I have one process with a color that does not exist in the other group
> or with the color = MPI_UNDEFINED.
> >
> > On Wed, Jan 25, 2012 at 4:28 PM, Rodrigo Oliveira <
> rsilva.olive...@gmail.com> wrote:
> > Hi Thatyene,
> >
> > I took a look in your code and it seems to be logically correct. Maybe
> there is some problem when you call the split function having one client
> process with color = MPI_UNDEFINED. I understood you are trying to isolate
> one of the client process to do something applicable only to it, am I
> wrong? According to open mpi documentation, this function can be used to do
> that, but it is not working. Anyone have any idea about what can be?
> >
> > Best regards
> >
> > Rodrigo Oliveira
> >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] MPI_Comm_split and intercommunicator - Problem

2012-01-27 Thread Jeff Squyres
Unfortunately, I think that this is a known problem with INTERCOMM_MERGE and 
COMM_SPAWN parents and children:

https://svn.open-mpi.org/trac/ompi/ticket/2904


On Jan 26, 2012, at 12:11 PM, Rodrigo Oliveira wrote:

> Hi there, I tried to understand the behavior Thatyene said and I think is a 
> bug in open mpi implementation.
> 
> I do not know what exactly is happening because I am not an expert in ompi 
> code, but I could see that when one process define its color as 
> MPI_UNDEFINED, one of the processes on the inter-communicator blocks in the 
> call to the function bellow:
> 
> /* Step 3: set up the communicator   */
> /* - */
> /* Create the communicator finally */
> rc = ompi_comm_set ( ,   /* new comm */
>  comm,   /* old comm */
>  my_size,/* local_size */
>  lranks, /* local_ranks */
>  my_rsize,   /* remote_size */
>  rranks, /* remote_ranks */
>  NULL,   /* attrs */
>  comm->error_handler,/* error handler */
>  (pass_on_topo)?
>  (mca_base_component_t *)comm->c_topo_component:
>  NULL,   /* topo component */
>  NULL,   /* local group */
>  NULL/* remote group */
> );
> 
> This function is called inside ompi_comm_split, in the file 
> ompi/communicator/comm.c
> 
> Is there a solution for this problem in some revision? I insist in this 
> problem because I need to use this function for a similar purpose.
> 
> Any idea?
> 
> 
> On Wed, Jan 25, 2012 at 4:50 PM, Thatyene Louise Alves de Souza Ramos 
>  wrote:
> It seems the split is blocking when must return MPI_COMM_NULL, in the case I 
> have one process with a color that does not exist in the other group or with 
> the color = MPI_UNDEFINED.
> 
> On Wed, Jan 25, 2012 at 4:28 PM, Rodrigo Oliveira  
> wrote:
> Hi Thatyene,
> 
> I took a look in your code and it seems to be logically correct. Maybe there 
> is some problem when you call the split function having one client process 
> with color = MPI_UNDEFINED. I understood you are trying to isolate one of the 
> client process to do something applicable only to it, am I wrong? According 
> to open mpi documentation, this function can be used to do that, but it is 
> not working. Anyone have any idea about what can be?
> 
> Best regards
> 
> Rodrigo Oliveira
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] MPI_Comm_split and intercommunicator - Problem

2012-01-26 Thread Rodrigo Oliveira
Hi there, I tried to understand the behavior Thatyene said and I think is a
bug in open mpi implementation.

I do not know what exactly is happening because I am not an expert in ompi
code, but I could see that when one process define its color as *
MPI_UNDEFINED*, one of the processes on the inter-communicator blocks in
the call to the function bellow:

/* Step 3: set up the communicator   */
/* - */
/* Create the communicator finally */
rc = ompi_comm_set ( ,   /* new comm */
 comm,   /* old comm */
 my_size,/* local_size */
 lranks, /* local_ranks */
 my_rsize,   /* remote_size */
 rranks, /* remote_ranks */
 NULL,   /* attrs */
 comm->error_handler,/* error handler */
 (pass_on_topo)?
 (mca_base_component_t *)comm->c_topo_component:
 NULL,   /* topo component */
 NULL,   /* local group */
 NULL/* remote group */
);

This function is called inside *ompi_comm_split*, in the file *
ompi/communicator/comm.c*
*
*
Is there a solution for this problem in some revision? I insist in this
problem because I need to use this function for a similar purpose.

Any idea?


On Wed, Jan 25, 2012 at 4:50 PM, Thatyene Louise Alves de Souza Ramos <
thaty...@gmail.com> wrote:

> It seems the split is blocking when must return MPI_COMM_NULL, in the case
> I have one process with a color that does not exist in the other group or
> with the color = MPI_UNDEFINED.
>
> On Wed, Jan 25, 2012 at 4:28 PM, Rodrigo Oliveira <
> rsilva.olive...@gmail.com> wrote:
>
>> Hi Thatyene,
>>
>> I took a look in your code and it seems to be logically correct. Maybe
>> there is some problem when you call the split function having one client
>> process with color = MPI_UNDEFINED. I understood you are trying to isolate
>> one of the client process to do something applicable only to it, am I
>> wrong? According to open mpi documentation, this function can be used to do
>> that, but it is not working. Anyone have any idea about what can be?
>>
>> Best regards
>>
>> Rodrigo Oliveira
>>
>>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] MPI_Comm_split and intercommunicator - Problem

2012-01-25 Thread Thatyene Louise Alves de Souza Ramos
It seems the split is blocking when must return MPI_COMM_NULL, in the case
I have one process with a color that does not exist in the other group or
with the color = MPI_UNDEFINED.

On Wed, Jan 25, 2012 at 4:28 PM, Rodrigo Oliveira  wrote:

> Hi Thatyene,
>
> I took a look in your code and it seems to be logically correct. Maybe
> there is some problem when you call the split function having one client
> process with color = MPI_UNDEFINED. I understood you are trying to isolate
> one of the client process to do something applicable only to it, am I
> wrong? According to open mpi documentation, this function can be used to do
> that, but it is not working. Anyone have any idea about what can be?
>
> Best regards
>
> Rodrigo Oliveira
>
>


Re: [OMPI users] MPI_Comm_split and intercommunicator - Problem

2012-01-25 Thread Rodrigo Oliveira
Hi Thatyene,

I took a look in your code and it seems to be logically correct. Maybe
there is some problem when you call the split function having one client
process with color = MPI_UNDEFINED. I understood you are trying to isolate
one of the client process to do something applicable only to it, am I
wrong? According to open mpi documentation, this function can be used to do
that, but it is not working. Anyone have any idea about what can be?

Best regards

Rodrigo Oliveira

On Mon, Jan 23, 2012 at 4:53 PM, Thatyene Louise Alves de Souza Ramos <
thaty...@dcc.ufmg.br> wrote:

> Hi there!
>
> I've been trying to use the MPI_Comm_split function on an
> intercommunicator, but I didn't have success. My application is very simple
> and consists of a server that spawns 2 clients. After that, I want to split
> the intercommunicator between the server and the clients so that one client
> stay not connected with the server.
>
> The processes block in the split call and do not return. Can anyone help
> me?
>
> == Simplified server code ==
>
> int main( int argc, char *argv[] ) {
>
> MPI::Intracomm spawn_communicator = MPI::COMM_SELF;
> MPI::Intercomm group1;
>
> MPI::Init(argc, argv);
> group1 = spawn_client ( /* spawns 2 processes and returns the
> intercommunicator with them */ );
>  /* Tryes to split the intercommunicator */
> int color = 0;
>  MPI::Intercomm new_G1 = group1.Split(color, 0);
> group1.Free();
> group1 = new_G1;
>
> cout << "server after splitting- size G1 = " << group1.Get_remote_size()
> << endl << endl;
> MPI::Finalize();
>  return 0;
> }
>
> == Simplified client code ==
>
> int main( int argc, char *argv[] ) {
>
>  MPI::Intracomm group_communicator;
> MPI::Intercomm parent;
> int group_rank;
>  MPI::Init(argc, argv);
>  parent = MPI::Comm::Get_parent ();
> group_communicator = MPI::COMM_WORLD;
>  group_rank = group_communicator.Get_rank();
>  if (group_rank == 0) {
> color = 0;
>  }
> else {
> color = MPI_UNDEFINED;
>  }
>  MPI::Intercomm new_parent = parent.Split(color, inter_rank);
>  if (new_parent != MPI::COMM_NULL) {
> parent.Free();
> parent = new_parent;
>  }
>  group_communicator.Free();
>  parent.Free();
> MPI::Finalize();
> return 0;
> }
>
> Thanks in advance.
>
> Thatyene Ramos
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


[OMPI users] MPI_Comm_split and intercommunicator - Problem

2012-01-23 Thread Thatyene Louise Alves de Souza Ramos
Hi there!

I've been trying to use the MPI_Comm_split function on an
intercommunicator, but I didn't have success. My application is very simple
and consists of a server that spawns 2 clients. After that, I want to split
the intercommunicator between the server and the clients so that one client
stay not connected with the server.

The processes block in the split call and do not return. Can anyone help me?

== Simplified server code ==

int main( int argc, char *argv[] ) {

MPI::Intracomm spawn_communicator = MPI::COMM_SELF;
MPI::Intercomm group1;

MPI::Init(argc, argv);
group1 = spawn_client ( /* spawns 2 processes and returns the
intercommunicator with them */ );
 /* Tryes to split the intercommunicator */
int color = 0;
 MPI::Intercomm new_G1 = group1.Split(color, 0);
group1.Free();
group1 = new_G1;

cout << "server after splitting- size G1 = " << group1.Get_remote_size() <<
endl << endl;
MPI::Finalize();
 return 0;
}

== Simplified client code ==

int main( int argc, char *argv[] ) {

 MPI::Intracomm group_communicator;
MPI::Intercomm parent;
int group_rank;
 MPI::Init(argc, argv);
 parent = MPI::Comm::Get_parent ();
group_communicator = MPI::COMM_WORLD;
group_rank = group_communicator.Get_rank();
 if (group_rank == 0) {
color = 0;
 }
else {
color = MPI_UNDEFINED;
 }
 MPI::Intercomm new_parent = parent.Split(color, inter_rank);
 if (new_parent != MPI::COMM_NULL) {
parent.Free();
parent = new_parent;
 }
 group_communicator.Free();
 parent.Free();
MPI::Finalize();
return 0;
}

Thanks in advance.

Thatyene Ramos


Re: [OMPI users] MPI_COMM_split hanging

2011-12-12 Thread Gary Gorbet

On Dec 12, 2011, at 9:45 AM, Josh Hursey wrote:


 For MPI_Comm_split, all processes in the input communicator (oldcomm
 or MPI_COMM_WORLD in your case) must call the operation since it is
 collective over the input communicator. In your program rank 0 is not
 calling the operation, so MPI_Comm_split is waiting for it to
 participate.

 If you want rank 0 to be excluded from the any of the communicators,
 you can give it a special color that is distinct from all other ranks.
 Upon return from MPI_Comm_split, rank 0 will be given a new
 communicator containing just one processes, itself. If you do not
 intend to use that communicator you can free it immediately
 afterwards.


You can also specify MPI_UNDEFINED as your color, in which case the 
output communicator in that process will be MPI_COMM_NULL.  See 
MPI-2.2 p205.


Thank you, Josh and Jeff. That did it! I called MPI_COMM_split from 
my supervisor with color of MPI_UNDEFINED and key of 0. Then all 
_split() calls returned and I was able to do the work in my test 
program.


All the best,
Gary


Re: [OMPI users] MPI_COMM_split hanging

2011-12-12 Thread Jeff Squyres
On Dec 12, 2011, at 9:45 AM, Josh Hursey wrote:

> For MPI_Comm_split, all processes in the input communicator (oldcomm
> or MPI_COMM_WORLD in your case) must call the operation since it is
> collective over the input communicator. In your program rank 0 is not
> calling the operation, so MPI_Comm_split is waiting for it to
> participate.
> 
> If you want rank 0 to be excluded from the any of the communicators,
> you can give it a special color that is distinct from all other ranks.
> Upon return from MPI_Comm_split, rank 0 will be given a new
> communicator containing just one processes, itself. If you do not
> intend to use that communicator you can free it immediately
> afterwards.

You can also specify MPI_UNDEFINED as your color, in which case the output 
communicator in that process will be MPI_COMM_NULL.  See MPI-2.2 p205.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] MPI_COMM_split hanging

2011-12-12 Thread Josh Hursey
For MPI_Comm_split, all processes in the input communicator (oldcomm
or MPI_COMM_WORLD in your case) must call the operation since it is
collective over the input communicator. In your program rank 0 is not
calling the operation, so MPI_Comm_split is waiting for it to
participate.

If you want rank 0 to be excluded from the any of the communicators,
you can give it a special color that is distinct from all other ranks.
Upon return from MPI_Comm_split, rank 0 will be given a new
communicator containing just one processes, itself. If you do not
intend to use that communicator you can free it immediately
afterwards.

Hope that helps,
Josh


On Fri, Dec 9, 2011 at 6:52 PM, Gary Gorbet  wrote:
> I am attempting to split my application into multiple master+workers
> groups using MPI_COMM_split. My MPI revision is shown as:
>
> mpirun --tag-output ompi_info -v ompi full --parsable
> [1,0]:package:Open MPI root@build-x86-64 Distribution
> [1,0]:ompi:version:full:1.4.3
> [1,0]:ompi:version:svn:r23834
> [1,0]:ompi:version:release_date:Oct 05, 2010
> [1,0]:orte:version:full:1.4.3
> [1,0]:orte:version:svn:r23834
> [1,0]:orte:version:release_date:Oct 05, 2010
> [1,0]:opal:version:full:1.4.3
> [1,0]:opal:version:svn:r23834
> [1,0]:opal:version:release_date:Oct 05, 2010
> [1,0]:ident:1.4.3
>
> The basic problem I am having is that none of processor instances ever
> returns from the MPI_COMM_split call. I am pretty new to MPI and it is
> likely I am not doing things quite correctly. I'd appreciate some guidance.
>
> I am working with an application that has functioned nicely for a while
> now. It only uses a single MPI_COMM_WORLD communicator. It is standard
> stuff:  a master that hands out tasks to many workers, receives output
> and keeps track of workers that are ready to receive another task. The
> tasks are quite compute-intensive. When running a variation of the
> process that uses Monte Carlo iterations, jobs can exceed the 30 hours
> they are limited to. The MC iterations are independent of each other -
> adding random noise to an input - so I would like to run multiple
> iterations simultaneously so that 4 times the cores runs in a fourth of
> the time. This would entail a supervisor interacting with multiple
> master+workers groups.
>
> I had thought that I would just have to declare a communicator for each
> group so that broadcasts and syncs would work within a single group.
>
>   MPI_Comm_size( MPI_COMM_WORLD, _proc_count );
>   MPI_Comm_rank( MPI_COMM_WORLD, _rank );
>   ...
>   cores_per_group = total_proc_count / groups_count;
>   my_group = my_rank / cores_per_group;     // e.g., 0, 1, ...
>   group_rank = my_rank - my_group * cores_per_group;  // rank within a
> group
>   if ( my_rank == 0 )    continue;    // Do not create group for supervisor
>   MPI_Comm oldcomm = MPI_COMM_WORLD;
>   MPI_Comm my_communicator;    // Actually declared as a class variable
>   int sstat = MPI_Comm_split( oldcomm, my_group, group_rank,
>         _communicator );
>
> There is never a return from the above _split() call. Do I need to do
> something else to set this up? I would have expected perhaps a non-zero
> status return, but not that I would get no return at all. I would
> appreciate any comments or guidance.
>
> - Gary
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey



[OMPI users] MPI_COMM_split hanging

2011-12-09 Thread Gary Gorbet

I am attempting to split my application into multiple master+workers
groups using MPI_COMM_split. My MPI revision is shown as:

mpirun --tag-output ompi_info -v ompi full --parsable
[1,0]:package:Open MPI root@build-x86-64 Distribution
[1,0]:ompi:version:full:1.4.3
[1,0]:ompi:version:svn:r23834
[1,0]:ompi:version:release_date:Oct 05, 2010
[1,0]:orte:version:full:1.4.3
[1,0]:orte:version:svn:r23834
[1,0]:orte:version:release_date:Oct 05, 2010
[1,0]:opal:version:full:1.4.3
[1,0]:opal:version:svn:r23834
[1,0]:opal:version:release_date:Oct 05, 2010
[1,0]:ident:1.4.3

The basic problem I am having is that none of processor instances ever
returns from the MPI_COMM_split call. I am pretty new to MPI and it is
likely I am not doing things quite correctly. I'd appreciate some guidance.

I am working with an application that has functioned nicely for a while
now. It only uses a single MPI_COMM_WORLD communicator. It is standard
stuff:  a master that hands out tasks to many workers, receives output
and keeps track of workers that are ready to receive another task. The
tasks are quite compute-intensive. When running a variation of the
process that uses Monte Carlo iterations, jobs can exceed the 30 hours
they are limited to. The MC iterations are independent of each other -
adding random noise to an input - so I would like to run multiple
iterations simultaneously so that 4 times the cores runs in a fourth of
the time. This would entail a supervisor interacting with multiple
master+workers groups.

I had thought that I would just have to declare a communicator for each
group so that broadcasts and syncs would work within a single group.

   MPI_Comm_size( MPI_COMM_WORLD, _proc_count );
   MPI_Comm_rank( MPI_COMM_WORLD, _rank );
   ...
   cores_per_group = total_proc_count / groups_count;
   my_group = my_rank / cores_per_group; // e.g., 0, 1, ...
   group_rank = my_rank - my_group * cores_per_group;  // rank within a
group
   if ( my_rank == 0 )continue;// Do not create group for supervisor
   MPI_Comm oldcomm = MPI_COMM_WORLD;
   MPI_Comm my_communicator;// Actually declared as a class variable
   int sstat = MPI_Comm_split( oldcomm, my_group, group_rank,
 _communicator );

There is never a return from the above _split() call. Do I need to do
something else to set this up? I would have expected perhaps a non-zero
status return, but not that I would get no return at all. I would
appreciate any comments or guidance.

- Gary


Re: [OMPI users] MPI_Comm_split

2010-11-30 Thread Bill Rankin
> The tree is not symmetrical in that the valid values for the 10th
> parameter depends on the values selected in the 0th to 9th parameter
> (all the ancestry in the tree), for e.g., we may have a lot of nodes in
> the left of the tree than in the right, see attachment ( I hope they're
> allowed )

Which is why you don't have the master hand out all the work at once.  Instead, 
it hands out a small(er) piece of work to each node from a large list where the 
length of the list is significantly larger than the number of nodes.  As each 
node finishes processing the bit of work it was given, it sends a message back 
to the master with its results and asks for more work.  You repeat until all 
data has been processes.

Eg, say you are looking to search through all possible combinations for 10 
parameters (n0,...,n9).  The master would generate all possible combinations 
for the first 3 parameters (n0,n1,n2) and then for every element in that list, 
start sending them to the slave process who will use that as a basis vector for 
searching the rest of the space (n3,...,n9).  As each slave finishes, it asks 
the master for another basis vector to work on.

Lather, rinse, repeat until finished.

If you keep the number of basis vectors much higher than the number of slaves 
(like 100x bigger) the code sill load-balance itself, since it really doesn't 
matter the order in which they finish processing a single basis as long as they 
are all kept busy.

I used this approach many years ago searching for numerical sequences known as 
Golomb Rulers.  Email me off list and I can give you some pointers to 
references.

Good luck,

-bill





Re: [OMPI users] MPI_Comm_split

2010-11-29 Thread Jeff Squyres
On Nov 24, 2010, at 4:55 PM, Hicham Mouline wrote:

> The tree is not symmetrical in that the valid values for the 10th parameter
> depends on the values selected in the 0th to 9th parameter (all the ancestry
> in the tree), for e.g., we may have a lot of nodes in the left of the tree
> than in the right, see attachment ( I hope they're allowed ) 
> 
> The depth of the tree of course is the same everywhere, but not all nodes at
> some level have the same number of children.
> Is it better to just list vertically all the possible branches (n-tuples) at
> the master level and split that list uniformly over the slaves?

Yes, you certainly can MPI_COMM_SPLIT this way.  As Bill mentioned, if you do 
the splits at the beginning of a long computation, the cost of them is 
irrelevant.  I would expect for 128 MPI processes, doing 7-8 MPI_COMM_SPLITs 
will take far less than a second (although that's a total SWAG).  So if it 
helps your coding and you're going to be running for a little while, go ahead 
and do them.

Bill mentioned one good reason for splitting communicators: distinct sub groups 
for MPI collectives.  Another good reason is for message separation.  If you 
want to do a parameter sweep in a specific set of procs, you can give them 
their own communicator in order to guarantee that you have "private" 
communications between them (e.g., that tag X will never collide with tag X on 
another process).

That being said, if all your communications will solely be within that subset 
of processes, then making new communicators may not be necessary. I.e., you'll 
never have colliding tags, so you don't need to create a private communication 
space.

That being said, it may be useful to have all your subgroups be able to have a 
starting rank of 0 and be able to use MPI_COMM_SIZE to find how many peers are 
in your subgroup.  Having distinct sub-communicators is helpful here.

That being said... (this can go on ad nauseam -- it just depends on your app 
and whether you want to do the splitting or not; there's no universally right 
answer here)

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] MPI_Comm_split

2010-11-24 Thread Hicham Mouline
> -Original Message-
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
> Behalf Of Bill Rankin
> Sent: 24 November 2010 15:54
> To: Open MPI Users
> Subject: Re: [OMPI users] MPI_Comm_split
> 
> In this case, creating all those communicators really doesn't buy you
> anything since you aren't using any collective operations across all
> the subgroups you would be creating.
> 
> For this sort of course-grained parallelism, your best bet is probably
> a master/slave (producer/consumer, worker-pool) model.  Have one
> process (master) generate valid sets for your first X (of Y total)
> parameters.  The master then sends a unique set of these parameters to
> each slave process.  Each slave generates all possible sets of the
> remaining parameters, evaluates the function for those parameter sets,
> stores the local max/min and returns this value to the master.  Upon
> receiving the max/min from the slave, the master compares this to the
> global max/min and sends the slave a new set of the first X parameters.
> Repeat until the master has sent all possible sets of X parameters and
> all slaves have processed all their work.
> 
> Looking at it as a tree, the master process traverses the top of the
> tree, handing each slave a branch and letting the slave traverse the
> remainder of the tree.  For load balancing, you want a lot more
> branches than you want slaves so that each slave is always kept busy.
> But you also want enough work for each slave to where they are not
> constantly communicating with the master asking for the next set of
> parameters.  This is done by adjusting the depth to which the master
> process traverses the parameter tree.
> 
> Hope this helps.  Good luck.
> 
> -b
> 
It does very much, thanks a lot.

The tree is not symmetrical in that the valid values for the 10th parameter
depends on the values selected in the 0th to 9th parameter (all the ancestry
in the tree), for e.g., we may have a lot of nodes in the left of the tree
than in the right, see attachment ( I hope they're allowed ) 

The depth of the tree of course is the same everywhere, but not all nodes at
some level have the same number of children.
Is it better to just list vertically all the possible branches (n-tuples) at
the master level and split that list uniformly over the slaves?

regards,


Re: [OMPI users] MPI_Comm_split

2010-11-23 Thread Hicham Mouline
> -Original Message-
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
> Behalf Of Bill Rankin
> Sent: 23 November 2010 19:32
> To: Open MPI Users
> Subject: Re: [OMPI users] MPI_Comm_split
> 
> Hicham:
> 
> > If I have a 256 mpi processes in 1 communicator, am I able to split
> > that communicator, then again split the resulting 2 subgroups, then
> > again the resulting 4 subgroups and so on, until potentially having
> 256
> > subgroups?
> 
> You can.  But as the old saying goes: "just because you *can* do
> something doesn't necessarily mean you *should* do it." :-)
> 
> What is your intent in creating all these communicators?
> 
> > Is this insane in terms of performance?
> 
> Well, how much "real" work are you doing?  Operations on communicators
> are collectives, so they are expensive.  However if you do this only
> once at the beginning of something like a three-week long simulation
> run then you probably won't notice the impact.
> 
> 
> In any case, I suspect there is a better way.
> 
> -bill

I have need for a parallel parameter sweep. I have arguments x0 to x9 say of
a function.
I need to evaluate this function for every acceptable combination of
x0,...x9.
This list of acceptable combinations forms what I can view as a tree:
. under the root node, all possible values of x0 (say there are 10 of them
x0_0 to x0_9)
. under each of these nodes, all possible values of x1 that agree with the
args defined so far, for .e.g
if x1_0 is not possible with x0_0, then it's not part of the tree...
. and so on until reaching the leaf nodes. At those nodes, I evaluate the
function and I want the global maximum and/or minimum.

the order of magnitude is 128 for the depth of the tree, and 100 possible
values for each x. 
each eval takes a couple of ms though.

I thought this facility of splitting communicators maps nicely the nature of
my problem.

what do you think?
I'm actually not exactly sure how I'm gonna do it, but wished to have an
opinion about whether it's just crazy

rds,



Re: [OMPI users] MPI_Comm_split

2010-11-23 Thread Bill Rankin
Hicham:

> If I have a 256 mpi processes in 1 communicator, am I able to split
> that communicator, then again split the resulting 2 subgroups, then
> again the resulting 4 subgroups and so on, until potentially having 256
> subgroups?

You can.  But as the old saying goes: "just because you *can* do something 
doesn't necessarily mean you *should* do it." :-)

What is your intent in creating all these communicators?

> Is this insane in terms of performance?

Well, how much "real" work are you doing?  Operations on communicators are 
collectives, so they are expensive.  However if you do this only once at the 
beginning of something like a three-week long simulation run then you probably 
won't notice the impact.

In any case, I suspect there is a better way.

-bill





[OMPI users] MPI_Comm_split

2010-11-23 Thread Hicham Mouline
Hello

If I have a 256 mpi processes in 1 communicator, am I able to split that 
communicator, then again split the resulting 2 subgroups, then again the 
resulting 4 subgroups and so on, until potentially having 256 subgroups?
Is this insane in terms of performance?

regards,