Re: [OMPI users] MPI_Comm_split and intercommunicator - Problem
Hy Jeff, thanks for replying. Does it mean that you don't have it working properly yet? I read the thread at the devel list where you addressed the problem and a possible solution, but I was not able to find a conclusion about the problem. I'm in trouble without this function. Probably I'll need to redesign all my implementation to achieve what I need. On Fri, Jan 27, 2012 at 2:35 PM, Jeff Squyreswrote: > Unfortunately, I think that this is a known problem with INTERCOMM_MERGE > and COMM_SPAWN parents and children: > >https://svn.open-mpi.org/trac/ompi/ticket/2904 > > > On Jan 26, 2012, at 12:11 PM, Rodrigo Oliveira wrote: > > > Hi there, I tried to understand the behavior Thatyene said and I think > is a bug in open mpi implementation. > > > > I do not know what exactly is happening because I am not an expert in > ompi code, but I could see that when one process define its color as > MPI_UNDEFINED, one of the processes on the inter-communicator blocks in the > call to the function bellow: > > > > /* Step 3: set up the communicator */ > > /* - */ > > /* Create the communicator finally */ > > rc = ompi_comm_set ( , /* new comm */ > > comm, /* old comm */ > > my_size,/* local_size */ > > lranks, /* local_ranks */ > > my_rsize, /* remote_size */ > > rranks, /* remote_ranks */ > > NULL, /* attrs */ > > comm->error_handler,/* error handler */ > > (pass_on_topo)? > > (mca_base_component_t *)comm->c_topo_component: > > NULL, /* topo component */ > > NULL, /* local group */ > > NULL/* remote group */ > > ); > > > > This function is called inside ompi_comm_split, in the file > ompi/communicator/comm.c > > > > Is there a solution for this problem in some revision? I insist in this > problem because I need to use this function for a similar purpose. > > > > Any idea? > > > > > > On Wed, Jan 25, 2012 at 4:50 PM, Thatyene Louise Alves de Souza Ramos < > thaty...@gmail.com> wrote: > > It seems the split is blocking when must return MPI_COMM_NULL, in the > case I have one process with a color that does not exist in the other group > or with the color = MPI_UNDEFINED. > > > > On Wed, Jan 25, 2012 at 4:28 PM, Rodrigo Oliveira < > rsilva.olive...@gmail.com> wrote: > > Hi Thatyene, > > > > I took a look in your code and it seems to be logically correct. Maybe > there is some problem when you call the split function having one client > process with color = MPI_UNDEFINED. I understood you are trying to isolate > one of the client process to do something applicable only to it, am I > wrong? According to open mpi documentation, this function can be used to do > that, but it is not working. Anyone have any idea about what can be? > > > > Best regards > > > > Rodrigo Oliveira > > > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] MPI_Comm_split and intercommunicator - Problem
Unfortunately, I think that this is a known problem with INTERCOMM_MERGE and COMM_SPAWN parents and children: https://svn.open-mpi.org/trac/ompi/ticket/2904 On Jan 26, 2012, at 12:11 PM, Rodrigo Oliveira wrote: > Hi there, I tried to understand the behavior Thatyene said and I think is a > bug in open mpi implementation. > > I do not know what exactly is happening because I am not an expert in ompi > code, but I could see that when one process define its color as > MPI_UNDEFINED, one of the processes on the inter-communicator blocks in the > call to the function bellow: > > /* Step 3: set up the communicator */ > /* - */ > /* Create the communicator finally */ > rc = ompi_comm_set ( , /* new comm */ > comm, /* old comm */ > my_size,/* local_size */ > lranks, /* local_ranks */ > my_rsize, /* remote_size */ > rranks, /* remote_ranks */ > NULL, /* attrs */ > comm->error_handler,/* error handler */ > (pass_on_topo)? > (mca_base_component_t *)comm->c_topo_component: > NULL, /* topo component */ > NULL, /* local group */ > NULL/* remote group */ > ); > > This function is called inside ompi_comm_split, in the file > ompi/communicator/comm.c > > Is there a solution for this problem in some revision? I insist in this > problem because I need to use this function for a similar purpose. > > Any idea? > > > On Wed, Jan 25, 2012 at 4:50 PM, Thatyene Louise Alves de Souza Ramos >wrote: > It seems the split is blocking when must return MPI_COMM_NULL, in the case I > have one process with a color that does not exist in the other group or with > the color = MPI_UNDEFINED. > > On Wed, Jan 25, 2012 at 4:28 PM, Rodrigo Oliveira > wrote: > Hi Thatyene, > > I took a look in your code and it seems to be logically correct. Maybe there > is some problem when you call the split function having one client process > with color = MPI_UNDEFINED. I understood you are trying to isolate one of the > client process to do something applicable only to it, am I wrong? According > to open mpi documentation, this function can be used to do that, but it is > not working. Anyone have any idea about what can be? > > Best regards > > Rodrigo Oliveira > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] MPI_Comm_split and intercommunicator - Problem
Hi there, I tried to understand the behavior Thatyene said and I think is a bug in open mpi implementation. I do not know what exactly is happening because I am not an expert in ompi code, but I could see that when one process define its color as * MPI_UNDEFINED*, one of the processes on the inter-communicator blocks in the call to the function bellow: /* Step 3: set up the communicator */ /* - */ /* Create the communicator finally */ rc = ompi_comm_set ( , /* new comm */ comm, /* old comm */ my_size,/* local_size */ lranks, /* local_ranks */ my_rsize, /* remote_size */ rranks, /* remote_ranks */ NULL, /* attrs */ comm->error_handler,/* error handler */ (pass_on_topo)? (mca_base_component_t *)comm->c_topo_component: NULL, /* topo component */ NULL, /* local group */ NULL/* remote group */ ); This function is called inside *ompi_comm_split*, in the file * ompi/communicator/comm.c* * * Is there a solution for this problem in some revision? I insist in this problem because I need to use this function for a similar purpose. Any idea? On Wed, Jan 25, 2012 at 4:50 PM, Thatyene Louise Alves de Souza Ramos < thaty...@gmail.com> wrote: > It seems the split is blocking when must return MPI_COMM_NULL, in the case > I have one process with a color that does not exist in the other group or > with the color = MPI_UNDEFINED. > > On Wed, Jan 25, 2012 at 4:28 PM, Rodrigo Oliveira < > rsilva.olive...@gmail.com> wrote: > >> Hi Thatyene, >> >> I took a look in your code and it seems to be logically correct. Maybe >> there is some problem when you call the split function having one client >> process with color = MPI_UNDEFINED. I understood you are trying to isolate >> one of the client process to do something applicable only to it, am I >> wrong? According to open mpi documentation, this function can be used to do >> that, but it is not working. Anyone have any idea about what can be? >> >> Best regards >> >> Rodrigo Oliveira >> >> > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] MPI_Comm_split and intercommunicator - Problem
It seems the split is blocking when must return MPI_COMM_NULL, in the case I have one process with a color that does not exist in the other group or with the color = MPI_UNDEFINED. On Wed, Jan 25, 2012 at 4:28 PM, Rodrigo Oliveirawrote: > Hi Thatyene, > > I took a look in your code and it seems to be logically correct. Maybe > there is some problem when you call the split function having one client > process with color = MPI_UNDEFINED. I understood you are trying to isolate > one of the client process to do something applicable only to it, am I > wrong? According to open mpi documentation, this function can be used to do > that, but it is not working. Anyone have any idea about what can be? > > Best regards > > Rodrigo Oliveira > >
Re: [OMPI users] MPI_Comm_split and intercommunicator - Problem
Hi Thatyene, I took a look in your code and it seems to be logically correct. Maybe there is some problem when you call the split function having one client process with color = MPI_UNDEFINED. I understood you are trying to isolate one of the client process to do something applicable only to it, am I wrong? According to open mpi documentation, this function can be used to do that, but it is not working. Anyone have any idea about what can be? Best regards Rodrigo Oliveira On Mon, Jan 23, 2012 at 4:53 PM, Thatyene Louise Alves de Souza Ramos < thaty...@dcc.ufmg.br> wrote: > Hi there! > > I've been trying to use the MPI_Comm_split function on an > intercommunicator, but I didn't have success. My application is very simple > and consists of a server that spawns 2 clients. After that, I want to split > the intercommunicator between the server and the clients so that one client > stay not connected with the server. > > The processes block in the split call and do not return. Can anyone help > me? > > == Simplified server code == > > int main( int argc, char *argv[] ) { > > MPI::Intracomm spawn_communicator = MPI::COMM_SELF; > MPI::Intercomm group1; > > MPI::Init(argc, argv); > group1 = spawn_client ( /* spawns 2 processes and returns the > intercommunicator with them */ ); > /* Tryes to split the intercommunicator */ > int color = 0; > MPI::Intercomm new_G1 = group1.Split(color, 0); > group1.Free(); > group1 = new_G1; > > cout << "server after splitting- size G1 = " << group1.Get_remote_size() > << endl << endl; > MPI::Finalize(); > return 0; > } > > == Simplified client code == > > int main( int argc, char *argv[] ) { > > MPI::Intracomm group_communicator; > MPI::Intercomm parent; > int group_rank; > MPI::Init(argc, argv); > parent = MPI::Comm::Get_parent (); > group_communicator = MPI::COMM_WORLD; > group_rank = group_communicator.Get_rank(); > if (group_rank == 0) { > color = 0; > } > else { > color = MPI_UNDEFINED; > } > MPI::Intercomm new_parent = parent.Split(color, inter_rank); > if (new_parent != MPI::COMM_NULL) { > parent.Free(); > parent = new_parent; > } > group_communicator.Free(); > parent.Free(); > MPI::Finalize(); > return 0; > } > > Thanks in advance. > > Thatyene Ramos > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
[OMPI users] MPI_Comm_split and intercommunicator - Problem
Hi there! I've been trying to use the MPI_Comm_split function on an intercommunicator, but I didn't have success. My application is very simple and consists of a server that spawns 2 clients. After that, I want to split the intercommunicator between the server and the clients so that one client stay not connected with the server. The processes block in the split call and do not return. Can anyone help me? == Simplified server code == int main( int argc, char *argv[] ) { MPI::Intracomm spawn_communicator = MPI::COMM_SELF; MPI::Intercomm group1; MPI::Init(argc, argv); group1 = spawn_client ( /* spawns 2 processes and returns the intercommunicator with them */ ); /* Tryes to split the intercommunicator */ int color = 0; MPI::Intercomm new_G1 = group1.Split(color, 0); group1.Free(); group1 = new_G1; cout << "server after splitting- size G1 = " << group1.Get_remote_size() << endl << endl; MPI::Finalize(); return 0; } == Simplified client code == int main( int argc, char *argv[] ) { MPI::Intracomm group_communicator; MPI::Intercomm parent; int group_rank; MPI::Init(argc, argv); parent = MPI::Comm::Get_parent (); group_communicator = MPI::COMM_WORLD; group_rank = group_communicator.Get_rank(); if (group_rank == 0) { color = 0; } else { color = MPI_UNDEFINED; } MPI::Intercomm new_parent = parent.Split(color, inter_rank); if (new_parent != MPI::COMM_NULL) { parent.Free(); parent = new_parent; } group_communicator.Free(); parent.Free(); MPI::Finalize(); return 0; } Thanks in advance. Thatyene Ramos
Re: [OMPI users] MPI_COMM_split hanging
On Dec 12, 2011, at 9:45 AM, Josh Hursey wrote: For MPI_Comm_split, all processes in the input communicator (oldcomm or MPI_COMM_WORLD in your case) must call the operation since it is collective over the input communicator. In your program rank 0 is not calling the operation, so MPI_Comm_split is waiting for it to participate. If you want rank 0 to be excluded from the any of the communicators, you can give it a special color that is distinct from all other ranks. Upon return from MPI_Comm_split, rank 0 will be given a new communicator containing just one processes, itself. If you do not intend to use that communicator you can free it immediately afterwards. You can also specify MPI_UNDEFINED as your color, in which case the output communicator in that process will be MPI_COMM_NULL. See MPI-2.2 p205. Thank you, Josh and Jeff. That did it! I called MPI_COMM_split from my supervisor with color of MPI_UNDEFINED and key of 0. Then all _split() calls returned and I was able to do the work in my test program. All the best, Gary
Re: [OMPI users] MPI_COMM_split hanging
On Dec 12, 2011, at 9:45 AM, Josh Hursey wrote: > For MPI_Comm_split, all processes in the input communicator (oldcomm > or MPI_COMM_WORLD in your case) must call the operation since it is > collective over the input communicator. In your program rank 0 is not > calling the operation, so MPI_Comm_split is waiting for it to > participate. > > If you want rank 0 to be excluded from the any of the communicators, > you can give it a special color that is distinct from all other ranks. > Upon return from MPI_Comm_split, rank 0 will be given a new > communicator containing just one processes, itself. If you do not > intend to use that communicator you can free it immediately > afterwards. You can also specify MPI_UNDEFINED as your color, in which case the output communicator in that process will be MPI_COMM_NULL. See MPI-2.2 p205. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] MPI_COMM_split hanging
For MPI_Comm_split, all processes in the input communicator (oldcomm or MPI_COMM_WORLD in your case) must call the operation since it is collective over the input communicator. In your program rank 0 is not calling the operation, so MPI_Comm_split is waiting for it to participate. If you want rank 0 to be excluded from the any of the communicators, you can give it a special color that is distinct from all other ranks. Upon return from MPI_Comm_split, rank 0 will be given a new communicator containing just one processes, itself. If you do not intend to use that communicator you can free it immediately afterwards. Hope that helps, Josh On Fri, Dec 9, 2011 at 6:52 PM, Gary Gorbetwrote: > I am attempting to split my application into multiple master+workers > groups using MPI_COMM_split. My MPI revision is shown as: > > mpirun --tag-output ompi_info -v ompi full --parsable > [1,0]:package:Open MPI root@build-x86-64 Distribution > [1,0]:ompi:version:full:1.4.3 > [1,0]:ompi:version:svn:r23834 > [1,0]:ompi:version:release_date:Oct 05, 2010 > [1,0]:orte:version:full:1.4.3 > [1,0]:orte:version:svn:r23834 > [1,0]:orte:version:release_date:Oct 05, 2010 > [1,0]:opal:version:full:1.4.3 > [1,0]:opal:version:svn:r23834 > [1,0]:opal:version:release_date:Oct 05, 2010 > [1,0]:ident:1.4.3 > > The basic problem I am having is that none of processor instances ever > returns from the MPI_COMM_split call. I am pretty new to MPI and it is > likely I am not doing things quite correctly. I'd appreciate some guidance. > > I am working with an application that has functioned nicely for a while > now. It only uses a single MPI_COMM_WORLD communicator. It is standard > stuff: a master that hands out tasks to many workers, receives output > and keeps track of workers that are ready to receive another task. The > tasks are quite compute-intensive. When running a variation of the > process that uses Monte Carlo iterations, jobs can exceed the 30 hours > they are limited to. The MC iterations are independent of each other - > adding random noise to an input - so I would like to run multiple > iterations simultaneously so that 4 times the cores runs in a fourth of > the time. This would entail a supervisor interacting with multiple > master+workers groups. > > I had thought that I would just have to declare a communicator for each > group so that broadcasts and syncs would work within a single group. > > MPI_Comm_size( MPI_COMM_WORLD, _proc_count ); > MPI_Comm_rank( MPI_COMM_WORLD, _rank ); > ... > cores_per_group = total_proc_count / groups_count; > my_group = my_rank / cores_per_group; // e.g., 0, 1, ... > group_rank = my_rank - my_group * cores_per_group; // rank within a > group > if ( my_rank == 0 ) continue; // Do not create group for supervisor > MPI_Comm oldcomm = MPI_COMM_WORLD; > MPI_Comm my_communicator; // Actually declared as a class variable > int sstat = MPI_Comm_split( oldcomm, my_group, group_rank, > _communicator ); > > There is never a return from the above _split() call. Do I need to do > something else to set this up? I would have expected perhaps a non-zero > status return, but not that I would get no return at all. I would > appreciate any comments or guidance. > > - Gary > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Joshua Hursey Postdoctoral Research Associate Oak Ridge National Laboratory http://users.nccs.gov/~jjhursey
[OMPI users] MPI_COMM_split hanging
I am attempting to split my application into multiple master+workers groups using MPI_COMM_split. My MPI revision is shown as: mpirun --tag-output ompi_info -v ompi full --parsable [1,0]:package:Open MPI root@build-x86-64 Distribution [1,0]:ompi:version:full:1.4.3 [1,0]:ompi:version:svn:r23834 [1,0]:ompi:version:release_date:Oct 05, 2010 [1,0]:orte:version:full:1.4.3 [1,0]:orte:version:svn:r23834 [1,0]:orte:version:release_date:Oct 05, 2010 [1,0]:opal:version:full:1.4.3 [1,0]:opal:version:svn:r23834 [1,0]:opal:version:release_date:Oct 05, 2010 [1,0]:ident:1.4.3 The basic problem I am having is that none of processor instances ever returns from the MPI_COMM_split call. I am pretty new to MPI and it is likely I am not doing things quite correctly. I'd appreciate some guidance. I am working with an application that has functioned nicely for a while now. It only uses a single MPI_COMM_WORLD communicator. It is standard stuff: a master that hands out tasks to many workers, receives output and keeps track of workers that are ready to receive another task. The tasks are quite compute-intensive. When running a variation of the process that uses Monte Carlo iterations, jobs can exceed the 30 hours they are limited to. The MC iterations are independent of each other - adding random noise to an input - so I would like to run multiple iterations simultaneously so that 4 times the cores runs in a fourth of the time. This would entail a supervisor interacting with multiple master+workers groups. I had thought that I would just have to declare a communicator for each group so that broadcasts and syncs would work within a single group. MPI_Comm_size( MPI_COMM_WORLD, _proc_count ); MPI_Comm_rank( MPI_COMM_WORLD, _rank ); ... cores_per_group = total_proc_count / groups_count; my_group = my_rank / cores_per_group; // e.g., 0, 1, ... group_rank = my_rank - my_group * cores_per_group; // rank within a group if ( my_rank == 0 )continue;// Do not create group for supervisor MPI_Comm oldcomm = MPI_COMM_WORLD; MPI_Comm my_communicator;// Actually declared as a class variable int sstat = MPI_Comm_split( oldcomm, my_group, group_rank, _communicator ); There is never a return from the above _split() call. Do I need to do something else to set this up? I would have expected perhaps a non-zero status return, but not that I would get no return at all. I would appreciate any comments or guidance. - Gary
Re: [OMPI users] MPI_Comm_split
> The tree is not symmetrical in that the valid values for the 10th > parameter depends on the values selected in the 0th to 9th parameter > (all the ancestry in the tree), for e.g., we may have a lot of nodes in > the left of the tree than in the right, see attachment ( I hope they're > allowed ) Which is why you don't have the master hand out all the work at once. Instead, it hands out a small(er) piece of work to each node from a large list where the length of the list is significantly larger than the number of nodes. As each node finishes processing the bit of work it was given, it sends a message back to the master with its results and asks for more work. You repeat until all data has been processes. Eg, say you are looking to search through all possible combinations for 10 parameters (n0,...,n9). The master would generate all possible combinations for the first 3 parameters (n0,n1,n2) and then for every element in that list, start sending them to the slave process who will use that as a basis vector for searching the rest of the space (n3,...,n9). As each slave finishes, it asks the master for another basis vector to work on. Lather, rinse, repeat until finished. If you keep the number of basis vectors much higher than the number of slaves (like 100x bigger) the code sill load-balance itself, since it really doesn't matter the order in which they finish processing a single basis as long as they are all kept busy. I used this approach many years ago searching for numerical sequences known as Golomb Rulers. Email me off list and I can give you some pointers to references. Good luck, -bill
Re: [OMPI users] MPI_Comm_split
On Nov 24, 2010, at 4:55 PM, Hicham Mouline wrote: > The tree is not symmetrical in that the valid values for the 10th parameter > depends on the values selected in the 0th to 9th parameter (all the ancestry > in the tree), for e.g., we may have a lot of nodes in the left of the tree > than in the right, see attachment ( I hope they're allowed ) > > The depth of the tree of course is the same everywhere, but not all nodes at > some level have the same number of children. > Is it better to just list vertically all the possible branches (n-tuples) at > the master level and split that list uniformly over the slaves? Yes, you certainly can MPI_COMM_SPLIT this way. As Bill mentioned, if you do the splits at the beginning of a long computation, the cost of them is irrelevant. I would expect for 128 MPI processes, doing 7-8 MPI_COMM_SPLITs will take far less than a second (although that's a total SWAG). So if it helps your coding and you're going to be running for a little while, go ahead and do them. Bill mentioned one good reason for splitting communicators: distinct sub groups for MPI collectives. Another good reason is for message separation. If you want to do a parameter sweep in a specific set of procs, you can give them their own communicator in order to guarantee that you have "private" communications between them (e.g., that tag X will never collide with tag X on another process). That being said, if all your communications will solely be within that subset of processes, then making new communicators may not be necessary. I.e., you'll never have colliding tags, so you don't need to create a private communication space. That being said, it may be useful to have all your subgroups be able to have a starting rank of 0 and be able to use MPI_COMM_SIZE to find how many peers are in your subgroup. Having distinct sub-communicators is helpful here. That being said... (this can go on ad nauseam -- it just depends on your app and whether you want to do the splitting or not; there's no universally right answer here) -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] MPI_Comm_split
> -Original Message- > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On > Behalf Of Bill Rankin > Sent: 24 November 2010 15:54 > To: Open MPI Users > Subject: Re: [OMPI users] MPI_Comm_split > > In this case, creating all those communicators really doesn't buy you > anything since you aren't using any collective operations across all > the subgroups you would be creating. > > For this sort of course-grained parallelism, your best bet is probably > a master/slave (producer/consumer, worker-pool) model. Have one > process (master) generate valid sets for your first X (of Y total) > parameters. The master then sends a unique set of these parameters to > each slave process. Each slave generates all possible sets of the > remaining parameters, evaluates the function for those parameter sets, > stores the local max/min and returns this value to the master. Upon > receiving the max/min from the slave, the master compares this to the > global max/min and sends the slave a new set of the first X parameters. > Repeat until the master has sent all possible sets of X parameters and > all slaves have processed all their work. > > Looking at it as a tree, the master process traverses the top of the > tree, handing each slave a branch and letting the slave traverse the > remainder of the tree. For load balancing, you want a lot more > branches than you want slaves so that each slave is always kept busy. > But you also want enough work for each slave to where they are not > constantly communicating with the master asking for the next set of > parameters. This is done by adjusting the depth to which the master > process traverses the parameter tree. > > Hope this helps. Good luck. > > -b > It does very much, thanks a lot. The tree is not symmetrical in that the valid values for the 10th parameter depends on the values selected in the 0th to 9th parameter (all the ancestry in the tree), for e.g., we may have a lot of nodes in the left of the tree than in the right, see attachment ( I hope they're allowed ) The depth of the tree of course is the same everywhere, but not all nodes at some level have the same number of children. Is it better to just list vertically all the possible branches (n-tuples) at the master level and split that list uniformly over the slaves? regards,
Re: [OMPI users] MPI_Comm_split
> -Original Message- > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On > Behalf Of Bill Rankin > Sent: 23 November 2010 19:32 > To: Open MPI Users > Subject: Re: [OMPI users] MPI_Comm_split > > Hicham: > > > If I have a 256 mpi processes in 1 communicator, am I able to split > > that communicator, then again split the resulting 2 subgroups, then > > again the resulting 4 subgroups and so on, until potentially having > 256 > > subgroups? > > You can. But as the old saying goes: "just because you *can* do > something doesn't necessarily mean you *should* do it." :-) > > What is your intent in creating all these communicators? > > > Is this insane in terms of performance? > > Well, how much "real" work are you doing? Operations on communicators > are collectives, so they are expensive. However if you do this only > once at the beginning of something like a three-week long simulation > run then you probably won't notice the impact. > > > In any case, I suspect there is a better way. > > -bill I have need for a parallel parameter sweep. I have arguments x0 to x9 say of a function. I need to evaluate this function for every acceptable combination of x0,...x9. This list of acceptable combinations forms what I can view as a tree: . under the root node, all possible values of x0 (say there are 10 of them x0_0 to x0_9) . under each of these nodes, all possible values of x1 that agree with the args defined so far, for .e.g if x1_0 is not possible with x0_0, then it's not part of the tree... . and so on until reaching the leaf nodes. At those nodes, I evaluate the function and I want the global maximum and/or minimum. the order of magnitude is 128 for the depth of the tree, and 100 possible values for each x. each eval takes a couple of ms though. I thought this facility of splitting communicators maps nicely the nature of my problem. what do you think? I'm actually not exactly sure how I'm gonna do it, but wished to have an opinion about whether it's just crazy rds,
Re: [OMPI users] MPI_Comm_split
Hicham: > If I have a 256 mpi processes in 1 communicator, am I able to split > that communicator, then again split the resulting 2 subgroups, then > again the resulting 4 subgroups and so on, until potentially having 256 > subgroups? You can. But as the old saying goes: "just because you *can* do something doesn't necessarily mean you *should* do it." :-) What is your intent in creating all these communicators? > Is this insane in terms of performance? Well, how much "real" work are you doing? Operations on communicators are collectives, so they are expensive. However if you do this only once at the beginning of something like a three-week long simulation run then you probably won't notice the impact. In any case, I suspect there is a better way. -bill
[OMPI users] MPI_Comm_split
Hello If I have a 256 mpi processes in 1 communicator, am I able to split that communicator, then again split the resulting 2 subgroups, then again the resulting 4 subgroups and so on, until potentially having 256 subgroups? Is this insane in terms of performance? regards,