Re: [OMPI devel] [RFC] Sparse group implementation
On Thu, July 26, 2007 2:07 pm, Brian Barrett wrote: > On Jul 26, 2007, at 1:01 PM, Mohamad Chaarawi wrote: > >> >> On Thu, July 26, 2007 1:18 pm, Brian Barrett wrote: >>> On Jul 26, 2007, at 12:00 PM, Mohamad Chaarawi wrote: >>> > 2) I think it would be better to always have the flags and macros > available (like OMPI_GROUP_SPORADIC and OMPI_GROUP_IS_SPORADIC) > even > when sparse groups are disabled. They dont' take up any space, and > mean less #ifs in the general code base > That's what i actually was proposing.. keep the flags (there are no macros, just the GROUP_GET_PROC_POINTER) and the sparse parameters in the group strucutre, and this will mean, only 1 maybe 2 #ifs.. >>> >>> Why would this mean having the sparse parameters in the group >>> structure? >> >> not sure if i understood your question right, but in the group >> struct we >> added 5 integers and 3 pointer.. if we want to compile these out, >> we would >> then need all the #ifs around the code where we use these parameters.. > > I don't follow why you would need all the sparse stuff in > ompi_group_t when OMPI_GROUP_SPARSE is 0. The OMPI_GROUP_IS and > OMPI_GROU_SET macros only modify grp_flags, which is always present. > I don't need them, right now they are compiled out.. but since they are, all functions using these parameters (example: translate_ranks_strided, the long lookup function) have to be also compiled out, and other common functions that changed (like translate ranks, where we now check if sparse groups are enabled so we use an easier translate_ranks corresponding to the storage type) have to have the #ifs to compile stuff out. > Like the ompi_group_peer_lookup, much can be hidden inside the > functions rather than exposed through the interface, if you're > concerned about the other functionality currently #if'ed in the code. > in the ompi_group_peer_lookup that u suggested, we have an #if, so the same way with all functions that use sparse parameters.. I think i get what u are saying, Im don't want any functionality from including the sparse stuff when they are disabled, just easier code to look at.. > Brian > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > -- Mohamad Chaarawi Instructional Assistant http://www.cs.uh.edu/~mschaara Department of Computer ScienceUniversity of Houston 4800 Calhoun, PGH Room 526Houston, TX 77204, USA
Re: [OMPI devel] [RFC] Sparse group implementation
On Jul 26, 2007, at 1:01 PM, Mohamad Chaarawi wrote: On Thu, July 26, 2007 1:18 pm, Brian Barrett wrote: On Jul 26, 2007, at 12:00 PM, Mohamad Chaarawi wrote: 2) I think it would be better to always have the flags and macros available (like OMPI_GROUP_SPORADIC and OMPI_GROUP_IS_SPORADIC) even when sparse groups are disabled. They dont' take up any space, and mean less #ifs in the general code base That's what i actually was proposing.. keep the flags (there are no macros, just the GROUP_GET_PROC_POINTER) and the sparse parameters in the group strucutre, and this will mean, only 1 maybe 2 #ifs.. Why would this mean having the sparse parameters in the group structure? not sure if i understood your question right, but in the group struct we added 5 integers and 3 pointer.. if we want to compile these out, we would then need all the #ifs around the code where we use these parameters.. I don't follow why you would need all the sparse stuff in ompi_group_t when OMPI_GROUP_SPARSE is 0. The OMPI_GROUP_IS and OMPI_GROU_SET macros only modify grp_flags, which is always present. Like the ompi_group_peer_lookup, much can be hidden inside the functions rather than exposed through the interface, if you're concerned about the other functionality currently #if'ed in the code. Brian
Re: [OMPI devel] [RFC] Sparse group implementation
On Thu, July 26, 2007 1:18 pm, Brian Barrett wrote: > On Jul 26, 2007, at 12:00 PM, Mohamad Chaarawi wrote: > >>> 2) I think it would be better to always have the flags and macros >>> available (like OMPI_GROUP_SPORADIC and OMPI_GROUP_IS_SPORADIC) even >>> when sparse groups are disabled. They dont' take up any space, and >>> mean less #ifs in the general code base >>> >> That's what i actually was proposing.. keep the flags (there are no >> macros, just the GROUP_GET_PROC_POINTER) and the sparse parameters >> in the >> group strucutre, and this will mean, only 1 maybe 2 #ifs.. > > Why would this mean having the sparse parameters in the group structure? not sure if i understood your question right, but in the group struct we added 5 integers and 3 pointer.. if we want to compile these out, we would then need all the #ifs around the code where we use these parameters.. > >>> 3) Instead of the GROU_GET_PROC_POINTER macro, why not just change >>> the behavior of the ompi_group_peer_lookup() function, so that there >>> is symmetry with how you get a proc from a communicator? static >>> inline functions (especially short ones like that) are basically >>> free. We'll still have to fix all the places in the code where the >>> macro is used or people poke directly at the group structure, but I >>> like static inline over macros whenever possible. So much easier t >>> debug. >> >> Actually i never knew till this morning that this function was in the >> code.. I have an inline function ompi_group_lookup (which does the >> same >> thing), that actually checks if the group is dense or not and act >> accordingly.. but to use the inline function instead of the macro, >> means >> again that we need to compile in all the sparse parameters/code, >> which im >> for.. > > No, it doesn't. Just have something like: > > static inline ompi_proc_t* > ompi_group_lookup(ompi_group_t *group, int peer) > { > #if OMPI_GROUP_SPARSE > /* big long lookup function for sparse groups here */ > #else > return group->grp_proc_pointers[peer] > #endif > } > ok, i guess i can do that... > Brian > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > -- Mohamad Chaarawi Instructional Assistant http://www.cs.uh.edu/~mschaara Department of Computer ScienceUniversity of Houston 4800 Calhoun, PGH Room 526Houston, TX 77204, USA
Re: [OMPI devel] [RFC] Sparse group implementation
On Jul 26, 2007, at 12:00 PM, Mohamad Chaarawi wrote: 2) I think it would be better to always have the flags and macros available (like OMPI_GROUP_SPORADIC and OMPI_GROUP_IS_SPORADIC) even when sparse groups are disabled. They dont' take up any space, and mean less #ifs in the general code base That's what i actually was proposing.. keep the flags (there are no macros, just the GROUP_GET_PROC_POINTER) and the sparse parameters in the group strucutre, and this will mean, only 1 maybe 2 #ifs.. Why would this mean having the sparse parameters in the group structure? 3) Instead of the GROU_GET_PROC_POINTER macro, why not just change the behavior of the ompi_group_peer_lookup() function, so that there is symmetry with how you get a proc from a communicator? static inline functions (especially short ones like that) are basically free. We'll still have to fix all the places in the code where the macro is used or people poke directly at the group structure, but I like static inline over macros whenever possible. So much easier t debug. Actually i never knew till this morning that this function was in the code.. I have an inline function ompi_group_lookup (which does the same thing), that actually checks if the group is dense or not and act accordingly.. but to use the inline function instead of the macro, means again that we need to compile in all the sparse parameters/code, which im for.. No, it doesn't. Just have something like: static inline ompi_proc_t* ompi_group_lookup(ompi_group_t *group, int peer) { #if OMPI_GROUP_SPARSE /* big long lookup function for sparse groups here */ #else return group->grp_proc_pointers[peer] #endif } Brian
Re: [OMPI devel] [RFC] Sparse group implementation
On Thu, July 26, 2007 12:20 pm, Brian Barrett wrote: > Mohamad - > > A couple of comments / questions: > > 1) Why do you need the #if OMPI_GROUP_SPARSE in communicator/comm.c? > That seems like > part of the API that should under no conditions change based on > sparse/not sparse > I don't, there was one #if that i just removed.. but we do need to check in some cases like in ompi_comm_get_rprocs that we are not using the direct access to the pointers list. like for example: if(OMPI_GROUP_IS_DENSE(local_comm->c_local_group)) { rc = ompi_proc_pack(local_comm->c_local_group->grp_proc_pointers, local_size, sbuf); } /* get the proc list for the sparse implementations */ else { proc_list = (ompi_proc_t **) calloc (local_comm->c_local_group->grp_proc_count, sizeof (ompi_proc_t *)); for(i=0 ; ic_local_group->grp_proc_count ; i++) proc_list[i] = GROUP_GET_PROC_POINTER(local_comm->c_local_group,i); rc = ompi_proc_pack (proc_list, local_size, sbuf); } here if sparse groups are disabled, we don't really want to allocate and set this list of pointers that already exists (not to waste more memory and time).. > 2) I think it would be better to always have the flags and macros > available (like OMPI_GROUP_SPORADIC and OMPI_GROUP_IS_SPORADIC) even > when sparse groups are disabled. They dont' take up any space, and > mean less #ifs in the general code base > That's what i actually was proposing.. keep the flags (there are no macros, just the GROUP_GET_PROC_POINTER) and the sparse parameters in the group strucutre, and this will mean, only 1 maybe 2 #ifs.. > 3) Instead of the GROU_GET_PROC_POINTER macro, why not just change > the behavior of the ompi_group_peer_lookup() function, so that there > is symmetry with how you get a proc from a communicator? static > inline functions (especially short ones like that) are basically > free. We'll still have to fix all the places in the code where the > macro is used or people poke directly at the group structure, but I > like static inline over macros whenever possible. So much easier t > debug. Actually i never knew till this morning that this function was in the code.. I have an inline function ompi_group_lookup (which does the same thing), that actually checks if the group is dense or not and act accordingly.. but to use the inline function instead of the macro, means again that we need to compile in all the sparse parameters/code, which im for.. > > Other than that, I think you've got my concerns pretty much addressed. > > Brian > > On Jul 25, 2007, at 8:45 PM, Mohamad Chaarawi wrote: > >> In the current code, almost all #ifs are due to the fact that we don't >> want to add the extra memory by the sparse parameters that are >> added to >> the group structure. >> The additional parameters are 5 pointers and 3 integers. >> If nobody objects, i would actually keep those extra parameters, >> even if >> sparse groups are disabled (in the default case on configure), >> because it >> would reduce the number of #ifs in the code to only 2 (as i recall >> that i >> had it before) .. >> >> Thank you, >> >> Mohamad >> >> On Wed, July 25, 2007 4:23 pm, Brian Barrett wrote: >>> On Jul 25, 2007, at 3:14 PM, Jeff Squyres wrote: >>> On Jul 25, 2007, at 5:07 PM, Brian Barrett wrote: >> It just adds a lot of #if's throughout the code. Other than that, >> there's no reason to remove it. > > I agree, lots of #ifs are bad. But I guess I don't see the > problem. > The only real important thing people were directly accessing in the > ompi_group_t is the array of proc pointers. Indexing into them > could > be done with a static inline function that just has slightly > different time complexity based on compile options. Static inline > function that just does an index in the group proc pointer would > have > almost no added overhead (none if the compiler doesn't suck). Ya, that's what I proposed. :-) But I did also propose removing the extra #if's so that the sparse group code would be available and we'd add an extra "if" in the critical code path. But we can do it this way instead: Still use the MACRO to access proc_t's. In the --disable-sparse- groups scenario, have it map to comm.group.proc[i]. In the -- enable- sparse-groups scenario, have it like I listed in the original proposal: static inline ompi_proc_t lookup_group(ompi_group_t *group, int index) { if (group_is_dense(group)) { return group->procs[index]; } else { return sparse_group_lookup(group, index); } } With: a) groups are always dense if --enable and an MCA parameter turns off
Re: [OMPI devel] [RFC] Sparse group implementation
Mohamad - A couple of comments / questions: 1) Why do you need the #if OMPI_GROUP_SPARSE in communicator/comm.c? That seems like part of the API that should under no conditions change based on sparse/not sparse 2) I think it would be better to always have the flags and macros available (like OMPI_GROUP_SPORADIC and OMPI_GROUP_IS_SPORADIC) even when sparse groups are disabled. They dont' take up any space, and mean less #ifs in the general code base 3) Instead of the GROU_GET_PROC_POINTER macro, why not just change the behavior of the ompi_group_peer_lookup() function, so that there is symmetry with how you get a proc from a communicator? static inline functions (especially short ones like that) are basically free. We'll still have to fix all the places in the code where the macro is used or people poke directly at the group structure, but I like static inline over macros whenever possible. So much easier t debug. Other than that, I think you've got my concerns pretty much addressed. Brian On Jul 25, 2007, at 8:45 PM, Mohamad Chaarawi wrote: In the current code, almost all #ifs are due to the fact that we don't want to add the extra memory by the sparse parameters that are added to the group structure. The additional parameters are 5 pointers and 3 integers. If nobody objects, i would actually keep those extra parameters, even if sparse groups are disabled (in the default case on configure), because it would reduce the number of #ifs in the code to only 2 (as i recall that i had it before) .. Thank you, Mohamad On Wed, July 25, 2007 4:23 pm, Brian Barrett wrote: On Jul 25, 2007, at 3:14 PM, Jeff Squyres wrote: On Jul 25, 2007, at 5:07 PM, Brian Barrett wrote: It just adds a lot of #if's throughout the code. Other than that, there's no reason to remove it. I agree, lots of #ifs are bad. But I guess I don't see the problem. The only real important thing people were directly accessing in the ompi_group_t is the array of proc pointers. Indexing into them could be done with a static inline function that just has slightly different time complexity based on compile options. Static inline function that just does an index in the group proc pointer would have almost no added overhead (none if the compiler doesn't suck). Ya, that's what I proposed. :-) But I did also propose removing the extra #if's so that the sparse group code would be available and we'd add an extra "if" in the critical code path. But we can do it this way instead: Still use the MACRO to access proc_t's. In the --disable-sparse- groups scenario, have it map to comm.group.proc[i]. In the -- enable- sparse-groups scenario, have it like I listed in the original proposal: static inline ompi_proc_t lookup_group(ompi_group_t *group, int index) { if (group_is_dense(group)) { return group->procs[index]; } else { return sparse_group_lookup(group, index); } } With: a) groups are always dense if --enable and an MCA parameter turns off sparse groups, or b) there's an added check in the inline function for whether the MCA parameter is on I'm personally in favor of a) because it means only one conditional in the critical path. I don't really care about the sparse groups turned on case. I just want minimal #ifs in the global code and to not have an if() { ... } in the critical path when sparse groups are disabled :). Brian ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Mohamad Chaarawi Instructional Assistant http://www.cs.uh.edu/~mschaara Department of Computer ScienceUniversity of Houston 4800 Calhoun, PGH Room 526Houston, TX 77204, USA ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] [RFC] Sparse group implementation
Sorry the additional parameters are 5 integers and 3 pointers.. Mohamad On Wed, July 25, 2007 9:45 pm, Mohamad Chaarawi wrote: > In the current code, almost all #ifs are due to the fact that we don't > want to add the extra memory by the sparse parameters that are added to > the group structure. > The additional parameters are 5 pointers and 3 integers. > If nobody objects, i would actually keep those extra parameters, even if > sparse groups are disabled (in the default case on configure), because it > would reduce the number of #ifs in the code to only 2 (as i recall that i > had it before) .. > > Thank you, > > Mohamad > > On Wed, July 25, 2007 4:23 pm, Brian Barrett wrote: >> On Jul 25, 2007, at 3:14 PM, Jeff Squyres wrote: >> >>> On Jul 25, 2007, at 5:07 PM, Brian Barrett wrote: >>> > It just adds a lot of #if's throughout the code. Other than that, > there's no reason to remove it. I agree, lots of #ifs are bad. But I guess I don't see the problem. The only real important thing people were directly accessing in the ompi_group_t is the array of proc pointers. Indexing into them could be done with a static inline function that just has slightly different time complexity based on compile options. Static inline function that just does an index in the group proc pointer would have almost no added overhead (none if the compiler doesn't suck). >>> >>> Ya, that's what I proposed. :-) >>> >>> But I did also propose removing the extra #if's so that the sparse >>> group code would be available and we'd add an extra "if" in the >>> critical code path. >>> >>> But we can do it this way instead: >>> >>> Still use the MACRO to access proc_t's. In the --disable-sparse- >>> groups scenario, have it map to comm.group.proc[i]. In the --enable- >>> sparse-groups scenario, have it like I listed in the original >>> proposal: >>> >>> static inline ompi_proc_t lookup_group(ompi_group_t *group, int >>> index) { >>> if (group_is_dense(group)) { >>> return group->procs[index]; >>> } else { >>> return sparse_group_lookup(group, index); >>> } >>> } >>> >>> With: >>> >>> a) groups are always dense if --enable and an MCA parameter turns off >>> sparse groups, or >>> b) there's an added check in the inline function for whether the MCA >>> parameter is on >>> >>> I'm personally in favor of a) because it means only one conditional >>> in the critical path. >> >> I don't really care about the sparse groups turned on case. I just >> want minimal #ifs in the global code and to not have an if() { ... } >> in the critical path when sparse groups are disabled :). >> >> Brian >> ___ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > > > -- > Mohamad Chaarawi > Instructional Assistant http://www.cs.uh.edu/~mschaara > Department of Computer Science University of Houston > 4800 Calhoun, PGH Room 526Houston, TX 77204, USA > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > -- Mohamad Chaarawi Instructional Assistant http://www.cs.uh.edu/~mschaara Department of Computer ScienceUniversity of Houston 4800 Calhoun, PGH Room 526Houston, TX 77204, USA
Re: [OMPI devel] [RFC] Sparse group implementation
In the current code, almost all #ifs are due to the fact that we don't want to add the extra memory by the sparse parameters that are added to the group structure. The additional parameters are 5 pointers and 3 integers. If nobody objects, i would actually keep those extra parameters, even if sparse groups are disabled (in the default case on configure), because it would reduce the number of #ifs in the code to only 2 (as i recall that i had it before) .. Thank you, Mohamad On Wed, July 25, 2007 4:23 pm, Brian Barrett wrote: > On Jul 25, 2007, at 3:14 PM, Jeff Squyres wrote: > >> On Jul 25, 2007, at 5:07 PM, Brian Barrett wrote: >> It just adds a lot of #if's throughout the code. Other than that, there's no reason to remove it. >>> >>> I agree, lots of #ifs are bad. But I guess I don't see the problem. >>> The only real important thing people were directly accessing in the >>> ompi_group_t is the array of proc pointers. Indexing into them could >>> be done with a static inline function that just has slightly >>> different time complexity based on compile options. Static inline >>> function that just does an index in the group proc pointer would have >>> almost no added overhead (none if the compiler doesn't suck). >> >> Ya, that's what I proposed. :-) >> >> But I did also propose removing the extra #if's so that the sparse >> group code would be available and we'd add an extra "if" in the >> critical code path. >> >> But we can do it this way instead: >> >> Still use the MACRO to access proc_t's. In the --disable-sparse- >> groups scenario, have it map to comm.group.proc[i]. In the --enable- >> sparse-groups scenario, have it like I listed in the original >> proposal: >> >> static inline ompi_proc_t lookup_group(ompi_group_t *group, int >> index) { >> if (group_is_dense(group)) { >> return group->procs[index]; >> } else { >> return sparse_group_lookup(group, index); >> } >> } >> >> With: >> >> a) groups are always dense if --enable and an MCA parameter turns off >> sparse groups, or >> b) there's an added check in the inline function for whether the MCA >> parameter is on >> >> I'm personally in favor of a) because it means only one conditional >> in the critical path. > > I don't really care about the sparse groups turned on case. I just > want minimal #ifs in the global code and to not have an if() { ... } > in the critical path when sparse groups are disabled :). > > Brian > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > -- Mohamad Chaarawi Instructional Assistant http://www.cs.uh.edu/~mschaara Department of Computer ScienceUniversity of Houston 4800 Calhoun, PGH Room 526Houston, TX 77204, USA
Re: [OMPI devel] [RFC] Sparse group implementation
On Jul 25, 2007, at 3:14 PM, Jeff Squyres wrote: On Jul 25, 2007, at 5:07 PM, Brian Barrett wrote: It just adds a lot of #if's throughout the code. Other than that, there's no reason to remove it. I agree, lots of #ifs are bad. But I guess I don't see the problem. The only real important thing people were directly accessing in the ompi_group_t is the array of proc pointers. Indexing into them could be done with a static inline function that just has slightly different time complexity based on compile options. Static inline function that just does an index in the group proc pointer would have almost no added overhead (none if the compiler doesn't suck). Ya, that's what I proposed. :-) But I did also propose removing the extra #if's so that the sparse group code would be available and we'd add an extra "if" in the critical code path. But we can do it this way instead: Still use the MACRO to access proc_t's. In the --disable-sparse- groups scenario, have it map to comm.group.proc[i]. In the --enable- sparse-groups scenario, have it like I listed in the original proposal: static inline ompi_proc_t lookup_group(ompi_group_t *group, int index) { if (group_is_dense(group)) { return group->procs[index]; } else { return sparse_group_lookup(group, index); } } With: a) groups are always dense if --enable and an MCA parameter turns off sparse groups, or b) there's an added check in the inline function for whether the MCA parameter is on I'm personally in favor of a) because it means only one conditional in the critical path. I don't really care about the sparse groups turned on case. I just want minimal #ifs in the global code and to not have an if() { ... } in the critical path when sparse groups are disabled :). Brian
Re: [OMPI devel] [RFC] Sparse group implementation
On Jul 25, 2007, at 5:07 PM, Brian Barrett wrote: It just adds a lot of #if's throughout the code. Other than that, there's no reason to remove it. I agree, lots of #ifs are bad. But I guess I don't see the problem. The only real important thing people were directly accessing in the ompi_group_t is the array of proc pointers. Indexing into them could be done with a static inline function that just has slightly different time complexity based on compile options. Static inline function that just does an index in the group proc pointer would have almost no added overhead (none if the compiler doesn't suck). Ya, that's what I proposed. :-) But I did also propose removing the extra #if's so that the sparse group code would be available and we'd add an extra "if" in the critical code path. But we can do it this way instead: Still use the MACRO to access proc_t's. In the --disable-sparse- groups scenario, have it map to comm.group.proc[i]. In the --enable- sparse-groups scenario, have it like I listed in the original proposal: static inline ompi_proc_t lookup_group(ompi_group_t *group, int index) { if (group_is_dense(group)) { return group->procs[index]; } else { return sparse_group_lookup(group, index); } } With: a) groups are always dense if --enable and an MCA parameter turns off sparse groups, or b) there's an added check in the inline function for whether the MCA parameter is on I'm personally in favor of a) because it means only one conditional in the critical path. -- Jeff Squyres Cisco Systems
Re: [OMPI devel] [RFC] Sparse group implementation
On Jul 25, 2007, at 2:56 PM, Jeff Squyres wrote: On Jul 25, 2007, at 10:39 AM, Brian Barrett wrote: I have an even bigger objection than Rich. It's near impossible to measure the latency impact of something like this, but it does have an additive effect. It doesn't make sense to have all that code in the critical path for systems where it's not needed. We should leave the compile time decision available, unless there's a very good reason (which I did not see in this e-mail) to remove it. It just adds a lot of #if's throughout the code. Other than that, there's no reason to remove it. I agree, lots of #ifs are bad. But I guess I don't see the problem. The only real important thing people were directly accessing in the ompi_group_t is the array of proc pointers. Indexing into them could be done with a static inline function that just has slightly different time complexity based on compile options. Static inline function that just does an index in the group proc pointer would have almost no added overhead (none if the compiler doesn't suck). Brian
Re: [OMPI devel] [RFC] Sparse group implementation
I have an even bigger objection than Rich. It's near impossible to measure the latency impact of something like this, but it does have an additive effect. It doesn't make sense to have all that code in the critical path for systems where it's not needed. We should leave the compile time decision available, unless there's a very good reason (which I did not see in this e-mail) to remove it. Brian On Jul 25, 2007, at 8:00 AM, Richard Graham wrote: This is good work, so I am happy to see it come over. My initial understanding was that there would be compile time protection for this. In the absence of this, I think we need to see performance data on a variety of communication substrates. It seems like a latency measurement is, perhaps, the most sensitive measurement, and should be sufficient to see the impact on the critical path. Rich On 7/25/07 9:04 AM, "Jeff Squyres"wrote: WHAT:Merge the sparse groups work to the trunk; get the community's opinion on one remaining issue. WHY: For large MPI jobs, it can be memory-prohibitive to fully represent dense groups; you can save a lot of space by having "sparse" representations of groups that are (for example) derived from MPI_COMM_WORLD. WHERE: Main changes are (might have missed a few in this analysis, but this is 99% of it): - Big changes in ompi/group - Moderate changes in ompi/comm - Trivial changes in ompi/mpi/c, ompi/mca/pml/[dr|ob1], ompi/mca/comm/sm WHEN:The code is ready now in /tmp/sparse-groups (it is passing all Intel and IBM tests; see below). TIMEOUT: We'll merge all the work to the trunk and enable the possibility of using sparse groups (dense will still be the default, of course) if no one objects by COB Tuesday, 31 Aug 2007. = === === The sparse groups work from U. Houston is ready to be brought into the trunk. It is built on the premise that for very large MPI jobs, you don't want to fully represent MPI groups in memory if you don't have to. Specifically, you can save memory for communicators/groups that are derived from MPI_COMM_WORLD by representing them in a sparse storage format. The sparse groups work introduces 3 new ompi_group_t storage formats: * dense (i.e., what it is today -- an array of ompi_proc_t pointers) * sparse, where the current group's contents are based on the group from which the child was derived: 1. range: a series of (offset,length) tuples 2. stride: a single (first,stride,last) tuple 3. bitmap: a bitmap Currently, all the sparse groups code must be enabled by configuring with --enable-sparse-groups. If sparse groups are enabled, each MPI group that is created will automatically use the storage format that takes the least amount of space. The Big Issue with the sparse groups is that getting a pointer to an ompi_proc_t may no longer be an O(1) operation -- you can't just access it via comm->group->procs[i]. Instead, you have to call a macro. If sparse groups are enabled, this will call a function to do the resolution and return the proc pointer. If sparse groups are not enabled, the macro currently resolves to group->procs[i]. When sparse groups are enabled, looking up a proc pointer is an iterative process; you have to traverse up through one or more parent groups until you reach a "dense" group to get the pointer. So the time to lookup the proc pointer (essentially) depends on the group and how many times it has been derived from a parent group (there are corner cases where the lookup time is shorter). Lookup times in MPI_COMM_WORLD are O(1) because it is dense, but it now requires an inline function call rather than directly accessing the data structure (see below). Note that the code in /tmp/sparse-groups is currently out-of-date with respect to the orte and opal trees due to SVN merge mistakes and problems. Testing has occurred by copying full orte/opal branches from a trunk checkout into the sparse group tree, so we're confident that it's compatible with the trunk. Full integration will occur before commiting to the trunk, of course. The proposal we have for the community is as follows: 1. Remove the --enable-sparse-groups configure option 2. Default to use only dense groups (i.e., same as today) 3. If the new MCA parameter "mpi_use_sparse_groups" is enabled, enable the use of sparse groups 4. Eliminate the current macro used for group proc lookups and instead use an inline function of the form: static inline ompi_proc_t lookup_group(ompi_group_t *group, int index) { if (group_is_dense(group)) { return group->procs[index]; } else { return sparse_group_lookup(group, index); } } *** NOTE: This design adds a single "if" in some
Re: [OMPI devel] [RFC] Sparse group implementation
This is good work, so I am happy to see it come over. My initial understanding was that there would be compile time protection for this. In the absence of this, I think we need to see performance data on a variety of communication substrates. It seems like a latency measurement is, perhaps, the most sensitive measurement, and should be sufficient to see the impact on the critical path. Rich On 7/25/07 9:04 AM, "Jeff Squyres"wrote: > WHAT:Merge the sparse groups work to the trunk; get the community's > opinion on one remaining issue. > WHY: For large MPI jobs, it can be memory-prohibitive to fully > represent dense groups; you can save a lot of space by having > "sparse" representations of groups that are (for example) > derived from MPI_COMM_WORLD. > WHERE: Main changes are (might have missed a few in this analysis, > but this is 99% of it): > - Big changes in ompi/group > - Moderate changes in ompi/comm > - Trivial changes in ompi/mpi/c, ompi/mca/pml/[dr|ob1], > ompi/mca/comm/sm > WHEN:The code is ready now in /tmp/sparse-groups (it is passing > all Intel and IBM tests; see below). > TIMEOUT: We'll merge all the work to the trunk and enable the > possibility of using sparse groups (dense will still be the > default, of course) if no one objects by COB Tuesday, 31 Aug > 2007. > > > === > > The sparse groups work from U. Houston is ready to be brought into the > trunk. It is built on the premise that for very large MPI jobs, you > don't want to fully represent MPI groups in memory if you don't have > to. Specifically, you can save memory for communicators/groups that > are derived from MPI_COMM_WORLD by representing them in a sparse > storage format. > > The sparse groups work introduces 3 new ompi_group_t storage formats: > > * dense (i.e., what it is today -- an array of ompi_proc_t pointers) > * sparse, where the current group's contents are based on the group >from which the child was derived: >1. range: a series of (offset,length) tuples >2. stride: a single (first,stride,last) tuple >3. bitmap: a bitmap > > Currently, all the sparse groups code must be enabled by configuring > with --enable-sparse-groups. If sparse groups are enabled, each MPI > group that is created will automatically use the storage format that > takes the least amount of space. > > The Big Issue with the sparse groups is that getting a pointer to an > ompi_proc_t may no longer be an O(1) operation -- you can't just > access it via comm->group->procs[i]. Instead, you have to call a > macro. If sparse groups are enabled, this will call a function to do > the resolution and return the proc pointer. If sparse groups are not > enabled, the macro currently resolves to group->procs[i]. > > When sparse groups are enabled, looking up a proc pointer is an > iterative process; you have to traverse up through one or more parent > groups until you reach a "dense" group to get the pointer. So the > time to lookup the proc pointer (essentially) depends on the group and > how many times it has been derived from a parent group (there are > corner cases where the lookup time is shorter). Lookup times in > MPI_COMM_WORLD are O(1) because it is dense, but it now requires an > inline function call rather than directly accessing the data > structure (see below). > > Note that the code in /tmp/sparse-groups is currently out-of-date with > respect to the orte and opal trees due to SVN merge mistakes and > problems. Testing has occurred by copying full orte/opal branches > from a trunk checkout into the sparse group tree, so we're confident > that it's compatible with the trunk. Full integration will occur > before commiting to the trunk, of course. > > The proposal we have for the community is as follows: > > 1. Remove the --enable-sparse-groups configure option > 2. Default to use only dense groups (i.e., same as today) > 3. If the new MCA parameter "mpi_use_sparse_groups" is enabled, enable > the use of sparse groups > 4. Eliminate the current macro used for group proc lookups and instead > use an inline function of the form: > > static inline ompi_proc_t lookup_group(ompi_group_t *group, int > index) { > if (group_is_dense(group)) { > return group->procs[index]; > } else { > return sparse_group_lookup(group, index); > } > } > > *** NOTE: This design adds a single "if" in some > performance-critical paths. If the group is sparse, it will > add a function call and the overhead to do the lookup. > If the group is dense (which will be the default), the proc > will be returned directly from the inline function. > > The rationale is that adding a single "if" (perhaps
[OMPI devel] [RFC] Sparse group implementation
WHAT:Merge the sparse groups work to the trunk; get the community's opinion on one remaining issue. WHY: For large MPI jobs, it can be memory-prohibitive to fully represent dense groups; you can save a lot of space by having "sparse" representations of groups that are (for example) derived from MPI_COMM_WORLD. WHERE: Main changes are (might have missed a few in this analysis, but this is 99% of it): - Big changes in ompi/group - Moderate changes in ompi/comm - Trivial changes in ompi/mpi/c, ompi/mca/pml/[dr|ob1], ompi/mca/comm/sm WHEN:The code is ready now in /tmp/sparse-groups (it is passing all Intel and IBM tests; see below). TIMEOUT: We'll merge all the work to the trunk and enable the possibility of using sparse groups (dense will still be the default, of course) if no one objects by COB Tuesday, 31 Aug 2007. === The sparse groups work from U. Houston is ready to be brought into the trunk. It is built on the premise that for very large MPI jobs, you don't want to fully represent MPI groups in memory if you don't have to. Specifically, you can save memory for communicators/groups that are derived from MPI_COMM_WORLD by representing them in a sparse storage format. The sparse groups work introduces 3 new ompi_group_t storage formats: * dense (i.e., what it is today -- an array of ompi_proc_t pointers) * sparse, where the current group's contents are based on the group from which the child was derived: 1. range: a series of (offset,length) tuples 2. stride: a single (first,stride,last) tuple 3. bitmap: a bitmap Currently, all the sparse groups code must be enabled by configuring with --enable-sparse-groups. If sparse groups are enabled, each MPI group that is created will automatically use the storage format that takes the least amount of space. The Big Issue with the sparse groups is that getting a pointer to an ompi_proc_t may no longer be an O(1) operation -- you can't just access it via comm->group->procs[i]. Instead, you have to call a macro. If sparse groups are enabled, this will call a function to do the resolution and return the proc pointer. If sparse groups are not enabled, the macro currently resolves to group->procs[i]. When sparse groups are enabled, looking up a proc pointer is an iterative process; you have to traverse up through one or more parent groups until you reach a "dense" group to get the pointer. So the time to lookup the proc pointer (essentially) depends on the group and how many times it has been derived from a parent group (there are corner cases where the lookup time is shorter). Lookup times in MPI_COMM_WORLD are O(1) because it is dense, but it now requires an inline function call rather than directly accessing the data structure (see below). Note that the code in /tmp/sparse-groups is currently out-of-date with respect to the orte and opal trees due to SVN merge mistakes and problems. Testing has occurred by copying full orte/opal branches from a trunk checkout into the sparse group tree, so we're confident that it's compatible with the trunk. Full integration will occur before commiting to the trunk, of course. The proposal we have for the community is as follows: 1. Remove the --enable-sparse-groups configure option 2. Default to use only dense groups (i.e., same as today) 3. If the new MCA parameter "mpi_use_sparse_groups" is enabled, enable the use of sparse groups 4. Eliminate the current macro used for group proc lookups and instead use an inline function of the form: static inline ompi_proc_t lookup_group(ompi_group_t *group, int index) { if (group_is_dense(group)) { return group->procs[index]; } else { return sparse_group_lookup(group, index); } } *** NOTE: This design adds a single "if" in some performance-critical paths. If the group is sparse, it will add a function call and the overhead to do the lookup. If the group is dense (which will be the default), the proc will be returned directly from the inline function. The rationale is that adding a single "if" (perhaps with OPAL_[UN]LIKELY?) in a few code paths will not be a big deal. 5. Bring all these changes into the OMPI trunk and therefore into v1.3. Comments? -- Jeff Squyres Cisco Systems