Re: [OMPI devel] [OMPI svn] svn:open-mpi r19653
we need to extend the existing RML function to handle the subsequent setting of the route to the proc itself. In the current dpm, we automatically assume that the route will be to a different job family, and hence send the routing info to the HNP. However, this may not be true - e.g., after a comm_spawn, there is no reason to route through the HNP since the job family is the same. This is not correct. The current code in the DPM already takes care of the "usual" case where both ends are in the same job family; in that case it creates a "direct" route to the remote end (maybe it should just do nothing, though). This logic is pretty simple and is well contained in the DPM. Moving this logic to the rml should not basically change much: the complexity will just move from the dpm to the routed. The existing single dpm code already do everything we need for current and future use, while we might have to upgrade all the routed to take into account this special case. This is why I would advocate for the lesser effort for the exact same functionality at the end. Haven't thought it all through yet, but wanted to suggest we think about it as we may (per the FT July discussions) need to define routes for things other than just DPM-related operations. Perhaps we should do some design discussion off-list to see what makes sense? I'm always open to discussion. Let me know if you find this useful on some purpose. Thanks Ralph Aurelien On Sep 28, 2008, at 8:33 AM, Aurélien Bouteiller wrote: Ralph, I just split the existing static function from inside the dpm and exposed it to the outside world. The idea is that the dpm create the (opaque) port strings and therefore nows how they are supposed to be formated. So he is responsible for parsing them. Second, I split the parsing and routing in two different functions because sometimes you might want to parse without creating a route to the target. I'll check the RML function to see if it offers similar functionality om monday. I have no strongly religious belief on wether this should be a rml or dpm function. So I don't care as long as I have what I need :] Thanks for the feedback, Aurelien Le 27 sept. 08 à 20:53, Ralph Castain a écrit : Yo Aurelien Regarding the dpm including a "route_to_port" API. This actually is pretty close to being an exact duplicate of an already existing function in the RML that takes a URI as it's input, parses it to separate the proc name and the contact info, sets the contact info into the OOB, sets the route to that proc, and returns the proc name to the caller. Take a look at orte/mca/rml/base/ rml_base_contact.c. All we need to do is add the logic to that function so that, if the target proc is not in our job family, we update the route and contact info in the HNP instead of locally. This would keep all the "setting_route_to_proc" functionality in one place, instead of duplicating it in the dpm, thus making maintenance much easier. Make sense? Ralph On Sep 27, 2008, at 7:22 AM, boute...@osl.iu.edu wrote: Author: bouteill Date: 2008-09-27 09:22:32 EDT (Sat, 27 Sep 2008) New Revision: 19653 URL: https://svn.open-mpi.org/trac/ompi/changeset/19653 Log: Add functions to access the opaque port_string and to add routes to a remote port. This is usefull for FT, but could also turn usefull when considering MPI3 extentions to the MPI2 dynamics. Text files modified: trunk/ompi/mca/dpm/base/base.h | 3 + trunk/ompi/mca/dpm/base/dpm_base_null_fns.c |12 trunk/ompi/mca/dpm/base/dpm_base_open.c | 2 trunk/ompi/mca/dpm/dpm.h|20 +++ trunk/ompi/mca/dpm/orte/dpm_orte.c | 114 ++ +++-- 5 files changed, 99 insertions(+), 52 deletions(-) Modified: trunk/ompi/mca/dpm/base/base.h = = = = = = = = = = = === --- trunk/ompi/mca/dpm/base/base.h (original) +++ trunk/ompi/mca/dpm/base/base.h 2008-09-27 09:22:32 EDT (Sat, 27 Sep 2008) @@ -92,6 +92,9 @@ int ompi_dpm_base_null_dyn_finalize (void); void ompi_dpm_base_null_mark_dyncomm (ompi_communicator_t *comm); int ompi_dpm_base_null_open_port(char *port_name, orte_rml_tag_t given_tag); +int ompi_dpm_base_null_parse_port(char *port_name, + orte_process_name_t *rproc, orte_rml_tag_t *tag); +int ompi_dpm_base_null_route_to_port(char *rml_uri, orte_process_name_t *rproc); int ompi_dpm_base_null_close_port(char *port_name); /* useful globals */ Modified: trunk/ompi/mca/dpm/base/dpm_base_null_fns.c = = = = = = = = = = = === --- trunk/ompi/mca/dpm/base/dpm_base_null_fns.c (original) +++ trunk/ompi/mca/dpm/base/dpm_base_null_fns.c 2008-09-27 09:22:32 EDT (Sat, 27 Sep 2008) @@ -36,6 +36,7 @@ { return OMPI_ERR_NO
Re: [OMPI devel] [OMPI svn] svn:open-mpi r19653
No religious fervor on my end either :-) I have in my notes from the July meeting some thoughts/talk about needing a function like this, but (probably just in notes from thinking) some question as to the best place to put it. The dpm does create the port string, so having a "route_to_port" would make some sense if we are passing it the full port string (I couldn't tell from the commit if that is what you were planning to do, but it sounds like that is where you were heading). So breaking the port string into its component parts in that function would make sense. It also makes a lot of sense to provide a way to parse the port without having to create the route. Perhaps this stems from my own ignorance, but my thought was that perhaps we need to extend the existing RML function to handle the subsequent setting of the route to the proc itself. In the current dpm, we automatically assume that the route will be to a different job family, and hence send the routing info to the HNP. However, this may not be true - e.g., after a comm_spawn, there is no reason to route through the HNP since the job family is the same. Rather than implement all that logic in the dpm, it seems to me that perhaps we would be better served to let the dpm solely deal with parsing the port string into its component parts, and then let the RML or routed framework figure out what to do with each of those parts. So the dpm function would, for example, break out the tag and save/return/ ignore it, then break off the proc's and the HNP's URIs and send them to the RML or routed "define_route_to_proc". That function would look at the proc to see if this is in the our job family - if so, it would just pass it to the appropriate routed module for handling. If it is in a different job family, then it sends the HNP info to its own HNP, passes the info to the routed module, etc. Haven't thought it all through yet, but wanted to suggest we think about it as we may (per the FT July discussions) need to define routes for things other than just DPM-related operations. Perhaps we should do some design discussion off-list to see what makes sense? Thanks Ralph On Sep 28, 2008, at 8:33 AM, Aurélien Bouteiller wrote: Ralph, I just split the existing static function from inside the dpm and exposed it to the outside world. The idea is that the dpm create the (opaque) port strings and therefore nows how they are supposed to be formated. So he is responsible for parsing them. Second, I split the parsing and routing in two different functions because sometimes you might want to parse without creating a route to the target. I'll check the RML function to see if it offers similar functionality om monday. I have no strongly religious belief on wether this should be a rml or dpm function. So I don't care as long as I have what I need :] Thanks for the feedback, Aurelien Le 27 sept. 08 à 20:53, Ralph Castain a écrit : Yo Aurelien Regarding the dpm including a "route_to_port" API. This actually is pretty close to being an exact duplicate of an already existing function in the RML that takes a URI as it's input, parses it to separate the proc name and the contact info, sets the contact info into the OOB, sets the route to that proc, and returns the proc name to the caller. Take a look at orte/mca/rml/base/ rml_base_contact.c. All we need to do is add the logic to that function so that, if the target proc is not in our job family, we update the route and contact info in the HNP instead of locally. This would keep all the "setting_route_to_proc" functionality in one place, instead of duplicating it in the dpm, thus making maintenance much easier. Make sense? Ralph On Sep 27, 2008, at 7:22 AM, boute...@osl.iu.edu wrote: Author: bouteill Date: 2008-09-27 09:22:32 EDT (Sat, 27 Sep 2008) New Revision: 19653 URL: https://svn.open-mpi.org/trac/ompi/changeset/19653 Log: Add functions to access the opaque port_string and to add routes to a remote port. This is usefull for FT, but could also turn usefull when considering MPI3 extentions to the MPI2 dynamics. Text files modified: trunk/ompi/mca/dpm/base/base.h | 3 + trunk/ompi/mca/dpm/base/dpm_base_null_fns.c |12 trunk/ompi/mca/dpm/base/dpm_base_open.c | 2 trunk/ompi/mca/dpm/dpm.h|20 +++ trunk/ompi/mca/dpm/orte/dpm_orte.c | 114 +++ ++-- 5 files changed, 99 insertions(+), 52 deletions(-) Modified: trunk/ompi/mca/dpm/base/base.h = = = = = = = = = = --- trunk/ompi/mca/dpm/base/base.h (original) +++ trunk/ompi/mca/dpm/base/base.h 2008-09-27 09:22:32 EDT (Sat, 27 Sep 2008) @@ -92,6 +92,9 @@ int ompi_dpm_base_null_dyn_finalize (void); void ompi_dpm_base_null_mark_dyncomm (ompi_communicator_t *comm); int ompi_dpm_
Re: [OMPI devel] [OMPI svn] svn:open-mpi r19653
Ralph, I just split the existing static function from inside the dpm and exposed it to the outside world. The idea is that the dpm create the (opaque) port strings and therefore nows how they are supposed to be formated. So he is responsible for parsing them. Second, I split the parsing and routing in two different functions because sometimes you might want to parse without creating a route to the target. I'll check the RML function to see if it offers similar functionality om monday. I have no strongly religious belief on wether this should be a rml or dpm function. So I don't care as long as I have what I need :] Thanks for the feedback, Aurelien Le 27 sept. 08 à 20:53, Ralph Castain a écrit : Yo Aurelien Regarding the dpm including a "route_to_port" API. This actually is pretty close to being an exact duplicate of an already existing function in the RML that takes a URI as it's input, parses it to separate the proc name and the contact info, sets the contact info into the OOB, sets the route to that proc, and returns the proc name to the caller. Take a look at orte/mca/rml/base/rml_base_contact.c. All we need to do is add the logic to that function so that, if the target proc is not in our job family, we update the route and contact info in the HNP instead of locally. This would keep all the "setting_route_to_proc" functionality in one place, instead of duplicating it in the dpm, thus making maintenance much easier. Make sense? Ralph On Sep 27, 2008, at 7:22 AM, boute...@osl.iu.edu wrote: Author: bouteill Date: 2008-09-27 09:22:32 EDT (Sat, 27 Sep 2008) New Revision: 19653 URL: https://svn.open-mpi.org/trac/ompi/changeset/19653 Log: Add functions to access the opaque port_string and to add routes to a remote port. This is usefull for FT, but could also turn usefull when considering MPI3 extentions to the MPI2 dynamics. Text files modified: trunk/ompi/mca/dpm/base/base.h | 3 + trunk/ompi/mca/dpm/base/dpm_base_null_fns.c |12 trunk/ompi/mca/dpm/base/dpm_base_open.c | 2 trunk/ompi/mca/dpm/dpm.h|20 +++ trunk/ompi/mca/dpm/orte/dpm_orte.c | 114 +++ ++-- 5 files changed, 99 insertions(+), 52 deletions(-) Modified: trunk/ompi/mca/dpm/base/base.h = = = = = = = = = = --- trunk/ompi/mca/dpm/base/base.h (original) +++ trunk/ompi/mca/dpm/base/base.h 2008-09-27 09:22:32 EDT (Sat, 27 Sep 2008) @@ -92,6 +92,9 @@ int ompi_dpm_base_null_dyn_finalize (void); void ompi_dpm_base_null_mark_dyncomm (ompi_communicator_t *comm); int ompi_dpm_base_null_open_port(char *port_name, orte_rml_tag_t given_tag); +int ompi_dpm_base_null_parse_port(char *port_name, + orte_process_name_t *rproc, orte_rml_tag_t *tag); +int ompi_dpm_base_null_route_to_port(char *rml_uri, orte_process_name_t *rproc); int ompi_dpm_base_null_close_port(char *port_name); /* useful globals */ Modified: trunk/ompi/mca/dpm/base/dpm_base_null_fns.c = = = = = = = = = = --- trunk/ompi/mca/dpm/base/dpm_base_null_fns.c (original) +++ trunk/ompi/mca/dpm/base/dpm_base_null_fns.c 2008-09-27 09:22:32 EDT (Sat, 27 Sep 2008) @@ -36,6 +36,7 @@ { return OMPI_ERR_NOT_SUPPORTED; } + void ompi_dpm_base_null_disconnect(ompi_communicator_t *comm) { return; @@ -70,6 +71,17 @@ return OMPI_ERR_NOT_SUPPORTED; } +int ompi_dpm_base_null_parse_port(char *port_name, + orte_process_name_t *rproc, orte_rml_tag_t *tag) +{ +return OMPI_ERR_NOT_SUPPORTED; +} + +int ompi_dpm_base_null_route_to_port(char *rml_uri, orte_process_name_t *rproc) +{ +return OMPI_ERR_NOT_SUPPORTED; +} + int ompi_dpm_base_null_close_port(char *port_name) { return OMPI_ERR_NOT_SUPPORTED; Modified: trunk/ompi/mca/dpm/base/dpm_base_open.c = = = = = = = = = = --- trunk/ompi/mca/dpm/base/dpm_base_open.c (original) +++ trunk/ompi/mca/dpm/base/dpm_base_open.c 2008-09-27 09:22:32 EDT (Sat, 27 Sep 2008) @@ -42,6 +42,8 @@ ompi_dpm_base_null_dyn_finalize, ompi_dpm_base_null_mark_dyncomm, ompi_dpm_base_null_open_port, +ompi_dpm_base_null_parse_port, +ompi_dpm_base_null_route_to_port, ompi_dpm_base_null_close_port, NULL }; Modified: trunk/ompi/mca/dpm/dpm.h = = = = = = = = = = --- trunk/ompi/mca/dpm/dpm.h(original) +++ trunk/ompi/mca/dpm/dpm.h 2008-09-27 09:22:32 EDT (Sat, 27 Sep 2008) @@ -58,6 +58,8 @@ #define OMPI_RML_TAG_DYNAMIC OMPI_RML_TAG_BASE+200 + + /* * Initialize a module */ @@ -116,6 +118,20 @@ typedef int (*ompi_dpm_base_module_open_port_fn_t)(char *port_name,
Re: [OMPI devel] [OMPI svn] svn:open-mpi r19653
Yo Aurelien Regarding the dpm including a "route_to_port" API. This actually is pretty close to being an exact duplicate of an already existing function in the RML that takes a URI as it's input, parses it to separate the proc name and the contact info, sets the contact info into the OOB, sets the route to that proc, and returns the proc name to the caller. Take a look at orte/mca/rml/base/rml_base_contact.c. All we need to do is add the logic to that function so that, if the target proc is not in our job family, we update the route and contact info in the HNP instead of locally. This would keep all the "setting_route_to_proc" functionality in one place, instead of duplicating it in the dpm, thus making maintenance much easier. Make sense? Ralph On Sep 27, 2008, at 7:22 AM, boute...@osl.iu.edu wrote: Author: bouteill Date: 2008-09-27 09:22:32 EDT (Sat, 27 Sep 2008) New Revision: 19653 URL: https://svn.open-mpi.org/trac/ompi/changeset/19653 Log: Add functions to access the opaque port_string and to add routes to a remote port. This is usefull for FT, but could also turn usefull when considering MPI3 extentions to the MPI2 dynamics. Text files modified: trunk/ompi/mca/dpm/base/base.h | 3 + trunk/ompi/mca/dpm/base/dpm_base_null_fns.c |12 trunk/ompi/mca/dpm/base/dpm_base_open.c | 2 trunk/ompi/mca/dpm/dpm.h|20 +++ trunk/ompi/mca/dpm/orte/dpm_orte.c | 114 +++ ++-- 5 files changed, 99 insertions(+), 52 deletions(-) Modified: trunk/ompi/mca/dpm/base/base.h = = = = = = = = == --- trunk/ompi/mca/dpm/base/base.h (original) +++ trunk/ompi/mca/dpm/base/base.h 2008-09-27 09:22:32 EDT (Sat, 27 Sep 2008) @@ -92,6 +92,9 @@ int ompi_dpm_base_null_dyn_finalize (void); void ompi_dpm_base_null_mark_dyncomm (ompi_communicator_t *comm); int ompi_dpm_base_null_open_port(char *port_name, orte_rml_tag_t given_tag); +int ompi_dpm_base_null_parse_port(char *port_name, + orte_process_name_t *rproc, orte_rml_tag_t *tag); +int ompi_dpm_base_null_route_to_port(char *rml_uri, orte_process_name_t *rproc); int ompi_dpm_base_null_close_port(char *port_name); /* useful globals */ Modified: trunk/ompi/mca/dpm/base/dpm_base_null_fns.c = = = = = = = = == --- trunk/ompi/mca/dpm/base/dpm_base_null_fns.c (original) +++ trunk/ompi/mca/dpm/base/dpm_base_null_fns.c 2008-09-27 09:22:32 EDT (Sat, 27 Sep 2008) @@ -36,6 +36,7 @@ { return OMPI_ERR_NOT_SUPPORTED; } + void ompi_dpm_base_null_disconnect(ompi_communicator_t *comm) { return; @@ -70,6 +71,17 @@ return OMPI_ERR_NOT_SUPPORTED; } +int ompi_dpm_base_null_parse_port(char *port_name, + orte_process_name_t *rproc, orte_rml_tag_t *tag) +{ +return OMPI_ERR_NOT_SUPPORTED; +} + +int ompi_dpm_base_null_route_to_port(char *rml_uri, orte_process_name_t *rproc) +{ +return OMPI_ERR_NOT_SUPPORTED; +} + int ompi_dpm_base_null_close_port(char *port_name) { return OMPI_ERR_NOT_SUPPORTED; Modified: trunk/ompi/mca/dpm/base/dpm_base_open.c = = = = = = = = == --- trunk/ompi/mca/dpm/base/dpm_base_open.c (original) +++ trunk/ompi/mca/dpm/base/dpm_base_open.c 2008-09-27 09:22:32 EDT (Sat, 27 Sep 2008) @@ -42,6 +42,8 @@ ompi_dpm_base_null_dyn_finalize, ompi_dpm_base_null_mark_dyncomm, ompi_dpm_base_null_open_port, +ompi_dpm_base_null_parse_port, +ompi_dpm_base_null_route_to_port, ompi_dpm_base_null_close_port, NULL }; Modified: trunk/ompi/mca/dpm/dpm.h = = = = = = = = == --- trunk/ompi/mca/dpm/dpm.h(original) +++ trunk/ompi/mca/dpm/dpm.h 2008-09-27 09:22:32 EDT (Sat, 27 Sep 2008) @@ -58,6 +58,8 @@ #define OMPI_RML_TAG_DYNAMICOMPI_RML_TAG_BASE +200 + + /* * Initialize a module */ @@ -116,6 +118,20 @@ typedef int (*ompi_dpm_base_module_open_port_fn_t)(char *port_name, orte_rml_tag_t tag); /* + * Converts an opaque port string to a RML process nane and tag. + */ +typedef int (*ompi_dpm_base_module_parse_port_name_t)(char *port_name, + orte_process_name_t *rproc, + orte_rml_tag_t *tag); + +/* + * Update the routed component to make sure that the RML can send messages to + * the remote port + */ +typedef int (*ompi_dpm_base_module_route_to_port_t)(char *rml_uri, orte_process_name_t *rproc); + + +/* * Close a port */ typedef int (*ompi_dpm_base_module_close_port_fn_t)(char *port_name); @@ -145,6 +161,10 @@ ompi_dpm_base_module_mark_dyncomm_fn_t mark