Re: [OMPI devel] [OMPI svn] svn:open-mpi r19653

2008-09-29 Thread Aurélien Bouteiller
 we need to extend the existing RML function to handle the  
subsequent setting of the route to the proc itself. In the current  
dpm, we automatically assume that the route will be to a different  
job family, and hence send the routing info to the HNP. However,  
this may not be true - e.g., after a comm_spawn, there is no reason  
to route through the HNP since the job family is the same.


This is not correct. The current code in the DPM already takes care of  
the "usual" case where both ends are in the same job family; in that  
case it creates a "direct" route to the remote end (maybe it should  
just do nothing, though). This logic is pretty simple and is well  
contained in the DPM. Moving this logic to the rml should not  
basically change much: the complexity will just move from the dpm to  
the routed. The existing single dpm code already do everything we need  
for current and future use, while we might have to upgrade all the  
routed to take into account this special case. This is why I would  
advocate for the lesser effort for the exact same functionality at the  
end.


Haven't thought it all through yet, but wanted to suggest we think  
about it as we may (per the FT July discussions) need to define  
routes for things other than just DPM-related operations. Perhaps we  
should do some design discussion off-list to see what makes sense?


I'm always open to discussion. Let me know if you find this useful on  
some purpose.



Thanks
Ralph


Aurelien




On Sep 28, 2008, at 8:33 AM, Aurélien Bouteiller wrote:


Ralph,

I just split the existing static function from inside the dpm and  
exposed it to the outside world. The idea is that the dpm create  
the (opaque) port strings and therefore nows how they are supposed  
to be formated. So he is responsible for parsing them. Second, I  
split the parsing and routing in two different functions because  
sometimes you might want to parse without creating a route to the  
target.


I'll check the RML function to see if it offers similar  
functionality om monday. I have no strongly religious belief on  
wether this should be a rml or dpm function. So I don't care as  
long as I have what I need :]


Thanks for the feedback,
Aurelien


Le 27 sept. 08 à 20:53, Ralph Castain a écrit :


Yo Aurelien

Regarding the dpm including a "route_to_port" API. This actually  
is pretty close to being an exact duplicate of an already existing  
function in the RML that takes a URI as it's input, parses it to  
separate the proc name and the contact info, sets the contact info  
into the OOB, sets the route to that proc, and returns the proc  
name to the caller. Take a look at orte/mca/rml/base/ 
rml_base_contact.c.


All we need to do is add the logic to that function so that, if  
the target proc is not in our job family, we update the route and  
contact info in the HNP instead of locally.


This would keep all the "setting_route_to_proc" functionality in  
one place, instead of duplicating it in the dpm, thus making  
maintenance much easier.


Make sense?
Ralph


On Sep 27, 2008, at 7:22 AM, boute...@osl.iu.edu wrote:


Author: bouteill
Date: 2008-09-27 09:22:32 EDT (Sat, 27 Sep 2008)
New Revision: 19653
URL: https://svn.open-mpi.org/trac/ompi/changeset/19653

Log:
Add functions to access the opaque port_string and to add routes  
to a remote port. This is usefull for FT, but could also turn  
usefull when considering MPI3 extentions to the MPI2 dynamics.






Text files modified:
trunk/ompi/mca/dpm/base/base.h  | 3 +
trunk/ompi/mca/dpm/base/dpm_base_null_fns.c |12 
trunk/ompi/mca/dpm/base/dpm_base_open.c | 2
trunk/ompi/mca/dpm/dpm.h|20 +++
trunk/ompi/mca/dpm/orte/dpm_orte.c  |   114 ++ 
+++--

5 files changed, 99 insertions(+), 52 deletions(-)

Modified: trunk/ompi/mca/dpm/base/base.h
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
===

--- trunk/ompi/mca/dpm/base/base.h  (original)
+++ trunk/ompi/mca/dpm/base/base.h	2008-09-27 09:22:32 EDT (Sat,  
27 Sep 2008)

@@ -92,6 +92,9 @@
int ompi_dpm_base_null_dyn_finalize (void);
void ompi_dpm_base_null_mark_dyncomm (ompi_communicator_t *comm);
int ompi_dpm_base_null_open_port(char *port_name, orte_rml_tag_t  
given_tag);

+int ompi_dpm_base_null_parse_port(char *port_name,
+  orte_process_name_t *rproc,  
orte_rml_tag_t *tag);
+int ompi_dpm_base_null_route_to_port(char *rml_uri,  
orte_process_name_t *rproc);

int ompi_dpm_base_null_close_port(char *port_name);

/* useful globals */

Modified: trunk/ompi/mca/dpm/base/dpm_base_null_fns.c
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 
===

--- trunk/ompi/mca/dpm/base/dpm_base_null_fns.c (original)
+++ trunk/ompi/mca/dpm/base/dpm_base_null_fns.c	2008-09-27  
09:22:32 EDT (Sat, 27 Sep 2008)

@@ -36,6 +36,7 @@
{
 return OMPI_ERR_NO

Re: [OMPI devel] [OMPI svn] svn:open-mpi r19653

2008-09-28 Thread Ralph Castain

No religious fervor on my end either :-)

I have in my notes from the July meeting some thoughts/talk about  
needing a function like this, but (probably just in notes from  
thinking) some question as to the best place to put it. The dpm does  
create the port string, so having a "route_to_port" would make some  
sense if we are passing it the full port string (I couldn't tell from  
the commit if that is what you were planning to do, but it sounds like  
that is where you were heading). So breaking the port string into its  
component parts in that function would make sense. It also makes a lot  
of sense to provide a way to parse the port without having to create  
the route.


Perhaps this stems from my own ignorance, but my thought was that  
perhaps we need to extend the existing RML function to handle the  
subsequent setting of the route to the proc itself. In the current  
dpm, we automatically assume that the route will be to a different job  
family, and hence send the routing info to the HNP. However, this may  
not be true - e.g., after a comm_spawn, there is no reason to route  
through the HNP since the job family is the same.


Rather than implement all that logic in the dpm, it seems to me that  
perhaps we would be better served to let the dpm solely deal with  
parsing the port string into its component parts, and then let the RML  
or routed framework figure out what to do with each of those parts. So  
the dpm function would, for example, break out the tag and save/return/ 
ignore it, then break off the proc's and the HNP's URIs and send them  
to the RML or routed "define_route_to_proc". That function would look  
at the proc to see if this is in the our job family - if so, it would  
just pass it to the appropriate routed module for handling. If it is  
in a different job family, then it sends the HNP info to its own HNP,  
passes the info to the routed module, etc.


Haven't thought it all through yet, but wanted to suggest we think  
about it as we may (per the FT July discussions) need to define routes  
for things other than just DPM-related operations. Perhaps we should  
do some design discussion off-list to see what makes sense?


Thanks
Ralph


On Sep 28, 2008, at 8:33 AM, Aurélien Bouteiller wrote:


Ralph,

I just split the existing static function from inside the dpm and  
exposed it to the outside world. The idea is that the dpm create the  
(opaque) port strings and therefore nows how they are supposed to be  
formated. So he is responsible for parsing them. Second, I split the  
parsing and routing in two different functions because sometimes you  
might want to parse without creating a route to the target.


I'll check the RML function to see if it offers similar  
functionality om monday. I have no strongly religious belief on  
wether this should be a rml or dpm function. So I don't care as long  
as I have what I need :]


Thanks for the feedback,
Aurelien


Le 27 sept. 08 à 20:53, Ralph Castain a écrit :


Yo Aurelien

Regarding the dpm including a "route_to_port" API. This actually is  
pretty close to being an exact duplicate of an already existing  
function in the RML that takes a URI as it's input, parses it to  
separate the proc name and the contact info, sets the contact info  
into the OOB, sets the route to that proc, and returns the proc  
name to the caller. Take a look at orte/mca/rml/base/ 
rml_base_contact.c.


All we need to do is add the logic to that function so that, if the  
target proc is not in our job family, we update the route and  
contact info in the HNP instead of locally.


This would keep all the "setting_route_to_proc" functionality in  
one place, instead of duplicating it in the dpm, thus making  
maintenance much easier.


Make sense?
Ralph


On Sep 27, 2008, at 7:22 AM, boute...@osl.iu.edu wrote:


Author: bouteill
Date: 2008-09-27 09:22:32 EDT (Sat, 27 Sep 2008)
New Revision: 19653
URL: https://svn.open-mpi.org/trac/ompi/changeset/19653

Log:
Add functions to access the opaque port_string and to add routes  
to a remote port. This is usefull for FT, but could also turn  
usefull when considering MPI3 extentions to the MPI2 dynamics.






Text files modified:
trunk/ompi/mca/dpm/base/base.h  | 3 +
trunk/ompi/mca/dpm/base/dpm_base_null_fns.c |12 
trunk/ompi/mca/dpm/base/dpm_base_open.c | 2
trunk/ompi/mca/dpm/dpm.h|20 +++
trunk/ompi/mca/dpm/orte/dpm_orte.c  |   114 +++ 
++--

5 files changed, 99 insertions(+), 52 deletions(-)

Modified: trunk/ompi/mca/dpm/base/base.h
= 
= 
= 
= 
= 
= 
= 
= 
= 
= 


--- trunk/ompi/mca/dpm/base/base.h  (original)
+++ trunk/ompi/mca/dpm/base/base.h	2008-09-27 09:22:32 EDT (Sat,  
27 Sep 2008)

@@ -92,6 +92,9 @@
int ompi_dpm_base_null_dyn_finalize (void);
void ompi_dpm_base_null_mark_dyncomm (ompi_communicator_t *comm);
int ompi_dpm_

Re: [OMPI devel] [OMPI svn] svn:open-mpi r19653

2008-09-28 Thread Aurélien Bouteiller

Ralph,

I just split the existing static function from inside the dpm and  
exposed it to the outside world. The idea is that the dpm create the  
(opaque) port strings and therefore nows how they are supposed to be  
formated. So he is responsible for parsing them. Second, I split the  
parsing and routing in two different functions because sometimes you  
might want to parse without creating a route to the target.


I'll check the RML function to see if it offers similar functionality  
om monday. I have no strongly religious belief on wether this should  
be a rml or dpm function. So I don't care as long as I have what I  
need :]


Thanks for the feedback,
Aurelien


Le 27 sept. 08 à 20:53, Ralph Castain a écrit :


Yo Aurelien

Regarding the dpm including a "route_to_port" API. This actually is  
pretty close to being an exact duplicate of an already existing  
function in the RML that takes a URI as it's input, parses it to  
separate the proc name and the contact info, sets the contact info  
into the OOB, sets the route to that proc, and returns the proc name  
to the caller. Take a look at orte/mca/rml/base/rml_base_contact.c.


All we need to do is add the logic to that function so that, if the  
target proc is not in our job family, we update the route and  
contact info in the HNP instead of locally.


This would keep all the "setting_route_to_proc" functionality in one  
place, instead of duplicating it in the dpm, thus making maintenance  
much easier.


Make sense?
Ralph


On Sep 27, 2008, at 7:22 AM, boute...@osl.iu.edu wrote:


Author: bouteill
Date: 2008-09-27 09:22:32 EDT (Sat, 27 Sep 2008)
New Revision: 19653
URL: https://svn.open-mpi.org/trac/ompi/changeset/19653

Log:
Add functions to access the opaque port_string and to add routes to  
a remote port. This is usefull for FT, but could also turn usefull  
when considering MPI3 extentions to the MPI2 dynamics.






Text files modified:
 trunk/ompi/mca/dpm/base/base.h  | 3 +
 trunk/ompi/mca/dpm/base/dpm_base_null_fns.c |12 
 trunk/ompi/mca/dpm/base/dpm_base_open.c | 2
 trunk/ompi/mca/dpm/dpm.h|20 +++
 trunk/ompi/mca/dpm/orte/dpm_orte.c  |   114 +++ 
++--

 5 files changed, 99 insertions(+), 52 deletions(-)

Modified: trunk/ompi/mca/dpm/base/base.h
= 
= 
= 
= 
= 
= 
= 
= 
= 
=

--- trunk/ompi/mca/dpm/base/base.h  (original)
+++ trunk/ompi/mca/dpm/base/base.h	2008-09-27 09:22:32 EDT (Sat, 27  
Sep 2008)

@@ -92,6 +92,9 @@
int ompi_dpm_base_null_dyn_finalize (void);
void ompi_dpm_base_null_mark_dyncomm (ompi_communicator_t *comm);
int ompi_dpm_base_null_open_port(char *port_name, orte_rml_tag_t  
given_tag);

+int ompi_dpm_base_null_parse_port(char *port_name,
+  orte_process_name_t *rproc,  
orte_rml_tag_t *tag);
+int ompi_dpm_base_null_route_to_port(char *rml_uri,  
orte_process_name_t *rproc);

int ompi_dpm_base_null_close_port(char *port_name);

/* useful globals */

Modified: trunk/ompi/mca/dpm/base/dpm_base_null_fns.c
= 
= 
= 
= 
= 
= 
= 
= 
= 
=

--- trunk/ompi/mca/dpm/base/dpm_base_null_fns.c (original)
+++ trunk/ompi/mca/dpm/base/dpm_base_null_fns.c	2008-09-27 09:22:32  
EDT (Sat, 27 Sep 2008)

@@ -36,6 +36,7 @@
{
   return OMPI_ERR_NOT_SUPPORTED;
}
+
void ompi_dpm_base_null_disconnect(ompi_communicator_t *comm)
{
   return;
@@ -70,6 +71,17 @@
   return OMPI_ERR_NOT_SUPPORTED;
}

+int ompi_dpm_base_null_parse_port(char *port_name,
+  orte_process_name_t *rproc,  
orte_rml_tag_t *tag)

+{
+return OMPI_ERR_NOT_SUPPORTED;
+}
+
+int ompi_dpm_base_null_route_to_port(char *rml_uri,  
orte_process_name_t *rproc)

+{
+return OMPI_ERR_NOT_SUPPORTED;
+}
+
int ompi_dpm_base_null_close_port(char *port_name)
{
   return OMPI_ERR_NOT_SUPPORTED;

Modified: trunk/ompi/mca/dpm/base/dpm_base_open.c
= 
= 
= 
= 
= 
= 
= 
= 
= 
=

--- trunk/ompi/mca/dpm/base/dpm_base_open.c (original)
+++ trunk/ompi/mca/dpm/base/dpm_base_open.c	2008-09-27 09:22:32 EDT  
(Sat, 27 Sep 2008)

@@ -42,6 +42,8 @@
   ompi_dpm_base_null_dyn_finalize,
   ompi_dpm_base_null_mark_dyncomm,
   ompi_dpm_base_null_open_port,
+ompi_dpm_base_null_parse_port,
+ompi_dpm_base_null_route_to_port,
   ompi_dpm_base_null_close_port,
   NULL
};

Modified: trunk/ompi/mca/dpm/dpm.h
= 
= 
= 
= 
= 
= 
= 
= 
= 
=

--- trunk/ompi/mca/dpm/dpm.h(original)
+++ trunk/ompi/mca/dpm/dpm.h	2008-09-27 09:22:32 EDT (Sat, 27 Sep  
2008)

@@ -58,6 +58,8 @@
#define OMPI_RML_TAG_DYNAMIC 
OMPI_RML_TAG_BASE+200



+
+
/*
* Initialize a module
*/
@@ -116,6 +118,20 @@
typedef int (*ompi_dpm_base_module_open_port_fn_t)(char *port_name,  

Re: [OMPI devel] [OMPI svn] svn:open-mpi r19653

2008-09-27 Thread Ralph Castain

Yo Aurelien

Regarding the dpm including a "route_to_port" API. This actually is  
pretty close to being an exact duplicate of an already existing  
function in the RML that takes a URI as it's input, parses it to  
separate the proc name and the contact info, sets the contact info  
into the OOB, sets the route to that proc, and returns the proc name  
to the caller. Take a look at orte/mca/rml/base/rml_base_contact.c.


All we need to do is add the logic to that function so that, if the  
target proc is not in our job family, we update the route and contact  
info in the HNP instead of locally.


This would keep all the "setting_route_to_proc" functionality in one  
place, instead of duplicating it in the dpm, thus making maintenance  
much easier.


Make sense?
Ralph


On Sep 27, 2008, at 7:22 AM, boute...@osl.iu.edu wrote:


Author: bouteill
Date: 2008-09-27 09:22:32 EDT (Sat, 27 Sep 2008)
New Revision: 19653
URL: https://svn.open-mpi.org/trac/ompi/changeset/19653

Log:
Add functions to access the opaque port_string and to add routes to  
a remote port. This is usefull for FT, but could also turn usefull  
when considering MPI3 extentions to the MPI2 dynamics.






Text files modified:
  trunk/ompi/mca/dpm/base/base.h  | 3 +
  trunk/ompi/mca/dpm/base/dpm_base_null_fns.c |12 
  trunk/ompi/mca/dpm/base/dpm_base_open.c | 2
  trunk/ompi/mca/dpm/dpm.h|20 +++
  trunk/ompi/mca/dpm/orte/dpm_orte.c  |   114 +++ 
++--

  5 files changed, 99 insertions(+), 52 deletions(-)

Modified: trunk/ompi/mca/dpm/base/base.h
= 
= 
= 
= 
= 
= 
= 
= 
==

--- trunk/ompi/mca/dpm/base/base.h  (original)
+++ trunk/ompi/mca/dpm/base/base.h	2008-09-27 09:22:32 EDT (Sat, 27  
Sep 2008)

@@ -92,6 +92,9 @@
int ompi_dpm_base_null_dyn_finalize (void);
void ompi_dpm_base_null_mark_dyncomm (ompi_communicator_t *comm);
int ompi_dpm_base_null_open_port(char *port_name, orte_rml_tag_t  
given_tag);

+int ompi_dpm_base_null_parse_port(char *port_name,
+  orte_process_name_t *rproc,  
orte_rml_tag_t *tag);
+int ompi_dpm_base_null_route_to_port(char *rml_uri,  
orte_process_name_t *rproc);

int ompi_dpm_base_null_close_port(char *port_name);

/* useful globals */

Modified: trunk/ompi/mca/dpm/base/dpm_base_null_fns.c
= 
= 
= 
= 
= 
= 
= 
= 
==

--- trunk/ompi/mca/dpm/base/dpm_base_null_fns.c (original)
+++ trunk/ompi/mca/dpm/base/dpm_base_null_fns.c	2008-09-27 09:22:32  
EDT (Sat, 27 Sep 2008)

@@ -36,6 +36,7 @@
{
return OMPI_ERR_NOT_SUPPORTED;
}
+
void ompi_dpm_base_null_disconnect(ompi_communicator_t *comm)
{
return;
@@ -70,6 +71,17 @@
return OMPI_ERR_NOT_SUPPORTED;
}

+int ompi_dpm_base_null_parse_port(char *port_name,
+  orte_process_name_t *rproc,  
orte_rml_tag_t *tag)

+{
+return OMPI_ERR_NOT_SUPPORTED;
+}
+
+int ompi_dpm_base_null_route_to_port(char *rml_uri,  
orte_process_name_t *rproc)

+{
+return OMPI_ERR_NOT_SUPPORTED;
+}
+
int ompi_dpm_base_null_close_port(char *port_name)
{
return OMPI_ERR_NOT_SUPPORTED;

Modified: trunk/ompi/mca/dpm/base/dpm_base_open.c
= 
= 
= 
= 
= 
= 
= 
= 
==

--- trunk/ompi/mca/dpm/base/dpm_base_open.c (original)
+++ trunk/ompi/mca/dpm/base/dpm_base_open.c	2008-09-27 09:22:32 EDT  
(Sat, 27 Sep 2008)

@@ -42,6 +42,8 @@
ompi_dpm_base_null_dyn_finalize,
ompi_dpm_base_null_mark_dyncomm,
ompi_dpm_base_null_open_port,
+ompi_dpm_base_null_parse_port,
+ompi_dpm_base_null_route_to_port,
ompi_dpm_base_null_close_port,
NULL
};

Modified: trunk/ompi/mca/dpm/dpm.h
= 
= 
= 
= 
= 
= 
= 
= 
==

--- trunk/ompi/mca/dpm/dpm.h(original)
+++ trunk/ompi/mca/dpm/dpm.h	2008-09-27 09:22:32 EDT (Sat, 27 Sep  
2008)

@@ -58,6 +58,8 @@
#define OMPI_RML_TAG_DYNAMICOMPI_RML_TAG_BASE 
+200



+
+
/*
 * Initialize a module
 */
@@ -116,6 +118,20 @@
typedef int (*ompi_dpm_base_module_open_port_fn_t)(char *port_name,  
orte_rml_tag_t tag);


/*
+ * Converts an opaque port string to a RML process nane and tag.
+ */
+typedef int (*ompi_dpm_base_module_parse_port_name_t)(char  
*port_name,
+   
orte_process_name_t *rproc,
+   
orte_rml_tag_t *tag);

+
+/*
+ * Update the routed component to make sure that the RML can send  
messages to

+ * the remote port
+ */
+typedef int (*ompi_dpm_base_module_route_to_port_t)(char *rml_uri,  
orte_process_name_t *rproc);

+
+
+/*
 * Close a port
 */
typedef int (*ompi_dpm_base_module_close_port_fn_t)(char *port_name);
@@ -145,6 +161,10 @@
ompi_dpm_base_module_mark_dyncomm_fn_t  mark