Re: [PATCH] OpenSM - The DnUp routing algorithm.

2011-04-17 Thread Alex Netes

Hi Ken,

This is the first round of comments.
First of all, I noticed that there is a big amount of code that is almost
identical to UpDn. Have you considered to 'tweak' osm_ucast_updn.c instead of
creating a new file? I'm not sure that it's the best way to go, but it might
reduce the code. The question is whether this approach will keep the code
readable and clean enough.

On 16:04 Wed 23 Mar , Ken Schmidt wrote:
 This routing algorithm operates in a very similar fashion to UpDn, but
 is modified to allow optimal routing on certain network structures in
 which uplinks and CA nodes are connected to the same switch nodes. (For
 example Chinook at EMSL and RoadRunner at LANL.) In these networks the
 optimal paths between nodes connected to a single chassis would remain
 within the chassis.  However due to the uplinks being connected at the
 same level of the network as the CA nodes UpDn will not allow these paths
 to be used for communication between the CA nodes.
 
 DnUp follows the same procedure as UpDn with a few differences.  Ranking
 is based solely on the relative distance from CA nodes, any switch node
 with a CA node directly attached is assigned a rank of 0 any switch
 node without a CA node attached is assigned a rank of one greater than
 the minimum rank of their neighbors. Transitions are also reversed;
 The initial direction is down and only one transition to up
 is allowed.  There is also an option which relaxes this restriction to
 allow communication with switches nodes similar to the functionality of
 connect_roots in UpDn.
 
 ---
  include/opensm/osm_opensm.h |1 +
  include/opensm/osm_subnet.h |5 +
  man/opensm.8.in |   18 ++-
  opensm/Makefile.am  |4 +-
  opensm/main.c   |2 +-
  opensm/osm_opensm.c |6 +
  opensm/osm_subnet.c |   14 ++-
  opensm/osm_ucast_dnup.c |  467 
 +++
  8 files changed, 512 insertions(+), 5 deletions(-)
  create mode 100644 opensm/osm_ucast_dnup.c
 
 diff --git a/include/opensm/osm_opensm.h b/include/opensm/osm_opensm.h
 index 8d63111..3ebf533 100644
 --- a/include/opensm/osm_opensm.h
 +++ b/include/opensm/osm_opensm.h
 @@ -101,6 +101,7 @@ typedef enum _osm_routing_engine_type {
   OSM_ROUTING_ENGINE_TYPE_NONE = 0,
   OSM_ROUTING_ENGINE_TYPE_MINHOP,
   OSM_ROUTING_ENGINE_TYPE_UPDN,
 + OSM_ROUTING_ENGINE_TYPE_DNUP,
   OSM_ROUTING_ENGINE_TYPE_FILE,
   OSM_ROUTING_ENGINE_TYPE_FTREE,
   OSM_ROUTING_ENGINE_TYPE_LASH,
 diff --git a/include/opensm/osm_subnet.h b/include/opensm/osm_subnet.h
 index 42ae416..465f220 100644
 --- a/include/opensm/osm_subnet.h
 +++ b/include/opensm/osm_subnet.h
 @@ -236,6 +236,7 @@ typedef struct osm_subn_opt {
   struct osm_subn_opt *file_opts; /* used for update */
   uint8_t lash_start_vl;  /* starting vl to use in lash */
   uint8_t sm_sl;  /* which SL to use for SM/SA 
 communication */
 + uint8_t prune_weight;
  } osm_subn_opt_t;
  /*
  * FIELDS
 @@ -503,6 +504,10 @@ typedef struct osm_subn_opt {
  *no_clients_rereg
  *When TRUE disables clients reregistration request.
  *
 +*prune_weight
 +*when not zero, add the value to hops which should be
 +*pruned by DnUp to allow a completely connected subnet.
 +*
  * SEE ALSO
  *Subnet object
  */
 diff --git a/man/opensm.8.in b/man/opensm.8.in
 index cd3a24f..58ef291 100644
 --- a/man/opensm.8.in
 +++ b/man/opensm.8.in
 @@ -152,7 +152,7 @@ separated by commas so that specific ordering of routing 
 algorithms
  will be tried if earlier routing engines fail.  If all configured
  routing engines fail, OpenSM will always attempt to route with Min Hop
  unless 'no_fallback' is included in the list of routing engines.
 -Supported engines: minhop, updn, file, ftree, lash, dor, torus-2QoS.
 +Supported engines: minhop, updn, dnup, file, ftree, lash, dor, torus-2QoS.
  .TP
  \fB\-\-do_mesh_analysis\fR
  This option enables additional analysis for the lash routing engine to
 @@ -667,6 +667,10 @@ node, but it is constrained to ranking rules. This 
 algorithm should be chosen
  if the subnet is not a pure Fat Tree, and deadlock may occur due to a
  loop in the subnet.
  
 +3. DNUP Unicast routing algorithm - similar to UPDN but allows routing in
 +fabrics which have some Ca nodes attached closer to the roots than some 
 switch
 +nodes.
 +
  3.  Fat Tree Unicast routing algorithm - this algorithm optimizes routing
  for congestion-free shift communication pattern.
  It should be chosen if a subnet is a symmetrical or almost symmetrical
 @@ -836,6 +840,18 @@ format will be discarded.
  possible to specify CA guids; OpenSM will use the guid of the switch (if
  it exists) that connects the CA to the subnet as a root node.
  
 +Purpose of DNUP Algorithm
 +
 +The DNUP algorithm is designed to serve a similar purpose to UPDN. However
 +it is intended to work in network 

Issues with patchwork?

2011-04-17 Thread Weiny, Ira K.
I just checked patchwork and it looks like those patches which were submitted 
since ~2/25/2011 are no longer there?  This includes 3 patches which were in my 
todo list yesterday.

Does anyone know what the issue is?

Thanks,
Ira

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [opensm] routing segfault + LMC 0 routing bug?

2011-04-17 Thread Alex Netes
On 18:23 Tue 22 Mar , Albert Chu wrote:
 Hey Jim, Alex,
 
 Just hit a segfault on the main tree.  It appears patch 
 
 commit 9ddcf3419eade13bdc0a54f93930c49fe67efd63
 Author: Jim Schutt jasc...@sandia.gov
 Date:   Fri Sep 3 10:43:12 2010 -0600
 
 opensm: Avoid havoc in minhop caused by torus-2QoS persistent use of
 osm_port_t:priv.
 
 segfaults opensm on one of our systems w/ updn routing and lmc  0
 (would likely segfault dor, minhop, and maybe others too).  Our system
 has older switches that do not support enhanced port zero, thus do not
 support LMC  0.  (I imagine setting lmc_esp0 to FALSE, results in the
 same behavior.)  Subsequently even if you set LMC  0 in your opensm
 config file, there can be ports with LMC = 0 and LMC != 0 (e.g. from
 HCAs). Subsequently in alloc_ports_priv(), some ports will have priv set
 to NULL and some will not.  Because of assumptions in osm_switch.c about
 priv != NULL when lmc  0, we hit a segfault.  The issue didn't exist
 before b/c we allocated p_port-priv non-NULL no matter what.
 
 The attached patch fixes the problem w/ updn.  I haven't looked through
 all of the 2Qos code thoroughly to figure out the consequences of this
 change, so I'm just considering this a starting point for discussion.
 

Applied, thanks.

-- Alex
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] opensm: perfmgr, only set orig_lid when we have a valid port. Otherwise leave it as 0

2011-04-17 Thread Alex Netes
Hi Ira,

On 15:17 Fri 15 Apr , Ira Weiny wrote:
 
 Subject: [PATCH] opensm: perfmgr, only set orig_lid when we have a valid 
 port.  Otherwise leave it as 0
 
 
 Signed-off-by: Ira Weiny wei...@llnl.gov
 ---
  opensm/osm_perfmgr.c |4 +++-
  1 files changed, 3 insertions(+), 1 deletions(-)
 
 diff --git a/opensm/osm_perfmgr.c b/opensm/osm_perfmgr.c
 index 6a1fa63..3e1575a 100644
 --- a/opensm/osm_perfmgr.c
 +++ b/opensm/osm_perfmgr.c
 @@ -454,7 +454,9 @@ static void collect_guids(cl_map_item_t * p_map_item, 
 void *context)
 ib_switch_info_is_enhanced_port0(node-sw-
  
 switch_info));
   for (port = mon_node-esp0 ? 0 : 1; port  num_ports; port++) {
 - mon_node-port[port].orig_lid = get_base_lid(node, 
 port);
 + mon_node-port[port].orig_lid = 0;
 + if (osm_physp_is_valid(node-physp_table[port]))
 + mon_node-port[port].orig_lid = 
 get_base_lid(node, port);
   mon_node-port[port].valid = TRUE;

Shouldn't this port marked with mon_node-port[port].valid = FALSE ?


-- Alex
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] opensm: fix strtoull error handling

2011-04-17 Thread Alex Netes
On 15:19 Fri 15 Apr , Ira Weiny wrote:
 Subject: [PATCH] opensm: fix strtoull error handling
 
 The man page states that errno _may_ be returned when the string can not be
 converted.  Check that the entire string was consumed otherwise assume this is
 not a guid.
 
 Signed-off-by: Ira Weiny wei...@llnl.gov
 ---

Applied, thanks.

-- Alex
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] opensm: perfmgr, only set orig_lid when we have a valid port. Otherwise leave it as 0

2011-04-17 Thread Hal Rosenstock
On 4/15/2011 6:17 PM, Ira Weiny wrote:
 
 Subject: [PATCH] opensm: perfmgr, only set orig_lid when we have a valid 
 port.  Otherwise leave it as 0
 
 
 Signed-off-by: Ira Weiny wei...@llnl.gov
 ---
  opensm/osm_perfmgr.c |4 +++-
  1 files changed, 3 insertions(+), 1 deletions(-)
 
 diff --git a/opensm/osm_perfmgr.c b/opensm/osm_perfmgr.c
 index 6a1fa63..3e1575a 100644
 --- a/opensm/osm_perfmgr.c
 +++ b/opensm/osm_perfmgr.c
 @@ -454,7 +454,9 @@ static void collect_guids(cl_map_item_t * p_map_item, 
 void *context)
 ib_switch_info_is_enhanced_port0(node-sw-
  
 switch_info));
   for (port = mon_node-esp0 ? 0 : 1; port  num_ports; port++) {
 - mon_node-port[port].orig_lid = get_base_lid(node, 
 port);
 + mon_node-port[port].orig_lid = 0;
 + if (osm_physp_is_valid(node-physp_table[port]))
 + mon_node-port[port].orig_lid = 
 get_base_lid(node, port);
   mon_node-port[port].valid = TRUE;
   }
  

There are several other calls to get_base_lid. Is it also an issue there
too or is port always valid there ? If it's not always valid for those
cases, why not just move this check into get_base_lid ?
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] opensm: perfmgr, only set orig_lid when we have a valid port. Otherwise leave it as 0

2011-04-17 Thread Weiny, Ira K.
On Apr 17, 2011, at 8:21 AM, Hal Rosenstock wrote:

 On 4/15/2011 6:17 PM, Ira Weiny wrote:
 
 Subject: [PATCH] opensm: perfmgr, only set orig_lid when we have a valid 
 port.  Otherwise leave it as 0
 
 
 Signed-off-by: Ira Weiny wei...@llnl.gov
 ---
 opensm/osm_perfmgr.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)
 
 diff --git a/opensm/osm_perfmgr.c b/opensm/osm_perfmgr.c
 index 6a1fa63..3e1575a 100644
 --- a/opensm/osm_perfmgr.c
 +++ b/opensm/osm_perfmgr.c
 @@ -454,7 +454,9 @@ static void collect_guids(cl_map_item_t * p_map_item, 
 void *context)
ib_switch_info_is_enhanced_port0(node-sw-
 
 switch_info));
  for (port = mon_node-esp0 ? 0 : 1; port  num_ports; port++) {
 -mon_node-port[port].orig_lid = get_base_lid(node, 
 port);
 +mon_node-port[port].orig_lid = 0;
 +if (osm_physp_is_valid(node-physp_table[port]))
 +mon_node-port[port].orig_lid = 
 get_base_lid(node, port);
  mon_node-port[port].valid = TRUE;
  }
 
 
 There are several other calls to get_base_lid. Is it also an issue there
 too or is port always valid there ? If it's not always valid for those
 cases, why not just move this check into get_base_lid ?

I believe this is a special case because we are looping through the ports of a 
CA node and not all of it's ports are valid on this fabric.  Most of the other 
places where get_base_lid is called I believe the port is known to be good, 
therefore there is an assertion in there just to protect us.

I think the better way to fix this would probably be to change the perfmgr to 
track ports rather than nodes.[*]  Unfortunately this will take a significant 
amount of coding effort.  I think the best thing to do is just check if the 
physical port is valid as I did above.

As to Alex's comment I think the port should be marked invalid as well.  (I 
should have CC'ed the list on my response to him.)  I will resend the patch 
with that change.

Ira

[*] At the time I coded the perfmgr it seemed to make more sense to track nodes 
rather than ports.  In hindsight this might have been a mistake but for this 
small place I don't see a reason to re-architect it all yet.

 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] opensm: perfmgr, only set orig_lid when we have a valid port. Otherwise leave it as 0

2011-04-17 Thread Weiny, Ira K.

On Apr 17, 2011, at 8:06 AM, Alex Netes wrote:

 Hi Ira,
 
 On 15:17 Fri 15 Apr , Ira Weiny wrote:
 
 Subject: [PATCH] opensm: perfmgr, only set orig_lid when we have a valid 
 port.  Otherwise leave it as 0
 
 
 Signed-off-by: Ira Weiny wei...@llnl.gov
 ---
 opensm/osm_perfmgr.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)
 
 diff --git a/opensm/osm_perfmgr.c b/opensm/osm_perfmgr.c
 index 6a1fa63..3e1575a 100644
 --- a/opensm/osm_perfmgr.c
 +++ b/opensm/osm_perfmgr.c
 @@ -454,7 +454,9 @@ static void collect_guids(cl_map_item_t * p_map_item, 
 void *context)
ib_switch_info_is_enhanced_port0(node-sw-
 
 switch_info));
  for (port = mon_node-esp0 ? 0 : 1; port  num_ports; port++) {
 -mon_node-port[port].orig_lid = get_base_lid(node, 
 port);
 +mon_node-port[port].orig_lid = 0;
 +if (osm_physp_is_valid(node-physp_table[port]))
 +mon_node-port[port].orig_lid = 
 get_base_lid(node, port);
  mon_node-port[port].valid = TRUE;
 
 Shouldn't this port marked with mon_node-port[port].valid = FALSE ?

New patch below.


Subject: [PATCH 1/5] opensm/perfmgr: set redirect orig_lid and valid flag only 
when we have a valid port.

Signed-off-by: Ira Weiny wei...@llnl.gov
---
 opensm/osm_perfmgr.c |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/opensm/osm_perfmgr.c b/opensm/osm_perfmgr.c
index 5758587..3578e13 100644
--- a/opensm/osm_perfmgr.c
+++ b/opensm/osm_perfmgr.c
@@ -454,8 +454,12 @@ static void collect_guids(cl_map_item_t * p_map_item, void 
*context)
  ib_switch_info_is_enhanced_port0(node-sw-
   
switch_info));
for (port = mon_node-esp0 ? 0 : 1; port  num_ports; port++) {
-   mon_node-port[port].orig_lid = get_base_lid(node, 
port);
-   mon_node-port[port].valid = TRUE;
+   mon_node-port[port].orig_lid = 0;
+   mon_node-port[port].valid = FALSE;
+   if (osm_physp_is_valid(node-physp_table[port])) {
+   mon_node-port[port].orig_lid = 
get_base_lid(node, port);
+   mon_node-port[port].valid = TRUE;
+   }
}
 
cl_qmap_insert(pm-monitored_map, node_guid,
-- 
1.7.1


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html