Re: [EXTERNAL] [PATCH opensm] osm_sa_path_record.c: path_sl may return SL different from requested SL

2013-12-11 Thread Jim Schutt
On 12/11/2013 11:36 AM, Hal Rosenstock wrote:
 From: Vladimir Koushnir vladim...@mellanox.com

Nice catch!

Acked-by: Jim Schutt jasc...@sandia.gov

 
 Signed-off-by: Vladimir Koushnir vladim...@mellanox.com
 Signed-off-by: Hal Rosenstock h...@mellanox.com
 ---
  opensm/osm_sa_path_record.c |   20 +++-
  1 files changed, 19 insertions(+), 1 deletions(-)
 
 diff --git a/opensm/osm_sa_path_record.c b/opensm/osm_sa_path_record.c
 index d2ff93b..8384ece 100644
 --- a/opensm/osm_sa_path_record.c
 +++ b/opensm/osm_sa_path_record.c
 @@ -839,10 +839,28 @@ static ib_api_status_t pr_rcv_get_path_parms(IN 
 osm_sa_t * sa,
* send the currently computed SL value as a hint and let the routing
* engine override it.
*/
 - if (p_re  p_re-path_sl)
 + if (p_re  p_re-path_sl) {
 + uint8_t pr_sl;
 + pr_sl = sl;
 +
   sl = p_re-path_sl(p_re-context, sl,
  cl_hton16(src_lid_ho), 
 cl_hton16(dest_lid_ho));
  
 + if ((comp_mask  IB_PR_COMPMASK_SL)  (sl != pr_sl)) {
 + OSM_LOG(sa-p_log, OSM_LOG_ERROR, ERR 1F2A: 
 + Requested SL (%u) doesn't match SL calculated
 + by routing engine (%u) 
 + [%s port %d - %s port %d]\n,
 + pr_sl,
 + sl,
 + 
 p_src_alias_guid-p_base_port-p_node-print_desc,
 + 
 p_src_alias_guid-p_base_port-p_physp-port_num,
 + 
 p_dest_alias_guid-p_base_port-p_node-print_desc,
 + 
 p_dest_alias_guid-p_base_port-p_physp-port_num);
 + status = IB_NOT_FOUND;
 + goto Exit;
 + }
 + }
   /* reset pkey when raw traffic */
   if (comp_mask  IB_PR_COMPMASK_RAWTRAFFIC 
   cl_ntoh32(p_pr-hop_flow_raw)  (1  31))
 

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [EXTERNAL] [PATCH opensm] Remove unused lid matrix calculation in Torus_2Qos routing

2013-11-13 Thread Jim Schutt
On 11/12/2013 04:18 AM, Hal Rosenstock wrote:

Acked-by: Jim Schutt jasc...@sandia.gov

 
 From: Vladimir Koushnir vladim...@mellanox.com
 
 Signed-off-by: Vladimir Koushnir vladim...@mellanox.com
 Signed-off-by: Hal Rosenstock h...@mellanox.com
 ---
 diff --git a/include/opensm/osm_ucast_mgr.h b/include/opensm/osm_ucast_mgr.h
 index c534b7e..b9c1ca1 100644
 --- a/include/opensm/osm_ucast_mgr.h
 +++ b/include/opensm/osm_ucast_mgr.h
 @@ -296,5 +296,7 @@ int osm_ucast_mgr_process(IN osm_ucast_mgr_t * p_mgr);
  * SEE ALSO
  *Unicast Manager, Node Info Response Controller
  */
 +
 +int ucast_dummy_build_lid_matrices(void *context);
  END_C_DECLS
  #endif   /* _OSM_UCAST_MGR_H_ */
 diff --git a/opensm/osm_torus.c b/opensm/osm_torus.c
 index 8139f1a..71753cf 100644
 --- a/opensm/osm_torus.c
 +++ b/opensm/osm_torus.c
 @@ -9550,6 +9550,7 @@ int osm_ucast_torus2QoS_setup(struct osm_routing_engine 
 *r,
  
   r-context = ctx;
   r-ucast_build_fwd_tables = torus_build_lfts;
 + r-build_lid_matrices = ucast_dummy_build_lid_matrices;
   r-update_sl2vl = torus_update_osm_sl2vl;
   r-update_vlarb = torus_update_osm_vlarb;
   r-path_sl = torus_path_sl;
 diff --git a/opensm/osm_ucast_mgr.c b/opensm/osm_ucast_mgr.c
 index 6384362..9ef7947 100644
 --- a/opensm/osm_ucast_mgr.c
 +++ b/opensm/osm_ucast_mgr.c
 @@ -1182,3 +1182,8 @@ int osm_ucast_dor_setup(struct osm_routing_engine *r, 
 osm_opensm_t * osm)
   r-ucast_build_fwd_tables = ucast_dor_build_lfts;
   return 0;
  }
 +
 +int ucast_dummy_build_lid_matrices(void *context)
 +{
 + return 0;
 +}
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] opensm/osm_torus.c: In dump_torus, make sure switch is present before dumping

2013-02-04 Thread Jim Schutt
On 02/04/2013 02:36 AM, Alex Netes wrote:
 Fix segmentation fault in osm_torus.c.
 
 Signed-off-by: Hal Rosenstock h...@mellanox.com
 Signed-off-by: Alex Netes ale...@mellanox.com

Acked-by: Jim Schutt jasc...@sandia.gov

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] opensm/osm_torus.c: torus routing should fail with VLCap 1 on switch external ports

2013-02-04 Thread Jim Schutt
On 02/04/2013 02:36 AM, Alex Netes wrote:
 Signed-off-by: Hal Rosenstock h...@mellanox.com
 Signed-off-by: Alex Netes ale...@mellanox.com

Acked-by: Jim Schutt jasc...@sandia.gov

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] opensm/osm_torus.c: Fix crash in torus_update_osm_vlarb

2012-12-03 Thread Jim Schutt

On 12/03/2012 08:26 AM, Hal Rosenstock wrote:


Signed-off-by: Alex Netesale...@mellanox.com
Signed-off-by: Hal Rosenstockh...@mellanox.com
---
  opensm/osm_torus.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/opensm/osm_torus.c b/opensm/osm_torus.c
index c06f8d4..075f84a 100644
--- a/opensm/osm_torus.c
+++ b/opensm/osm_torus.c
@@ -8089,7 +8089,7 @@ void torus_update_osm_vlarb(void *context, osm_physp_t 
*osm_phys_port,
 * So, leave VL 0 alone, remap VL 4 to VL 1, zero out the rest,
 * and compress out the zero entries to the end.
 */
-   if (!sw || !port_num ||
+   if (!sw || !port_num || sw-port[port_num] ||
sw-port[port_num]-pgrp-port_grp != 2 * TORUS_MAX_DIM)
return;



With the patch as-is, if torus_update_osm_vlarb() returns early
for any non-NULL switch port, it will never do any updates.

If the crash was that sw-port[port_num] was NULL,
shouldn't the check be !sw-port[port_num] ?

Can you tell me more about the test case that leads to the crash?

Is it that there's a switch with a port that's not connected
to anything, and torus_update_osm_vlarb() was called for it?

Testing for a non-NULL sw-port[port_num] is definitely the right
thing to do to handle that case, and I'm sorry I missed it earlier.

If not, then something else is likely broken, and we need to find
and fix that.

-- Jim

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2} opensm/osm_torus.c: Fix crash in torus_update_osm_vlarb

2012-12-03 Thread Jim Schutt

On 12/03/2012 02:25 PM, Hal Rosenstock wrote:


Signed-off-by: Alex Netesale...@mellanox.com
Signed-off-by: Hal Rosenstockh...@mellanox.com


Acked-by: Jim Schutt jasc...@sandia.gov


---
Change since v1:
Fixed NULL pointer check on sw-port[port_num]
Pointed out by: Jim Schuttjasc...@sandia.gov

  opensm/osm_torus.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/opensm/osm_torus.c b/opensm/osm_torus.c
index c06f8d4..075f84a 100644
--- a/opensm/osm_torus.c
+++ b/opensm/osm_torus.c
@@ -8089,7 +8089,7 @@ void torus_update_osm_vlarb(void *context, osm_physp_t 
*osm_phys_port,
 * So, leave VL 0 alone, remap VL 4 to VL 1, zero out the rest,
 * and compress out the zero entries to the end.
 */
-   if (!sw || !port_num ||
+   if (!sw || !port_num || !sw-port[port_num] ||
sw-port[port_num]-pgrp-port_grp != 2 * TORUS_MAX_DIM)
return;




--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1v2] opensm: fixed port order configuration in torus routing engine

2011-06-01 Thread Jim Schutt

Alex Netes wrote:

Commit 1c2a298b295eba7e24205519abc24e47106d15df broke port order
configuration for torus routing engine. order was incorrectly initiated,
causing setting LFTs to fail.

Signed-off-by: Alex Netes ale...@mellanox.com


Acked-by: Jim Schutt jasc...@sandia.gov

Also, while reviewing this I noticed a couple things in the
port-order patch that I should have noticed earlier, but
didn't.  I have a couple of minor fixup patches that go on
top of this one, to send along in a minute.

-- Jim

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] opensm: fail if configured torus port order references a port not available in all switches

2011-06-01 Thread Jim Schutt

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/osm_torus.c |   24 
 1 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/opensm/osm_torus.c b/opensm/osm_torus.c
index 29c1bb4..47654ce 100644
--- a/opensm/osm_torus.c
+++ b/opensm/osm_torus.c
@@ -856,9 +856,6 @@ bool parse_port(unsigned *pnum, const char *parse_sep)
if (!val)
return false;
*pnum = strtoul(val, nextchar, 0);
-   if (*pnum  IB_NODE_NUM_PORTS_MAX) {
-   *pnum = 0;
-   }
return true;
 }
 
@@ -7018,7 +7015,8 @@ static
 bool verify_setup(struct torus *t, struct fabric *f)
 {
struct coord_dirs *o;
-   unsigned n = 0;
+   struct f_switch *sw;
+   unsigned p, s, n = 0;
bool success = false;
bool all_sw_present, need_seed = true;
 
@@ -7044,6 +7042,24 @@ bool verify_setup(struct torus *t, struct fabric *f)
with two QoS levels (have %d need 8)\n,
(int)t-osm-subn.min_data_vls);
/*
+* Be sure all the switches in the torus support the port
+* ordering that might have been configured.
+*/
+   for (s = 0; s  f-switch_cnt; s++) {
+   sw = f-sw[s];
+   for (p = 0; p  sw-port_cnt; p++) {
+   if (t-port_order[p] = sw-port_cnt) {
+   OSM_LOG(t-osm-log, OSM_LOG_ERROR,
+   Error: port_order configured using 
+   port %u, but only %u ports in 
+   switch w/ GUID 0x%04PRIx64\n,
+   t-port_order[p],  sw-port_cnt - 1,
+   cl_ntoh64(sw-n_id));
+   goto out;
+   }
+   }
+   }
+   /*
 * Unfortunately, there is a problem with non-unique topology for any
 * torus dimension which has radix four.  This problem requires extra
 * input, in the form of specifying both the positive and negative
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: opensm: fixed port order configuration in torus routing engine

2011-05-31 Thread Jim Schutt

Hi Alex,

Alex Netes wrote:

Commit 1c2a298b295eba7e24205519abc24e47106d15df broke port order
configuration for torus routing engine. order was incorrectly initiated,
causing setting LFTs to fail.

Signed-off-by: Alex Netes ale...@mellanox.com
---
 opensm/osm_torus.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/opensm/osm_torus.c b/opensm/osm_torus.c
index cd3d490..75724d2 100644
--- a/opensm/osm_torus.c
+++ b/opensm/osm_torus.c
@@ -8484,7 +8484,7 @@ bool torus_lft(struct torus *t, struct t_switch *sw)
struct port_grp *pgrp;
struct t_switch *dsw;
osm_switch_t *osm_sw;
-   unsigned order[IB_NODE_NUM_PORTS_MAX+1];
+   unsigned char order[IB_NODE_NUM_PORTS_MAX+1];
 
 	if (!(sw-osm_switch  sw-osm_switch-priv == sw)) {

OSM_LOG(t-osm-log, OSM_LOG_ERROR,
@@ -8506,7 +8506,7 @@ bool torus_lft(struct torus *t, struct t_switch *sw)
 
 		for (p = 0; p  ARRAY_SIZE(order); p++) {
 
-			unsigned px = order[t-port_order[p]];

+   unsigned char px = order[t-port_order[p]];
 
 			if (px == IB_INVALID_PORT_NUM)

continue;


I guess the memset(order, IB_INVALID_PORT_NUM, sizeof(order))
a few lines up does the wrong thing without your fix,
since we compare here with IB_INVALID_PORT_NUM?

Nice catch - I'm sorry I missed it.

FWIW, ib_types.h uses uint8_t for ports - maybe should do
that here as well rather than unsigned char?

Acked-by: Jim Schutt jasc...@sandia.gov

-- Jim

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mlx4: allow for 4K mtu configuration of IB ports

2011-05-26 Thread Jim Schutt

Or Gerlitz wrote:

Roland Dreier wrote:

Bob Pearson rpear...@systemfabricworks.com wrote:
With lash+mesh redsky required 6-7 VLs to wire up without deadlocks. 
I think

that Jim's version uses 8 SLs but only 2VLs to work.
If someone was using a torus and also wanted to support QOS and also 
wanted
to separate multicast and management on a separate VL to be 
absolutely sure
that there is no possibility of a deadlock you might end up with #QOS 
* 2 +

1 which would be 5 using the current algorithm.



But again you don't need all those VLs on the HCAs' links, do you?


Jason Gunthorpe wrote:

Routing algorithms only need VLs on interswitch links, not on HCA to
switch links. The only use of the HCA to switch VLs is for QoS. Mesh
topologies can usually be routed with only two VLs, but you need alot
of SLs to make that work.


Bob, Jim, Alex

I wasn't sure if the SL-to-VL mapping done by open SM is dictated by the 
directives @ the user config file or if the routing algorithm is VL 
aware but the routing engine? if the latter, do interswitch links use 
different mapping vs. HCA - switch links?


FWIW, the torus-2QoS routing engine uses VL bit 0 for torus deadlock
avoidance, VL bit 1 to route around a missing switch without deadlocks,
and VL bit 2 to provide two QoS levels.  It needs the port dependence
of the SL2VL maps to do this in switches.

The interswitch and HCAs use the same mapping, but only VL bit 2
is needed on HCAs, to provide the QoS levels.

I chose that bit usage because it seemed the proper ordering of
capabilities if there are fewer than 8 data VLs available - basic
deadlock avoidance is most important; some QoS is nice to have but
not that useful if the fabric can deadlock.

Is that what you were asking, at least WRT. torus-2QoS?

-- Jim



Or.





--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2] OpenSM torus routing order list

2011-05-02 Thread Jim Schutt

Alex Netes wrote:

Enables to define list of switch ports so the SM will
go over this list when creating a routing.
It helps balancing links load on some communication patterns
where multipile links connect between the switches.

Signed-off-by: David McMillen da...@systemfabricworks.com
Signed-off-by: Alex Netes ale...@mellanox.com


Acked-by: Jim Schutt jasc...@sandia.gov


---
Changes since v1:
Added detailed description in man/torus-2QoS.conf.5.in
Added error messages on configuration file parsing
Fixed some styling
---
 man/torus-2QoS.conf.5.in |   26 ++-
 opensm/osm_torus.c   |   76 --
 2 files changed, 97 insertions(+), 5 deletions(-)


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] infiniband-diags/ibnetdiscover: Make --ports option use remap_node_name()

2011-04-25 Thread Jim Schutt
From: Marcus Epperson mrep...@sandia.gov

Signed-off-by: Marcus Epperson mrep...@sandia.gov
Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 src/ibnetdiscover.c |   20 +++-
 1 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/src/ibnetdiscover.c b/src/ibnetdiscover.c
index e081a36..fdbe3b5 100644
--- a/src/ibnetdiscover.c
+++ b/src/ibnetdiscover.c
@@ -574,6 +574,8 @@ void dump_ports_report(ibnd_node_t * node, void *user_data)
 {
int p = 0;
ibnd_port_t *port = NULL;
+   char *nodename = NULL;
+   char *rem_nodename = NULL;
 
/* for each port */
for (p = node-numports, port = node-ports[p]; p  0;
@@ -585,6 +587,9 @@ void dump_ports_report(ibnd_node_t * node, void *user_data)
mad_get_field(port-info, 0, IB_PORT_LINK_WIDTH_ACTIVE_F);
ispeed =
mad_get_field(port-info, 0, IB_PORT_LINK_SPEED_ACTIVE_F);
+   nodename = remap_node_name(node_name_map,
+  port-node-guid,
+  port-node-nodedesc);
fprintf(stdout, %2s %5d %2d 0x%016 PRIx64  %s %s,
ports_nt_str_compat(node),
node-type ==
@@ -592,7 +597,10 @@ void dump_ports_report(ibnd_node_t * node, void *user_data)
port-portnum, port-guid,
dump_linkwidth_compat(iwidth),
dump_linkspeed_compat(ispeed));
-   if (port-remoteport)
+   if (port-remoteport) {
+   rem_nodename = remap_node_name(node_name_map,
+ port-remoteport-node-guid,
+ port-remoteport-node-nodedesc);
fprintf(stdout,
 - %2s %5d %2d 0x%016 PRIx64
 ( '%s' - '%s' )\n,
@@ -601,10 +609,12 @@ void dump_ports_report(ibnd_node_t * node, void 
*user_data)
port-remoteport-node-smalid :
port-remoteport-base_lid,
port-remoteport-portnum,
-   port-remoteport-guid, port-node-nodedesc,
-   port-remoteport-node-nodedesc);
-   else
-   fprintf(stdout, %36s'%s'\n, , port-node-nodedesc);
+   port-remoteport-guid, nodename, rem_nodename);
+   free(rem_nodename);
+   } else
+   fprintf(stdout, %36s'%s'\n, , nodename);
+
+   free(nodename);
}
 }
 
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


opensm: switch incorrectly reports IB_PORT_CAP_HAS_MCAST_FDB_TOP ?

2011-04-22 Thread Jim Schutt

Hi,

I've been testing the current opensm development head
(commit 83b67527d16 from git://git.openfabrics.org/~alexnetes/opensm),
and I've been getting some messages that are new since version 3.3.7:

Apr 22 12:08:09 646534 [411CD940] 0x01 - log_rcv_cb_error: ERR 3111: Received 
MAD with error status = 0x1C
SubnGetResp(SwitchInfo), attr_mod 0x0, TID 0x4802
Initial path: 0,1,1,4 Return path: 0,20,1,7

I get one of these messages for each switch in my fabric, on every
heavy sweep.

It appears these are caused by my switches incorrectly reporting
the capability IB_PORT_CAP_HAS_MCAST_FDB_TOP; i.e. this patch stops
the messages:

diff --git a/opensm/osm_mcast_mgr.c b/opensm/osm_mcast_mgr.c
index ea52bfe..63d2968 100644
--- a/opensm/osm_mcast_mgr.c
+++ b/opensm/osm_mcast_mgr.c
@@ -1041,7 +1041,7 @@ static void mcast_mgr_set_mfttop(IN osm_sm_t * sm, IN 
osm_switch_t * p_sw)
p_path = osm_physp_get_dr_path_ptr(p_physp);
p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw);

-   if (p_physp-port_info.capability_mask  IB_PORT_CAP_HAS_MCAST_FDB_TOP) 
{
+   if (0  p_physp-port_info.capability_mask  
IB_PORT_CAP_HAS_MCAST_FDB_TOP) {
/*
   Set the top of the multicast forwarding table.
 */

IB_PORT_CAP_HAS_MCAST_FDB_TOP is bit 30 of the port capability mask,
which in at least IBA v1.2.1 was a reserved bit but apparently is
not anymore.

Should I file a bug report with my switch vendor about setting
a port capability bit for a capability they don't support, or
is there something else going on that I haven't figured out yet?

FWIW I think my switches have a base SP0; maybe it's got something
to do with that?

Thanks -- Jim

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: opensm: switch incorrectly reports IB_PORT_CAP_HAS_MCAST_FDB_TOP ?

2011-04-22 Thread Jim Schutt

Weiny, Ira K. wrote:

On Apr 22, 2011, at 11:19 AM, Jim Schutt wrote:


Hi,

I've been testing the current opensm development head
(commit 83b67527d16 from git://git.openfabrics.org/~alexnetes/opensm),
and I've been getting some messages that are new since version 3.3.7:

Apr 22 12:08:09 646534 [411CD940] 0x01 - log_rcv_cb_error: ERR 3111: Received 
MAD with error status = 0x1C
SubnGetResp(SwitchInfo), attr_mod 0x0, TID 0x4802
Initial path: 0,1,1,4 Return path: 0,20,1,7

I get one of these messages for each switch in my fabric, on every
heavy sweep.

It appears these are caused by my switches incorrectly reporting
the capability IB_PORT_CAP_HAS_MCAST_FDB_TOP; i.e. this patch stops
the messages:

diff --git a/opensm/osm_mcast_mgr.c b/opensm/osm_mcast_mgr.c
index ea52bfe..63d2968 100644
--- a/opensm/osm_mcast_mgr.c
+++ b/opensm/osm_mcast_mgr.c
@@ -1041,7 +1041,7 @@ static void mcast_mgr_set_mfttop(IN osm_sm_t * sm, IN 
osm_switch_t * p_sw)
p_path = osm_physp_get_dr_path_ptr(p_physp);
p_tbl = osm_switch_get_mcast_tbl_ptr(p_sw);

-   if (p_physp-port_info.capability_mask  IB_PORT_CAP_HAS_MCAST_FDB_TOP) 
{
+   if (0  p_physp-port_info.capability_mask  
IB_PORT_CAP_HAS_MCAST_FDB_TOP) {
/*
   Set the top of the multicast forwarding table.
 */

IB_PORT_CAP_HAS_MCAST_FDB_TOP is bit 30 of the port capability mask,
which in at least IBA v1.2.1 was a reserved bit but apparently is
not anymore.


Yes these have been published as errata to the 1.2.1 specification.

smpquery portinfo lid

should show you if it is reporting that field.  Also what does

smpquery switchinfo lid

say?


# smpquery --version
smpquery BUILD VERSION: 1.5.8_f0526f4 Build date: Apr 22 2011 12:36:58

# smpquery -G switchinfo 0x21283a87200040
# Switch info: Lid 3
LinearFdbCap:49152
RandomFdbCap:0
McastFdbCap:.4096
LinearFdbTop:105
DefPort:.0
DefMcastPrimPort:255
DefMcastNotPrimPort:.255
LifeTime:18
StateChange:.0
OptSLtoVLMapping:1
LidsPerPort:.0
PartEnforceCap:..32
InboundPartEnf:..1
OutboundPartEnf:.1
FilterRawInbound:1
FilterRawOutbound:...1
EnhancedPort0:...0
MulticastFDBTop:.0x

# smpquery portinfo 3
# Port info: Lid 3 port 0
Mkey:0x
GidPrefix:...0xfe80
Lid:.3
SMLid:...48
CapMask:.0x42500848
IsTrapSupported
IsSLMappingSupported
IsSystemImageGUIDsupported
IsVendorClassSupported
IsCapabilityMaskNoticeSupported
IsClientRegistrationSupported
IsMulticastFDBTopSupported
DiagCode:0x
MkeyLeasePeriod:.0
LocalPort:...20
LinkWidthEnabled:1X or 4X
LinkWidthSupported:..1X or 4X
LinkWidthActive:.4X
LinkSpeedSupported:..2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkState:...Active
PhysLinkState:...LinkUp
LinkDownDefState:Polling
ProtectBits:.0
LMC:.0
LinkSpeedActive:.10.0 Gbps
LinkSpeedEnabled:2.5 Gbps or 5.0 Gbps or 10.0 Gbps
NeighborMTU:.4096
SMSL:0
VLCap:...VL0-3
InitType:0x00
VLHighLimit:.0
VLArbHighCap:0
VLArbLowCap:.0
InitReply:...0x00
MtuCap:..4096
VLStallCount:0
HoqLife:.0
OperVLs:.VL0-3
PartEnforceInb:..0
PartEnforceOutb:.0
FilterRawInb:0
FilterRawOutb:...0
MkeyViolations:..0
PkeyViolations:..0
QkeyViolations:..0
GuidCap:.1
ClientReregister:0
McastPkeyTrapSuppressionEnabled:.0
SubnetTimeout:...18
RespTimeVal:.19
LocalPhysErr:0
OverrunErr:..0
MaxCreditHint:...0
RoundTrip:...0

-- Jim



Ira


Should I file a bug report with my switch vendor about setting
a port capability bit for a capability they don't support, or
is there something else going on that I haven't figured out yet

Re: [PATCH] opensm: fixed segfault when enable qos on fabric with no switches

2011-04-13 Thread Jim Schutt

Alex Netes wrote:

Signed-off-by: Alex Netes ale...@mellanox.com
---
 opensm/osm_qos.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)


Acked-by: Jim Schutt jasc...@sandia.gov

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [opensm] routing segfault + LMC 0 routing bug?

2011-03-23 Thread Jim Schutt

Hi Al,

Albert Chu wrote:

Hey Jim, Alex,

Just hit a segfault on the main tree.  It appears patch 


commit 9ddcf3419eade13bdc0a54f93930c49fe67efd63
Author: Jim Schutt jasc...@sandia.gov
Date:   Fri Sep 3 10:43:12 2010 -0600

opensm: Avoid havoc in minhop caused by torus-2QoS persistent use of
osm_port_t:priv.

segfaults opensm on one of our systems w/ updn routing and lmc  0
(would likely segfault dor, minhop, and maybe others too).  Our system
has older switches that do not support enhanced port zero, thus do not
support LMC  0.  (I imagine setting lmc_esp0 to FALSE, results in the
same behavior.)  Subsequently even if you set LMC  0 in your opensm
config file, there can be ports with LMC = 0 and LMC != 0 (e.g. from
HCAs). Subsequently in alloc_ports_priv(), some ports will have priv set
to NULL and some will not.  Because of assumptions in osm_switch.c about
priv != NULL when lmc  0, we hit a segfault.  The issue didn't exist
before b/c we allocated p_port-priv non-NULL no matter what.


OK, I think I see.  But this segfault can only occur in
the case where LMC is configured  0, right?

The issue is in osm_switch_recommend_path() when
routing_for_lmc is true, but p_port-priv is NULL, right?



The attached patch fixes the problem w/ updn.  I haven't looked through
all of the 2Qos code thoroughly to figure out the consequences of this
change, so I'm just considering this a starting point for discussion.


Torus-2QoS's use of port-priv is unique because it persists
between routing sweeps.  So if another routing engine runs
after torus-2QoS and uses port-priv without having ensured
that it set it itself, there will be trouble.  9ddcf3419ea
was fixing such an issue.

I can find only two calls of osm_switch_recommend_path(),
and both seem to be to do the right thing, so I think
your patch is OK.



In addition, with the possibility that SP0 ports will be LMC = 0, this
code in osm_ucast_mgr.c ucast_mgr_process_tbl() does not look good.

lids_per_port = 1  p_mgr-p_subn-opt.lmc;
for (i = 0; i  lids_per_port; i++) {
 cl_qlist_t *list = p_mgr-port_order_list;
 cl_list_item_t *item;
 for (item = cl_qlist_head(list); item != cl_qlist_end(list);
  item = cl_qlist_next(item)) {
  osm_port_t *port = cl_item_obj(item, port, list_item);
  ucast_mgr_process_port(p_mgr, p_sw, port, i);
 }
}

It iterates over all ports with the configured LMC, not the LMC of the
port?  I haven't thought about this too deeply or investigated deeply,
so consider this another starting point for discussion.


Hmm, looks like ucast_mgr_process_port() DTRT, though;
it ignores lids that aren't in the range configured on
the port?



Al





Subject:
[PATCH] fix segfault corner case w/ updn routing and LMC  0
From:
Albert L.Chu ch...@llnl.gov
Date:
Tue, 22 Mar 2011 17:36:16 -0700


Signed-off-by: Albert L. Chu ch...@llnl.gov


Reviewed-by: Jim Schutt jasc...@sandia.gov

-- Jim


---
 opensm/osm_ucast_mgr.c |4 
 1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/opensm/osm_ucast_mgr.c b/opensm/osm_ucast_mgr.c
index 4019589..211d6e0 100644
--- a/opensm/osm_ucast_mgr.c
+++ b/opensm/osm_ucast_mgr.c
@@ -318,10 +318,6 @@ static void alloc_ports_priv(osm_ucast_mgr_t * mgr)
 item = cl_qmap_next(item)) {
port = (osm_port_t *) item;
lmc = ib_port_info_get_lmc(port-p_physp-port_info);
-   if (!lmc) {
-   port-priv = NULL;
-   continue;
-   }
r = malloc(sizeof(*r) + sizeof(r-guids[0]) * (1  lmc));
if (!r) {
OSM_LOG(mgr-p_log, OSM_LOG_ERROR, ERR 3A09: 


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [opensm] routing segfault + LMC 0 routing bug?

2011-03-23 Thread Jim Schutt

Albert Chu wrote:

Hey Jim,

On Wed, 2011-03-23 at 09:01 -0700, Jim Schutt wrote:

Hi Al,




Torus-2QoS's use of port-priv is unique because it persists
between routing sweeps.  So if another routing engine runs
after torus-2QoS and uses port-priv without having ensured
that it set it itself, there will be trouble.  9ddcf3419ea
was fixing such an issue.

I can find only two calls of osm_switch_recommend_path(),
and both seem to be to do the right thing, so I think
your patch is OK.


Sounds good.  When reading over your comments about the 2Qos patches
that affected this area, I wasn't quite sure how you were dealing with
the p_port-priv, so I was unsure how my patch would affect things.



There's some comments at the top of osm_torus.c,
in the definitions of struct endpoint and struct t_switch,
that describe the rules for how torus-2QoS code can
safely use -priv.

They may shed some extra light...

-- Jim

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] OpenSM torus routing order list

2011-03-21 Thread Jim Schutt

Hi,

On Mon, 2011-03-21 at 03:58 -0600, Alex Netes wrote:
 Enables to define list of switch ports so the SM will
 go over this list when creating a routing.
 It helps balancing links load on some communication patterns
 where multipile links connect between the switches.
 
 Signed-off-by: David McMillen da...@systemfabricworks.com
 Signed-off-by: Alex Netes ale...@mellanox.com
 ---
  man/torus-2QoS.conf.5.in |   19 -
  opensm/osm_torus.c   |   65 +++--
  2 files changed, 79 insertions(+), 5 deletions(-)
 
 diff --git a/man/torus-2QoS.conf.5.in b/man/torus-2QoS.conf.5.in
 index 147a7b1..dd1aafb 100644
 --- a/man/torus-2QoS.conf.5.in
 +++ b/man/torus-2QoS.conf.5.in
 @@ -62,7 +62,7 @@ see \fBUNICAST ROUTING\fR in torus-2QoS(8).
  \fIsw0_GUID sw1_GUID
  \fR
  .RS
 -These keywords are used to seed the torus/mesh topolgy.
 +These keywords are used to seed the torus/mesh topology.
  For example, xp_link 0x2000 0x2001 specifies that a link from
  the switch with node GUID 0x2000 to the switch with node GUID 0x2001
  would point in the positive x direction,
 @@ -78,7 +78,7 @@ for torus dimensions of radix four (see \fBTOPOLOGY 
 DISCOVERY\fR in
  torus-2QoS(8)).  For such cases both the positive and negative coordinate
  directions must be specified.
  .P
 -Based on the topology specifed via the \fBtorus\fR/\fBmesh\fR keyword,
 +Based on the topology specified via the \fBtorus\fR/\fBmesh\fR keyword,
  torus-2QoS will detect and log when it has insufficient seed configuration.
  .RE
  .
 @@ -140,6 +140,17 @@ parameter needs to be increased.
  If this keyword appears multiple times, the last instance prevails.
  .RE
  .
 +.P
 +\fBport_order
 +\fIp1 p2 p3 ...
 +\fR
 +.RS
 +This keyword specifies the order on which the ports would be chosen for 
 routing.
 +This keyword is optional and if given overrides the default order.
 +It's possible to define any subset of ports that would be chosen before the
 +others.
 +.RE
 +.

This documentation needs to tell me a little more about 
how to choose port_order values.  

Something like this:

This keyword specifies the order in which CA ports on a 
destination switch are visited when computing routes.
When the fabric contains switches connected with multiple
parallel links, routes are distributed in a round-robin
fashion across such links, and so changing the order 
that CA ports are visited changes the distribution
of routes across such links.  This may be advantageous 
for some specific traffic patterns.

The default is to visit CA ports in increasing port
order on destination switches.

Duplicate values in the list will be ignored.


  .SH EXAMPLE
  .
  \f(RC
 @@ -171,6 +182,10 @@ z_dateline -1  # back to its original position.
  # on a host attached to a switch from the second seed.
  # Both instances should use this torus-2QoS.conf to ensure
  # path SL values do not change in the event of SM failover.
 +
 +# port_order defines the order on which the ports would be
 +# chosen for routing.
 +port_order 7 10 8 11 9 12 25 28 26 29 27 30
  .fi
  \fR
  .
 diff --git a/opensm/osm_torus.c b/opensm/osm_torus.c
 index add3cf9..7723a45 100644
 --- a/opensm/osm_torus.c
 +++ b/opensm/osm_torus.c
 @@ -287,6 +287,8 @@ struct torus {
   unsigned seed_cnt, seed_idx;
   unsigned x_sz, y_sz, z_sz;
  
 + unsigned port_order[IB_NODE_NUM_PORTS_MAX+1];
 +
   unsigned sw_pool_sz;
   unsigned link_pool_sz;
   unsigned seed_sz;
 @@ -844,6 +846,47 @@ out:
  }
  
  static
 +bool parse_port(unsigned *pnum, const char *parse_sep)
 +{
 + char *val, *nextchar;
 +
 + val = strtok(NULL, parse_sep);
 + if (!val)
 + return false;
 + *pnum = strtoul(val, nextchar, 0);
 + if (*pnum  IB_NODE_NUM_PORTS_MAX)
 + *pnum = 0;

Hmmm, user configuration was just silently discarded.
Please warn to give the user a chance to correct it.

 + return true;
 +}
 +
 +static
 +bool parse_port_order(struct torus *t, const char *parse_sep)
 +{
 + unsigned i, j, k, n;
 +
 + for (i = 0; i  (sizeof(t-port_order) / sizeof(unsigned)); i++) {

Please add this (from linux kernel):
#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))

and use it instead for all such loops.

 + if (!parse_port((t-port_order[i]), parse_sep))
 + break;
 + for (j = 0; j  i; j++) {
 + if (t-port_order[j] == t-port_order[i]) {
 + i--;/* Ignore duplicate port number */

Again, please warn that user configuration was discarded.

 + break;
 + }
 + }
 + }
 +
 + n = i;
 + for (j = 0; j  (sizeof(t-port_order) / sizeof(unsigned)); j++) {
 + for (k = 0; k  i; k++)
 + if (t-port_order[k] == j)
 + break;
 + if (k = i) t-port_order[n++] = j;

Style nit: make that last line into two lines.

 + }
 +
 

Re: [PATCH] opensm: Fixed pointer validity check in report_torus_changes()

2011-03-07 Thread Jim Schutt

Hi Alex,

On Mon, 2011-03-07 at 02:59 -0700, Alex Netes wrote:
 struct torus *nt should be checked for validity before getting assignments.
 
 Signed-off-by: Alex Netes ale...@mellanox.com
 ---
  opensm/osm_torus.c |   10 +++---
  1 files changed, 7 insertions(+), 3 deletions(-)
 
 diff --git a/opensm/osm_torus.c b/opensm/osm_torus.c
 index add3cf9..7a2c252 100644
 --- a/opensm/osm_torus.c
 +++ b/opensm/osm_torus.c
 @@ -7423,13 +7423,17 @@ void report_torus_changes(struct torus *nt, struct 
 torus *ot)
  {
   unsigned cnt = 0;
   unsigned i, j, k;
 - unsigned x_sz = nt-x_sz;
 - unsigned y_sz = nt-y_sz;
 - unsigned z_sz = nt-z_sz;
 + unsigned x_sz;
 + unsigned y_sz;
 + unsigned z_sz;

Actually, nt is guaranteed to be non-NULL; check the
only caller (torus_build_lfts()).

  
   if (!(nt  ot))
   return;

This check for nt is redundant, I think.  Only ot has 
any possibility of being NULL.

-- Jim

  
 + x_sz = nt-x_sz;
 + y_sz = nt-y_sz;
 + z_sz = nt-z_sz;
 +
   if (x_sz != ot-x_sz) {
   cnt++;
   OSM_LOG(nt-osm-log, OSM_LOG_INFO,


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] opensm.spec.in: Add in man5 for torus-2QoS.conf

2011-02-10 Thread Jim Schutt
From: Hal Rosenstock h...@mellanox.com

Signed-off-by: Hal Rosenstock h...@mellanox.com
Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm.spec.in |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/opensm/opensm.spec.in b/opensm/opensm.spec.in
index c541804..65e46c9 100644
--- a/opensm/opensm.spec.in
+++ b/opensm/opensm.spec.in
@@ -129,6 +129,7 @@ fi
 %{_sbindir}/opensm
 %{_sbindir}/osmtest
 %{_mandir}/man8/*
+%{_mandir}/man5/*
 %doc AUTHORS COPYING README doc/performance-manager-HOWTO.txt 
doc/QoS_management_in_OpenSM.txt doc/opensm_release_notes-3.3.txt
 %{_sysconfdir}/init.d/opensmd
 %{_sbindir}/sldd.sh
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/13] opensm: Cleanups and more documentation for torus-2QoS patchset

2011-01-31 Thread Jim Schutt
Hi Sasha,

On Sun, 2011-01-30 at 10:12 -0700, Sasha Khapyorsky wrote:
 Hi Jim,
 
 On 15:11 Fri 12 Nov , Jim Schutt wrote:
  
  These patches clean up and add documentation to the
  torus-2QoS routing module for OpenSM.  They apply on
  top of my previous bug-fix patchset from September
  (http://www.spinics.net/lists/linux-rdma/msg05809.html),
  which applies to your torus-2qos branch.
 
 Following your and others feedback. I've merge torus-2qos branch
 upstream. Thanks.

Great news! Thanks.

-- Jim

 
 Sasha
 


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/13] Revert opensm: Do not require -Q option for torus-2QoS routing engine.

2010-11-12 Thread Jim Schutt
This reverts commit b9691580e29c6a8cf1f45995988350c02826786d.

Since all other routing engines require -Q to cause SL2VL maps to
be programmed, torus-2QoS should do the same.

Of course, torus-2QoS requires SL2VL maps to be programmed for correct
routing, so a check for that will need to be added.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_qos.c|7 ++-
 opensm/opensm/osm_subnet.c |   18 +-
 2 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c
index ab55918..ba198a0 100644
--- a/opensm/opensm/osm_qos.c
+++ b/opensm/opensm/osm_qos.c
@@ -314,9 +314,7 @@ int osm_qos_setup(osm_opensm_t * p_osm)
int ret = 0;
int vlarb_only;
 
-   if (!(p_osm-subn.opt.qos ||
- (p_osm-routing_engine_used 
-  p_osm-routing_engine_used-update_sl2vl)))
+   if (!p_osm-subn.opt.qos)
return 0;
 
OSM_LOG_ENTER(p_osm-log);
@@ -333,8 +331,7 @@ int osm_qos_setup(osm_opensm_t * p_osm)
cl_plock_excl_acquire(p_osm-lock);
 
/* read QoS policy config file */
-   if (p_osm-subn.opt.qos)
-   osm_qos_parse_policy_file(p_osm-subn);
+   osm_qos_parse_policy_file(p_osm-subn);
 
p_tbl = p_osm-subn.port_guid_tbl;
p_next = cl_qmap_head(p_tbl);
diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c
index f714af7..bc34a0f 100644
--- a/opensm/opensm/osm_subnet.c
+++ b/opensm/opensm/osm_subnet.c
@@ -1051,8 +1051,6 @@ static void subn_verify_qos_set(osm_qos_options_t *set, 
const char *prefix,
 
 int osm_subn_verify_config(IN osm_subn_opt_t * p_opts)
 {
-   osm_qos_options_t dflt;
-
if (p_opts-lmc  7) {
log_report( Invalid Cached Option Value:lmc = %u:
   Using Default:%u\n, p_opts-lmc, OSM_DEFAULT_LMC);
@@ -1103,15 +1101,17 @@ int osm_subn_verify_config(IN osm_subn_opt_t * p_opts)
p_opts-console = OSM_DEFAULT_CONSOLE;
}
 
+   if (p_opts-qos) {
+   osm_qos_options_t dflt;
 
-   /* the default options in qos_options must be correct.
-* every other one need not be, b/c those will default
-* back to whatever is in qos_options.
-*/
-   subn_set_default_qos_options(dflt);
-   subn_verify_qos_set(p_opts-qos_options, qos, dflt);
+   /* the default options in qos_options must be correct.
+* every other one need not be, b/c those will default
+* back to whatever is in qos_options.
+*/
 
-   if (p_opts-qos) {
+   subn_set_default_qos_options(dflt);
+
+   subn_verify_qos_set(p_opts-qos_options, qos, dflt);
subn_verify_qos_set(p_opts-qos_ca_options, qos_ca,
p_opts-qos_options);
subn_verify_qos_set(p_opts-qos_sw0_options, qos_sw0,
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/13] opensm: torus-2QoS requires that QoS be enabled.

2010-11-12 Thread Jim Schutt
SL2VL maps are only programmed if QoS is enabled.  Require this to be
the case if torus-2QoS is configured, and print a message otherwise.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_torus.c |8 
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/opensm/opensm/osm_torus.c b/opensm/opensm/osm_torus.c
index 3b67f16..aeb4fe6 100644
--- a/opensm/opensm/osm_torus.c
+++ b/opensm/opensm/osm_torus.c
@@ -9045,6 +9045,14 @@ int torus_build_lfts(void *context)
struct fabric *fabric;
struct torus *torus;
 
+   if (!ctx-osm-subn.opt.qos) {
+   OSM_LOG(ctx-osm-log, OSM_LOG_ERROR,
+   Error: Routing engine list contains torus-2QoS. 
+   Enable QoS for correct operation 
+   (-Q or 'qos TRUE' in opensm.conf).\n);
+   return status;
+   }
+
fabric = ctx-fabric;
teardown_fabric(fabric);
 
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/13] opensm: Fill in default QoS values at last possible moment.

2010-11-12 Thread Jim Schutt
The comments for struct osm_qos_options in osm_subnet.h describe values that
flag default QoS values for struct members.  osm_qos_options structs are
initialized with these flag values in subn_init_qos_options(), but they are
overwritten via osm_subn_verify_config() with the actual default values.

It turns out to be easy to wait until qos_build_config() to detect the flag
values and use the actual default values as needed.   osm_qos_setup() +
qos_build_config() already had code that set unconfigured CA, switch port,
and router specific QoS parameters from configured default QoS parameters,
so that duplicate code can be removed from osm_subn_verify_config().

In addition to code simplification, such delay in replacing default flag
values with the actual default values makes it possible for a routing
engine to detect that configured rather than default values were used.

For example, torus-2QoS can never use any configured qos_sl2vl values,
but should only warn if such are configured.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_qos.c|   55 +++---
 opensm/opensm/osm_subnet.c |   70 ++--
 2 files changed, 66 insertions(+), 59 deletions(-)

diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c
index ba198a0..afea7bb 100644
--- a/opensm/opensm/osm_qos.c
+++ b/opensm/opensm/osm_qos.c
@@ -376,7 +376,7 @@ int osm_qos_setup(osm_opensm_t * p_osm)
 /*
  *  QoS config stuff
  */
-static int parse_one_unsigned(char *str, char delim, unsigned *val)
+static int parse_one_unsigned(const char *str, char delim, unsigned *val)
 {
char *end;
*val = strtoul(str, end, 0);
@@ -385,10 +385,10 @@ static int parse_one_unsigned(char *str, char delim, 
unsigned *val)
return (int)(end - str);
 }
 
-static int parse_vlarb_entry(char *str, ib_vl_arb_element_t * e)
+static int parse_vlarb_entry(const char *str, ib_vl_arb_element_t * e)
 {
unsigned val;
-   char *p = str;
+   const char *p = str;
p += parse_one_unsigned(p, ':', val);
e-vl = val % 15;
p += parse_one_unsigned(p, ',', val);
@@ -396,10 +396,10 @@ static int parse_vlarb_entry(char *str, 
ib_vl_arb_element_t * e)
return (int)(p - str);
 }
 
-static int parse_sl2vl_entry(char *str, uint8_t * raw)
+static int parse_sl2vl_entry(const char *str, uint8_t * raw)
 {
unsigned val1, val2;
-   char *p = str;
+   const char *p = str;
p += parse_one_unsigned(p, ',', val1);
p += parse_one_unsigned(p, ',', val2);
*raw = (val1  4) | (val2  0xf);
@@ -410,18 +410,36 @@ static void qos_build_config(struct qos_config *cfg, 
osm_qos_options_t * opt,
 osm_qos_options_t * dflt)
 {
int i;
-   char *p;
+   const char *p;
 
memset(cfg, 0, sizeof(*cfg));
 
-   cfg-max_vls = opt-max_vls  0 ? opt-max_vls : dflt-max_vls;
+   if (opt-max_vls  0)
+   cfg-max_vls = opt-max_vls;
+   else {
+   if (dflt-max_vls  0)
+   cfg-max_vls = dflt-max_vls;
+   else
+   cfg-max_vls = OSM_DEFAULT_QOS_MAX_VLS;
+   }
 
if (opt-high_limit = 0)
cfg-vl_high_limit = (uint8_t) opt-high_limit;
-   else
-   cfg-vl_high_limit = (uint8_t) dflt-high_limit;
+   else {
+   if (dflt-high_limit = 0)
+   cfg-vl_high_limit = (uint8_t) dflt-high_limit;
+   else
+   cfg-vl_high_limit = (uint8_t) 
OSM_DEFAULT_QOS_HIGH_LIMIT;
+   }
 
-   p = opt-vlarb_high ? opt-vlarb_high : dflt-vlarb_high;
+   if (opt-vlarb_high)
+   p = opt-vlarb_high;
+   else {
+   if (dflt-vlarb_high)
+   p = dflt-vlarb_high;
+   else
+   p = OSM_DEFAULT_QOS_VLARB_HIGH;
+   }
for (i = 0; i  2 * IB_NUM_VL_ARB_ELEMENTS_IN_BLOCK; i++) {
p += parse_vlarb_entry(p,
   cfg-vlarb_high[i /
@@ -430,7 +448,14 @@ static void qos_build_config(struct qos_config *cfg, 
osm_qos_options_t * opt,

IB_NUM_VL_ARB_ELEMENTS_IN_BLOCK]);
}
 
-   p = opt-vlarb_low ? opt-vlarb_low : dflt-vlarb_low;
+   if (opt-vlarb_low)
+   p = opt-vlarb_low;
+   else {
+   if (dflt-vlarb_low)
+   p = dflt-vlarb_low;
+   else
+   p = OSM_DEFAULT_QOS_VLARB_LOW;
+   }
for (i = 0; i  2 * IB_NUM_VL_ARB_ELEMENTS_IN_BLOCK; i++) {
p += parse_vlarb_entry(p,
   cfg-vlarb_low[i /
@@ -440,6 +465,14 @@ static void qos_build_config(struct qos_config *cfg, 
osm_qos_options_t * opt,
}
 
p = opt-sl2vl ? opt-sl2vl : dflt-sl2vl;
+   if (opt-sl2vl)
+   p = opt-sl2vl;
+   else

[PATCH 08/13] opensm/osm_torus.c: Ignore multiple configurations of torus size.

2010-11-12 Thread Jim Schutt

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_torus.c |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/opensm/opensm/osm_torus.c b/opensm/opensm/osm_torus.c
index 8e0435b..add3cf9 100644
--- a/opensm/opensm/osm_torus.c
+++ b/opensm/opensm/osm_torus.c
@@ -782,6 +782,12 @@ bool parse_torus(struct torus *t, const char *parse_sep)
char *ptr;
bool success = false;
 
+   /*
+* There can be only one.  Ignore the imposters.
+*/
+   if (t-sw_pool)
+   goto out;
+
if (!parse_size(t-x_sz, t-flags, X_MESH, parse_sep))
goto out;
 
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/13] opensm/main.c: Add description of no_fallback to --routing_engine option documentation.

2010-11-12 Thread Jim Schutt

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/main.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/opensm/opensm/main.c b/opensm/opensm/main.c
index e74dc46..756fe6f 100644
--- a/opensm/opensm/main.c
+++ b/opensm/opensm/main.c
@@ -174,6 +174,9 @@ static void show_usage(void)
 Min Hop algorithm.  Multiple routing engines can be 
specified\n
 separated by commas so that specific ordering of 
routing\n
 algorithms will be tried if earlier routing engines 
fail.\n
+If all configured routing engines fail, OpenSM will 
always\n
+attempt to route with Min Hop unless 'no_fallback' 
is\n
+included in the list of routing engines.\n
 Supported engines: updn, file, ftree, lash, dor, 
torus-2QoS\n\n);
printf(--do_mesh_analysis\n
 This option enables additional analysis for the 
lash\n
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 13/13] opensm/doc/current-routing.txt: Sync torus-2QoS information with new man pages.

2010-11-12 Thread Jim Schutt

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/doc/current-routing.txt |  141 +++
 1 files changed, 126 insertions(+), 15 deletions(-)

diff --git a/opensm/doc/current-routing.txt b/opensm/doc/current-routing.txt
index 4eaf861..5048c55 100644
--- a/opensm/doc/current-routing.txt
+++ b/opensm/doc/current-routing.txt
@@ -399,7 +399,7 @@ Use '-R dor' option to activate the DOR algorithm.
 Torus-2QoS Routing Algorithm
 
 
-Torus-2QoS is routing algorithm designed for large-scale 2D/3D torus fabrics.
+Torus-2QoS is a routing algorithm designed for large-scale 2D/3D torus fabrics.
 The torus-2QoS routing engine can provide the following functionality on
 a 2D/3D torus:
 - routing that is free of credit loops
@@ -411,6 +411,8 @@ a 2D/3D torus:
 - very short run times, with good scaling properties as fabric size
 increases
 
+Unicast Routing:
+
 Torus-2QoS is a DOR-based algorithm that avoids deadlocks that would otherwise
 occur in a torus using the concept of a dateline for each torus dimension.
 It encodes into a path SL which datelines the path crosses as follows:
@@ -423,17 +425,18 @@ It encodes into a path SL which datelines the path 
crosses as follows:
 For a 3D torus, that leaves one SL bit free, which torus-2QoS uses to
 implement two QoS levels.
 
-This is possible because torus-2QoS also makes use of the output port
-dependence of the switch SL2VL maps.  It computes in which torus coordinate
-direction each interswitch link points, and writes SL2VL maps for such
-ports as follows:
+Torus-2QoS also makes use of the output port dependence of switch SL2VL
+maps to encode into one VL bit the information encoded in three SL bits.
+It computes in which torus coordinate direction each inter-switch link
+points, and writes SL2VL maps for such ports as follows:
 
   for (sl = 0; sl  16; sl ++)
 /* cdir(port) reports which torus coordinate direction a switch port
  * points in, and returns 0, 1, or 2 */
 sl2vl(iport,oport,sl) = 0x1  (sl  cdir(oport));
 
-Thus torus-2QoS consumes 8 SL values (SL bits 0-2) and 2 VL values (VL bit 0)
+Thus, on a pristine 3D torus, i.e., in the absence of failed fabric switches,
+torus-2QoS consumes 8 SL values (SL bits 0-2) and 2 VL values (VL bit 0)
 per QoS level to provide deadlock-free routing on a 3D torus.
 
 Torus-2QoS routes around link failure by taking the long way around any
@@ -454,7 +457,7 @@ torus below, where switches are denoted by [+a-zA-Z]:
 
   x=012345
 
-For a pristine fabric the path from S to D would be S-n-T-r-d.  In the
+For a pristine fabric the path from S to D would be S-n-T-r-D.  In the
 event that either link S-n or n-T has failed, torus-2QoS would use the path
 S-m-p-o-T-r-D.  Note that it can do this without changing the path SL
 value; once the 1D ring m-S-n-T-o-p-m has been broken by failure, path
@@ -463,11 +466,19 @@ dateline (between, say, x=5 and x=0) can be ignored for 
path segments on
 that ring.
 
 One result of this is that torus-2QoS can route around many simultaneous
-link failures, as long as no 1D ring is broken into disjoint regions.  For
+link failures, as long as no 1D ring is broken into disjoint segments.  For
 example, if links n-T and T-o have both failed, that ring has been broken
-into two disjoint regions, T and o-p-m-S-n.  Torus-2QoS checks for such
+into two disjoint segments, T and o-p-m-S-n.  Torus-2QoS checks for such
 issues, reports if they are found, and refuses to route such fabrics.
 
+Note that in the case where there are multiple parallel links between a pair
+of switches, torus-2QoS will allocate routes across such links in a round-
+robin fashion, based on ports at the path destination switch that are active
+and not used for inter-switch links.  Should a link that is one of several
+such parallel links fail, routes are redistributed across the remaining
+links.   When the last of such a set of parallel links fails, traffic is
+rerouted as described above.
+
 Handling a failed switch under DOR requires introducing into a path at
 least one turn that would be otherwise illegal, i.e. not allowed by DOR
 rules.  Torus-2QoS will introduce such a turn as close as possible to the
@@ -476,8 +487,9 @@ failed switch in order to route around it.
 In the above example, suppose switch T has failed, and consider the path
 from S to D.  Torus-2QoS will produce the path S-n-I-r-D, rather than the
 S-n-T-r-D path for a pristine torus, by introducing an early turn at n.
-For traffic arriving at switch I from n, normal DOR rules will generate an
-illegal turn in the path from S to D at I, and a legal turn at r.
+Normal DOR rules will cause traffic arriving at switch I to be forwarded
+to switch r; for traffic arriving from I due to the early turn at n,
+this will generate an illegal turn at I.
 
 Torus-2QoS will also use the input port dependence of SL2VL maps to set VL
 bit 1 (which would be otherwise unused) for y

[PATCH 07/13] opensm/osm_torus.c: Use PRIx64 for GUID printing.

2010-11-12 Thread Jim Schutt

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_torus.c |  216 ++--
 1 files changed, 108 insertions(+), 108 deletions(-)

diff --git a/opensm/opensm/osm_torus.c b/opensm/opensm/osm_torus.c
index 804334f..8e0435b 100644
--- a/opensm/opensm/osm_torus.c
+++ b/opensm/opensm/osm_torus.c
@@ -60,8 +60,6 @@
 #define SWITCH_MAX_PORTGRPS  (1 + 2 * TORUS_MAX_DIM)
 
 typedef ib_net64_t guid_t;
-#define ntohllu(v_64bit) ((unsigned long long)cl_ntoh64(v_64bit))
-
 
 /*
  * An endpoint terminates a link, and is one of three types:
@@ -584,8 +582,8 @@ bool build_sw_endpoint(struct fabric *f, osm_port_t 
*osm_port)
sw = find_f_sw(f, sw_guid);
if (!sw) {
OSM_LOG(f-osm-log, OSM_LOG_ERROR,
-   Error: missing switch w/ GUID 0x%04llx\n,
-   ntohllu(sw_guid));
+   Error: missing switch w/ GUID 0x%04PRIx64\n,
+   cl_ntoh64(sw_guid));
goto out;
}
/*
@@ -598,9 +596,9 @@ bool build_sw_endpoint(struct fabric *f, osm_port_t 
*osm_port)
} else
OSM_LOG(f-osm-log, OSM_LOG_ERROR,
Error: switch port %d has id 
-   0x%04llx, expected 0x%04llx\n,
-   sw_port, ntohllu(sw-port[sw_port]-n_id),
-   ntohllu(sw_guid));
+   0x%04PRIx64, expected 0x%04PRIx64\n,
+   sw_port, cl_ntoh64(sw-port[sw_port]-n_id),
+   cl_ntoh64(sw_guid));
goto out;
}
ep = calloc(1, sizeof(*ep));
@@ -657,8 +655,8 @@ bool build_ca_link(struct fabric *f,
sw = find_f_sw(f, sw_guid);
if (!sw) {
OSM_LOG(f-osm-log, OSM_LOG_ERROR,
-   Error: missing switch w/ GUID 0x%04llx\n,
-   ntohllu(sw_guid));
+   Error: missing switch w/ GUID 0x%04PRIx64\n,
+   cl_ntoh64(sw_guid));
goto out;
}
l = alloc_flink(f);
@@ -713,15 +711,15 @@ bool build_link(struct fabric *f,
sw0 = find_f_sw(f, sw_guid0);
if (!sw0) {
OSM_LOG(f-osm-log, OSM_LOG_ERROR,
-   Error: missing switch w/ GUID 0x%04llx\n,
-   ntohllu(sw_guid0));
+   Error: missing switch w/ GUID 0x%04PRIx64\n,
+   cl_ntoh64(sw_guid0));
goto out;
}
sw1 = find_f_sw(f, sw_guid1);
if (!sw1) {
OSM_LOG(f-osm-log, OSM_LOG_ERROR,
-   Error: missing switch w/ GUID 0x%04llx\n,
-   ntohllu(sw_guid1));
+   Error: missing switch w/ GUID 0x%04PRIx64\n,
+   cl_ntoh64(sw_guid1));
goto out;
}
l = alloc_flink(f);
@@ -1242,10 +1240,10 @@ void diagnose_fabric(struct fabric *f)
 
OSM_LOG(f-osm-log, OSM_LOG_INFO,
Found non-torus fabric link:
-sw GUID 0x%04llx port %d -
-sw GUID 0x%04llx port %d\n,
-   ntohllu(l-end[0].n_id), l-end[0].port,
-   ntohllu(l-end[1].n_id), l-end[1].port);
+sw GUID 0x%04PRIx64 port %d -
+sw GUID 0x%04PRIx64 port %d\n,
+   cl_ntoh64(l-end[0].n_id), l-end[0].port,
+   cl_ntoh64(l-end[1].n_id), l-end[1].port);
}
/*
 * Report on any switches with ports using endpoints that didn't
@@ -1267,8 +1265,8 @@ void diagnose_fabric(struct fabric *f)
 
OSM_LOG(f-osm-log, OSM_LOG_INFO,
Found non-torus fabric port:
-sw GUID 0x%04llx port %d\n,
-   ntohllu(f-sw[k]-n_id), p);
+sw GUID 0x%04PRIx64 port %d\n,
+   cl_ntoh64(f-sw[k]-n_id), p);
}
 }
 
@@ -1423,15 +1421,15 @@ bool connect_tlink(struct port_grp *pg0, struct 
endpoint *f_ep0,
if (pg0-port_cnt == t-portgrp_sz) {
OSM_LOG(t-osm-log, OSM_LOG_ERROR,
Error: exceeded port group max 
-   port count (%d): switch GUID 0x%04llx\n,
-   t-portgrp_sz, ntohllu(pg0-sw-n_id));
+   port count (%d): switch GUID 0x%04PRIx64\n,
+   t-portgrp_sz, cl_ntoh64(pg0-sw-n_id));
goto out;
}
if (pg1-port_cnt == t-portgrp_sz) {
OSM_LOG(t-osm-log, OSM_LOG_ERROR,
Error: exceeded port group max 
-   port count (%d): switch GUID 0x%04llx\n,
-   t-portgrp_sz, ntohllu(pg1-sw

[PATCH 05/13] opensm: Cause torus-2QoS to warn if QoS configuration will cause issues.

2010-11-12 Thread Jim Schutt
Torus-2QoS needs 8 VLs, and complete control over sl2vl maps, in order
to provide 2 QoS levels with routing that is free of credit loops on torus
fabrics.  Warn to this effect if an insufficient max_vls configuration
or a non-default qos_sl2vl configuration is detected.

Also, torus-2QoS needs to use VLs 0-3 to implement one QoS level, and
VLs 4-7 to implement the other.  The VLarb weights for VLs 0-3 should
all have the same value, and similarly for the weights for VLs 4-7.
Otherwise, differences in data rates for different paths may cause
hard-to-diagnose application issues.  Warn to this effect when
detected.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_torus.c |   87 +
 1 files changed, 87 insertions(+), 0 deletions(-)

diff --git a/opensm/opensm/osm_torus.c b/opensm/opensm/osm_torus.c
index aeb4fe6..784955d 100644
--- a/opensm/opensm/osm_torus.c
+++ b/opensm/opensm/osm_torus.c
@@ -9038,6 +9038,84 @@ out:
 }
 
 static
+void check_vlarb_config(const char *vlarb_str, bool is_default,
+   const char *str, const char *pri, osm_log_t *log)
+{
+   unsigned total_weight[IB_MAX_NUM_VLS] = {0,};
+   unsigned i = 0, v, vl = 0;
+   char *end;
+   bool uniform;
+
+   while (*vlarb_str  i++  2 * IB_NUM_VL_ARB_ELEMENTS_IN_BLOCK) {
+   v = strtoul(vlarb_str, end, 0);
+   if (*end)
+   end++;
+   vlarb_str = end;
+   if (i  0x1)
+   vl = v  0xf;
+   else
+   total_weight[vl] += v  0xff;
+   }
+   uniform = true;
+   v = total_weight[0];
+   for (i = 1; i  8; i++) {
+   if (i == 4)
+   v = total_weight[i];
+   if (total_weight[i] != v)
+   uniform = false;
+   }
+   if (!uniform)
+   OSM_LOG(log, OSM_LOG_INFO,
+   Warning: torus-2QoS requires same VLarb weights for 
+   VLs 0-3; also for VLs 4-7: not true for %s 
+   %s_vlarb_%s\n,
+   (is_default ? default : configured), str, pri);
+}
+
+static
+void check_qos_config(osm_qos_options_t *opt, bool tgt_is_default,
+ const char *str, osm_log_t *log)
+{
+   const char *vlarb_str;
+   bool is_default;
+
+   if (opt-max_vls  0  opt-max_vls  8)
+   OSM_LOG(log, OSM_LOG_INFO,
+   Warning: full torus-2QoS functionality not available 
+   for configured %s_max_vls = %d\n, str, opt-max_vls);
+
+   if (opt-vlarb_high) {
+   is_default = false;
+   vlarb_str = opt-vlarb_high;
+   } else{
+   is_default = true;
+   vlarb_str = OSM_DEFAULT_QOS_VLARB_HIGH;
+   }
+   /*
+* Only check values that were actually configured, or the overall
+* defaults that target-specific (CA, switch port, etc) defaults
+* are set from.
+*/
+   if (!is_default || tgt_is_default)
+   check_vlarb_config(vlarb_str, is_default, str, high, log);
+
+   if (opt-vlarb_low) {
+   is_default = false;
+   vlarb_str = opt-vlarb_low;
+   } else {
+   is_default = true;
+   vlarb_str = OSM_DEFAULT_QOS_VLARB_LOW;
+   }
+   if (!is_default || tgt_is_default)
+   check_vlarb_config(vlarb_str, is_default, str, low, log);
+
+   if (opt-sl2vl)
+   OSM_LOG(log, OSM_LOG_INFO,
+   Warning: torus-2QoS must override configured 
+   %s_sl2vl to generate deadlock-free routes\n, str);
+}
+
+static
 int torus_build_lfts(void *context)
 {
int status = -1;
@@ -9111,9 +9189,18 @@ out:
if (torus)
teardown_torus(torus);
} else {
+   osm_subn_opt_t *opt = torus-osm-subn.opt;
+   osm_log_t *log = torus-osm-log;
+
if (ctx-torus)
teardown_torus(ctx-torus);
ctx-torus = torus;
+
+   check_qos_config(opt-qos_options, 1, qos, log);
+   check_qos_config(opt-qos_ca_options, 0, qos_ca, log);
+   check_qos_config(opt-qos_sw0_options, 0, qos_sw0, log);
+   check_qos_config(opt-qos_swe_options, 0, qos_swe, log);
+   check_qos_config(opt-qos_rtr_options, 0, qos_rtr, log);
}
teardown_fabric(fabric);
return status;
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/13] opensm/man/opensm.8.in: Add references to torus-2QoS.

2010-11-12 Thread Jim Schutt

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/man/opensm.8.in |   29 ++---
 1 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/opensm/man/opensm.8.in b/opensm/man/opensm.8.in
index 47dff99..c021026 100644
--- a/opensm/man/opensm.8.in
+++ b/opensm/man/opensm.8.in
@@ -1,4 +1,4 @@
-.TH OPENSM 8 October 22, 2009 OpenIB OpenIB Management
+.TH OPENSM 8 November 3, 2010 OpenIB OpenIB Management
 
 .SH NAME
 opensm \- InfiniBand subnet manager and administration (SM/SA)
@@ -51,6 +51,7 @@ opensm \- InfiniBand subnet manager and administration (SM/SA)
 [\-\-prefix_routes_file path]
 [\-\-consolidate_ipv6_snm_req]
 [\-\-log_prefix prefix text]
+[\-\-torus_config path to file]
 [\-v(erbose)] [\-V] [\-D flags] [\-d(ebug) number]
 [\-h(elp)] [\-?]
 
@@ -148,8 +149,10 @@ LID assignments resolving multiple use of same LID.
 This option chooses routing engine(s) to use instead of Min Hop
 algorithm (default).  Multiple routing engines can be specified
 separated by commas so that specific ordering of routing algorithms
-will be tried if earlier routing engines fail.
-Supported engines: minhop, updn, file, ftree, lash, dor
+will be tried if earlier routing engines fail.  If all configured
+routing engines fail, OpenSM will always attempt to route with Min Hop
+unless 'no_fallback' is included in the list of routing engines.
+Supported engines: minhop, updn, file, ftree, lash, dor, torus-2QoS.
 .TP
 \fB\-\-do_mesh_analysis\fR
 This option enables additional analysis for the lash routing engine to
@@ -364,6 +367,11 @@ when two or more instances of OpenSM run in a single node 
to manage multiple
 fabrics. For example, in a dual-fabric (or dual-rail) IB cluster, the prefix
 for the first fabric could be mpi and the other fabric could be storage.
 .TP
+\fB\-\-torus_config\fR path to torus\-2QoS config file
+This option defines the file name for the extra configuration
+information needed for the torus-2QoS routing engine.   The default
+name is \f...@opensm_config_dir@/@torus2qos_conf_f...@\fp
+.TP
 \fB\-v\fR, \fB\-\-verbose\fR
 This option increases the log verbosity level.
 The -v option may be specified multiple times
@@ -1004,6 +1012,14 @@ along the mesh dimension, or the -O option used as an 
override.
 
 Use '-R dor' option to activate the DOR algorithm.
 
+Torus-2QoS Routing Algorithm
+
+Torus-2QoS is routing algorithm designed for large-scale 2D/3D torus fabrics;
+see torus-2QoS(8) for full documentation.
+
+Use '-R torus-2QoS -Q' or '-R torus-2QoS,no_fallback -Q'
+to activate the torus-2QoS algorithm.
+
 
 Routing References
 
@@ -1113,6 +1129,10 @@ default QOS policy config file
 .B @OPENSM_CONFIG_DIR@/@PREFIX_ROUTES_FILE@
 default prefix routes file.
 
+.TP
+.B @OPENSM_CONFIG_DIR@/@TORUS2QOS_CONF_FILE@
+default torus-2QoS config file.
+
 .SH AUTHORS
 .TP
 Hal Rosenstock
@@ -1135,3 +1155,6 @@ Ira Weiny
 .TP
 Dale Purdy
 .RI  pu...@sgi.com 
+
+.SH SEE ALSO
+torus-2QoS(8), torus-2QoS.conf(5).
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/13] opensm: Add torus-2QoS man pages.

2010-11-12 Thread Jim Schutt

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/Makefile.am  |2 +-
 opensm/configure.in |6 +-
 opensm/man/torus-2QoS.8.in  |  476 +++
 opensm/man/torus-2QoS.conf.5.in |  184 +++
 4 files changed, 666 insertions(+), 2 deletions(-)
 create mode 100644 opensm/man/torus-2QoS.8.in
 create mode 100644 opensm/man/torus-2QoS.conf.5.in

diff --git a/opensm/Makefile.am b/opensm/Makefile.am
index 88ff9da..58a682b 100644
--- a/opensm/Makefile.am
+++ b/opensm/Makefile.am
@@ -12,7 +12,7 @@ install-exec-hook:
chmod 755 $(DESTDIR)/$(sysconfdir)/init.d/opensmd
 
 
-man_MANS = man/opensm.8 man/osmtest.8
+man_MANS = man/opensm.8 man/osmtest.8 man/torus-2QoS.8 man/torus-2QoS.conf.5
 
 various_scripts = $(wildcard scripts/*)
 docs = doc/performance-manager-HOWTO.txt doc/QoS_management_in_OpenSM.txt \
diff --git a/opensm/configure.in b/opensm/configure.in
index 8695965..aaad999 100644
--- a/opensm/configure.in
+++ b/opensm/configure.in
@@ -196,6 +196,10 @@ AC_DEFINE_UNQUOTED(HAVE_DEFAULT_QOS_POLICY_FILE,
[Define a QOS policy config file])
 AC_SUBST(QOS_POLICY_FILE)
 
+dnl For now, this does not need to be configurable
+TORUS2QOS_CONF_FILE=torus-2QoS.conf
+AC_SUBST(TORUS2QOS_CONF_FILE)
+
 dnl Check for a different prefix-routes file
 PREFIX_ROUTES_FILE=prefix-routes.conf
 AC_MSG_CHECKING(for --with-prefix-routes-conf)
@@ -226,7 +230,7 @@ dnl Checks for headers and libraries
 OPENIB_APP_OSMV_CHECK_HEADER
 OPENIB_APP_OSMV_CHECK_LIB
 
-AC_CONFIG_FILES([man/opensm.8 scripts/opensm.init scripts/redhat-opensm.init 
scripts/sldd.sh])
+AC_CONFIG_FILES([man/opensm.8 man/torus-2QoS.8 man/torus-2QoS.conf.5 
scripts/opensm.init scripts/redhat-opensm.init scripts/sldd.sh])
 
 dnl Create the following Makefiles
 AC_OUTPUT([include/opensm/osm_version.h Makefile include/Makefile 
complib/Makefile libvendor/Makefile opensm/Makefile osmeventplugin/Makefile 
osmtest/Makefile opensm.spec])
diff --git a/opensm/man/torus-2QoS.8.in b/opensm/man/torus-2QoS.8.in
new file mode 100644
index 000..68e2bce
--- /dev/null
+++ b/opensm/man/torus-2QoS.8.in
@@ -0,0 +1,476 @@
+.TH TORUS\-2QOS 8 November 10, 2010 OpenIB OpenIB Management
+.
+.SH NAME
+torus\-2QoS \- Routing engine for OpenSM subnet manager
+.
+.SH DESCRIPTION
+.
+Torus-2QoS is routing algorithm designed for large-scale 2D/3D torus fabrics.
+The torus-2QoS routing engine can provide the following functionality on
+a 2D/3D torus:
+.br
+\ roff illiteracy leads to following brain-dead list implementation
+\
+.na  \ otherwise line space adjustment can add spaces between dash and text
+.in +2m
+\[en]
+'in +2m
+Routing that is free of credit loops.
+.in
+\[en]
+'in +2m
+Two levels of Quality of Service (QoS), assuming switches and channel
+adapters support eight data VLs.
+.in
+\[en]
+'in +2m
+The ability to route around a single failed switch, and/or multiple failed
+links, without
+.in
+.in +2m
+\[en]
+'in +2
+introducing credit loops, or
+.in
+\[en]
+'in +2m
+changing path SL values.
+.in -4m
+\[en]
+'in +2m
+Very short run times, with good scaling properties as fabric size increases.
+.ad
+.
+.SH UNICAST ROUTING
+.
+Unicast routing in torus-2QoS is based on Dimension Order Routing (DOR).
+It avoids the deadlocks that would otherwise occur in a DOR-routed
+torus using the concept of a dateline for each torus dimension.
+It encodes into a path SL which datelines the path crosses, as follows:
+\f(CR
+.P
+.nf
+sl = 0;
+for (d = 0; d  torus_dimensions; d++) {
+/* path_crosses_dateline(d) returns 0 or 1 */
+sl |= path_crosses_dateline(d)  d;
+}
+.fi
+\fR
+.P
+On a 3D torus this consumes three SL bits, leaving one SL bit unused.
+Torus-2QoS uses this SL bit to implement two QoS levels.
+.P
+Torus-2QoS also makes use of the output port
+dependence of switch SL2VL maps to encode into one VL bit the
+information encoded in three SL bits.
+It computes in which torus coordinate direction each inter-switch link
+points, and writes SL2VL maps for such ports as follows:
+\f(CR
+.P
+.nf
+for (sl = 0; sl  16; sl++) {
+/* cdir(port) computes which torus coordinate direction
+ * a switch port points in; returns 0, 1, or 2
+ */
+sl2vl(iport,oport,sl) = 0x1  (sl  cdir(oport));
+}
+.fi
+\fR
+.P
+Thus, on a pristine 3D torus,
+\fIi.e.\fR,
+in the absence of failed fabric switches,
+torus-2QoS consumes eight SL values (SL bits 0-2) and
+two VL values (VL bit 0) per QoS level to provide deadlock-free routing.
+.P
+Torus-2QoS routes around link failure by taking the long way around any
+1D ring interrupted by link failure.  For example, consider the 2D 6x5
+torus below, where switches are denoted by [+a-zA-Z]:
+.
+.
+\# define macros to start and end ascii art, assuming Roman font.
+\# the start macro takes an argument which is the width in ems of
+\# the ascii art, and is used to center it.
+\#
+.de ascii_art
+.nop \f(CR
+.nr indent_in_ems

[PATCH 09/13] opensm/osm_subnet.c: Add torus-2QoS config file option to those configurable via opensm config file.

2010-11-12 Thread Jim Schutt

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_subnet.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c
index be406ac..f2ca36f 100644
--- a/opensm/opensm/osm_subnet.c
+++ b/opensm/opensm/osm_subnet.c
@@ -352,6 +352,7 @@ static const opt_rec_t opt_tbl[] = {
{ guid_routing_order_file, OPT_OFFSET(guid_routing_order_file), 
opts_parse_charp, NULL, 0 },
{ sa_db_file, OPT_OFFSET(sa_db_file), opts_parse_charp, NULL, 0 },
{ sa_db_dump, OPT_OFFSET(sa_db_dump), opts_parse_boolean, NULL, 1 },
+   { torus_config, OPT_OFFSET(torus_conf_file), opts_parse_charp, NULL, 
1 },
{ do_mesh_analysis, OPT_OFFSET(do_mesh_analysis), opts_parse_boolean, 
NULL, 1 },
{ exit_on_fatal, OPT_OFFSET(exit_on_fatal), opts_parse_boolean, NULL, 
1 },
{ honor_guid2lid_file, OPT_OFFSET(honor_guid2lid_file), 
opts_parse_boolean, NULL, 1 },
@@ -1447,6 +1448,10 @@ int osm_subn_output_conf(FILE *out, IN osm_subn_opt_t * 
p_opts)
p_opts-sa_db_dump ? TRUE : FALSE);
 
fprintf(out,
+   # Torus-2QoS configuration file name\ntorus_config %s\n\n,
+   p_opts-torus_conf_file ? p_opts-torus_conf_file : null_str);
+
+   fprintf(out,
#\n# HANDOVER - MULTIPLE SMs OPTIONS\n#\n
# SM priority used for deciding who is the master\n
# Range goes from 0 (lowest priority) to 15 (highest).\n
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] opensm/osm_qos.c: Make offset of VL in VLarb block element match IBA spec

2010-11-02 Thread Jim Schutt
According to IBA 1.2.1, Table 152, page 845, the VL in a VLArbitration Table
Block Element has length 4 bits, starting at offset 4 in the 16 bit
Block Element.

Currently, the data being sent to the switches has the VL starting at
offset 0 in the 16 bit Block Element.

Fix things up to match the spec.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_qos.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c
index c90073e..cc38151 100644
--- a/opensm/opensm/osm_qos.c
+++ b/opensm/opensm/osm_qos.c
@@ -365,7 +365,7 @@ static int parse_vlarb_entry(char *str, ib_vl_arb_element_t 
* e)
unsigned val;
char *p = str;
p += parse_one_unsigned(p, ':', val);
-   e-vl = val % 15;
+   e-vl = (val % 15)  4;
p += parse_one_unsigned(p, ',', val);
e-weight = (uint8_t) val;
return (int)(p - str);
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: opensm: Add new torus routing engine: torus-2QoS

2010-10-28 Thread Jim Schutt

On Thu, 2010-10-28 at 07:42 -0600, Albino A. Aveleda wrote:
 Dear Jim,
 
 I have compiled and installed the opensm with torus-2qos.
 There is comment below in config file.
 ---
 # We need to tell the routing engine what directions we
 # want the torus coordinate directions to be, by specifing
 # the endpoints (switch GUID + port) of a link in each
 # direction. These links need to share a common switch,
 # which we call the torus seed.
 # Here we specify positive coordinate directions:
 xp_link 0x20  0x200019   # S_0_0_0 - S_1_0_0
 yp_link 0x20  0x25   # S_0_0_0 - S_0_1_0
 zp_link 0x20  0x21   # S_0_0_0 - S_0_0_1
 ---
 
 How do I get xp, yp and zp_link address?

You can bring up your fabric once with some other routing
engine, say minhop, and run ibnetdiscover.

This will tell you the node GUIDs for all your switches.

Then you need to pick a switch to be the seed.  If you
know that, e.g. your fabric is wired such that port 1,
say, connects to the switch in the direction you want to
be +x, look in your ibnetdiscover output to find the 
node GUID for the switch connected to that port of your
seed switch.

For maximum resiliency, pick the switch that your
opensm host connects to as the seed.

-- Jim

 
 Best regards,
 Albino
 
 - Jim Schutt jasc...@sandia.gov escreveu:
 
  
  This posting http://www.spinics.net/lists/linux-rdma/msg02967.html
  has some example input for a 5x5x5 torus.
  
  You'll want to configure your torus (via opensm --torus_config
  file)
  so that the intra-NEM links are z-direction links.  This will allow
  you to swap a QNEM and keep the fabric routable during the process.
  
  Please look over the torus-2QoS section in
  opensm/doc/current-routing.txt
  to see why this is so, and to help understand why the info
  in torus-2QoS.conf is required.
  
 


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: opensm: Add new torus routing engine: torus-2QoS

2010-10-28 Thread Jim Schutt

On Thu, 2010-10-28 at 09:48 -0600, Hal Rosenstock wrote:
 On Thu, Oct 28, 2010 at 11:05 AM, Jim Schutt jasc...@sandia.gov wrote:
 
  On Thu, 2010-10-28 at 07:42 -0600, Albino A. Aveleda wrote:
  Dear Jim,
 
  I have compiled and installed the opensm with torus-2qos.
  There is comment below in config file.
  ---
  # We need to tell the routing engine what directions we
  # want the torus coordinate directions to be, by specifing
  # the endpoints (switch GUID + port) of a link in each
  # direction. These links need to share a common switch,
  # which we call the torus seed.
  # Here we specify positive coordinate directions:
  xp_link 0x20  0x200019   # S_0_0_0 - S_1_0_0
  yp_link 0x20  0x25   # S_0_0_0 - S_0_1_0
  zp_link 0x20  0x21   # S_0_0_0 - S_0_0_1
  ---
 
  How do I get xp, yp and zp_link address?
 
  You can bring up your fabric once with some other routing
  engine, say minhop, and run ibnetdiscover.
 
 
  This will tell you the node GUIDs for all your switches.
 
 A minor clarification to the above: you don't need to run OpenSM at
 all to run ibnetdiscover to get the switch GUIDs.

D'oh!!  Thanks, Hal.

-- Jim

 
 -- Hal
 
  Then you need to pick a switch to be the seed.  If you
  know that, e.g. your fabric is wired such that port 1,
  say, connects to the switch in the direction you want to
  be +x, look in your ibnetdiscover output to find the
  node GUID for the switch connected to that port of your
  seed switch.
 
  For maximum resiliency, pick the switch that your
  opensm host connects to as the seed.
 
  -- Jim
 
 
  Best regards,
  Albino
 
  - Jim Schutt jasc...@sandia.gov escreveu:
 
  
   This posting http://www.spinics.net/lists/linux-rdma/msg02967.html
   has some example input for a 5x5x5 torus.
  
   You'll want to configure your torus (via opensm --torus_config
   file)
   so that the intra-NEM links are z-direction links.  This will allow
   you to swap a QNEM and keep the fabric routable during the process.
  
   Please look over the torus-2QoS section in
   opensm/doc/current-routing.txt
   to see why this is so, and to help understand why the info
   in torus-2QoS.conf is required.
  
 
 
 
  --
  To unsubscribe from this list: send the line unsubscribe linux-rdma in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] opensm/osm_torus.c: Handle calloc() failure on routing engine context creation.

2010-09-17 Thread Jim Schutt
Hal Rosenstock pointed out this calloc() could fail.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_torus.c |8 +++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/opensm/opensm/osm_torus.c b/opensm/opensm/osm_torus.c
index 12b480d..3b67f16 100644
--- a/opensm/opensm/osm_torus.c
+++ b/opensm/opensm/osm_torus.c
@@ -410,7 +410,11 @@ struct torus_context *torus_context_create(osm_opensm_t 
*osm)
struct torus_context *ctx;
 
ctx = calloc(1, sizeof(*ctx));
-   ctx-osm = osm;
+   if (ctx)
+   ctx-osm = osm;
+   else
+   OSM_LOG(osm-log, OSM_LOG_ERROR,
+   Error: calloc: %s\n, strerror(errno));
 
return ctx;
 }
@@ -9113,6 +9117,8 @@ int osm_ucast_torus2QoS_setup(struct osm_routing_engine 
*r,
struct torus_context *ctx;
 
ctx = torus_context_create(osm);
+   if (!ctx)
+   return -1;
 
r-context = ctx;
r-ucast_build_fwd_tables = torus_build_lfts;
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 04/18] opensm: Track the minimum value in the fabric of data VLs supported.

2010-09-03 Thread Jim Schutt
A routing engine that wants to make contributions to SL2VL maps in support
of routing free from credit loops may need to know the minimum number
of supported data VLs in the fabric.

This code tracks that value.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/include/opensm/osm_subnet.h |1 +
 opensm/opensm/osm_port_info_rcv.c  |   13 -
 opensm/opensm/osm_state_mgr.c  |6 ++
 opensm/opensm/osm_subnet.c |1 +
 4 files changed, 20 insertions(+), 1 deletions(-)

diff --git a/opensm/include/opensm/osm_subnet.h 
b/opensm/include/opensm/osm_subnet.h
index 95a635c..4fa0161 100644
--- a/opensm/include/opensm/osm_subnet.h
+++ b/opensm/include/opensm/osm_subnet.h
@@ -536,6 +536,7 @@ typedef struct osm_subn {
uint16_t max_mcast_lid_ho;
uint8_t min_ca_mtu;
uint8_t min_ca_rate;
+   uint8_t min_data_vls;
boolean_t ignore_existing_lfts;
boolean_t subnet_initialization_error;
boolean_t force_heavy_sweep;
diff --git a/opensm/opensm/osm_port_info_rcv.c 
b/opensm/opensm/osm_port_info_rcv.c
index 9260047..c05301e 100644
--- a/opensm/opensm/osm_port_info_rcv.c
+++ b/opensm/opensm/osm_port_info_rcv.c
@@ -83,6 +83,7 @@ static void pi_rcv_process_endport(IN osm_sm_t * sm, IN 
osm_physp_t * p_physp,
ib_api_status_t status;
ib_net64_t port_guid;
uint8_t rate, mtu;
+   unsigned data_vls;
cl_qmap_t *p_sm_tbl;
osm_remote_sm_t *p_sm;
 
@@ -92,7 +93,7 @@ static void pi_rcv_process_endport(IN osm_sm_t * sm, IN 
osm_physp_t * p_physp,
 
/* HACK extended port 0 should be handled too! */
if (osm_physp_get_port_num(p_physp) != 0) {
-   /* track the minimal endport MTU and rate */
+   /* track the minimal endport MTU, rate, and operational VLs */
mtu = ib_port_info_get_mtu_cap(p_pi);
if (mtu  sm-p_subn-min_ca_mtu) {
OSM_LOG(sm-p_log, OSM_LOG_VERBOSE,
@@ -108,6 +109,16 @@ static void pi_rcv_process_endport(IN osm_sm_t * sm, IN 
osm_physp_t * p_physp,
PRIx64 \n, rate, cl_ntoh64(port_guid));
sm-p_subn-min_ca_rate = rate;
}
+
+   data_vls = 1U  (ib_port_info_get_op_vls(p_pi) - 1);
+   if (data_vls = IB_MAX_NUM_VLS)
+   data_vls = IB_MAX_NUM_VLS - 1;
+   if ((uint8_t)data_vls  sm-p_subn-min_data_vls) {
+   OSM_LOG(sm-p_log, OSM_LOG_VERBOSE,
+   Setting endport minimal data VLs to:%u defined 
by port:0x%
+   PRIx64 \n, data_vls, cl_ntoh64(port_guid));
+   sm-p_subn-min_data_vls = data_vls;
+   }
}
 
if (port_guid != sm-p_subn-sm_port_guid) {
diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
index a3d09d8..bb60636 100644
--- a/opensm/opensm/osm_state_mgr.c
+++ b/opensm/opensm/osm_state_mgr.c
@@ -1171,6 +1171,12 @@ repeat_discovery:
sm-p_subn-force_reroute = FALSE;
sm-p_subn-subnet_initialization_error = FALSE;
 
+   /* Reset tracking values in case limiting component got removed
+* from fabric. */
+   sm-p_subn-min_ca_mtu = IB_MAX_MTU;
+   sm-p_subn-min_ca_rate = IB_MAX_RATE;
+   sm-p_subn-min_data_vls = IB_MAX_NUM_VLS - 1;
+
/* rescan configuration updates */
if (!config_parsed  osm_subn_rescan_conf_files(sm-p_subn)  0)
OSM_LOG(sm-p_log, OSM_LOG_ERROR, ERR 331A: 
diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c
index d5c5ab2..8224b5f 100644
--- a/opensm/opensm/osm_subnet.c
+++ b/opensm/opensm/osm_subnet.c
@@ -529,6 +529,7 @@ ib_api_status_t osm_subn_init(IN osm_subn_t * p_subn, IN 
osm_opensm_t * p_osm,
p_subn-max_mcast_lid_ho = IB_LID_MCAST_END_HO;
p_subn-min_ca_mtu = IB_MAX_MTU;
p_subn-min_ca_rate = IB_MAX_RATE;
+   p_subn-min_data_vls = IB_MAX_NUM_VLS - 1;
p_subn-ignore_existing_lfts = TRUE;
 
/* we assume master by default - so we only need to set it true if 
STANDBY */
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 02/18] opensm: Allow the routing engine to influence SL2VL calculations.

2010-09-03 Thread Jim Schutt
Note that the original code assumes that QoS setup is mostly static and
based only on user configuration.  As a result, there is no provision for
routing engines that want to compute contributions to the SL2VL maps.

Fix this up by adding a callback to struct osm_routing_engine that computes
a per-port SL2VL map, and call it from the appropriate place in the QoS
setup path.  Assume that if a routing engine provides a update_sl2vl()
callback that there will input-port dependence in the SL2VL maps, and
so do not attempt to use optimized SL2VL map programming even if the
switch supports it.

Also need to move the call to osm_qos_setup() in do_sweep() to after the
call to the routing engine, so that any SL2VL map contributions from the
routing engine are based on the latest information.  Need to call
osm_qos_setup() for requested reroute for the same reason.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/include/opensm/osm_opensm.h |   12 
 opensm/opensm/osm_qos.c|   27 +++
 opensm/opensm/osm_state_mgr.c  |5 +++--
 3 files changed, 38 insertions(+), 6 deletions(-)

diff --git a/opensm/include/opensm/osm_opensm.h 
b/opensm/include/opensm/osm_opensm.h
index e97142e..25a6f90 100644
--- a/opensm/include/opensm/osm_opensm.h
+++ b/opensm/include/opensm/osm_opensm.h
@@ -126,6 +126,9 @@ struct osm_routing_engine {
int (*build_lid_matrices) (void *context);
int (*ucast_build_fwd_tables) (void *context);
void (*ucast_dump_tables) (void *context);
+   void (*update_sl2vl)(void *context, IN osm_physp_t *port,
+IN uint8_t in_port_num, IN uint8_t out_port_num,
+IN OUT ib_slvl_table_t *t);
void (*delete) (void *context);
struct osm_routing_engine *next;
 };
@@ -147,6 +150,15 @@ struct osm_routing_engine {
 *  ucast_dump_tables
 *  The callback for dumping unicast routing tables.
 *
+*  update_sl2vl(void *context, IN osm_physp_t *port,
+*   IN uint8_t in_port_num, IN uint8_t out_port_num,
+*   OUT ib_slvl_table_t *t)
+*  The callback to allow routing engine input for SL2VL maps.
+*  *port is the phyical port for which the SL2VL map is to be
+*  updated. For switches, in_port_num/out_port_num identify
+*  which part of the SL2VL map to update.  For router/HCA ports,
+*  in_port_num/out_port_num should be ignored.
+*
 *  delete
 *  The delete method, may be used for routing engine
 *  internals cleanup.
diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c
index c90073e..e0f4411 100644
--- a/opensm/opensm/osm_qos.c
+++ b/opensm/opensm/osm_qos.c
@@ -207,6 +207,7 @@ static int qos_extports_setup(osm_sm_t * sm, osm_node_t 
*node,
osm_physp_t *p0, *p;
unsigned force_update;
unsigned num_ports = osm_node_get_num_physp(node);
+   struct osm_routing_engine *re = sm-p_subn-p_osm-routing_engine_used;
int ret = 0;
unsigned in, out;
uint8_t op_vl1;
@@ -224,7 +225,7 @@ static int qos_extports_setup(osm_sm_t * sm, osm_node_t 
*node,
return ret;
 
if (ib_switch_info_get_opt_sl2vlmapping(node-sw-switch_info) 
-   sm-p_subn-opt.use_optimized_slvl) {
+   sm-p_subn-opt.use_optimized_slvl  !re-update_sl2vl) {
p = osm_node_get_physp_ptr(node, 1);
op_vl1 = ib_port_info_get_op_vls(p-port_info);
force_update = p-need_update || sm-p_subn-need_update;
@@ -249,10 +250,20 @@ static int qos_extports_setup(osm_sm_t * sm, osm_node_t 
*node,
p = osm_node_get_physp_ptr(node, out);
force_update = p-need_update || sm-p_subn-need_update;
/* go over all in ports */
-   for (in = 0; in  num_ports; in++)
+   for (in = 0; in  num_ports; in++) {
+   const ib_slvl_table_t *port_sl2vl = qcfg-sl2vl;
+   ib_slvl_table_t routing_sl2vl;
+
+   if (re-update_sl2vl) {
+   routing_sl2vl = *port_sl2vl;
+   re-update_sl2vl(re-context,
+p, in, out, routing_sl2vl);
+   port_sl2vl = routing_sl2vl;
+   }
if (sl2vl_update_table(sm, p, in, in  8 | out,
-  force_update, qcfg-sl2vl))
+  force_update, port_sl2vl))
ret = -1;
+   }
}
 
return ret;
@@ -262,6 +273,9 @@ static int qos_endport_setup(osm_sm_t * sm, osm_physp_t * p,
 const struct qos_config *qcfg, int vlarb_only)
 {
unsigned force_update = p-need_update || sm-p_subn-need_update;
+   struct

[PATCH v4 12/18] opensm: Enable torus-2QoS routing engine.

2010-09-03 Thread Jim Schutt

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/include/opensm/osm_opensm.h |1 +
 opensm/opensm/main.c   |2 +-
 opensm/opensm/osm_opensm.c |6 ++
 3 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/opensm/include/opensm/osm_opensm.h 
b/opensm/include/opensm/osm_opensm.h
index fddcf53..8d63111 100644
--- a/opensm/include/opensm/osm_opensm.h
+++ b/opensm/include/opensm/osm_opensm.h
@@ -105,6 +105,7 @@ typedef enum _osm_routing_engine_type {
OSM_ROUTING_ENGINE_TYPE_FTREE,
OSM_ROUTING_ENGINE_TYPE_LASH,
OSM_ROUTING_ENGINE_TYPE_DOR,
+   OSM_ROUTING_ENGINE_TYPE_TORUS_2QOS,
OSM_ROUTING_ENGINE_TYPE_UNKNOWN
 } osm_routing_engine_type_t;
 /***/
diff --git a/opensm/opensm/main.c b/opensm/opensm/main.c
index 6e6c733..3a6d565 100644
--- a/opensm/opensm/main.c
+++ b/opensm/opensm/main.c
@@ -174,7 +174,7 @@ static void show_usage(void)
 Min Hop algorithm.  Multiple routing engines can be 
specified\n
 separated by commas so that specific ordering of 
routing\n
 algorithms will be tried if earlier routing engines 
fail.\n
-Supported engines: updn, file, ftree, lash, dor\n\n);
+Supported engines: updn, file, ftree, lash, dor, 
torus-2QoS\n\n);
printf(--do_mesh_analysis\n
 This option enables additional analysis for the 
lash\n
 routing engine to precondition switch port 
assignments\n
diff --git a/opensm/opensm/osm_opensm.c b/opensm/opensm/osm_opensm.c
index 1c865b0..a69b7bb 100644
--- a/opensm/opensm/osm_opensm.c
+++ b/opensm/opensm/osm_opensm.c
@@ -70,6 +70,7 @@ extern int osm_ucast_file_setup(struct osm_routing_engine *, 
osm_opensm_t *);
 extern int osm_ucast_ftree_setup(struct osm_routing_engine *, osm_opensm_t *);
 extern int osm_ucast_lash_setup(struct osm_routing_engine *, osm_opensm_t *);
 extern int osm_ucast_dor_setup(struct osm_routing_engine *, osm_opensm_t *);
+extern int osm_ucast_torus2QoS_setup(struct osm_routing_engine *, osm_opensm_t 
*);
 
 const static struct routing_engine_module routing_modules[] = {
{minhop, osm_ucast_minhop_setup},
@@ -78,6 +79,7 @@ const static struct routing_engine_module routing_modules[] = 
{
{ftree, osm_ucast_ftree_setup},
{lash, osm_ucast_lash_setup},
{dor, osm_ucast_dor_setup},
+   {torus-2QoS, osm_ucast_torus2QoS_setup},
{NULL, NULL}
 };
 
@@ -98,6 +100,8 @@ const char *osm_routing_engine_type_str(IN 
osm_routing_engine_type_t type)
return lash;
case OSM_ROUTING_ENGINE_TYPE_DOR:
return dor;
+   case OSM_ROUTING_ENGINE_TYPE_TORUS_2QOS:
+   return torus-2QoS;
default:
break;
}
@@ -124,6 +128,8 @@ osm_routing_engine_type_t osm_routing_engine_type(IN const 
char *str)
return OSM_ROUTING_ENGINE_TYPE_LASH;
else if (!strcasecmp(str, dor))
return OSM_ROUTING_ENGINE_TYPE_DOR;
+   else if (!strcasecmp(str, torus-2QoS))
+   return OSM_ROUTING_ENGINE_TYPE_TORUS_2QOS;
else
return OSM_ROUTING_ENGINE_TYPE_UNKNOWN;
 }
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 18/18] opensm: Cause status of unicast routing attempt to propogate to callers of osm_ucast_mgr_process().

2010-09-03 Thread Jim Schutt
If unicast routing fails, there is no point to continuing with fabric bring-up.
Just restart a new heavy sweep instead.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_state_mgr.c |   12 +---
 opensm/opensm/osm_ucast_mgr.c |   14 +-
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
index bb60636..1befbfe 100644
--- a/opensm/opensm/osm_state_mgr.c
+++ b/opensm/opensm/osm_state_mgr.c
@@ -1142,7 +1142,11 @@ static void do_sweep(osm_sm_t * sm)
/* Re-program the switches fully */
sm-p_subn-ignore_existing_lfts = TRUE;
 
-   osm_ucast_mgr_process(sm-ucast_mgr);
+   if (osm_ucast_mgr_process(sm-ucast_mgr)) {
+   OSM_LOG_MSG_BOX(sm-p_log, OSM_LOG_VERBOSE,
+   REROUTE FAILED);
+   return;
+   }
osm_qos_setup(sm-p_subn-p_osm);
 
/* Reset flag */
@@ -1313,12 +1317,14 @@ repeat_discovery:
LID ASSIGNMENT COMPLETE - STARTING SWITCH TABLE 
CONFIG);
 
/*
-* Proceed with unicast forwarding table configuration.
+* Proceed with unicast forwarding table configuration; if it fails
+* return early to wait for a trap or the next sweep interval.
 */
 
if (!sm-ucast_mgr.cache_valid ||
osm_ucast_cache_process(sm-ucast_mgr))
-   osm_ucast_mgr_process(sm-ucast_mgr);
+   if (osm_ucast_mgr_process(sm-ucast_mgr))
+   return;
 
osm_qos_setup(sm-p_subn-p_osm);
 
diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c
index f5a715f..85495eb 100644
--- a/opensm/opensm/osm_ucast_mgr.c
+++ b/opensm/opensm/osm_ucast_mgr.c
@@ -1069,6 +1069,7 @@ int osm_ucast_mgr_process(IN osm_ucast_mgr_t * p_mgr)
osm_opensm_t *p_osm;
struct osm_routing_engine *p_routing_eng;
cl_qmap_t *p_sw_guid_tbl;
+   int failed = 0;
 
OSM_LOG_ENTER(p_mgr-p_log);
 
@@ -1087,7 +1088,8 @@ int osm_ucast_mgr_process(IN osm_ucast_mgr_t * p_mgr)
 
p_osm-routing_engine_used = NULL;
while (p_routing_eng) {
-   if (!ucast_mgr_route(p_routing_eng, p_osm))
+   failed = ucast_mgr_route(p_routing_eng, p_osm);
+   if (!failed)
break;
p_routing_eng = p_routing_eng-next;
}
@@ -1098,9 +1100,11 @@ int osm_ucast_mgr_process(IN osm_ucast_mgr_t * p_mgr)
struct osm_routing_engine *r = p_osm-default_routing_engine;
 
r-build_lid_matrices(r-context);
-   r-ucast_build_fwd_tables(r-context);
-   p_osm-routing_engine_used = r;
-   osm_ucast_mgr_set_fwd_tables(p_mgr);
+   failed = r-ucast_build_fwd_tables(r-context);
+   if (!failed) {
+   p_osm-routing_engine_used = r;
+   osm_ucast_mgr_set_fwd_tables(p_mgr);
+   }
}
 
if (p_osm-routing_engine_used) {
@@ -1120,7 +1124,7 @@ int osm_ucast_mgr_process(IN osm_ucast_mgr_t * p_mgr)
 Exit:
CL_PLOCK_RELEASE(p_mgr-p_lock);
OSM_LOG_EXIT(p_mgr-p_log);
-   return 0;
+   return failed;
 }
 
 static int ucast_build_lid_matrices(void *context)
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 17/18] opensm: Avoid havoc in dump_ucast_routes() caused by torus-2QoS persistent use of osm_port_t:priv.

2010-09-03 Thread Jim Schutt
Torus-2QoS makes persistent use of osm_port_t:priv to speed calculation
of path SL values.

However, osm_switch_recommend_path() uses a non-NULL osm_port_t:priv
as a flag that osm_port_t:priv holds a tracking array used when
LMC  0.  It turns out that 1) dump_ucast_routes() does not need
osm_switch_recommend_path() to consider alternate routes, and 2)
before the addition of torus-2QoS, osm_port_t:priv use never
persisted past the unicast routing function, so it was always
NULL on entry to dump_ucast_routes().

Fix this up by making the routing_for_lmc flag explicitly set by
the caller of osm_switch_recommend_path(), rather than inferring
it from osm_port_t:priv.  This retains existing behavior for
existing routing engines, and allows torus-2QoS to make persistent
use of osm_port_t:priv.

The alternative would be to add another member to osm_port_t,
say osm_port_t:priv2.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/include/opensm/osm_switch.h |   12 
 opensm/opensm/osm_dump.c   |2 +-
 opensm/opensm/osm_switch.c |7 ---
 opensm/opensm/osm_ucast_mgr.c  |1 +
 4 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/opensm/include/opensm/osm_switch.h 
b/opensm/include/opensm/osm_switch.h
index 51a8427..f407dd9 100644
--- a/opensm/include/opensm/osm_switch.h
+++ b/opensm/include/opensm/osm_switch.h
@@ -918,6 +918,7 @@ uint8_t osm_switch_recommend_path(IN const osm_switch_t * 
p_sw,
  IN osm_port_t * p_port, IN uint16_t lid_ho,
  IN unsigned start_from,
  IN boolean_t ignore_existing,
+ IN boolean_t routing_for_lmc,
  IN boolean_t dor);
 /*
 * PARAMETERS
@@ -940,6 +941,17 @@ uint8_t osm_switch_recommend_path(IN const osm_switch_t * 
p_sw,
 *  If false, the switch will choose an existing route if one
 *  exists, otherwise will choose the optimal route.
 *
+*  routing_for_lmc
+*  [in] We support an enhanced LMC aware routing mode:
+*  In the case of LMC  0, we can track the remote side
+*  system and node for all of the lids of the target
+*  and try and avoid routing again through the same
+*  system / node.
+*
+*  Assume if routing_for_lmc is TRUE that this procedure
+*  was provided with the tracking array and counter via
+*  p_port-priv, and we can conduct this algorithm.
+*
 *  dor
 *  [in] If TRUE, Dimension Order Routing will be done.
 *
diff --git a/opensm/opensm/osm_dump.c b/opensm/opensm/osm_dump.c
index bfff1a0..535a03f 100644
--- a/opensm/opensm/osm_dump.c
+++ b/opensm/opensm/osm_dump.c
@@ -221,7 +221,7 @@ static void dump_ucast_routes(cl_map_item_t * item, FILE * 
file, void *cxt)
/* No LMC Optimization */
best_port = osm_switch_recommend_path(p_sw, p_port,
  lid_ho, 1, TRUE,
- dor);
+ FALSE, dor);
fprintf(file, No %u hop path possible via port %u!,
best_hops, best_port);
}
diff --git a/opensm/opensm/osm_switch.c b/opensm/opensm/osm_switch.c
index b621852..9785a9d 100644
--- a/opensm/opensm/osm_switch.c
+++ b/opensm/opensm/osm_switch.c
@@ -216,6 +216,7 @@ uint8_t osm_switch_recommend_path(IN const osm_switch_t * 
p_sw,
  IN osm_port_t * p_port, IN uint16_t lid_ho,
  IN unsigned start_from,
  IN boolean_t ignore_existing,
+ IN boolean_t routing_for_lmc,
  IN boolean_t dor)
 {
/*
@@ -225,10 +226,10 @@ uint8_t osm_switch_recommend_path(IN const osm_switch_t * 
p_sw,
   and try and avoid routing again through the same
   system / node.
 
-  If this procedure is provided with the tracking array
-  and counter we can conduct this algorithm.
+  Assume if routing_for_lmc is true that this procedure was
+  provided the tracking array and counter via p_port-priv,
+  and we can conduct this algorithm.
 */
-   boolean_t routing_for_lmc = (p_port-priv != NULL);
uint16_t base_lid;
uint8_t hops;
uint8_t least_hops;
diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c
index e6e40f0..f5a715f 100644
--- a/opensm/opensm/osm_ucast_mgr.c
+++ b/opensm/opensm/osm_ucast_mgr.c
@@ -252,6 +252,7 @@ static void ucast_mgr_process_port(IN osm_ucast_mgr_t * 
p_mgr,
 */
port = osm_switch_recommend_path(p_sw, p_port, lid_ho, start_from,
 p_mgr

[PATCH v4 15/18] opensm: Make it possible to configure no fallback routing engine.

2010-09-03 Thread Jim Schutt
For a fabric that requires routing with an engine with special properties,
say avoiding credit loops via making use of SLs in routing, it might
be preferable to not fall back to minhop if the configured routing engine
fails.

E.g. the torus-2QoS routing engine uses both SL2VL maps and path SL values
to provide routing free of credit loops, but cannot route fabrics for
some patterns of failed switches.  Should a switch fail that creates such
a pattern, it may be preferable to keep the previous routing information
loaded in the switches until a switch can be replaced that restores
torus-2QoS's ability to route the fabric.

The alternative, having some other engine route the fabric, will immediately
introduce credit loops.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/include/opensm/osm_subnet.h |1 +
 opensm/opensm/osm_opensm.c |5 +
 opensm/opensm/osm_qos.c|6 ++
 opensm/opensm/osm_ucast_mgr.c  |   23 +++
 4 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/opensm/include/opensm/osm_subnet.h 
b/opensm/include/opensm/osm_subnet.h
index fa3e46e..42ae416 100644
--- a/opensm/include/opensm/osm_subnet.h
+++ b/opensm/include/opensm/osm_subnet.h
@@ -219,6 +219,7 @@ typedef struct osm_subn_opt {
osm_qos_options_t qos_rtr_options;
boolean_t enable_quirks;
boolean_t no_clients_rereg;
+   boolean_t no_fallback_routing_engine;
 #ifdef ENABLE_OSM_PERF_MGR
boolean_t perfmgr;
boolean_t perfmgr_redir;
diff --git a/opensm/opensm/osm_opensm.c b/opensm/opensm/osm_opensm.c
index a69b7bb..82aa987 100644
--- a/opensm/opensm/osm_opensm.c
+++ b/opensm/opensm/osm_opensm.c
@@ -159,6 +159,11 @@ static struct osm_routing_engine 
*setup_routing_engine(osm_opensm_t *osm,
struct osm_routing_engine *re;
const struct routing_engine_module *m;
 
+   if (!strcmp(name, no_fallback)) {
+   osm-subn.opt.no_fallback_routing_engine = TRUE;
+   return NULL;
+   }
+
for (m = routing_modules; m-name  *m-name; m++) {
if (!strcmp(m-name, name)) {
re = malloc(sizeof(struct osm_routing_engine));
diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c
index 204c69c..ab55918 100644
--- a/opensm/opensm/osm_qos.c
+++ b/opensm/opensm/osm_qos.c
@@ -212,6 +212,12 @@ static int qos_extports_setup(osm_sm_t * sm, osm_node_t 
*node,
unsigned in, out;
uint8_t op_vl1;
 
+   /*
+* Do nothing unless the most recent routing attempt was successful.
+*/
+   if (!re)
+   return ret;
+
for (out = 1; out  num_ports; out++) {
p = osm_node_get_physp_ptr(node, out);
force_update = p-need_update || sm-p_subn-need_update;
diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c
index 10629cb..d1c485f 100644
--- a/opensm/opensm/osm_ucast_mgr.c
+++ b/opensm/opensm/osm_ucast_mgr.c
@@ -1091,7 +1091,8 @@ int osm_ucast_mgr_process(IN osm_ucast_mgr_t * p_mgr)
p_routing_eng = p_routing_eng-next;
}
 
-   if (!p_osm-routing_engine_used) {
+   if (!p_osm-routing_engine_used 
+   p_osm-subn.opt.no_fallback_routing_engine != TRUE) {
/* If configured routing algorithm failed, use default MinHop */
struct osm_routing_engine *r = p_osm-default_routing_engine;
 
@@ -1101,14 +1102,20 @@ int osm_ucast_mgr_process(IN osm_ucast_mgr_t * p_mgr)
osm_ucast_mgr_set_fwd_tables(p_mgr);
}
 
-   OSM_LOG(p_mgr-p_log, OSM_LOG_INFO,
-   %s tables configured on all switches\n,
-   osm_routing_engine_type_str(p_osm-
-   routing_engine_used-type));
-
-   if (p_mgr-p_subn-opt.use_ucast_cache)
-   p_mgr-cache_valid = TRUE;
+   if (p_osm-routing_engine_used) {
+   OSM_LOG(p_mgr-p_log, OSM_LOG_INFO,
+   %s tables configured on all switches\n,
+   osm_routing_engine_type_str(p_osm-
+   routing_engine_used-type));
 
+   if (p_mgr-p_subn-opt.use_ucast_cache)
+   p_mgr-cache_valid = TRUE;
+   } else {
+   p_mgr-p_subn-subnet_initialization_error = TRUE;
+   OSM_LOG(p_mgr-p_log, OSM_LOG_ERROR,
+   No routing engine able to successfully configure 
+switch tables on current fabric\n);
+   }
 Exit:
CL_PLOCK_RELEASE(p_mgr-p_lock);
OSM_LOG_EXIT(p_mgr-p_log);
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 14/18] opensm: Do not require -Q option for torus-2QoS routing engine.

2010-09-03 Thread Jim Schutt
The torus-2QoS engine provides a deadlock-free routing for a 2D/3D torus,
but requires that switch SL2VL maps be programmed.  Before this change,
opensm -Q was required for that to happen.

When a routing engine sets the struct osm_routing_engine:update_sl2vl
pointer, it is signalling its intent to participate in SL2VL map programming.
So, don't return early from osm_qos_setup() in that case; instead do everything
except attempt to read QoS configuration information.

For that to work properly, need to also always set up the default QoS config
information, instead of just when QoS is requested via -Q.

With that in place, the -Q option now means the same thing to torus-2QoS that
it means to other routing engines: QoS configuration is requested.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_qos.c|7 +--
 opensm/opensm/osm_subnet.c |   18 +-
 2 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c
index e0f4411..204c69c 100644
--- a/opensm/opensm/osm_qos.c
+++ b/opensm/opensm/osm_qos.c
@@ -308,7 +308,9 @@ int osm_qos_setup(osm_opensm_t * p_osm)
int ret = 0;
int vlarb_only;
 
-   if (!p_osm-subn.opt.qos)
+   if (!(p_osm-subn.opt.qos ||
+ (p_osm-routing_engine_used 
+  p_osm-routing_engine_used-update_sl2vl)))
return 0;
 
OSM_LOG_ENTER(p_osm-log);
@@ -325,7 +327,8 @@ int osm_qos_setup(osm_opensm_t * p_osm)
cl_plock_excl_acquire(p_osm-lock);
 
/* read QoS policy config file */
-   osm_qos_parse_policy_file(p_osm-subn);
+   if (p_osm-subn.opt.qos)
+   osm_qos_parse_policy_file(p_osm-subn);
 
p_tbl = p_osm-subn.port_guid_tbl;
p_next = cl_qmap_head(p_tbl);
diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c
index bc34a0f..f714af7 100644
--- a/opensm/opensm/osm_subnet.c
+++ b/opensm/opensm/osm_subnet.c
@@ -1051,6 +1051,8 @@ static void subn_verify_qos_set(osm_qos_options_t *set, 
const char *prefix,
 
 int osm_subn_verify_config(IN osm_subn_opt_t * p_opts)
 {
+   osm_qos_options_t dflt;
+
if (p_opts-lmc  7) {
log_report( Invalid Cached Option Value:lmc = %u:
   Using Default:%u\n, p_opts-lmc, OSM_DEFAULT_LMC);
@@ -1101,17 +1103,15 @@ int osm_subn_verify_config(IN osm_subn_opt_t * p_opts)
p_opts-console = OSM_DEFAULT_CONSOLE;
}
 
-   if (p_opts-qos) {
-   osm_qos_options_t dflt;
-
-   /* the default options in qos_options must be correct.
-* every other one need not be, b/c those will default
-* back to whatever is in qos_options.
-*/
 
-   subn_set_default_qos_options(dflt);
+   /* the default options in qos_options must be correct.
+* every other one need not be, b/c those will default
+* back to whatever is in qos_options.
+*/
+   subn_set_default_qos_options(dflt);
+   subn_verify_qos_set(p_opts-qos_options, qos, dflt);
 
-   subn_verify_qos_set(p_opts-qos_options, qos, dflt);
+   if (p_opts-qos) {
subn_verify_qos_set(p_opts-qos_ca_options, qos_ca,
p_opts-qos_options);
subn_verify_qos_set(p_opts-qos_sw0_options, qos_sw0,
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] opensm: Fix sl2vl configuration

2010-07-29 Thread Jim Schutt
Hi Eli,

On Wed, 2010-07-28 at 10:26 -0600, Eli Dorfman (Voltaire) wrote:
 Subject: [PATCH] Fix sl2vl configuration
 
 For non-optimized sl2vl configuration in and out ports were reversed.

Nice catch.  I think this is also the correct fix for the problem
I was trying to fix with commit e1c253e893.

 For optimal sl2vl added override of default ALL settting with port's
 sl2vl when operational VL was other than the default port.
 
 Signed-off-by: Eli Dorfman e...@voltaire.com
 ---
  opensm/opensm/osm_qos.c |   25 -
  1 files changed, 20 insertions(+), 5 deletions(-)
 
 diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c
 index a571370..de0ae23 100644
 --- a/opensm/opensm/osm_qos.c
 +++ b/opensm/opensm/osm_qos.c
 @@ -182,7 +182,7 @@ static ib_api_status_t sl2vl_update_table(osm_sm_t * sm, 
 osm_physp_t * p,
   tbl.raw_vl_by_sl[i] = (vl1  4) | vl2;
   }
  
 - if (!force_update  (p_tbl = osm_physp_get_slvl_tbl(p, in_port)) 
 + if (!force_update  in_port  (p_tbl = osm_physp_get_slvl_tbl(p, 
 in_port)) 
   !memcmp(p_tbl, tbl, sizeof(tbl)))
   return IB_SUCCESS;


I'm confused.  Why do we want to always send the sl2vl update
if in_port is zero?


  
 @@ -209,6 +209,7 @@ static int qos_extports_setup(osm_sm_t * sm, osm_node_t 
 *node,
   unsigned num_ports = osm_node_get_num_physp(node);
   int ret = 0;
   unsigned i, j;
 + uint8_t op_vl1;
  
   for (i = 1; i  num_ports; i++) {
   p = osm_node_get_physp_ptr(node, i);
 @@ -225,17 +226,31 @@ static int qos_extports_setup(osm_sm_t * sm, osm_node_t 
 *node,
   if (ib_switch_info_get_opt_sl2vlmapping(node-sw-switch_info) 
   sm-p_subn-opt.use_optimized_slvl) {
   p = osm_node_get_physp_ptr(node, 1);
 + op_vl1 = ib_port_info_get_op_vls(p-port_info);
   force_update = p-need_update || sm-p_subn-need_update;
 - return sl2vl_update_table(sm, p, 1, 0x3, force_update,
 -   qcfg-sl2vl);
 + if (sl2vl_update_table(sm, p, 0, 0x3, force_update,
 + qcfg-sl2vl))
 + ret = -1;
 + /* overwrite default ALL configuration if port's
 +op_vl is different */
 + for (i = 2; i  num_ports; i++) {
 + p = osm_node_get_physp_ptr(node, i);
 + if (ib_port_info_get_op_vls(p-port_info) != op_vl1  
 + sl2vl_update_table(sm, p, 0, 0x2 | i, 
 force_update,
 + qcfg-sl2vl))
 + ret = -1;
 + }
 + return ret;
   }
  

I think below we only need to avoid port 0 when it is the output port,
and is not an enhanced port 0.

 - for (i = 0; i  num_ports; i++) {
 + /* non optimized sl2vl configuration */
 + i = ib_switch_info_is_enhanced_port0(node-sw-switch_info) ? 0 : 1;
 + for (; i  num_ports; i++) {
   p = osm_node_get_physp_ptr(node, i);
   force_update = p-need_update || sm-p_subn-need_update;
   j = ib_switch_info_is_enhanced_port0(node-sw-switch_info) ? 
 0 : 1;
   for (; j  num_ports; j++)
 - if (sl2vl_update_table(sm, p, i, i  8 | j,
 + if (sl2vl_update_table(sm, p, j, j  8 | i,
  force_update, qcfg-sl2vl))
   ret = -1;
   }


So I think the above should be:

-   for (i = 0; i  num_ports; i++) {
+   /* non optimized sl2vl configuration */
+   i = ib_switch_info_is_enhanced_port0(node-sw-switch_info) ? 0 : 1;
+   for (; i  num_ports; i++) {
p = osm_node_get_physp_ptr(node, i);
force_update = p-need_update || sm-p_subn-need_update;
-   j = ib_switch_info_is_enhanced_port0(node-sw-switch_info) ? 
0 : 1;
-   for (; j  num_ports; j++)
-   if (sl2vl_update_table(sm, p, i, i  8 | j,
+   for (j = 0; j  num_ports; j++)
+   if (sl2vl_update_table(sm, p, j, j  8 | i,
   force_update, qcfg-sl2vl))
ret = -1;
}

I've tested this version of your fix, and it also stops the 
messages logged as described in the commit log for e1c253e893.

Thanks -- Jim



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


rds-tools issue starting with OFED-1.5.2-20100718-0600.tgz

2010-07-27 Thread Jim Schutt
Hi,

It looks like starting on July 18, the daily snapshot for OFED-1.5.2
changed what version of rds-tools it was based on:

diff -urN OFED-1.5.2-20100717-0600/BUILD_ID OFED-1.5.2-20100718-0600/BUILD_ID
--- OFED-1.5.2-20100717-0600/BUILD_ID   2010-07-17 07:13:29.0 -0600
+++ OFED-1.5.2-20100718-0600/BUILD_ID   2010-07-18 07:13:29.0 -0600
@@ -1,4 +1,4 @@
-OFED-1.5.2-20100717-0600:
+OFED-1.5.2-20100718-0600:
 
 compat-dapl:
 http://www.openfabrics.org/downloads/dapl/compat-dapl-1.2.18.tar.gz
@@ -77,7 +77,7 @@
 
 ofa_kernel:
 git://git.openfabrics.org/ofed_1_5/linux-2.6.git ofed_kernel_1_5
-commit 0c842405cd3d204b23125836a8749fe7cd40b566
+commit 6bcc8f2eb4f005f430ee8f1d6962ba6778d6bbd8
 
 ofed-docs:
 git://git.openfabrics.org/~tziporet/docs.git ofed_1_5
@@ -105,7 +105,7 @@
 http://www.openfabrics.org/downloads/qperf/qperf-0.4.6-0.1.gb81434e.tar.gz
 
 rds-tools:
-http://www.openfabrics.org/~vlad/ofed_1_5/rds-tools/rds-tools-1.5-1.src.rpm
+http://www.openfabrics.org/downloads/rds-tools/rds-tools-2.0.3.tar.gz
 


It also looks like, from 
  http://oss.oracle.com/git/?p=agrover/rds-tools.git;a=summary

that rds-tools now builds into two RPMs, rds-tools and rds-devel,
but the OFED build scripts don't seem to know about that change.

I'd like to learn how to write apps that use RDS, so I thought
I needed rds.h to compile against, in hopes of running against the
latest upstream kernel RDS.  But, I can't seem to get it from
the 1.5.2 daily snapshot, as a rds-devel rpm isn't getting installed.

Is there somewhere else to get the appropriate rds.h from?

Thanks -- Jim



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 01/17] opensm: Prepare for routing engine input to path record SL lookup and SL2VL map setup.

2010-07-07 Thread Jim Schutt
Hi Sasha:

On Wed, 2010-07-07 at 11:06 -0600, Sasha Khapyorsky wrote:
 Hi Jim,
 
 On 13:53 Tue 15 Jun , Jim Schutt wrote:
  diff --git a/opensm/opensm/osm_opensm.c b/opensm/opensm/osm_opensm.c
  index d3dc02e..5614240 100644
  --- a/opensm/opensm/osm_opensm.c
  +++ b/opensm/opensm/osm_opensm.c
  @@ -147,7 +147,8 @@ static void append_routing_engine(osm_opensm_t *osm,
  r-next = routing_engine;
   }
   
  -static void setup_routing_engine(osm_opensm_t *osm, const char *name)
  +static struct osm_routing_engine *setup_routing_engine(osm_opensm_t *osm,
  +  const char *name)
   {
  struct osm_routing_engine *re;
  const struct routing_engine_module *m;
  @@ -158,47 +159,53 @@ static void setup_routing_engine(osm_opensm_t *osm, 
  const char *name)
  if (!re) {
  OSM_LOG(osm-log, OSM_LOG_VERBOSE,
  memory allocation failed\n);
  -   return;
  +   return NULL;
  }
  memset(re, 0, sizeof(struct osm_routing_engine));
   
  re-name = m-name;
  +   re-type = osm_routing_engine_type(m-name);
  if (m-setup(re, osm)) {
  OSM_LOG(osm-log, OSM_LOG_VERBOSE,
  setup of routing
   engine \'%s\' failed\n, name);
  -   return;
  +   free(re);
  +   return NULL;
  }
  OSM_LOG(osm-log, OSM_LOG_DEBUG,
  \'%s\' routing engine set up\n, re-name);
  -   append_routing_engine(osm, re);
  -   return;
  +   if (re-type == OSM_ROUTING_ENGINE_TYPE_MINHOP)
  +   osm-default_routing_engine = re;
  +   return re;
  }
  }
   
  OSM_LOG(osm-log, OSM_LOG_ERROR,
  cannot find or setup routing engine \'%s\'\n, name);
  +   return NULL;
   }
   
   static void setup_routing_engines(osm_opensm_t *osm, const char 
  *engine_names)
   {
  char *name, *str, *p;
  +   struct osm_routing_engine *re;
   
  -   if (!engine_names || !*engine_names) {
  -   setup_routing_engine(osm, minhop);
  -   return;
  +   if (engine_names  *engine_names) {
  +   str = strdup(engine_names);
  +   name = strtok_r(str, , \t\n, p);
  +   while (name  *name) {
  +   re = setup_routing_engine(osm, name);
  +   if (re)
  +   append_routing_engine(osm, re);
  +   name = strtok_r(NULL, , \t\n, p);
  +   }
  +   free(str);
  }
  -
  -   str = strdup(engine_names);
  -   name = strtok_r(str, , \t\n, p);
  -   while (name  *name) {
  -   setup_routing_engine(osm, name);
  -   name = strtok_r(NULL, , \t\n, p);
  +   if (!osm-default_routing_engine) {
  +   re = setup_routing_engine(osm, minhop);
  +   if (!osm-routing_engine_list  re)
  +   append_routing_engine(osm, re);
 
 Shouldn't here be:
 
   osm-default_routing_engine = re;
 
 too?

I think above call to setup_routing_engine(osm, minhop)
does that, because we're explicitly calling it for minhop?

But now that I look at this again, I'm confused why I
thought I needed to append a minhop routing engine to
the routing engine list when the list was empty and there 
was no default routing engine.

I was trying to exactly duplicate old functionality, where
minhop is only in the routing engine list if explicitly
configured, but always called if no routing engines are
configured or all configured engines fail.
   
So I think the end of the above chunk only needs to be

-
-   str = strdup(engine_names);
-   name = strtok_r(str, , \t\n, p);
-   while (name  *name) {
-   setup_routing_engine(osm, name);
-   name = strtok_r(NULL, , \t\n, p);
-   }
+   if (!osm-default_routing_engine)
+   setup_routing_engine(osm, minhop);

-- Jim

 
 
  }
  -   free(str);
  -
  -   if (!osm-routing_engine_list)
  -   setup_routing_engine(osm, minhop);
   }
   
   void osm_opensm_construct(IN osm_opensm_t * p_osm)
 
 
 So that this chunk in osm_ucast_mgr_process() (below) will not break
 over NULL pointer?
 
  -   if (p_osm-routing_engine_used == OSM_ROUTING_ENGINE_TYPE_NONE) {
  +   if (!p_osm-routing_engine_used) {
  /* If configured routing algorithm failed, use default MinHop */
  -   osm_ucast_mgr_build_lid_matrices(p_mgr);
  -   ucast_mgr_build_lfts(p_mgr);
  +   struct osm_routing_engine *r = p_osm-default_routing_engine;
  +
  +   r-build_lid_matrices(r-context);
  +   r-ucast_build_fwd_tables

[PATCH v3 08/17] opensm: Add new torus routing engine: torus-2QoS, part 2.

2010-06-16 Thread Jim Schutt

Signed-off-by: Jim Schutt jasc...@sandia.gov
---

Hmmm, I tried to break up the addition of osm_torus.c into
mailing-list-size hunks, but evidently failed on this one;
it doesn't seem to have made it to the list.

I've attached the patch as a compressed file.

Sorry.

-- Jim

 opensm/opensm/osm_torus.c | 3993
+
 1 files changed, 3993 insertions(+), 0 deletions(-)



0008-opensm-Add-torus-2QoS-routing-engine-part-2.patch.bz2
Description: application/bzip


[PATCH v3 04/17] opensm: Track the minimum value in the fabric of data VLs supported.

2010-06-15 Thread Jim Schutt
A routing engine that wants to make contributions to SL2VL maps in support
of routing free from credit loops may need to know the minimum number
of supported data VLs in the fabric.

This code tracks that value.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/include/opensm/osm_subnet.h |1 +
 opensm/opensm/osm_port_info_rcv.c  |   13 -
 opensm/opensm/osm_state_mgr.c  |6 ++
 opensm/opensm/osm_subnet.c |1 +
 4 files changed, 20 insertions(+), 1 deletions(-)

diff --git a/opensm/include/opensm/osm_subnet.h 
b/opensm/include/opensm/osm_subnet.h
index 95a635c..4fa0161 100644
--- a/opensm/include/opensm/osm_subnet.h
+++ b/opensm/include/opensm/osm_subnet.h
@@ -536,6 +536,7 @@ typedef struct osm_subn {
uint16_t max_mcast_lid_ho;
uint8_t min_ca_mtu;
uint8_t min_ca_rate;
+   uint8_t min_data_vls;
boolean_t ignore_existing_lfts;
boolean_t subnet_initialization_error;
boolean_t force_heavy_sweep;
diff --git a/opensm/opensm/osm_port_info_rcv.c 
b/opensm/opensm/osm_port_info_rcv.c
index 9260047..c05301e 100644
--- a/opensm/opensm/osm_port_info_rcv.c
+++ b/opensm/opensm/osm_port_info_rcv.c
@@ -83,6 +83,7 @@ static void pi_rcv_process_endport(IN osm_sm_t * sm, IN 
osm_physp_t * p_physp,
ib_api_status_t status;
ib_net64_t port_guid;
uint8_t rate, mtu;
+   unsigned data_vls;
cl_qmap_t *p_sm_tbl;
osm_remote_sm_t *p_sm;
 
@@ -92,7 +93,7 @@ static void pi_rcv_process_endport(IN osm_sm_t * sm, IN 
osm_physp_t * p_physp,
 
/* HACK extended port 0 should be handled too! */
if (osm_physp_get_port_num(p_physp) != 0) {
-   /* track the minimal endport MTU and rate */
+   /* track the minimal endport MTU, rate, and operational VLs */
mtu = ib_port_info_get_mtu_cap(p_pi);
if (mtu  sm-p_subn-min_ca_mtu) {
OSM_LOG(sm-p_log, OSM_LOG_VERBOSE,
@@ -108,6 +109,16 @@ static void pi_rcv_process_endport(IN osm_sm_t * sm, IN 
osm_physp_t * p_physp,
PRIx64 \n, rate, cl_ntoh64(port_guid));
sm-p_subn-min_ca_rate = rate;
}
+
+   data_vls = 1U  (ib_port_info_get_op_vls(p_pi) - 1);
+   if (data_vls = IB_MAX_NUM_VLS)
+   data_vls = IB_MAX_NUM_VLS - 1;
+   if ((uint8_t)data_vls  sm-p_subn-min_data_vls) {
+   OSM_LOG(sm-p_log, OSM_LOG_VERBOSE,
+   Setting endport minimal data VLs to:%u defined 
by port:0x%
+   PRIx64 \n, data_vls, cl_ntoh64(port_guid));
+   sm-p_subn-min_data_vls = data_vls;
+   }
}
 
if (port_guid != sm-p_subn-sm_port_guid) {
diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
index cdd72c1..762bb27 100644
--- a/opensm/opensm/osm_state_mgr.c
+++ b/opensm/opensm/osm_state_mgr.c
@@ -1164,6 +1164,12 @@ repeat_discovery:
sm-p_subn-force_reroute = FALSE;
sm-p_subn-subnet_initialization_error = FALSE;
 
+   /* Reset tracking values in case limiting component got removed
+* from fabric. */
+   sm-p_subn-min_ca_mtu = IB_MAX_MTU;
+   sm-p_subn-min_ca_rate = IB_MAX_RATE;
+   sm-p_subn-min_data_vls = IB_MAX_NUM_VLS - 1;
+
/* rescan configuration updates */
if (!config_parsed  osm_subn_rescan_conf_files(sm-p_subn)  0)
OSM_LOG(sm-p_log, OSM_LOG_ERROR, ERR 331A: 
diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c
index d5c5ab2..8224b5f 100644
--- a/opensm/opensm/osm_subnet.c
+++ b/opensm/opensm/osm_subnet.c
@@ -529,6 +529,7 @@ ib_api_status_t osm_subn_init(IN osm_subn_t * p_subn, IN 
osm_opensm_t * p_osm,
p_subn-max_mcast_lid_ho = IB_LID_MCAST_END_HO;
p_subn-min_ca_mtu = IB_MAX_MTU;
p_subn-min_ca_rate = IB_MAX_RATE;
+   p_subn-min_data_vls = IB_MAX_NUM_VLS - 1;
p_subn-ignore_existing_lfts = TRUE;
 
/* we assume master by default - so we only need to set it true if 
STANDBY */
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 02/17] opensm: Allow the routing engine to influence SL2VL calculations.

2010-06-15 Thread Jim Schutt
Note that the original code assumes that QoS setup is mostly static and
based only on user configuration.  As a result, there is no provision for
routing engines that want to compute contributions to the SL2VL maps.

Fix this up by adding a callback to struct osm_routing_engine that computes
a per-port SL2VL map, and call it from the appropriate place in the QoS
setup path.  Assume that if a routing engine provides a update_sl2vl()
callback that there will input-port dependence in the SL2VL maps, and
so do not attempt to use optimized SL2VL map programming even if the
switch supports it.

Also need to move the call to osm_qos_setup() in do_sweep() to after the
call to the routing engine, so that any SL2VL map contributions from the
routing engine are based on the latest information.  Need to call
osm_qos_setup() for requested reroute for the same reason.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/include/opensm/osm_opensm.h |   12 
 opensm/opensm/osm_qos.c|   27 +++
 opensm/opensm/osm_state_mgr.c  |5 +++--
 3 files changed, 38 insertions(+), 6 deletions(-)

diff --git a/opensm/include/opensm/osm_opensm.h 
b/opensm/include/opensm/osm_opensm.h
index e97142e..25a6f90 100644
--- a/opensm/include/opensm/osm_opensm.h
+++ b/opensm/include/opensm/osm_opensm.h
@@ -126,6 +126,9 @@ struct osm_routing_engine {
int (*build_lid_matrices) (void *context);
int (*ucast_build_fwd_tables) (void *context);
void (*ucast_dump_tables) (void *context);
+   void (*update_sl2vl)(void *context, IN osm_physp_t *port,
+IN uint8_t in_port_num, IN uint8_t out_port_num,
+IN OUT ib_slvl_table_t *t);
void (*delete) (void *context);
struct osm_routing_engine *next;
 };
@@ -147,6 +150,15 @@ struct osm_routing_engine {
 *  ucast_dump_tables
 *  The callback for dumping unicast routing tables.
 *
+*  update_sl2vl(void *context, IN osm_physp_t *port,
+*   IN uint8_t in_port_num, IN uint8_t out_port_num,
+*   OUT ib_slvl_table_t *t)
+*  The callback to allow routing engine input for SL2VL maps.
+*  *port is the phyical port for which the SL2VL map is to be
+*  updated. For switches, in_port_num/out_port_num identify
+*  which part of the SL2VL map to update.  For router/HCA ports,
+*  in_port_num/out_port_num should be ignored.
+*
 *  delete
 *  The delete method, may be used for routing engine
 *  internals cleanup.
diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c
index cce59ee..dadef29 100644
--- a/opensm/opensm/osm_qos.c
+++ b/opensm/opensm/osm_qos.c
@@ -207,6 +207,7 @@ static int qos_extports_setup(osm_sm_t * sm, osm_node_t 
*node,
osm_physp_t *p0, *p;
unsigned force_update;
unsigned num_ports = osm_node_get_num_physp(node);
+   struct osm_routing_engine *re = sm-p_subn-p_osm-routing_engine_used;
int ret = 0;
unsigned i, j;
 
@@ -223,7 +224,7 @@ static int qos_extports_setup(osm_sm_t * sm, osm_node_t 
*node,
return ret;
 
if (ib_switch_info_get_opt_sl2vlmapping(node-sw-switch_info) 
-   sm-p_subn-opt.use_optimized_slvl) {
+   sm-p_subn-opt.use_optimized_slvl  !re-update_sl2vl) {
p = osm_node_get_physp_ptr(node, 1);
force_update = p-need_update || sm-p_subn-need_update;
return sl2vl_update_table(sm, p, 1, 0x3, force_update,
@@ -234,10 +235,20 @@ static int qos_extports_setup(osm_sm_t * sm, osm_node_t 
*node,
p = osm_node_get_physp_ptr(node, i);
force_update = p-need_update || sm-p_subn-need_update;
j = ib_switch_info_is_enhanced_port0(node-sw-switch_info) ? 
0 : 1;
-   for (; j  num_ports; j++)
+   for (; j  num_ports; j++) {
+   const ib_slvl_table_t *port_sl2vl = qcfg-sl2vl;
+   ib_slvl_table_t routing_sl2vl;
+
+   if (re-update_sl2vl) {
+   routing_sl2vl = *port_sl2vl;
+   re-update_sl2vl(re-context,
+p, i, j, routing_sl2vl);
+   port_sl2vl = routing_sl2vl;
+   }
if (sl2vl_update_table(sm, p, i, i  8 | j,
-  force_update, qcfg-sl2vl))
+  force_update, port_sl2vl))
ret = -1;
+   }
}
 
return ret;
@@ -247,6 +258,9 @@ static int qos_endport_setup(osm_sm_t * sm, osm_physp_t * p,
 const struct qos_config *qcfg)
 {
unsigned force_update = p-need_update || sm-p_subn-need_update;
+   struct

[PATCH v3 00/17] opensm: Add new torus routing engine: torus-2QoS

2010-06-15 Thread Jim Schutt
This is v3 of a patchset to add to opensm a new routing engine designed
to handle large fabrics connected with a 2D/3D torus topology.

Changes since v2:

- Rebased to a3dec3a87a.
- Divide Add torus-2QoS routing engine patch into three parts
   to avoid rejection by mailing list.
- Bug fix: reduce number of required seed links for a torus
   with one or more dimensions of radix four.
- Bug fix: don't let torus-2QoS be fooled into thinking it can route
   a torus with two or more blocks of switches adjacent in z missing.
- Bug fix: if osm_ucast_mgr_process() fails, no configured routing engine
   could route the fabric, so wait for a trap or sweep interval before
   next heavy sweep.
- Bug fix: cut-n-paste error in handle_case_0x731().

Changes since initial version:

- Merged my patchsets from 11/20/2009, 12/18/2009, 2/16/2010.
- Moved infomation contained in the earlier patch series introduction
emails into the appropriate commit messages.
- Rebased to c183eb8c4c.
- Addressed issues found by Yevgeny Kliteynik in original patchsets.
Yevgeny's --no_default_routing option patch is not included
in the merging, but would be a good addition.
- Renamed osm_ucast_torus.c to osm_torus.c.
Since osm_torus.c contains code to implement both unicast and
multicast routing, the new name seems more appropriate.  The
multicast support depends heavily on the unicast routing code,
so it is more convenient to keep everything in one file.
- Removed redundant check for changed sl2vl map.
This functionality already exists in sl2vl_update_table().
- Set sl2vl maps on CA ports for torus-2QoS.
This was missing in the original patches.
- Do not force torus-2QoS to use SLs 8-15 when not using opensm -Q.
This was an interim measure introduced before multicast support was
working, that allowed multicast to use SL/VL 0 and thus not deadlock
against unicast.  I forget to take it out in the multicast patchset,
so I took it out when I merged.
- Renamed torus variables referencing origin to seed.
These things refer to switches used to seed the torus topology
appropriately, so the new name should reduce confusion going forward.
This also contains a keyword change in the torus configuration file,
so I'll repost an updated example.


Jim Schutt (17):
  opensm: Prepare for routing engine input to path record SL lookup and
SL2VL map setup.
  opensm: Allow the routing engine to influence SL2VL calculations.
  opensm: Allow the routing engine to participate in path SL
calculations.
  opensm: Track the minimum value in the fabric of data VLs supported.
  opensm: Add struct osm_routing_engine callback to build spanning
trees for multicast.
  opensm: Make mcast_mgr_purge_tree() available outside
osm_mcast_mgr.c.
  opensm: Add torus-2QoS routing engine, part 1.
  opensm: Add torus-2QoS routing engine, part 2.
  opensm: Add torus-2QoS routing engine, part 3.
  opensm: Update documentation to describe torus-2QoS.
  opensm: Enable torus-2QoS routing engine.
  opensm: Add opensm option to specify file name for extra torus-2QoS
configuration information.
  opensm: Do not require -Q option for torus-2QoS routing engine.
  opensm: Make it possible to configure no fallback routing engine.
  opensm: Avoid havoc in minhop caused by torus-2QoS persistent use of
osm_port_t:priv.
  opensm: Avoid havoc in dump_ucast_routes() caused by torus-2QoS
persistent use of osm_port_t:priv.
  opensm: Cause status of unicast routing attempt to propogate to
callers of osm_ucast_mgr_process().

 opensm/doc/current-routing.txt |  269 +-
 opensm/include/opensm/osm_base.h   |   18 +
 opensm/include/opensm/osm_multicast.h  |   33 +
 opensm/include/opensm/osm_opensm.h |   29 +-
 opensm/include/opensm/osm_subnet.h |7 +
 opensm/include/opensm/osm_switch.h |   12 +
 opensm/include/opensm/osm_ucast_lash.h |3 -
 opensm/man/opensm.8.in |9 +-
 opensm/opensm/Makefile.am  |2 +-
 opensm/opensm/main.c   |   11 +-
 opensm/opensm/osm_console.c|   10 +-
 opensm/opensm/osm_dump.c   |5 +-
 opensm/opensm/osm_link_mgr.c   |   16 +-
 opensm/opensm/osm_mcast_mgr.c  |   11 +-
 opensm/opensm/osm_opensm.c |   54 +-
 opensm/opensm/osm_port_info_rcv.c  |   13 +-
 opensm/opensm/osm_qos.c|   40 +-
 opensm/opensm/osm_sa_path_record.c |   33 +-
 opensm/opensm/osm_state_mgr.c  |   23 +-
 opensm/opensm/osm_subnet.c |   20 +-
 opensm/opensm/osm_switch.c |7 +-
 opensm/opensm/osm_torus.c  | 9120 
 opensm/opensm/osm_ucast_lash.c |   11 +-
 opensm/opensm/osm_ucast_mgr.c  |   55 +-
 24 files changed, 9702 insertions(+), 109 deletions(-)
 create mode 100644 opensm/opensm/osm_torus.c


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body

[PATCH v3 01/17] opensm: Prepare for routing engine input to path record SL lookup and SL2VL map setup.

2010-06-15 Thread Jim Schutt
In the event a routing engine needs to participate in SL assignment and
SL2VL map setup in order to avoid credit loops in a fabric, it will be
useful to make the routing engine context more widely available.

To this end, have osm_opensm_t save a pointer to the routing engine used,
rather than its type.  This will make the routing engine context easily
available in, e.g., sl2vl_update() and pr_rcv_get_path_parms().

Make the necessary adjustments to the code that used the old
routing_engine_used as an enum _osm_routing_engine_type.  In order to
keep the behavior where minhop was used if the configured routing engines
failed, the easiest solution was to add a pointer to osm_opensm_t which
pointed to the minhop struct osm_routing_engine.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/include/opensm/osm_opensm.h |4 ++-
 opensm/opensm/osm_console.c|   10 ++--
 opensm/opensm/osm_dump.c   |3 +-
 opensm/opensm/osm_link_mgr.c   |5 ++-
 opensm/opensm/osm_opensm.c |   43 +---
 opensm/opensm/osm_sa_path_record.c |3 +-
 opensm/opensm/osm_ucast_lash.c |3 +-
 opensm/opensm/osm_ucast_mgr.c  |   17 --
 8 files changed, 54 insertions(+), 34 deletions(-)

diff --git a/opensm/include/opensm/osm_opensm.h 
b/opensm/include/opensm/osm_opensm.h
index c6c9bdb..e97142e 100644
--- a/opensm/include/opensm/osm_opensm.h
+++ b/opensm/include/opensm/osm_opensm.h
@@ -120,6 +120,7 @@ typedef enum _osm_routing_engine_type {
 *  added later.
 */
 struct osm_routing_engine {
+   osm_routing_engine_type_t type;
const char *name;
void *context;
int (*build_lid_matrices) (void *context);
@@ -183,7 +184,8 @@ typedef struct osm_opensm {
cl_dispatcher_t disp;
cl_plock_t lock;
struct osm_routing_engine *routing_engine_list;
-   osm_routing_engine_type_t routing_engine_used;
+   struct osm_routing_engine *routing_engine_used;
+   struct osm_routing_engine *default_routing_engine;
osm_stats_t stats;
osm_console_t console;
nn_map_t *node_name_map;
diff --git a/opensm/opensm/osm_console.c b/opensm/opensm/osm_console.c
index bc7bea3..b99bb84 100644
--- a/opensm/opensm/osm_console.c
+++ b/opensm/opensm/osm_console.c
@@ -382,6 +382,8 @@ static void print_status(osm_opensm_t * p_osm, FILE * out)
cl_list_item_t *item;
 
if (out) {
+   const char *re_str;
+
cl_plock_acquire(p_osm-lock);
fprintf(out,OpenSM Version   : %s\n, 
p_osm-osm_version);
fprintf(out,SM State : %s\n,
@@ -390,9 +392,11 @@ static void print_status(osm_opensm_t * p_osm, FILE * out)
p_osm-subn.opt.sm_priority);
fprintf(out,SA State : %s\n,
sa_state_str(p_osm-sa.state));
-   fprintf(out,Routing Engine   : %s\n,
-   osm_routing_engine_type_str(p_osm-
-   routing_engine_used));
+
+   re_str = p_osm-routing_engine_used ?
+   
osm_routing_engine_type_str(p_osm-routing_engine_used-type) :
+   
osm_routing_engine_type_str(OSM_ROUTING_ENGINE_TYPE_NONE);
+   fprintf(out,Routing Engine   : %s\n, re_str);
 
fprintf(out,Loaded event plugins :);
if (cl_qlist_head(p_osm-plugin_list) ==
diff --git a/opensm/opensm/osm_dump.c b/opensm/opensm/osm_dump.c
index fe2c3bc..bfff1a0 100644
--- a/opensm/opensm/osm_dump.c
+++ b/opensm/opensm/osm_dump.c
@@ -135,7 +135,8 @@ static void dump_ucast_routes(cl_map_item_t * item, FILE * 
file, void *cxt)
Switch 0x%016 PRIx64 \nLID: Port : Hops : Optimal\n,
cl_ntoh64(osm_node_get_node_guid(p_node)));
 
-   dor = (p_osm-routing_engine_used == OSM_ROUTING_ENGINE_TYPE_DOR);
+   dor = (p_osm-routing_engine_used 
+  p_osm-routing_engine_used-type == OSM_ROUTING_ENGINE_TYPE_DOR);
 
for (lid_ho = 1; lid_ho = max_lid_ho; lid_ho++) {
fprintf(file, 0x%04X : , lid_ho);
diff --git a/opensm/opensm/osm_link_mgr.c b/opensm/opensm/osm_link_mgr.c
index e6c9b3b..c309916 100644
--- a/opensm/opensm/osm_link_mgr.c
+++ b/opensm/opensm/osm_link_mgr.c
@@ -64,8 +64,9 @@ static uint8_t link_mgr_get_smsl(IN osm_sm_t * sm, IN 
osm_physp_t * p_physp)
 
OSM_LOG_ENTER(sm-p_log);
 
-   if (p_osm-routing_engine_used != OSM_ROUTING_ENGINE_TYPE_LASH
-   || !(slid = osm_physp_get_base_lid(p_physp))) {
+   if (!(p_osm-routing_engine_used 
+ p_osm-routing_engine_used-type == OSM_ROUTING_ENGINE_TYPE_LASH 

+ (slid = osm_physp_get_base_lid(p_physp {
/* Use default SL if lash routing is not used */
OSM_LOG_EXIT(sm-p_log);
return sm-p_subn-opt.sm_sl;
diff --git

[PATCH v3 05/17] opensm: Add struct osm_routing_engine callback to build spanning trees for multicast.

2010-06-15 Thread Jim Schutt
If a routing engine needs to compute spanning trees with special
properties, it needs a way to override the default implementation.
A routing engine callback provides that mechanism.  Routing engines
that can use the default implementation can leave the callback
pointer set to NULL.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/include/opensm/osm_opensm.h |6 ++
 opensm/opensm/osm_mcast_mgr.c  |7 ++-
 2 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/opensm/include/opensm/osm_opensm.h 
b/opensm/include/opensm/osm_opensm.h
index 734a6db..fddcf53 100644
--- a/opensm/include/opensm/osm_opensm.h
+++ b/opensm/include/opensm/osm_opensm.h
@@ -132,6 +132,8 @@ struct osm_routing_engine {
uint8_t (*path_sl)(void *context, IN uint8_t path_sl_hint,
   IN const osm_port_t *src_port,
   IN const osm_port_t *dst_port);
+   ib_api_status_t (*mcast_build_stree)(void *context,
+IN OUT osm_mgrp_box_t *mgb);
void (*delete) (void *context);
struct osm_routing_engine *next;
 };
@@ -165,6 +167,10 @@ struct osm_routing_engine {
 *  path_sl
 *  The callback for computing path SL.
 *
+*  mcast_build_stree
+*  The callback for building the spanning tree for multicast
+*  forwarding, called per MLID.
+*
 *  delete
 *  The delete method, may be used for routing engine
 *  internals cleanup.
diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c
index 322635d..bd67d4e 100644
--- a/opensm/opensm/osm_mcast_mgr.c
+++ b/opensm/opensm/osm_mcast_mgr.c
@@ -986,6 +986,7 @@ Exit:
 static ib_api_status_t mcast_mgr_process_mlid(osm_sm_t * sm, uint16_t mlid)
 {
ib_api_status_t status = IB_SUCCESS;
+   struct osm_routing_engine *re = sm-p_subn-p_osm-routing_engine_used;
osm_mgrp_box_t *mbox;
 
OSM_LOG_ENTER(sm-p_log);
@@ -1000,7 +1001,11 @@ static ib_api_status_t mcast_mgr_process_mlid(osm_sm_t * 
sm, uint16_t mlid)
 
mbox = osm_get_mbox_by_mlid(sm-p_subn, cl_hton16(mlid));
if (mbox) {
-   status = mcast_mgr_build_spanning_tree(sm, mbox);
+   if (re  re-mcast_build_stree)
+   status = re-mcast_build_stree(re-context, mbox);
+   else
+   status = mcast_mgr_build_spanning_tree(sm, mbox);
+
if (status != IB_SUCCESS)
OSM_LOG(sm-p_log, OSM_LOG_ERROR, ERR 0A17: 
Unable to create spanning tree (%s) for mlid 
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 06/17] opensm: Make mcast_mgr_purge_tree() available outside osm_mcast_mgr.c.

2010-06-15 Thread Jim Schutt
A routing engine that needs to compute multicast spanning trees with
special properties will need to delete old trees.  There's already
a function that does this: mcast_mgr_purge_tree().

Make it available outside osm_mcast_mgr.c, and change the name
to follow the naming convention (osm_ prefix) for global functions.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/include/opensm/osm_multicast.h |   33 +
 opensm/opensm/osm_mcast_mgr.c |4 ++--
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/opensm/include/opensm/osm_multicast.h 
b/opensm/include/opensm/osm_multicast.h
index 1da575d..df6ac6c 100644
--- a/opensm/include/opensm/osm_multicast.h
+++ b/opensm/include/opensm/osm_multicast.h
@@ -53,6 +53,7 @@
 #include opensm/osm_mcm_port.h
 #include opensm/osm_subnet.h
 #include opensm/osm_log.h
+#include opensm/osm_sm.h
 
 #ifdef __cplusplus
 #  define BEGIN_C_DECLS extern C {
@@ -193,6 +194,38 @@ osm_mgrp_t *osm_mgrp_new(IN osm_subn_t * subn, IN 
ib_net16_t mlid,
 *  Multicast Group, osm_mgrp_delete
 */
 
+/*
+ * Need a forward declaration to work around include loop:
+ * osm_sm.h - osm_multicast.h
+ */
+struct osm_sm;
+
+/f* OpenSM: Multicast Tree/osm_purge_mtree
+* NAME
+*  osm_purge_mtree
+*
+* DESCRIPTION
+*  Frees all the nodes in a multicast spanning tree
+*
+* SYNOPSIS
+*/
+void osm_purge_mtree(IN struct osm_sm * sm, IN osm_mgrp_box_t * mgb);
+/*
+* PARAMETERS
+*  sm
+*  [in] Pointer to osm_sm_t object.
+*  mgb
+*  [in] Pointer to an osm_mgrp_box_t object.
+*
+* RETURN VALUES
+*  None.
+*
+*
+* NOTES
+*
+* SEE ALSO
+*/
+
 /f* OpenSM: Multicast Group/osm_mgrp_is_guid
 * NAME
 *  osm_mgrp_is_guid
diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c
index bd67d4e..e6db6db 100644
--- a/opensm/opensm/osm_mcast_mgr.c
+++ b/opensm/opensm/osm_mcast_mgr.c
@@ -146,7 +146,7 @@ static void mcast_mgr_purge_tree_node(IN osm_mtree_node_t * 
p_mtn)
free(p_mtn);
 }
 
-static void mcast_mgr_purge_tree(osm_sm_t * sm, IN osm_mgrp_box_t * mbox)
+void osm_purge_mtree(osm_sm_t * sm, IN osm_mgrp_box_t * mbox)
 {
OSM_LOG_ENTER(sm-p_log);
 
@@ -735,7 +735,7 @@ static ib_api_status_t 
mcast_mgr_build_spanning_tree(osm_sm_t * sm,
   on multicast forwarding table information if the user wants to
   preserve existing multicast routes.
 */
-   mcast_mgr_purge_tree(sm, mbox);
+   osm_purge_mtree(sm, mbox);
 
/* build the first subset containing all member ports */
if (make_port_list(port_list, mbox)) {
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 14/17] opensm: Make it possible to configure no fallback routing engine.

2010-06-15 Thread Jim Schutt
For a fabric that requires routing with an engine with special properties,
say avoiding credit loops via making use of SLs in routing, it might
be preferable to not fall back to minhop if the configured routing engine
fails.

E.g. the torus-2QoS routing engine uses both SL2VL maps and path SL values
to provide routing free of credit loops, but cannot route fabrics for
some patterns of failed switches.  Should a switch fail that creates such
a pattern, it may be preferable to keep the previous routing information
loaded in the switches until a switch can be replaced that restores
torus-2QoS's ability to route the fabric.

The alternative, having some other engine route the fabric, will immediately
introduce credit loops.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/include/opensm/osm_subnet.h |1 +
 opensm/opensm/osm_opensm.c |5 +
 opensm/opensm/osm_qos.c|6 ++
 opensm/opensm/osm_ucast_mgr.c  |   23 +++
 4 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/opensm/include/opensm/osm_subnet.h 
b/opensm/include/opensm/osm_subnet.h
index fa3e46e..42ae416 100644
--- a/opensm/include/opensm/osm_subnet.h
+++ b/opensm/include/opensm/osm_subnet.h
@@ -219,6 +219,7 @@ typedef struct osm_subn_opt {
osm_qos_options_t qos_rtr_options;
boolean_t enable_quirks;
boolean_t no_clients_rereg;
+   boolean_t no_fallback_routing_engine;
 #ifdef ENABLE_OSM_PERF_MGR
boolean_t perfmgr;
boolean_t perfmgr_redir;
diff --git a/opensm/opensm/osm_opensm.c b/opensm/opensm/osm_opensm.c
index 8b03947..e296812 100644
--- a/opensm/opensm/osm_opensm.c
+++ b/opensm/opensm/osm_opensm.c
@@ -159,6 +159,11 @@ static struct osm_routing_engine 
*setup_routing_engine(osm_opensm_t *osm,
struct osm_routing_engine *re;
const struct routing_engine_module *m;
 
+   if (!strcmp(name, no_fallback)) {
+   osm-subn.opt.no_fallback_routing_engine = TRUE;
+   return NULL;
+   }
+
for (m = routing_modules; m-name  *m-name; m++) {
if (!strcmp(m-name, name)) {
re = malloc(sizeof(struct osm_routing_engine));
diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c
index 6d2af55..dc6a8ff 100644
--- a/opensm/opensm/osm_qos.c
+++ b/opensm/opensm/osm_qos.c
@@ -211,6 +211,12 @@ static int qos_extports_setup(osm_sm_t * sm, osm_node_t 
*node,
int ret = 0;
unsigned i, j;
 
+   /*
+* Do nothing unless the most recent routing attempt was successful.
+*/
+   if (!re)
+   return ret;
+
for (i = 1; i  num_ports; i++) {
p = osm_node_get_physp_ptr(node, i);
force_update = p-need_update || sm-p_subn-need_update;
diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c
index 10629cb..d1c485f 100644
--- a/opensm/opensm/osm_ucast_mgr.c
+++ b/opensm/opensm/osm_ucast_mgr.c
@@ -1091,7 +1091,8 @@ int osm_ucast_mgr_process(IN osm_ucast_mgr_t * p_mgr)
p_routing_eng = p_routing_eng-next;
}
 
-   if (!p_osm-routing_engine_used) {
+   if (!p_osm-routing_engine_used 
+   p_osm-subn.opt.no_fallback_routing_engine != TRUE) {
/* If configured routing algorithm failed, use default MinHop */
struct osm_routing_engine *r = p_osm-default_routing_engine;
 
@@ -1101,14 +1102,20 @@ int osm_ucast_mgr_process(IN osm_ucast_mgr_t * p_mgr)
osm_ucast_mgr_set_fwd_tables(p_mgr);
}
 
-   OSM_LOG(p_mgr-p_log, OSM_LOG_INFO,
-   %s tables configured on all switches\n,
-   osm_routing_engine_type_str(p_osm-
-   routing_engine_used-type));
-
-   if (p_mgr-p_subn-opt.use_ucast_cache)
-   p_mgr-cache_valid = TRUE;
+   if (p_osm-routing_engine_used) {
+   OSM_LOG(p_mgr-p_log, OSM_LOG_INFO,
+   %s tables configured on all switches\n,
+   osm_routing_engine_type_str(p_osm-
+   routing_engine_used-type));
 
+   if (p_mgr-p_subn-opt.use_ucast_cache)
+   p_mgr-cache_valid = TRUE;
+   } else {
+   p_mgr-p_subn-subnet_initialization_error = TRUE;
+   OSM_LOG(p_mgr-p_log, OSM_LOG_ERROR,
+   No routing engine able to successfully configure 
+switch tables on current fabric\n);
+   }
 Exit:
CL_PLOCK_RELEASE(p_mgr-p_lock);
OSM_LOG_EXIT(p_mgr-p_log);
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 13/17] opensm: Do not require -Q option for torus-2QoS routing engine.

2010-06-15 Thread Jim Schutt
The torus-2QoS engine provides a deadlock-free routing for a 2D/3D torus,
but requires that switch SL2VL maps be programmed.  Before this change,
opensm -Q was required for that to happen.

When a routing engine sets the struct osm_routing_engine:update_sl2vl
pointer, it is signalling its intent to participate in SL2VL map programming.
So, don't return early from osm_qos_setup() in that case; instead do everything
except attempt to read QoS configuration information.

For that to work properly, need to also always set up the default QoS config
information, instead of just when QoS is requested via -Q.

With that in place, the -Q option now means the same thing to torus-2QoS that
it means to other routing engines: QoS configuration is requested.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_qos.c|7 +--
 opensm/opensm/osm_subnet.c |   18 +-
 2 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c
index dadef29..6d2af55 100644
--- a/opensm/opensm/osm_qos.c
+++ b/opensm/opensm/osm_qos.c
@@ -290,7 +290,9 @@ int osm_qos_setup(osm_opensm_t * p_osm)
osm_node_t *p_node;
int ret = 0;
 
-   if (!p_osm-subn.opt.qos)
+   if (!(p_osm-subn.opt.qos ||
+ (p_osm-routing_engine_used 
+  p_osm-routing_engine_used-update_sl2vl)))
return 0;
 
OSM_LOG_ENTER(p_osm-log);
@@ -307,7 +309,8 @@ int osm_qos_setup(osm_opensm_t * p_osm)
cl_plock_excl_acquire(p_osm-lock);
 
/* read QoS policy config file */
-   osm_qos_parse_policy_file(p_osm-subn);
+   if (p_osm-subn.opt.qos)
+   osm_qos_parse_policy_file(p_osm-subn);
 
p_tbl = p_osm-subn.port_guid_tbl;
p_next = cl_qmap_head(p_tbl);
diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c
index bc34a0f..f714af7 100644
--- a/opensm/opensm/osm_subnet.c
+++ b/opensm/opensm/osm_subnet.c
@@ -1051,6 +1051,8 @@ static void subn_verify_qos_set(osm_qos_options_t *set, 
const char *prefix,
 
 int osm_subn_verify_config(IN osm_subn_opt_t * p_opts)
 {
+   osm_qos_options_t dflt;
+
if (p_opts-lmc  7) {
log_report( Invalid Cached Option Value:lmc = %u:
   Using Default:%u\n, p_opts-lmc, OSM_DEFAULT_LMC);
@@ -1101,17 +1103,15 @@ int osm_subn_verify_config(IN osm_subn_opt_t * p_opts)
p_opts-console = OSM_DEFAULT_CONSOLE;
}
 
-   if (p_opts-qos) {
-   osm_qos_options_t dflt;
-
-   /* the default options in qos_options must be correct.
-* every other one need not be, b/c those will default
-* back to whatever is in qos_options.
-*/
 
-   subn_set_default_qos_options(dflt);
+   /* the default options in qos_options must be correct.
+* every other one need not be, b/c those will default
+* back to whatever is in qos_options.
+*/
+   subn_set_default_qos_options(dflt);
+   subn_verify_qos_set(p_opts-qos_options, qos, dflt);
 
-   subn_verify_qos_set(p_opts-qos_options, qos, dflt);
+   if (p_opts-qos) {
subn_verify_qos_set(p_opts-qos_ca_options, qos_ca,
p_opts-qos_options);
subn_verify_qos_set(p_opts-qos_sw0_options, qos_sw0,
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 12/17] opensm: Add opensm option to specify file name for extra torus-2QoS configuration information.

2010-06-15 Thread Jim Schutt

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/include/opensm/osm_base.h   |   18 ++
 opensm/include/opensm/osm_subnet.h |5 +
 opensm/opensm/main.c   |9 +
 opensm/opensm/osm_subnet.c |1 +
 opensm/opensm/osm_torus.c  |2 +-
 5 files changed, 34 insertions(+), 1 deletions(-)

diff --git a/opensm/include/opensm/osm_base.h b/opensm/include/opensm/osm_base.h
index e0d6c66..fa4c78d 100644
--- a/opensm/include/opensm/osm_base.h
+++ b/opensm/include/opensm/osm_base.h
@@ -271,6 +271,24 @@ BEGIN_C_DECLS
 #endif
 /***/
 
+/d* OpenSM: Base/OSM_DEFAULT_TORUS_CONF_FILE
+* NAME
+*  OSM_DEFAULT_TORUS_CONF_FILE
+*
+* DESCRIPTION
+*  Specifies the default file name for extra torus-2QoS configuration
+*
+* SYNOPSIS
+*/
+#ifdef __WIN__
+#define OSM_DEFAULT_TORUS_CONF_FILE strcat(GetOsmCachePath(), 
osm-torus-2QoS.conf)
+#elif defined(OPENSM_CONFIG_DIR)
+#define OSM_DEFAULT_TORUS_CONF_FILE OPENSM_CONFIG_DIR /torus-2QoS.conf
+#else
+#define OSM_DEFAULT_TORUS_CONF_FILE /etc/opensm/torus-2QoS.conf
+#endif /* __WIN__ */
+/***/
+
 /d* OpenSM: Base/OSM_DEFAULT_PREFIX_ROUTES_FILE
 * NAME
 *  OSM_DEFAULT_PREFIX_ROUTES_FILE
diff --git a/opensm/include/opensm/osm_subnet.h 
b/opensm/include/opensm/osm_subnet.h
index 4fa0161..fa3e46e 100644
--- a/opensm/include/opensm/osm_subnet.h
+++ b/opensm/include/opensm/osm_subnet.h
@@ -204,6 +204,7 @@ typedef struct osm_subn_opt {
char *guid_routing_order_file;
char *sa_db_file;
boolean_t sa_db_dump;
+   char *torus_conf_file;
boolean_t do_mesh_analysis;
boolean_t exit_on_fatal;
boolean_t honor_guid2lid_file;
@@ -431,6 +432,10 @@ typedef struct osm_subn_opt {
 *  When TRUE causes OpenSM to dump SA DB at the end of every
 *  light sweep regardless the current verbosity level.
 *
+*  torus_conf_file
+*  Name of the file with extra configuration info for torus-2QoS
+*  routing engine.
+*
 *  exit_on_fatal
 *  If TRUE (default) - SM will exit on fatal subnet initialization
 *  issues.
diff --git a/opensm/opensm/main.c b/opensm/opensm/main.c
index abc3282..b0bc372 100644
--- a/opensm/opensm/main.c
+++ b/opensm/opensm/main.c
@@ -231,6 +231,10 @@ static void show_usage(void)
 Set the order port guids will be routed for the 
MinHop\n
 and Up/Down routing algorithms to the guids provided 
in the\n
 given file (one to a line)\n\n);
+   printf(--torus_config path to file\n
+This option defines the file name for the extra 
configuration\n
+info needed for the torus-2QoS routing engine.   The 
default\n
+name is \'OSM_DEFAULT_TORUS_CONF_FILE\'\n\n);
printf(--once, -o\n
 This option causes OpenSM to configure the subnet\n
 once, then exit.  Ports remain in the ACTIVE 
state.\n\n);
@@ -615,6 +619,7 @@ int main(int argc, char *argv[])
{sm_sl, 1, NULL, 7},
{retries, 1, NULL, 8},
{log_prefix, 1, NULL, 9},
+   {torus_config, 1, NULL, 10},
{NULL, 0, NULL, 0}  /* Required at the end of the array */
};
 
@@ -1003,6 +1008,10 @@ int main(int argc, char *argv[])
SET_STR_OPT(opt.log_prefix, optarg);
printf(Log prefix = %s\n, opt.log_prefix);
break;
+   case 10:
+   SET_STR_OPT(opt.torus_conf_file, optarg);
+   printf(Torus-2QoS config file = %s\n, 
opt.torus_conf_file);
+   break;
case 'h':
case '?':
case ':':
diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c
index 8224b5f..bc34a0f 100644
--- a/opensm/opensm/osm_subnet.c
+++ b/opensm/opensm/osm_subnet.c
@@ -753,6 +753,7 @@ void osm_subn_set_default_opt(IN osm_subn_opt_t * p_opt)
p_opt-guid_routing_order_file = NULL;
p_opt-sa_db_file = NULL;
p_opt-sa_db_dump = FALSE;
+   p_opt-torus_conf_file = strdup(OSM_DEFAULT_TORUS_CONF_FILE);
p_opt-do_mesh_analysis = FALSE;
p_opt-exit_on_fatal = TRUE;
p_opt-enable_quirks = FALSE;
diff --git a/opensm/opensm/osm_torus.c b/opensm/opensm/osm_torus.c
index fe643f2..871a3f5 100644
--- a/opensm/opensm/osm_torus.c
+++ b/opensm/opensm/osm_torus.c
@@ -9049,7 +9049,7 @@ int torus_build_lfts(void *context)
torus-osm = ctx-osm;
fabric-osm = ctx-osm;
 
-   if (!parse_config(OPENSM_CONFIG_DIR /opensm-torus.conf,
+   if (!parse_config(ctx-osm-subn.opt.torus_conf_file,
  fabric, torus))
goto out;
 
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body

[PATCH v3 11/17] opensm: Enable torus-2QoS routing engine.

2010-06-15 Thread Jim Schutt

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/include/opensm/osm_opensm.h |1 +
 opensm/opensm/main.c   |2 +-
 opensm/opensm/osm_opensm.c |6 ++
 3 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/opensm/include/opensm/osm_opensm.h 
b/opensm/include/opensm/osm_opensm.h
index fddcf53..8d63111 100644
--- a/opensm/include/opensm/osm_opensm.h
+++ b/opensm/include/opensm/osm_opensm.h
@@ -105,6 +105,7 @@ typedef enum _osm_routing_engine_type {
OSM_ROUTING_ENGINE_TYPE_FTREE,
OSM_ROUTING_ENGINE_TYPE_LASH,
OSM_ROUTING_ENGINE_TYPE_DOR,
+   OSM_ROUTING_ENGINE_TYPE_TORUS_2QOS,
OSM_ROUTING_ENGINE_TYPE_UNKNOWN
 } osm_routing_engine_type_t;
 /***/
diff --git a/opensm/opensm/main.c b/opensm/opensm/main.c
index 0093aa7..abc3282 100644
--- a/opensm/opensm/main.c
+++ b/opensm/opensm/main.c
@@ -174,7 +174,7 @@ static void show_usage(void)
 Min Hop algorithm.  Multiple routing engines can be 
specified\n
 separated by commas so that specific ordering of 
routing\n
 algorithms will be tried if earlier routing engines 
fail.\n
-Supported engines: updn, file, ftree, lash, dor\n\n);
+Supported engines: updn, file, ftree, lash, dor, 
torus-2QoS\n\n);
printf(--do_mesh_analysis\n
 This option enables additional analysis for the 
lash\n
 routing engine to precondition switch port 
assignments\n
diff --git a/opensm/opensm/osm_opensm.c b/opensm/opensm/osm_opensm.c
index 5614240..8b03947 100644
--- a/opensm/opensm/osm_opensm.c
+++ b/opensm/opensm/osm_opensm.c
@@ -70,6 +70,7 @@ extern int osm_ucast_file_setup(struct osm_routing_engine *, 
osm_opensm_t *);
 extern int osm_ucast_ftree_setup(struct osm_routing_engine *, osm_opensm_t *);
 extern int osm_ucast_lash_setup(struct osm_routing_engine *, osm_opensm_t *);
 extern int osm_ucast_dor_setup(struct osm_routing_engine *, osm_opensm_t *);
+extern int osm_ucast_torus2QoS_setup(struct osm_routing_engine *, osm_opensm_t 
*);
 
 const static struct routing_engine_module routing_modules[] = {
{minhop, osm_ucast_minhop_setup},
@@ -78,6 +79,7 @@ const static struct routing_engine_module routing_modules[] = 
{
{ftree, osm_ucast_ftree_setup},
{lash, osm_ucast_lash_setup},
{dor, osm_ucast_dor_setup},
+   {torus-2QoS, osm_ucast_torus2QoS_setup},
{NULL, NULL}
 };
 
@@ -98,6 +100,8 @@ const char *osm_routing_engine_type_str(IN 
osm_routing_engine_type_t type)
return lash;
case OSM_ROUTING_ENGINE_TYPE_DOR:
return dor;
+   case OSM_ROUTING_ENGINE_TYPE_TORUS_2QOS:
+   return torus-2QoS;
default:
break;
}
@@ -124,6 +128,8 @@ osm_routing_engine_type_t osm_routing_engine_type(IN const 
char *str)
return OSM_ROUTING_ENGINE_TYPE_LASH;
else if (!strcasecmp(str, dor))
return OSM_ROUTING_ENGINE_TYPE_DOR;
+   else if (!strcasecmp(str, torus-2QoS))
+   return OSM_ROUTING_ENGINE_TYPE_TORUS_2QOS;
else
return OSM_ROUTING_ENGINE_TYPE_UNKNOWN;
 }
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 15/17] opensm: Avoid havoc in minhop caused by torus-2QoS persistent use of osm_port_t:priv.

2010-06-15 Thread Jim Schutt
Torus-2QoS makes persistent use of osm_port_t:priv to speed calculation
of path SL values.

It cannot clear osm_port_t:priv members when it tears down its persistent
data for the following reason: If a port is removed from the fabric, the
opensm core will delete the corresponding osm_port_t object, leaving
torus-2QoS holding a dangling reference.  Torus-2QoS then has a use-after-free
error when tearing down its persistent data if it tries to use its dangling
osm_port_t reference to clear the priv member.

When torus-2QoS is unable to route a fabric due to missing switches and
opensm is configured to fall back to minhop, havoc will ensue because
minhop uses a non-NULL osm_port_t:priv as a proxy for LMC  0: it
assumes if osm_port_t:priv is non-NULL it can only be because
alloc_ports_priv() has been called.

Fix this up by always calling alloc_ports_priv(), and have it set
priv = NULL if LMC == 0.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_ucast_mgr.c |   10 +-
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c
index d1c485f..e6e40f0 100644
--- a/opensm/opensm/osm_ucast_mgr.c
+++ b/opensm/opensm/osm_ucast_mgr.c
@@ -315,8 +315,10 @@ static void alloc_ports_priv(osm_ucast_mgr_t * mgr)
 item = cl_qmap_next(item)) {
port = (osm_port_t *) item;
lmc = ib_port_info_get_lmc(port-p_physp-port_info);
-   if (!lmc)
+   if (!lmc) {
+   port-priv = NULL;
continue;
+   }
r = malloc(sizeof(*r) + sizeof(r-guids[0]) * (1  lmc));
if (!r) {
OSM_LOG(mgr-p_log, OSM_LOG_ERROR, ERR 3A09: 
@@ -363,8 +365,7 @@ static void ucast_mgr_process_tbl(IN cl_map_item_t * 
p_map_item,
/* Initialize LIDs in buffer to invalid port number. */
memset(p_sw-new_lft, OSM_NO_PATH, p_sw-max_lid_ho + 1);
 
-   if (p_mgr-p_subn-opt.lmc)
-   alloc_ports_priv(p_mgr);
+   alloc_ports_priv(p_mgr);
 
/*
   Iterate through every port setting LID routes for each
@@ -381,8 +382,7 @@ static void ucast_mgr_process_tbl(IN cl_map_item_t * 
p_map_item,
}
}
 
-   if (p_mgr-p_subn-opt.lmc)
-   free_ports_priv(p_mgr);
+   free_ports_priv(p_mgr);
 
OSM_LOG_EXIT(p_mgr-p_log);
 }
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 10/17] opensm: Update documentation to describe torus-2QoS.

2010-06-15 Thread Jim Schutt

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/doc/current-routing.txt |  269 +++-
 opensm/man/opensm.8.in |9 ++-
 2 files changed, 275 insertions(+), 3 deletions(-)

diff --git a/opensm/doc/current-routing.txt b/opensm/doc/current-routing.txt
index 1302860..78a2e01 100644
--- a/opensm/doc/current-routing.txt
+++ b/opensm/doc/current-routing.txt
@@ -1,7 +1,7 @@
 Current OpenSM Routing
-7/9/07
+10/9/09
 
-OpenSM offers five routing engines:
+OpenSM offers six routing engines:
 
 1.  Min Hop Algorithm - based on the minimum hops to each node where the
 path length is optimized.
@@ -28,6 +28,13 @@ two switches.  This provides deadlock free routes for 
hypercubes when
 the fabric is cabled as a hypercube and for meshes when cabled as a
 mesh (see details below).
 
+6. Torus-2QoS unicast routing algorithm - a DOR-based routing algorithm
+specialized for 2D/3D torus topologies.  Torus-2QoS provides deadlock-free
+routing while supporting two quality of service (QoS) levels.  In addition
+it is able to route around multiple failed fabric links or a single failed
+fabric switch without introducing deadlocks, and without changing path SL
+values granted before the failure.
+
 OpenSM provides an optional unicast routing cache (enabled by -A or
 --ucast_cache options). When enabled, unicast routing cache prevents
 routing recalculation (which is a heavy task in a large cluster) when
@@ -388,3 +395,261 @@ ports, one port on one end of the cable, and the other 
port on the
 other end, continuing along the mesh dimension.
 
 Use '-R dor' option to activate the DOR algorithm.
+
+Torus-2QoS Routing Algorithm
+
+
+Torus-2QoS is routing algorithm designed for large-scale 2D/3D torus fabrics.
+The torus-2QoS routing engine can provide the following functionality on
+a 2D/3D torus:
+- routing that is free of credit loops
+- two levels of QoS, assuming switches support 8 data VLs
+- ability to route around a single failed switch, and/or multiple failed
+links, without
+- introducing credit loops
+- changing path SL values
+- very short run times, with good scaling properties as fabric size
+increases
+
+Torus-2QoS is a DOR-based algorithm that avoids deadlocks that would otherwise
+occur in a torus using the concept of a dateline for each torus dimension.
+It encodes into a path SL which datelines the path crosses as follows:
+
+  sl = 0;
+  for (d = 0; d  torus_dimensions; d++)
+/* path_crosses_dateline(d) returns 0 or 1 */
+sl |= path_crosses_dateline(d)  d;
+
+For a 3D torus, that leaves one SL bit free, which torus-2QoS uses to
+implement two QoS levels.
+
+This is possible because torus-2QoS also makes use of the output port
+dependence of the switch SL2VL maps.  It computes in which torus coordinate
+direction each interswitch link points, and writes SL2VL maps for such
+ports as follows:
+
+  for (sl = 0; sl  16; sl ++)
+/* cdir(port) reports which torus coordinate direction a switch port
+ * points in, and returns 0, 1, or 2 */
+sl2vl(iport,oport,sl) = 0x1  (sl  cdir(oport));
+
+Thus torus-2QoS consumes 8 SL values (SL bits 0-2) and 2 VL values (VL bit 0)
+per QoS level to provide deadlock-free routing on a 3D torus.
+
+Torus-2QoS routes around link failure by taking the long way around any
+1D ring interrupted by a link failure.  For example, consider the 2D 6x5
+torus below, where switches are denoted by [+a-zA-Z]:
+
+||||||
+   4  --++++++--
+||||||
+   3  --+++D++--
+||||||
+   2  --++Ir++--
+||||||
+   1  --mSnTop--
+||||||
+ y=0  --++++++--
+||||||
+
+  x=012345
+
+For a pristine fabric the path from S to D would be S-n-T-r-d.  In the
+event that either link S-n or n-T has failed, torus-2QoS would use the path
+S-m-p-o-T-r-D.  Note that it can do this without changing the path SL
+value; once the 1D ring m-S-n-T-o-p-m has been broken by failure, path
+segments using it cannot contribute to deadlock, and the x-direction
+dateline (between, say, x=5 and x=0) can be ignored for path segments on
+that ring.
+
+One result of this is that torus-2QoS can route around many simultaneous
+link failures, as long as no 1D ring is broken into disjoint regions.  For
+example, if links n-T and T-o have both failed, that ring has been broken
+into two disjoint regions, T and o-p-m-S-n.  Torus-2QoS checks for such
+issues, reports if they are found, and refuses to route such fabrics.
+
+Handling a failed switch under DOR requires introducing into a path at
+least one turn that would be otherwise illegal, i.e. not allowed by DOR
+rules.  Torus-2QoS will introduce such a turn as close as possible to the
+failed switch

Re: [PATCH v2 10/15] opensm: Add opensm option to specify file name for extra torus-2QoS configuration information.

2010-06-10 Thread Jim Schutt


Hi Sasha,

Thanks for taking a look at this.


On Thu, 2010-06-10 at 05:25 -0600, Sasha Khapyorsky wrote:
 Hi Jim,
 
 On 11:06 Wed 10 Mar , Jim Schutt wrote:
  
  Signed-off-by: Jim Schutt jasc...@sandia.gov
  ---
   opensm/include/opensm/osm_base.h   |   18 ++
   opensm/include/opensm/osm_subnet.h |5 +
   opensm/opensm/main.c   |9 +
   opensm/opensm/osm_subnet.c |1 +
   opensm/opensm/osm_torus.c  |2 +-
 
 It breaks to apply at this point. It is because file
 'opensm/opensm/osm_torus.c' doesn't exist in previous patches. Could
 you
 please resend the patch series with files included? Thanks.

So 7/15 has the patch that adds osm_torus.c as a compressed attachment,
because the patch is so big.

I sent it that way because I was afraid it would otherwise be
rejected by vger.

So you want me to resend with that big patch inline?

Also, I have accumulated a few bug fixes to torus-2QoS
that I haven't posted yet.  I can

1) repost the patch series with no attachments, and
   add the bugfix patches at the end of series
2) repost a v3 patchset with these fixes merged.
3) do something else that you prefer.

Let me know?

-- Jim

 
 Sasha
 


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] opensm/qos.c: Revert port ranges for calls to sl2vl_update_table().

2010-06-01 Thread Jim Schutt
Before commit 051a1dd5 (opensm/osm_qos.c: split switch external and end
ports setup), osm_qos_setup() would end up calling sl2vl_update_table()
for output ports 1-N, and inport ports 0-N.

Commit 051a1dd5 changed this around to be output ports 0-N, and input
ports 1-N, and an InfiniScale IV-based fabric would log lots of errors
like these:

  log_rcv_cb_error: ERR 3111: Received MAD with error status = 0x1C
  SubnGetResp(SLtoVLMappingTable), attr_mod 0x2300, TID 0xad069
  Initial path: 0,1,1,4,13 Return path: 0,25,1,7,10

The attr_mod in every such message has 0x00 in the least significant
byte, which specifies the output port.

With the port ranges restored to their old values, the above log messages
stop.  Hal Rosenstock pointed out that we should not be attempting
to program a base SP0 with SL2VL maps; see, e.g.,  IBA 1.2.1, section
14.2.5.8, page 844.  So, this patch is a full reversion for
switches supporting base SP0, but only a partial reversion for
switches supporting enhanced SP0.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_qos.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c
index f814ea8..cce59ee 100644
--- a/opensm/opensm/osm_qos.c
+++ b/opensm/opensm/osm_qos.c
@@ -230,10 +230,11 @@ static int qos_extports_setup(osm_sm_t * sm, osm_node_t 
*node,
  qcfg-sl2vl);
}
 
-   for (i = 1; i  num_ports; i++) {
+   for (i = 0; i  num_ports; i++) {
p = osm_node_get_physp_ptr(node, i);
force_update = p-need_update || sm-p_subn-need_update;
-   for (j = 0; j  num_ports; j++)
+   j = ib_switch_info_is_enhanced_port0(node-sw-switch_info) ? 
0 : 1;
+   for (; j  num_ports; j++)
if (sl2vl_update_table(sm, p, i, i  8 | j,
   force_update, qcfg-sl2vl))
ret = -1;
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] opensm/qos.c: Revert port ranges for calls to sl2vl_update_table().

2010-05-24 Thread Jim Schutt
Hi Hal,

On Mon, 2010-05-24 at 12:03 -0600, Hal Rosenstock wrote:
 Hi Jim,
 
 On Fri, May 21, 2010 at 4:29 PM, Jim Schutt jasc...@sandia.gov wrote:
 
  Sorry, I somehow got this wrong.
 
  Corrected patch below.
 
  -- Jim
 
  On Fri, 2010-05-21 at 14:18 -0600, Jim Schutt wrote:
  Before commit 051a1dd5 (opensm/osm_qos.c: split switch external and end
  ports setup), osm_qos_setup() would end up calling sl2vl_update_table()
  for output ports 1-N, and inport ports 0-N.
 
  Commit 051a1dd5 changed this around to be output ports 0-N, and input
  ports 1-N, and an InfiniScale IV based fabric would log lots of errors
  like these:
 
log_rcv_cb_error: ERR 3111: Received MAD with error status = 0x1C
SubnGetResp(SLtoVLMappingTable), attr_mod 0x2300, TID 0xad069
Initial path: 0,1,1,4,13 Return path: 0,25,1,7,10
 
  The attr_mod in every such message has 0x00 in the least significant
  byte.
 
 This is the output port.
 
  With the port ranges restored to their old values, the above log messages
  stop.
 
 Is this with base or enhanced port 0 ? I'm assuming base. See comment below.

See extra patch below.

 
 Also, what firmware version is in use ?

flint reports the FW version as 7.2.0

 
  Signed-off-by: Jim Schutt jasc...@sandia.gov
  ---
   opensm/opensm/osm_qos.c |8 
   1 files changed, 4 insertions(+), 4 deletions(-)
 
  diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c
  index 6bbbfa2..b8c3111 100644
  --- a/opensm/opensm/osm_qos.c
  +++ b/opensm/opensm/osm_qos.c
  @@ -230,12 +230,12 @@ static int qos_extports_setup(osm_sm_t * sm, 
  osm_node_t *node,
  qcfg-sl2vl);
}
 
  - for (i = 1; i  num_ports; i++) {
  + for (i = 0; i  num_ports; i++) {
p = osm_node_get_physp_ptr(node, i);
force_update = p-need_update || sm-p_subn-need_update;
  - for (j = 0; j  num_ports; j++)
  - if (sl2vl_update_table(sm, p, i, i  8 | j,
  -force_update, qcfg-sl2vl))
  + for (j = 1; j  num_ports; j++)
  + if (sl2vl_update_table(sm, p, i, j, force_update,
  +qcfg-sl2vl))
ret = -1;
}
 
 
  diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c
  index 6bbbfa2..7d76c75 100644
  @@ -230,10 +230,10 @@ static int qos_extports_setup(osm_sm_t * sm, 
  osm_node_t *node,
   qcfg-sl2vl);
 }
 
  -   for (i = 1; i  num_ports; i++) {
  +   for (i = 0; i  num_ports; i++) {
 p = osm_node_get_physp_ptr(node, i);
 force_update = p-need_update || sm-p_subn-need_update;
  -   for (j = 0; j  num_ports; j++)
  +   for (j = 1; j  num_ports; j++)
 if (sl2vl_update_table(sm, p, i, i  8 | j,
force_update, qcfg-sl2vl))
 ret = -1;
 
 
 I think the start for j depends on whether it is base or enhanced port
 0. Start should be 0 for enhanced and 1 for base.

Ah.  I see now in the IBA 1.2.1 spec, Table 146 PortInfo, p 837, that
only an enhanced SP0 supports OperationalVLs.  So if a base SP0 doesn't
support it, such port also doesn't support VLs, and it makes sense that 
you shouldn't attempt to program SL2VL maps.

Is that what you're thinking?

I added this patch on top of above, and things still work w/ no
error messages.  I guess that means our InfiniSwitch IV gear has
only base SP0.

--- a/opensm/opensm/osm_qos.c
+++ b/opensm/opensm/osm_qos.c
@@ -233,7 +233,8 @@ static int qos_extports_setup(osm_sm_t * sm, osm_node_t 
*node,
for (i = 0; i  num_ports; i++) {
p = osm_node_get_physp_ptr(node, i);
force_update = p-need_update || sm-p_subn-need_update;
-   for (j = 1; j  num_ports; j++)
+   j = ib_switch_info_is_enhanced_port0(node-sw-switch_info) ? 
0 : 1;
+   for (; j  num_ports; j++)
if (sl2vl_update_table(sm, p, i, i  8 | j,
   force_update, qcfg-sl2vl))
ret = -1;

Does that look OK to you?

Thanks -- Jim


 
 -- Hal
 


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] opensm/qos.c: Revert port ranges for calls to sl2vl_update_table().

2010-05-21 Thread Jim Schutt
Before commit 051a1dd5 (opensm/osm_qos.c: split switch external and end
ports setup), osm_qos_setup() would end up calling sl2vl_update_table()
for output ports 1-N, and inport ports 0-N.

Commit 051a1dd5 changed this around to be output ports 0-N, and input
ports 1-N, and an InfiniScale IV based fabric would log lots of errors
like these:

  log_rcv_cb_error: ERR 3111: Received MAD with error status = 0x1C
  SubnGetResp(SLtoVLMappingTable), attr_mod 0x2300, TID 0xad069
  Initial path: 0,1,1,4,13 Return path: 0,25,1,7,10

The attr_mod in every such message has 0x00 in the least significant
byte.

With the port ranges restored to their old values, the above log messages
stop.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_qos.c |8 
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c
index 6bbbfa2..b8c3111 100644
--- a/opensm/opensm/osm_qos.c
+++ b/opensm/opensm/osm_qos.c
@@ -230,12 +230,12 @@ static int qos_extports_setup(osm_sm_t * sm, osm_node_t 
*node,
  qcfg-sl2vl);
}
 
-   for (i = 1; i  num_ports; i++) {
+   for (i = 0; i  num_ports; i++) {
p = osm_node_get_physp_ptr(node, i);
force_update = p-need_update || sm-p_subn-need_update;
-   for (j = 0; j  num_ports; j++)
-   if (sl2vl_update_table(sm, p, i, i  8 | j,
-  force_update, qcfg-sl2vl))
+   for (j = 1; j  num_ports; j++)
+   if (sl2vl_update_table(sm, p, i, j, force_update,
+  qcfg-sl2vl))
ret = -1;
}
 
-- 
1.6.2.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] opensm/qos.c: Revert port ranges for calls to sl2vl_update_table().

2010-05-21 Thread Jim Schutt

Sorry, I somehow got this wrong.

Corrected patch below.

-- Jim

On Fri, 2010-05-21 at 14:18 -0600, Jim Schutt wrote:
 Before commit 051a1dd5 (opensm/osm_qos.c: split switch external and end
 ports setup), osm_qos_setup() would end up calling sl2vl_update_table()
 for output ports 1-N, and inport ports 0-N.
 
 Commit 051a1dd5 changed this around to be output ports 0-N, and input
 ports 1-N, and an InfiniScale IV based fabric would log lots of errors
 like these:
 
   log_rcv_cb_error: ERR 3111: Received MAD with error status = 0x1C
   SubnGetResp(SLtoVLMappingTable), attr_mod 0x2300, TID 0xad069
   Initial path: 0,1,1,4,13 Return path: 0,25,1,7,10
 
 The attr_mod in every such message has 0x00 in the least significant
 byte.
 
 With the port ranges restored to their old values, the above log messages
 stop.
 
 Signed-off-by: Jim Schutt jasc...@sandia.gov
 ---
  opensm/opensm/osm_qos.c |8 
  1 files changed, 4 insertions(+), 4 deletions(-)
 
 diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c
 index 6bbbfa2..b8c3111 100644
 --- a/opensm/opensm/osm_qos.c
 +++ b/opensm/opensm/osm_qos.c
 @@ -230,12 +230,12 @@ static int qos_extports_setup(osm_sm_t * sm, osm_node_t 
 *node,
 qcfg-sl2vl);
   }
  
 - for (i = 1; i  num_ports; i++) {
 + for (i = 0; i  num_ports; i++) {
   p = osm_node_get_physp_ptr(node, i);
   force_update = p-need_update || sm-p_subn-need_update;
 - for (j = 0; j  num_ports; j++)
 - if (sl2vl_update_table(sm, p, i, i  8 | j,
 -force_update, qcfg-sl2vl))
 + for (j = 1; j  num_ports; j++)
 + if (sl2vl_update_table(sm, p, i, j, force_update,
 +qcfg-sl2vl))
   ret = -1;
   }
  

diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c
index 6bbbfa2..7d76c75 100644
@@ -230,10 +230,10 @@ static int qos_extports_setup(osm_sm_t * sm, osm_node_t 
*node,
  qcfg-sl2vl);
}
 
-   for (i = 1; i  num_ports; i++) {
+   for (i = 0; i  num_ports; i++) {
p = osm_node_get_physp_ptr(node, i);
force_update = p-need_update || sm-p_subn-need_update;
-   for (j = 0; j  num_ports; j++)
+   for (j = 1; j  num_ports; j++)
if (sl2vl_update_table(sm, p, i, i  8 | j,
   force_update, qcfg-sl2vl))
ret = -1;




--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] opensm/osm_dump.c: dump SL2VL tables

2010-04-16 Thread Jim Schutt
Hi Yevgeny,

If this patch is included on top of my torus-2QoS
patchset, you might want the addition below.

This is because my patchset causes SL2VL maps to
always be set when a routing engine registers its
intent to do so by providing an update_sl2vl() function,
without requiring opensm to be invoked with -Q.

So we want to dump SL2VL tables also in that case.

On Thu, 2010-03-25 at 09:56 -0600, Yevgeny Kliteynik wrote:
 Hi Sasha,
 
 Dumping SL2VL tables in ROUTING verbosity level when QoS is on.
 This is needed for SL2VL tables analysis in general, and for
 routing engines that are using IB VLs in particular, such as
 torus-2QoS.
 
 Signed-off-by: Yevgeny Kliteynik klit...@dev.mellanox.co.il
 ---
  opensm/opensm/osm_dump.c |   61 
 +-
  1 files changed, 60 insertions(+), 1 deletions(-)

[snip]

 @@ -630,6 +684,11 @@ void osm_dump_all(osm_opensm_t * osm)
   osm_dump_qmap_to_file(osm, opensm.mcfdbs,
 osm-subn.sw_guid_tbl,
 dump_mcast_routes, osm);
 + /* SL2VL tables */
 + if (osm-subn.opt.qos)

-   if (osm-subn.opt.qos)
+   if (osm-subn.opt.qos ||
+   (osm-routing_engine_used 
+osm-routing_engine_used-update_sl2vl))

 + osm_dump_qmap_to_file(osm, opensm-sl2vl.dump,
 +   osm-subn.port_guid_tbl,
 +   dump_sl2vl_tbl, osm);
   }
   osm_dump_qmap_to_file(osm, opensm-subnet.lst,
 osm-subn.node_guid_tbl, dump_topology_node,

Sorry for the delay in noticing this.

-- Jim



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] opensm: added function that dumps PathRecords

2010-04-14 Thread Jim Schutt
Hi Yevgeny,

On Thu, 2010-04-08 at 07:29 -0600, Yevgeny Kliteynik wrote:
 Dumping SL, MTU and Rate for all the
 non-switch-2-non-switch paths in the subnet.
 
 PRs that are dumped:
 
   for every non-switch source port
   for every non-switch target LID in the subnet
   dump PR between source port and target LID
 
 This way number of sources is equal to number of physical
 non-switch ports in the subnet, and only number of targets
 depends on LMC that is used.
 
 Signed-off-by: Yevgeny Kliteynik klit...@dev.mellanox.co.il
 ---

[snip]

 +
  /f* OpenSM: MC Member Record 
 Receiver/osm_mcmr_rcv_find_or_create_new_mgrp
  * NAME
  *osm_mcmr_rcv_find_or_create_new_mgrp
 diff --git a/opensm/opensm/osm_sa.c b/opensm/opensm/osm_sa.c
 index 8aab548..83da258 100644
 --- a/opensm/opensm/osm_sa.c
 +++ b/opensm/opensm/osm_sa.c
 @@ -718,6 +718,87 @@ int osm_sa_db_file_dump(osm_opensm_t * p_osm)
   return res;
  }
 
 +typedef struct _path_parms {
 + ib_net16_t pkey;
 + uint8_t mtu;
 + uint8_t rate;
 + uint8_t sl;
 + uint8_t pkt_life;
 + boolean_t reversible;
 +} path_parms_t;
 +
 +extern ib_api_status_t osm_get_path_params(IN osm_sa_t * sa,
 + IN const osm_port_t * p_src_port,
 + IN const osm_port_t * p_dest_port,
 + IN const uint16_t dlid_ho,
 + OUT path_parms_t * p_parms);
 +
 +static void sa_dump_path_records(osm_opensm_t * p_osm, FILE * file)
 +{
 + osm_port_t *p_src_port;
 + osm_port_t *p_dest_port;
 + osm_node_t *p_node;
 + uint16_t dlid_ho;
 + uint32_t vector_size;
 + osm_physp_t *p_physp;
 + path_parms_t path_parms;
 + ib_api_status_t status;
 +
 + vector_size = cl_ptr_vector_get_size(p_osm-subn.port_lid_tbl);
 + for (p_src_port = (osm_port_t *) 
 cl_qmap_head(p_osm-subn.port_guid_tbl);
 +  p_src_port != (osm_port_t *) 
 cl_qmap_end(p_osm-subn.port_guid_tbl);
 +  p_src_port = (osm_port_t *) cl_qmap_next(p_src_port-map_item)) {
 +
 + p_node = p_src_port-p_node;
 + if (p_node-node_info.node_type == IB_NODE_TYPE_SWITCH)
 + return;

-   return;
+   continue;

Otherwise we stop dumping at the first switch we encounter?

-- Jim



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] opensm/osm_dump.c: dump SL2VL tables

2010-03-25 Thread Jim Schutt

Hi Yevgey,

On Thu, 2010-03-25 at 09:56 -0600, Yevgeny Kliteynik wrote:
 Hi Sasha,
 
 Dumping SL2VL tables in ROUTING verbosity level when QoS is on.
 This is needed for SL2VL tables analysis in general, and for
 routing engines that are using IB VLs in particular, such as
 torus-2QoS.
 

Very cool.  Thanks.

-- Jim

 Signed-off-by: Yevgeny Kliteynik klit...@dev.mellanox.co.il
 ---
  opensm/opensm/osm_dump.c |   61 
 +-
  1 files changed, 60 insertions(+), 1 deletions(-)



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 15/15] opensm: Cause status of unicast routing attempt to propogate to callers of osm_ucast_mgr_process().

2010-03-11 Thread Jim Schutt

On Wed, 2010-03-10 at 11:06 -0700, Jim Schutt wrote:
 If unicast routing fails, there is no point to continuing with fabric 
 bring-up.
 Just restart a new heavy sweep instead.
 
 Signed-off-by: Jim Schutt jasc...@sandia.gov
 ---
  opensm/opensm/osm_state_mgr.c |   12 +---
  opensm/opensm/osm_ucast_mgr.c |   14 +-
  2 files changed, 18 insertions(+), 8 deletions(-)
 
 diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
 index 96ad348..e666034 100644
 --- a/opensm/opensm/osm_state_mgr.c
 +++ b/opensm/opensm/osm_state_mgr.c
 @@ -1140,7 +1140,11 @@ static void do_sweep(osm_sm_t * sm)
   /* Re-program the switches fully */
   sm-p_subn-ignore_existing_lfts = TRUE;
  
 - osm_ucast_mgr_process(sm-ucast_mgr);
 + if (osm_ucast_mgr_process(sm-ucast_mgr)) {
 + OSM_LOG_MSG_BOX(sm-p_log, OSM_LOG_VERBOSE,
 + REROUTE FAILED);
 + return;
 + }
   osm_qos_setup(sm-p_subn-p_osm);
  
   /* Reset flag */
 @@ -1299,12 +1303,14 @@ repeat_discovery:
   LID ASSIGNMENT COMPLETE - STARTING SWITCH TABLE 
 CONFIG);
  
   /*
 -  * Proceed with unicast forwarding table configuration.
 +  * Proceed with unicast forwarding table configuration; repeat
 +  * if unicast routing fails.
*/
  
   if (!sm-ucast_mgr.cache_valid ||
   osm_ucast_cache_process(sm-ucast_mgr))
 - osm_ucast_mgr_process(sm-ucast_mgr);
 + if (osm_ucast_mgr_process(sm-ucast_mgr))
 + goto repeat_discovery;
  
   osm_qos_setup(sm-p_subn-p_osm);
  

Sorry I missed this: do_sweep() should just return early on 
unicast route failure.

If osm_ucast_mgr_process() fails, no configured routing engine was able
to route the fabric.  In that case, do_sweep() should just return,
and a new sweep will be triggered either on a trap due to a fabric
change, or by the configured sweep_interval.

I think this should just be:

@@ -1299,12 +1303,14 @@ repeat_discovery:
LID ASSIGNMENT COMPLETE - STARTING SWITCH TABLE 
CONFIG);
 
/*
-* Proceed with unicast forwarding table configuration.
+* Proceed with unicast forwarding table configuration; if it fails
+* return early to wait for a trap or the next sweep interval.
 */
 
if (!sm-ucast_mgr.cache_valid ||
osm_ucast_cache_process(sm-ucast_mgr))
-   osm_ucast_mgr_process(sm-ucast_mgr);
+   if (osm_ucast_mgr_process(sm-ucast_mgr))
+   return;
 
osm_qos_setup(sm-p_subn-p_osm);
 


 diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c
 index fbc9244..8ea2e52 100644
 --- a/opensm/opensm/osm_ucast_mgr.c
 +++ b/opensm/opensm/osm_ucast_mgr.c
 @@ -955,6 +955,7 @@ int osm_ucast_mgr_process(IN osm_ucast_mgr_t * p_mgr)
   osm_opensm_t *p_osm;
   struct osm_routing_engine *p_routing_eng;
   cl_qmap_t *p_sw_guid_tbl;
 + int failed = 0;
  
   OSM_LOG_ENTER(p_mgr-p_log);
  
 @@ -973,7 +974,8 @@ int osm_ucast_mgr_process(IN osm_ucast_mgr_t * p_mgr)
  
   p_osm-routing_engine_used = NULL;
   while (p_routing_eng) {
 - if (!ucast_mgr_route(p_routing_eng, p_osm))
 + failed = ucast_mgr_route(p_routing_eng, p_osm);
 + if (!failed)
   break;
   p_routing_eng = p_routing_eng-next;
   }
 @@ -984,9 +986,11 @@ int osm_ucast_mgr_process(IN osm_ucast_mgr_t * p_mgr)
   struct osm_routing_engine *r = p_osm-default_routing_engine;
  
   r-build_lid_matrices(r-context);
 - r-ucast_build_fwd_tables(r-context);
 - p_osm-routing_engine_used = r;
 - osm_ucast_mgr_set_fwd_tables(p_mgr);
 + failed = r-ucast_build_fwd_tables(r-context);
 + if (!failed) {
 + p_osm-routing_engine_used = r;
 + osm_ucast_mgr_set_fwd_tables(p_mgr);
 + }
   }
  
   if (p_osm-routing_engine_used) {
 @@ -1006,7 +1010,7 @@ int osm_ucast_mgr_process(IN osm_ucast_mgr_t * p_mgr)
  Exit:
   CL_PLOCK_RELEASE(p_mgr-p_lock);
   OSM_LOG_EXIT(p_mgr-p_log);
 - return 0;
 + return failed;
  }
  
  static int ucast_build_lid_matrices(void *context)


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 04/15] opensm: Track the minimum value in the fabric of data VLs supported.

2010-03-10 Thread Jim Schutt
A routing engine that wants to make contributions to SL2VL maps in support
of routing free from credit loops may need to know the minimum number
of supported data VLs in the fabric.

This code tracks that value.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/include/opensm/osm_subnet.h |1 +
 opensm/opensm/osm_port_info_rcv.c  |   13 -
 opensm/opensm/osm_state_mgr.c  |6 ++
 opensm/opensm/osm_subnet.c |1 +
 4 files changed, 20 insertions(+), 1 deletions(-)

diff --git a/opensm/include/opensm/osm_subnet.h 
b/opensm/include/opensm/osm_subnet.h
index 3970e98..d74a57c 100644
--- a/opensm/include/opensm/osm_subnet.h
+++ b/opensm/include/opensm/osm_subnet.h
@@ -520,6 +520,7 @@ typedef struct osm_subn {
uint16_t max_mcast_lid_ho;
uint8_t min_ca_mtu;
uint8_t min_ca_rate;
+   uint8_t min_data_vls;
boolean_t ignore_existing_lfts;
boolean_t subnet_initialization_error;
boolean_t force_heavy_sweep;
diff --git a/opensm/opensm/osm_port_info_rcv.c 
b/opensm/opensm/osm_port_info_rcv.c
index 9260047..c05301e 100644
--- a/opensm/opensm/osm_port_info_rcv.c
+++ b/opensm/opensm/osm_port_info_rcv.c
@@ -83,6 +83,7 @@ static void pi_rcv_process_endport(IN osm_sm_t * sm, IN 
osm_physp_t * p_physp,
ib_api_status_t status;
ib_net64_t port_guid;
uint8_t rate, mtu;
+   unsigned data_vls;
cl_qmap_t *p_sm_tbl;
osm_remote_sm_t *p_sm;
 
@@ -92,7 +93,7 @@ static void pi_rcv_process_endport(IN osm_sm_t * sm, IN 
osm_physp_t * p_physp,
 
/* HACK extended port 0 should be handled too! */
if (osm_physp_get_port_num(p_physp) != 0) {
-   /* track the minimal endport MTU and rate */
+   /* track the minimal endport MTU, rate, and operational VLs */
mtu = ib_port_info_get_mtu_cap(p_pi);
if (mtu  sm-p_subn-min_ca_mtu) {
OSM_LOG(sm-p_log, OSM_LOG_VERBOSE,
@@ -108,6 +109,16 @@ static void pi_rcv_process_endport(IN osm_sm_t * sm, IN 
osm_physp_t * p_physp,
PRIx64 \n, rate, cl_ntoh64(port_guid));
sm-p_subn-min_ca_rate = rate;
}
+
+   data_vls = 1U  (ib_port_info_get_op_vls(p_pi) - 1);
+   if (data_vls = IB_MAX_NUM_VLS)
+   data_vls = IB_MAX_NUM_VLS - 1;
+   if ((uint8_t)data_vls  sm-p_subn-min_data_vls) {
+   OSM_LOG(sm-p_log, OSM_LOG_VERBOSE,
+   Setting endport minimal data VLs to:%u defined 
by port:0x%
+   PRIx64 \n, data_vls, cl_ntoh64(port_guid));
+   sm-p_subn-min_data_vls = data_vls;
+   }
}
 
if (port_guid != sm-p_subn-sm_port_guid) {
diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
index 6fcccba..96ad348 100644
--- a/opensm/opensm/osm_state_mgr.c
+++ b/opensm/opensm/osm_state_mgr.c
@@ -1164,6 +1164,12 @@ repeat_discovery:
sm-p_subn-force_reroute = FALSE;
sm-p_subn-subnet_initialization_error = FALSE;
 
+   /* Reset tracking values in case limiting component got removed
+* from fabric. */
+   sm-p_subn-min_ca_mtu = IB_MAX_MTU;
+   sm-p_subn-min_ca_rate = IB_MAX_RATE;
+   sm-p_subn-min_data_vls = IB_MAX_NUM_VLS - 1;
+
/* rescan configuration updates */
if (!config_parsed  osm_subn_rescan_conf_files(sm-p_subn)  0)
OSM_LOG(sm-p_log, OSM_LOG_ERROR, ERR 331A: 
diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c
index e4126bc..55b9384 100644
--- a/opensm/opensm/osm_subnet.c
+++ b/opensm/opensm/osm_subnet.c
@@ -525,6 +525,7 @@ ib_api_status_t osm_subn_init(IN osm_subn_t * p_subn, IN 
osm_opensm_t * p_osm,
p_subn-max_mcast_lid_ho = IB_LID_MCAST_END_HO;
p_subn-min_ca_mtu = IB_MAX_MTU;
p_subn-min_ca_rate = IB_MAX_RATE;
+   p_subn-min_data_vls = IB_MAX_NUM_VLS - 1;
p_subn-ignore_existing_lfts = TRUE;
 
/* we assume master by default - so we only need to set it true if 
STANDBY */
-- 
1.6.6.1


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 06/15] opensm: Make mcast_mgr_purge_tree() available outside osm_mcast_mgr.c.

2010-03-10 Thread Jim Schutt
A routing engine that needs to compute multicast spanning trees with
special properties will need to delete old trees.  There's already
a function that does this: mcast_mgr_purge_tree().

Make it available outside osm_mcast_mgr.c, and change the name
to follow the naming convention (osm_ prefix) for global functions.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/include/opensm/osm_multicast.h |   33 +
 opensm/opensm/osm_mcast_mgr.c |4 ++--
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/opensm/include/opensm/osm_multicast.h 
b/opensm/include/opensm/osm_multicast.h
index 1da575d..df6ac6c 100644
--- a/opensm/include/opensm/osm_multicast.h
+++ b/opensm/include/opensm/osm_multicast.h
@@ -53,6 +53,7 @@
 #include opensm/osm_mcm_port.h
 #include opensm/osm_subnet.h
 #include opensm/osm_log.h
+#include opensm/osm_sm.h
 
 #ifdef __cplusplus
 #  define BEGIN_C_DECLS extern C {
@@ -193,6 +194,38 @@ osm_mgrp_t *osm_mgrp_new(IN osm_subn_t * subn, IN 
ib_net16_t mlid,
 *  Multicast Group, osm_mgrp_delete
 */
 
+/*
+ * Need a forward declaration to work around include loop:
+ * osm_sm.h - osm_multicast.h
+ */
+struct osm_sm;
+
+/f* OpenSM: Multicast Tree/osm_purge_mtree
+* NAME
+*  osm_purge_mtree
+*
+* DESCRIPTION
+*  Frees all the nodes in a multicast spanning tree
+*
+* SYNOPSIS
+*/
+void osm_purge_mtree(IN struct osm_sm * sm, IN osm_mgrp_box_t * mgb);
+/*
+* PARAMETERS
+*  sm
+*  [in] Pointer to osm_sm_t object.
+*  mgb
+*  [in] Pointer to an osm_mgrp_box_t object.
+*
+* RETURN VALUES
+*  None.
+*
+*
+* NOTES
+*
+* SEE ALSO
+*/
+
 /f* OpenSM: Multicast Group/osm_mgrp_is_guid
 * NAME
 *  osm_mgrp_is_guid
diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c
index bd67d4e..e6db6db 100644
--- a/opensm/opensm/osm_mcast_mgr.c
+++ b/opensm/opensm/osm_mcast_mgr.c
@@ -146,7 +146,7 @@ static void mcast_mgr_purge_tree_node(IN osm_mtree_node_t * 
p_mtn)
free(p_mtn);
 }
 
-static void mcast_mgr_purge_tree(osm_sm_t * sm, IN osm_mgrp_box_t * mbox)
+void osm_purge_mtree(osm_sm_t * sm, IN osm_mgrp_box_t * mbox)
 {
OSM_LOG_ENTER(sm-p_log);
 
@@ -735,7 +735,7 @@ static ib_api_status_t 
mcast_mgr_build_spanning_tree(osm_sm_t * sm,
   on multicast forwarding table information if the user wants to
   preserve existing multicast routes.
 */
-   mcast_mgr_purge_tree(sm, mbox);
+   osm_purge_mtree(sm, mbox);
 
/* build the first subset containing all member ports */
if (make_port_list(port_list, mbox)) {
-- 
1.6.6.1


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 05/15] opensm: Add struct osm_routing_engine callback to build spanning trees for multicast.

2010-03-10 Thread Jim Schutt
If a routing engine needs to compute spanning trees with special
properties, it needs a way to override the default implementation.
A routing engine callback provides that mechanism.  Routing engines
that can use the default implementation can leave the callback
pointer set to NULL.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/include/opensm/osm_opensm.h |6 ++
 opensm/opensm/osm_mcast_mgr.c  |7 ++-
 2 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/opensm/include/opensm/osm_opensm.h 
b/opensm/include/opensm/osm_opensm.h
index 734a6db..fddcf53 100644
--- a/opensm/include/opensm/osm_opensm.h
+++ b/opensm/include/opensm/osm_opensm.h
@@ -132,6 +132,8 @@ struct osm_routing_engine {
uint8_t (*path_sl)(void *context, IN uint8_t path_sl_hint,
   IN const osm_port_t *src_port,
   IN const osm_port_t *dst_port);
+   ib_api_status_t (*mcast_build_stree)(void *context,
+IN OUT osm_mgrp_box_t *mgb);
void (*delete) (void *context);
struct osm_routing_engine *next;
 };
@@ -165,6 +167,10 @@ struct osm_routing_engine {
 *  path_sl
 *  The callback for computing path SL.
 *
+*  mcast_build_stree
+*  The callback for building the spanning tree for multicast
+*  forwarding, called per MLID.
+*
 *  delete
 *  The delete method, may be used for routing engine
 *  internals cleanup.
diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c
index 322635d..bd67d4e 100644
--- a/opensm/opensm/osm_mcast_mgr.c
+++ b/opensm/opensm/osm_mcast_mgr.c
@@ -986,6 +986,7 @@ Exit:
 static ib_api_status_t mcast_mgr_process_mlid(osm_sm_t * sm, uint16_t mlid)
 {
ib_api_status_t status = IB_SUCCESS;
+   struct osm_routing_engine *re = sm-p_subn-p_osm-routing_engine_used;
osm_mgrp_box_t *mbox;
 
OSM_LOG_ENTER(sm-p_log);
@@ -1000,7 +1001,11 @@ static ib_api_status_t mcast_mgr_process_mlid(osm_sm_t * 
sm, uint16_t mlid)
 
mbox = osm_get_mbox_by_mlid(sm-p_subn, cl_hton16(mlid));
if (mbox) {
-   status = mcast_mgr_build_spanning_tree(sm, mbox);
+   if (re  re-mcast_build_stree)
+   status = re-mcast_build_stree(re-context, mbox);
+   else
+   status = mcast_mgr_build_spanning_tree(sm, mbox);
+
if (status != IB_SUCCESS)
OSM_LOG(sm-p_log, OSM_LOG_ERROR, ERR 0A17: 
Unable to create spanning tree (%s) for mlid 
-- 
1.6.6.1


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 01/15] opensm: Prepare for routing engine input to path record SL lookup and SL2VL map setup.

2010-03-10 Thread Jim Schutt
In the event a routing engine needs to participate in SL assignment and
SL2VL map setup in order to avoid credit loops in a fabric, it will be
useful to make the routing engine context more widely available.

To this end, have osm_opensm_t save a pointer to the routing engine used,
rather than its type.  This will make the routing engine context easily
available in, e.g., sl2vl_update() and pr_rcv_get_path_parms().

Make the necessary adjustments to the code that used the old
routing_engine_used as an enum _osm_routing_engine_type.  In order to
keep the behavior where minhop was used if the configured routing engines
failed, the easiest solution was to add a pointer to osm_opensm_t which
pointed to the minhop struct osm_routing_engine.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/include/opensm/osm_opensm.h |4 ++-
 opensm/opensm/osm_console.c|   10 ++--
 opensm/opensm/osm_dump.c   |3 +-
 opensm/opensm/osm_link_mgr.c   |5 ++-
 opensm/opensm/osm_opensm.c |   43 +---
 opensm/opensm/osm_sa_path_record.c |3 +-
 opensm/opensm/osm_ucast_lash.c |3 +-
 opensm/opensm/osm_ucast_mgr.c  |   17 --
 8 files changed, 54 insertions(+), 34 deletions(-)

diff --git a/opensm/include/opensm/osm_opensm.h 
b/opensm/include/opensm/osm_opensm.h
index c6c9bdb..e97142e 100644
--- a/opensm/include/opensm/osm_opensm.h
+++ b/opensm/include/opensm/osm_opensm.h
@@ -120,6 +120,7 @@ typedef enum _osm_routing_engine_type {
 *  added later.
 */
 struct osm_routing_engine {
+   osm_routing_engine_type_t type;
const char *name;
void *context;
int (*build_lid_matrices) (void *context);
@@ -183,7 +184,8 @@ typedef struct osm_opensm {
cl_dispatcher_t disp;
cl_plock_t lock;
struct osm_routing_engine *routing_engine_list;
-   osm_routing_engine_type_t routing_engine_used;
+   struct osm_routing_engine *routing_engine_used;
+   struct osm_routing_engine *default_routing_engine;
osm_stats_t stats;
osm_console_t console;
nn_map_t *node_name_map;
diff --git a/opensm/opensm/osm_console.c b/opensm/opensm/osm_console.c
index a27bee3..31394a7 100644
--- a/opensm/opensm/osm_console.c
+++ b/opensm/opensm/osm_console.c
@@ -372,6 +372,8 @@ static void print_status(osm_opensm_t * p_osm, FILE * out)
cl_list_item_t *item;
 
if (out) {
+   const char *re_str;
+
cl_plock_acquire(p_osm-lock);
fprintf(out,OpenSM Version   : %s\n, 
p_osm-osm_version);
fprintf(out,SM State : %s\n,
@@ -380,9 +382,11 @@ static void print_status(osm_opensm_t * p_osm, FILE * out)
p_osm-subn.opt.sm_priority);
fprintf(out,SA State : %s\n,
sa_state_str(p_osm-sa.state));
-   fprintf(out,Routing Engine   : %s\n,
-   osm_routing_engine_type_str(p_osm-
-   routing_engine_used));
+
+   re_str = p_osm-routing_engine_used ?
+   
osm_routing_engine_type_str(p_osm-routing_engine_used-type) :
+   
osm_routing_engine_type_str(OSM_ROUTING_ENGINE_TYPE_NONE);
+   fprintf(out,Routing Engine   : %s\n, re_str);
 
fprintf(out,Loaded event plugins :);
if (cl_qlist_head(p_osm-plugin_list) ==
diff --git a/opensm/opensm/osm_dump.c b/opensm/opensm/osm_dump.c
index 86e9c00..f3f4623 100644
--- a/opensm/opensm/osm_dump.c
+++ b/opensm/opensm/osm_dump.c
@@ -135,7 +135,8 @@ static void dump_ucast_routes(cl_map_item_t * item, FILE * 
file, void *cxt)
Switch 0x%016 PRIx64 \nLID: Port : Hops : Optimal\n,
cl_ntoh64(osm_node_get_node_guid(p_node)));
 
-   dor = (p_osm-routing_engine_used == OSM_ROUTING_ENGINE_TYPE_DOR);
+   dor = (p_osm-routing_engine_used 
+  p_osm-routing_engine_used-type == OSM_ROUTING_ENGINE_TYPE_DOR);
 
for (lid_ho = 1; lid_ho = max_lid_ho; lid_ho++) {
fprintf(file, 0x%04X : , lid_ho);
diff --git a/opensm/opensm/osm_link_mgr.c b/opensm/opensm/osm_link_mgr.c
index 03a585b..aaeebc7 100644
--- a/opensm/opensm/osm_link_mgr.c
+++ b/opensm/opensm/osm_link_mgr.c
@@ -64,8 +64,9 @@ static uint8_t link_mgr_get_smsl(IN osm_sm_t * sm, IN 
osm_physp_t * p_physp)
 
OSM_LOG_ENTER(sm-p_log);
 
-   if (p_osm-routing_engine_used != OSM_ROUTING_ENGINE_TYPE_LASH
-   || !(slid = osm_physp_get_base_lid(p_physp))) {
+   if (!(p_osm-routing_engine_used 
+ p_osm-routing_engine_used-type == OSM_ROUTING_ENGINE_TYPE_LASH 

+ (slid = osm_physp_get_base_lid(p_physp {
/* Use default SL if lash routing is not used */
OSM_LOG_EXIT(sm-p_log);
return sm-p_subn-opt.sm_sl;
diff --git

[PATCH v2 08/15] opensm: Update documentation to describe torus-2QoS.

2010-03-10 Thread Jim Schutt

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/doc/current-routing.txt |  269 +++-
 opensm/man/opensm.8.in |9 ++-
 2 files changed, 275 insertions(+), 3 deletions(-)

diff --git a/opensm/doc/current-routing.txt b/opensm/doc/current-routing.txt
index 1302860..78a2e01 100644
--- a/opensm/doc/current-routing.txt
+++ b/opensm/doc/current-routing.txt
@@ -1,7 +1,7 @@
 Current OpenSM Routing
-7/9/07
+10/9/09
 
-OpenSM offers five routing engines:
+OpenSM offers six routing engines:
 
 1.  Min Hop Algorithm - based on the minimum hops to each node where the
 path length is optimized.
@@ -28,6 +28,13 @@ two switches.  This provides deadlock free routes for 
hypercubes when
 the fabric is cabled as a hypercube and for meshes when cabled as a
 mesh (see details below).
 
+6. Torus-2QoS unicast routing algorithm - a DOR-based routing algorithm
+specialized for 2D/3D torus topologies.  Torus-2QoS provides deadlock-free
+routing while supporting two quality of service (QoS) levels.  In addition
+it is able to route around multiple failed fabric links or a single failed
+fabric switch without introducing deadlocks, and without changing path SL
+values granted before the failure.
+
 OpenSM provides an optional unicast routing cache (enabled by -A or
 --ucast_cache options). When enabled, unicast routing cache prevents
 routing recalculation (which is a heavy task in a large cluster) when
@@ -388,3 +395,261 @@ ports, one port on one end of the cable, and the other 
port on the
 other end, continuing along the mesh dimension.
 
 Use '-R dor' option to activate the DOR algorithm.
+
+Torus-2QoS Routing Algorithm
+
+
+Torus-2QoS is routing algorithm designed for large-scale 2D/3D torus fabrics.
+The torus-2QoS routing engine can provide the following functionality on
+a 2D/3D torus:
+- routing that is free of credit loops
+- two levels of QoS, assuming switches support 8 data VLs
+- ability to route around a single failed switch, and/or multiple failed
+links, without
+- introducing credit loops
+- changing path SL values
+- very short run times, with good scaling properties as fabric size
+increases
+
+Torus-2QoS is a DOR-based algorithm that avoids deadlocks that would otherwise
+occur in a torus using the concept of a dateline for each torus dimension.
+It encodes into a path SL which datelines the path crosses as follows:
+
+  sl = 0;
+  for (d = 0; d  torus_dimensions; d++)
+/* path_crosses_dateline(d) returns 0 or 1 */
+sl |= path_crosses_dateline(d)  d;
+
+For a 3D torus, that leaves one SL bit free, which torus-2QoS uses to
+implement two QoS levels.
+
+This is possible because torus-2QoS also makes use of the output port
+dependence of the switch SL2VL maps.  It computes in which torus coordinate
+direction each interswitch link points, and writes SL2VL maps for such
+ports as follows:
+
+  for (sl = 0; sl  16; sl ++)
+/* cdir(port) reports which torus coordinate direction a switch port
+ * points in, and returns 0, 1, or 2 */
+sl2vl(iport,oport,sl) = 0x1  (sl  cdir(oport));
+
+Thus torus-2QoS consumes 8 SL values (SL bits 0-2) and 2 VL values (VL bit 0)
+per QoS level to provide deadlock-free routing on a 3D torus.
+
+Torus-2QoS routes around link failure by taking the long way around any
+1D ring interrupted by a link failure.  For example, consider the 2D 6x5
+torus below, where switches are denoted by [+a-zA-Z]:
+
+||||||
+   4  --++++++--
+||||||
+   3  --+++D++--
+||||||
+   2  --++Ir++--
+||||||
+   1  --mSnTop--
+||||||
+ y=0  --++++++--
+||||||
+
+  x=012345
+
+For a pristine fabric the path from S to D would be S-n-T-r-d.  In the
+event that either link S-n or n-T has failed, torus-2QoS would use the path
+S-m-p-o-T-r-D.  Note that it can do this without changing the path SL
+value; once the 1D ring m-S-n-T-o-p-m has been broken by failure, path
+segments using it cannot contribute to deadlock, and the x-direction
+dateline (between, say, x=5 and x=0) can be ignored for path segments on
+that ring.
+
+One result of this is that torus-2QoS can route around many simultaneous
+link failures, as long as no 1D ring is broken into disjoint regions.  For
+example, if links n-T and T-o have both failed, that ring has been broken
+into two disjoint regions, T and o-p-m-S-n.  Torus-2QoS checks for such
+issues, reports if they are found, and refuses to route such fabrics.
+
+Handling a failed switch under DOR requires introducing into a path at
+least one turn that would be otherwise illegal, i.e. not allowed by DOR
+rules.  Torus-2QoS will introduce such a turn as close as possible to the
+failed switch

[PATCH v2 02/15] opensm: Allow the routing engine to influence SL2VL calculations.

2010-03-10 Thread Jim Schutt
Note that the original code assumes that QoS setup is mostly static and
based only on user configuration.  As a result, there is no provision for
routing engines that want to compute contributions to the SL2VL maps.

Fix this up by adding a callback to struct osm_routing_engine that computes
a per-port SL2VL map, and call it from the appropriate place in the QoS
setup path.  Assume that if a routing engine provides a update_sl2vl()
callback that there will input-port dependence in the SL2VL maps, and
so do not attempt to use optimized SL2VL map programming even if the
switch supports it.

Also need to move the call to osm_qos_setup() in do_sweep() to after the
call to the routing engine, so that any SL2VL map contributions from the
routing engine are based on the latest information.  Need to call
osm_qos_setup() for requested reroute for the same reason.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/include/opensm/osm_opensm.h |   12 
 opensm/opensm/osm_qos.c|   27 +++
 opensm/opensm/osm_state_mgr.c  |5 +++--
 3 files changed, 38 insertions(+), 6 deletions(-)

diff --git a/opensm/include/opensm/osm_opensm.h 
b/opensm/include/opensm/osm_opensm.h
index e97142e..25a6f90 100644
--- a/opensm/include/opensm/osm_opensm.h
+++ b/opensm/include/opensm/osm_opensm.h
@@ -126,6 +126,9 @@ struct osm_routing_engine {
int (*build_lid_matrices) (void *context);
int (*ucast_build_fwd_tables) (void *context);
void (*ucast_dump_tables) (void *context);
+   void (*update_sl2vl)(void *context, IN osm_physp_t *port,
+IN uint8_t in_port_num, IN uint8_t out_port_num,
+IN OUT ib_slvl_table_t *t);
void (*delete) (void *context);
struct osm_routing_engine *next;
 };
@@ -147,6 +150,15 @@ struct osm_routing_engine {
 *  ucast_dump_tables
 *  The callback for dumping unicast routing tables.
 *
+*  update_sl2vl(void *context, IN osm_physp_t *port,
+*   IN uint8_t in_port_num, IN uint8_t out_port_num,
+*   OUT ib_slvl_table_t *t)
+*  The callback to allow routing engine input for SL2VL maps.
+*  *port is the phyical port for which the SL2VL map is to be
+*  updated. For switches, in_port_num/out_port_num identify
+*  which part of the SL2VL map to update.  For router/HCA ports,
+*  in_port_num/out_port_num should be ignored.
+*
 *  delete
 *  The delete method, may be used for routing engine
 *  internals cleanup.
diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c
index f814ea8..23fd316 100644
--- a/opensm/opensm/osm_qos.c
+++ b/opensm/opensm/osm_qos.c
@@ -207,6 +207,7 @@ static int qos_extports_setup(osm_sm_t * sm, osm_node_t 
*node,
osm_physp_t *p0, *p;
unsigned force_update;
unsigned num_ports = osm_node_get_num_physp(node);
+   struct osm_routing_engine *re = sm-p_subn-p_osm-routing_engine_used;
int ret = 0;
unsigned i, j;
 
@@ -223,7 +224,7 @@ static int qos_extports_setup(osm_sm_t * sm, osm_node_t 
*node,
return ret;
 
if (ib_switch_info_get_opt_sl2vlmapping(node-sw-switch_info) 
-   sm-p_subn-opt.use_optimized_slvl) {
+   sm-p_subn-opt.use_optimized_slvl  !re-update_sl2vl) {
p = osm_node_get_physp_ptr(node, 1);
force_update = p-need_update || sm-p_subn-need_update;
return sl2vl_update_table(sm, p, 1, 0x3, force_update,
@@ -233,10 +234,20 @@ static int qos_extports_setup(osm_sm_t * sm, osm_node_t 
*node,
for (i = 1; i  num_ports; i++) {
p = osm_node_get_physp_ptr(node, i);
force_update = p-need_update || sm-p_subn-need_update;
-   for (j = 0; j  num_ports; j++)
+   for (j = 0; j  num_ports; j++) {
+   const ib_slvl_table_t *port_sl2vl = qcfg-sl2vl;
+   ib_slvl_table_t routing_sl2vl;
+
+   if (re-update_sl2vl) {
+   routing_sl2vl = *port_sl2vl;
+   re-update_sl2vl(re-context,
+p, i, j, routing_sl2vl);
+   port_sl2vl = routing_sl2vl;
+   }
if (sl2vl_update_table(sm, p, i, i  8 | j,
-  force_update, qcfg-sl2vl))
+  force_update, port_sl2vl))
ret = -1;
+   }
}
 
return ret;
@@ -246,6 +257,9 @@ static int qos_endport_setup(osm_sm_t * sm, osm_physp_t * p,
 const struct qos_config *qcfg)
 {
unsigned force_update = p-need_update || sm-p_subn-need_update;
+   struct osm_routing_engine *re = sm-p_subn-p_osm

[PATCH v2 10/15] opensm: Add opensm option to specify file name for extra torus-2QoS configuration information.

2010-03-10 Thread Jim Schutt

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/include/opensm/osm_base.h   |   18 ++
 opensm/include/opensm/osm_subnet.h |5 +
 opensm/opensm/main.c   |9 +
 opensm/opensm/osm_subnet.c |1 +
 opensm/opensm/osm_torus.c  |2 +-
 5 files changed, 34 insertions(+), 1 deletions(-)

diff --git a/opensm/include/opensm/osm_base.h b/opensm/include/opensm/osm_base.h
index 4e9aaa9..8720c38 100644
--- a/opensm/include/opensm/osm_base.h
+++ b/opensm/include/opensm/osm_base.h
@@ -277,6 +277,24 @@ BEGIN_C_DECLS
 #endif /* __WIN__ */
 /***/
 
+/d* OpenSM: Base/OSM_DEFAULT_TORUS_CONF_FILE
+* NAME
+*  OSM_DEFAULT_TORUS_CONF_FILE
+*
+* DESCRIPTION
+*  Specifies the default file name for extra torus-2QoS configuration
+*
+* SYNOPSIS
+*/
+#ifdef __WIN__
+#define OSM_DEFAULT_TORUS_CONF_FILE strcat(GetOsmCachePath(), 
osm-torus-2QoS.conf)
+#elif defined(OPENSM_CONFIG_DIR)
+#define OSM_DEFAULT_TORUS_CONF_FILE OPENSM_CONFIG_DIR /torus-2QoS.conf
+#else
+#define OSM_DEFAULT_TORUS_CONF_FILE /etc/opensm/torus-2QoS.conf
+#endif /* __WIN__ */
+/***/
+
 /d* OpenSM: Base/OSM_DEFAULT_PREFIX_ROUTES_FILE
 * NAME
 *  OSM_DEFAULT_PREFIX_ROUTES_FILE
diff --git a/opensm/include/opensm/osm_subnet.h 
b/opensm/include/opensm/osm_subnet.h
index d74a57c..d2d9661 100644
--- a/opensm/include/opensm/osm_subnet.h
+++ b/opensm/include/opensm/osm_subnet.h
@@ -201,6 +201,7 @@ typedef struct osm_subn_opt {
char *guid_routing_order_file;
char *sa_db_file;
boolean_t sa_db_dump;
+   char *torus_conf_file;
boolean_t do_mesh_analysis;
boolean_t exit_on_fatal;
boolean_t honor_guid2lid_file;
@@ -418,6 +419,10 @@ typedef struct osm_subn_opt {
 *  When TRUE causes OpenSM to dump SA DB at the end of every
 *  light sweep regardless the current verbosity level.
 *
+*  torus_conf_file
+*  Name of the file with extra configuration info for torus-2QoS
+*  routing engine.
+*
 *  exit_on_fatal
 *  If TRUE (default) - SM will exit on fatal subnet initialization
 *  issues.
diff --git a/opensm/opensm/main.c b/opensm/opensm/main.c
index f396de4..578ae9f 100644
--- a/opensm/opensm/main.c
+++ b/opensm/opensm/main.c
@@ -231,6 +231,10 @@ static void show_usage(void)
 Set the order port guids will be routed for the 
MinHop\n
 and Up/Down routing algorithms to the guids provided 
in the\n
 given file (one to a line)\n\n);
+   printf(--torus_config path to file\n
+This option defines the file name for the extra 
configuration\n
+info needed for the torus-2QoS routing engine.   The 
default\n
+name is \'OSM_DEFAULT_TORUS_CONF_FILE\'\n\n);
printf(--once, -o\n
 This option causes OpenSM to configure the subnet\n
 once, then exit.  Ports remain in the ACTIVE 
state.\n\n);
@@ -610,6 +614,7 @@ int main(int argc, char *argv[])
{sm_sl, 1, NULL, 7},
{retries, 1, NULL, 8},
{log_prefix, 1, NULL, 9},
+   {torus_config, 1, NULL, 10},
{NULL, 0, NULL, 0}  /* Required at the end of the array */
};
 
@@ -992,6 +997,10 @@ int main(int argc, char *argv[])
SET_STR_OPT(opt.log_prefix, optarg);
printf(Log prefix = %s\n, opt.log_prefix);
break;
+   case 10:
+   SET_STR_OPT(opt.torus_conf_file, optarg);
+   printf(Torus-2QoS config file = %s\n, 
opt.torus_conf_file);
+   break;
case 'h':
case '?':
case ':':
diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c
index 55b9384..47aa529 100644
--- a/opensm/opensm/osm_subnet.c
+++ b/opensm/opensm/osm_subnet.c
@@ -758,6 +758,7 @@ void osm_subn_set_default_opt(IN osm_subn_opt_t * p_opt)
p_opt-guid_routing_order_file = NULL;
p_opt-sa_db_file = NULL;
p_opt-sa_db_dump = FALSE;
+   p_opt-torus_conf_file = strdup(OSM_DEFAULT_TORUS_CONF_FILE);
p_opt-do_mesh_analysis = FALSE;
p_opt-exit_on_fatal = TRUE;
p_opt-enable_quirks = FALSE;
diff --git a/opensm/opensm/osm_torus.c b/opensm/opensm/osm_torus.c
index 7f80034..7c3b550 100644
--- a/opensm/opensm/osm_torus.c
+++ b/opensm/opensm/osm_torus.c
@@ -9043,7 +9043,7 @@ int torus_build_lfts(void *context)
torus-osm = ctx-osm;
fabric-osm = ctx-osm;
 
-   if (!parse_config(OPENSM_CONFIG_DIR /opensm-torus.conf,
+   if (!parse_config(ctx-osm-subn.opt.torus_conf_file,
  fabric, torus))
goto out;
 
-- 
1.6.6.1


--
To unsubscribe from this list: send the line unsubscribe linux-rdma

[PATCH v2 00/15] opensm: Add new torus routing engine: torus-2QoS

2010-03-10 Thread Jim Schutt
This is v2 of a patchset to add to opensm a new routing engine designed
to handle large fabrics connected with a 2D/3D topology.

Changes since initial version:

- Merged my patchsets from 11/20/2009, 12/18/2009, 2/16/2010.
- Moved infomation contained in the earlier patch series introduction
emails into the appropriate commit messages.
- Rebased to c183eb8c4c.
- Addressed issues found by Yevgeny Kliteynik in original patchsets.
Yevgeny's --no_default_routing option patch is not included
in the merging, but would be a good addition.
- Renamed osm_ucast_torus.c to osm_torus.c.
Since osm_torus.c contains code to implement both unicast and
multicast routing, the new name seems more appropriate.  The
multicast support depends heavily on the unicast routing code,
so it is more convenient to keep everything in one file.
- Removed redundant check for changed sl2vl map.
This functionality already exists in sl2vl_update_table().
- Set sl2vl maps on CA ports for torus-2QoS.
This was missing in the original patches.
- Do not force torus-2QoS to use SLs 8-15 when not using opensm -Q.
This was an interim measure introduced before multicast support was
working, that allowed multicast to use SL/VL 0 and thus not deadlock
against unicast.  I forget to take it out in the multicast patchset,
so I took it out when I merged.
- Renamed torus variables referencing origin to seed.
These things refer to switches used to seed the torus topology
appropriately, so the new name should reduce confusion going forward.
This also contains a keyword change in the torus configuration file,
so I'll repost an updated example.

Jim Schutt (15):
  opensm: Prepare for routing engine input to path record SL lookup and
SL2VL map setup.
  opensm: Allow the routing engine to influence SL2VL calculations.
  opensm: Allow the routing engine to participate in path SL
calculations.
  opensm: Track the minimum value in the fabric of data VLs supported.
  opensm: Add struct osm_routing_engine callback to build spanning
trees for multicast.
  opensm: Make mcast_mgr_purge_tree() available outside
osm_mcast_mgr.c.
  opensm: Add torus-2QoS routing engine.
  opensm: Update documentation to describe torus-2QoS.
  opensm: Enable torus-2QoS routing engine.
  opensm: Add opensm option to specify file name for extra torus-2QoS
configuration information.
  opensm: Do not require -Q option for torus-2QoS routing engine.
  opensm: Make it possible to configure no fallback routing engine.
  opensm: Avoid havoc in minhop caused by torus-2QoS persistent use of
osm_port_t:priv.
  opensm: Avoid havoc in dump_ucast_routes() caused by torus-2QoS
persistent use of osm_port_t:priv.
  opensm: Cause status of unicast routing attempt to propogate to
callers of osm_ucast_mgr_process().

 opensm/doc/current-routing.txt |  269 +-
 opensm/include/opensm/osm_base.h   |   18 +
 opensm/include/opensm/osm_multicast.h  |   33 +
 opensm/include/opensm/osm_opensm.h |   29 +-
 opensm/include/opensm/osm_subnet.h |7 +
 opensm/include/opensm/osm_switch.h |   12 +
 opensm/include/opensm/osm_ucast_lash.h |3 -
 opensm/man/opensm.8.in |9 +-
 opensm/opensm/Makefile.am  |2 +-
 opensm/opensm/main.c   |   11 +-
 opensm/opensm/osm_console.c|   10 +-
 opensm/opensm/osm_dump.c   |5 +-
 opensm/opensm/osm_link_mgr.c   |   16 +-
 opensm/opensm/osm_mcast_mgr.c  |   11 +-
 opensm/opensm/osm_opensm.c |   54 +-
 opensm/opensm/osm_port_info_rcv.c  |   13 +-
 opensm/opensm/osm_qos.c|   40 +-
 opensm/opensm/osm_sa_path_record.c |   33 +-
 opensm/opensm/osm_state_mgr.c  |   23 +-
 opensm/opensm/osm_subnet.c |   20 +-
 opensm/opensm/osm_switch.c |7 +-
 opensm/opensm/osm_torus.c  | 9114 
 opensm/opensm/osm_ucast_lash.c |   11 +-
 opensm/opensm/osm_ucast_mgr.c  |   55 +-
 24 files changed, 9696 insertions(+), 109 deletions(-)
 create mode 100644 opensm/opensm/osm_torus.c


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 13/15] opensm: Avoid havoc in minhop caused by torus-2QoS persistent use of osm_port_t:priv.

2010-03-10 Thread Jim Schutt
Torus-2QoS makes persistent use of osm_port_t:priv to speed calculation
of path SL values.

It cannot clear osm_port_t:priv members when it tears down its persistent
data for the following reason: If a port is removed from the fabric, the
opensm core will delete the corresponding osm_port_t object, leaving
torus-2QoS holding a dangling reference.  Torus-2QoS then has a use-after-free
error when tearing down its persistent data if it tries to use its dangling
osm_port_t reference to clear the priv member.

When torus-2QoS is unable to route a fabric due to missing switches and
opensm is configured to fall back to minhop, havoc will ensue because
minhop uses a non-NULL osm_port_t:priv as a proxy for LMC  0: it
assumes if osm_port_t:priv is non-NULL it can only be because
alloc_ports_priv() has been called.

Fix this up by always calling alloc_ports_priv(), and have it set
priv = NULL if LMC == 0.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_ucast_mgr.c |   10 +-
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c
index d7a4a8c..9a3ea25 100644
--- a/opensm/opensm/osm_ucast_mgr.c
+++ b/opensm/opensm/osm_ucast_mgr.c
@@ -314,8 +314,10 @@ static void alloc_ports_priv(osm_ucast_mgr_t * mgr)
 item = cl_qmap_next(item)) {
port = (osm_port_t *) item;
lmc = ib_port_info_get_lmc(port-p_physp-port_info);
-   if (!lmc)
+   if (!lmc) {
+   port-priv = NULL;
continue;
+   }
r = malloc(sizeof(*r) + sizeof(r-guids[0]) * (1  lmc));
if (!r) {
OSM_LOG(mgr-p_log, OSM_LOG_ERROR, ERR 3A09: 
@@ -362,8 +364,7 @@ static void ucast_mgr_process_tbl(IN cl_map_item_t * 
p_map_item,
/* Initialize LIDs in buffer to invalid port number. */
memset(p_sw-new_lft, OSM_NO_PATH, p_sw-max_lid_ho + 1);
 
-   if (p_mgr-p_subn-opt.lmc)
-   alloc_ports_priv(p_mgr);
+   alloc_ports_priv(p_mgr);
 
/*
   Iterate through every port setting LID routes for each
@@ -380,8 +381,7 @@ static void ucast_mgr_process_tbl(IN cl_map_item_t * 
p_map_item,
}
}
 
-   if (p_mgr-p_subn-opt.lmc)
-   free_ports_priv(p_mgr);
+   free_ports_priv(p_mgr);
 
OSM_LOG_EXIT(p_mgr-p_log);
 }
-- 
1.6.6.1


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 11/15] opensm: Do not require -Q option for torus-2QoS routing engine.

2010-03-10 Thread Jim Schutt
The torus-2QoS engine provides a deadlock-free routing for a 2D/3D torus,
but requires that switch SL2VL maps be programmed.  Before this change,
opensm -Q was required for that to happen.

When a routing engine sets the struct osm_routing_engine:update_sl2vl
pointer, it is signalling its intent to participate in SL2VL map programming.
So, don't return early from osm_qos_setup() in that case; instead do everything
except attempt to read QoS configuration information.

For that to work properly, need to also always set up the default QoS config
information, instead of just when QoS is requested via -Q.

With that in place, the -Q option now means the same thing to torus-2QoS that
it means to other routing engines: QoS configuration is requested.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_qos.c|7 +--
 opensm/opensm/osm_subnet.c |   18 +-
 2 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c
index 23fd316..d78531b 100644
--- a/opensm/opensm/osm_qos.c
+++ b/opensm/opensm/osm_qos.c
@@ -289,7 +289,9 @@ int osm_qos_setup(osm_opensm_t * p_osm)
osm_node_t *p_node;
int ret = 0;
 
-   if (!p_osm-subn.opt.qos)
+   if (!(p_osm-subn.opt.qos ||
+ (p_osm-routing_engine_used 
+  p_osm-routing_engine_used-update_sl2vl)))
return 0;
 
OSM_LOG_ENTER(p_osm-log);
@@ -306,7 +308,8 @@ int osm_qos_setup(osm_opensm_t * p_osm)
cl_plock_excl_acquire(p_osm-lock);
 
/* read QoS policy config file */
-   osm_qos_parse_policy_file(p_osm-subn);
+   if (p_osm-subn.opt.qos)
+   osm_qos_parse_policy_file(p_osm-subn);
 
p_tbl = p_osm-subn.port_guid_tbl;
p_next = cl_qmap_head(p_tbl);
diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c
index 47aa529..5478eae 100644
--- a/opensm/opensm/osm_subnet.c
+++ b/opensm/opensm/osm_subnet.c
@@ -1056,6 +1056,8 @@ static void subn_verify_qos_set(osm_qos_options_t *set, 
const char *prefix,
 
 int osm_subn_verify_config(IN osm_subn_opt_t * p_opts)
 {
+   osm_qos_options_t dflt;
+
if (p_opts-lmc  7) {
log_report( Invalid Cached Option Value:lmc = %u:
   Using Default:%u\n, p_opts-lmc, OSM_DEFAULT_LMC);
@@ -1099,17 +1101,15 @@ int osm_subn_verify_config(IN osm_subn_opt_t * p_opts)
p_opts-console = OSM_DEFAULT_CONSOLE;
}
 
-   if (p_opts-qos) {
-   osm_qos_options_t dflt;
-
-   /* the default options in qos_options must be correct.
-* every other one need not be, b/c those will default
-* back to whatever is in qos_options.
-*/
 
-   subn_set_default_qos_options(dflt);
+   /* the default options in qos_options must be correct.
+* every other one need not be, b/c those will default
+* back to whatever is in qos_options.
+*/
+   subn_set_default_qos_options(dflt);
+   subn_verify_qos_set(p_opts-qos_options, qos, dflt);
 
-   subn_verify_qos_set(p_opts-qos_options, qos, dflt);
+   if (p_opts-qos) {
subn_verify_qos_set(p_opts-qos_ca_options, qos_ca,
p_opts-qos_options);
subn_verify_qos_set(p_opts-qos_sw0_options, qos_sw0,
-- 
1.6.6.1


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 07/15] opensm: Add torus-2QoS routing engine.

2010-03-10 Thread Jim Schutt
, there
will be no credit loops.

Since trous-2QoS uses all available SL values for unicast traffic,
multicast traffic must share SL values with unicast traffic.  This
in turn means that multicast routing must be compatible with unicast
routing to prevent credit loops.

Since torus-2QoS unicast routing is based on DOR, it turns out to
be possible to construct spanning trees so that when multicast
and unicast traffic are overlaid, credit loops are not possible.

Here is a 2D example of such a spanning tree, where x is the
root switch, and each + is a non-root switch:

   +  +  +  +  +
   |  |  |  |  |
   +  +  +  +  +
   |  |  |  |  |
   +--+--x--+--+
   |  |  |  |  |
   +  +  +  +  +

For multicast traffic routed from root to tip, every turn in the
above spanning tree is a legal DOR turn.

For traffic routed from tip to root, and traffic routed through
the root, turns are not legal DOR turns.  However, to construct
a credit loop, the union of multicast routing on this spanning
tree with DOR unicast routing can only provide 3 of the 4 turns
needed for the loop.

In addition, if none of the above spanning tree branches crosses
a dateline used for unicast credit loop avoidance on a torus,
and multicast traffic is confined to SL 0 or SL 8 (recall that
torus-2QoS uses SL bit 3 to differentiate QoS level), then
multicast traffic also cannot contribute to the ring credit
loops that are otherwise possible in a torus.

Torus-2QoS uses these ideas to create a master spanning tree.
Every multicast group spanning tree will be constructed as a
subset of the master tree, with the same root as the master
tree.

Such multicast group spanning trees will in general not be
optimal for groups which are a subset of the full fabric.
However, this compromise must be made to enable support for
two QoS levels on a torus while preventing credit loops.

To build a spanning tree for a particular MLID, torus-2QoS just
needs to mark all the ports that participate in that multicast
group, then walk the master spanning tree and add switches
hosting the marked ports to the multicast group spanning tree.
A depth-first search of the master spanning tree is used for this.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---

I've attached the patch as a compressed file, as otherwise
it is too large to make it through the list.

-- Jim

 opensm/opensm/Makefile.am |2 +-
 opensm/opensm/osm_torus.c | 9114 +
 2 files changed, 9115 insertions(+), 1 deletions(-)
 create mode 100644 opensm/opensm/osm_torus.c




0007-opensm-Add-torus-2QoS-routing-engine.patch.bz2
Description: application/bzip


Re: [PATCH v2 00/15] opensm: torus-2QoS example input files

2010-03-10 Thread Jim Schutt

The attached files can be used to test the torus-2QoS routing
engine using ibsim.

fabric-torus-5x5x5 contains a fabric description that ibsim can read.
Once ibsim is running, run opensm like this:

  opensm --config opensm.conf --torus_config torus-2QoS-5x5x5.conf
or 
  opensm --config opensm.conf --torus_config torus-2QoS-5x5x5.conf \
 -Q --qos_policy_file qos-policy-torus-5x5x5.conf

-- Jim



fabric-torus-5x5x5.bz2
Description: application/bzip

# Limit the maximal operational VLs
max_op_vls 8

# The number of seconds between subnet sweeps (0 disables it)
sweep_interval 10

# Routing engine
# Multiple routing engines can be specified separated by
# commas so that specific ordering of routing algorithms will
# be tried if earlier routing engines fail.
# Supported engines: minhop, updn, file, ftree, lash, dor
routing_engine torus-2QoS,no_fallback

# Use unicast routing cache (use FALSE if unsure)
use_ucast_cache TRUE

# Force flush of the log file after each log message
force_log_flush TRUE

# Log file to be used
log_file /dev/tty

# console [off|local|loopback|socket]
console loopback

# Telnet port for console (default 1)
console_port 1

# QoS default options
# Note that for OFED  1.3, this information can also be in qos-policy.conf.
# However, it may be good to have it here also for torus-2QoS, as this will
# change the defaults even if not using QoS.
qos_max_vls 8
qos_high_limit 0
qos_vlarb_high 0:0,1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0
qos_vlarb_low 0:64,1:64,2:64,3:64,4:64,5:64,6:64,7:64,8:64
qos_sl2vl (null)

# This is a QoS configuration for the torus-2QoS routing engine.
# As it supports only 2 levels of QoS, via SL bit 3, we should configure
# only SLs 0 and 8.  Based on that torus-2QoS will pick the appropriate
# SL value to provide deadlock-free routing for both QoS levels.

port-groups
port-group
name: Service_nodes
port-name: H_0_0_0_0/P1   # E.g. admin
port-name: H_0_0_1_0/P1   # E.g. NFS server
port-name: H_0_0_2_0/P1   # E.g. boot server
port-name: H_0_0_3_0/P1   # E.g. login node
end-port-group

port-group
name: Lustre_nodes

port-name: H_0_0_4_0/P1   # E.g. MDS

port-name: H_0_1_0_0/P1   # E.g. OSS
port-name: H_0_1_1_0/P1   # E.g. OSS
port-name: H_0_1_2_0/P1   # E.g. OSS
port-name: H_0_1_3_0/P1   # E.g. OSS
port-name: H_0_1_4_0/P1   # E.g. OSS
end-port-group

port-group
name: Compute_nodes

port-name: H_0_2_0_0/P1
port-name: H_0_2_1_0/P1
port-name: H_0_2_2_0/P1
port-name: H_0_2_3_0/P1
port-name: H_0_2_4_0/P1

port-name: H_0_3_0_0/P1
port-name: H_0_3_1_0/P1
port-name: H_0_3_2_0/P1
port-name: H_0_3_3_0/P1
port-name: H_0_3_4_0/P1

port-name: H_0_4_0_0/P1
port-name: H_0_4_1_0/P1
port-name: H_0_4_2_0/P1
port-name: H_0_4_3_0/P1
port-name: H_0_4_4_0/P1

port-name: H_1_0_0_0/P1
port-name: H_1_0_1_0/P1
port-name: H_1_0_2_0/P1
port-name: H_1_0_3_0/P1
port-name: H_1_0_4_0/P1

port-name: H_1_1_0_0/P1
port-name: H_1_1_1_0/P1
port-name: H_1_1_2_0/P1
port-name: H_1_1_3_0/P1
port-name: H_1_1_4_0/P1

port-name: H_1_2_0_0/P1
port-name: H_1_2_1_0/P1
port-name: H_1_2_2_0/P1
port-name: H_1_2_3_0/P1
port-name: H_1_2_4_0/P1

port-name: H_1_3_0_0/P1
port-name: H_1_3_1_0/P1
port-name: H_1_3_2_0/P1
port-name: H_1_3_3_0/P1
port-name: H_1_3_4_0/P1

port-name: H_1_4_0_0/P1
port-name: H_1_4_1_0/P1
port-name: H_1_4_2_0/P1
port-name: H_1_4_3_0/P1
port-name: H_1_4_4_0/P1

port-name: H_2_0_0_0/P1
port-name: H_2_0_1_0/P1
port-name: H_2_0_2_0/P1
port-name: H_2_0_3_0/P1
port-name: H_2_0_4_0/P1

port-name: H_2_1_0_0/P1
port-name: H_2_1_1_0/P1
port-name: H_2_1_2_0/P1
port-name: H_2_1_3_0/P1
port-name: H_2_1_4_0/P1

port-name: H_2_2_0_0/P1
port-name: H_2_2_1_0/P1
port-name: H_2_2_2_0/P1
port-name: H_2_2_3_0/P1
port-name: H_2_2_4_0/P1

port-name: H_2_3_0_0/P1
port-name: H_2_3_1_0/P1
port-name: H_2_3_2_0/P1
port-name: H_2_3_3_0/P1
port-name: H_2_3_4_0/P1

port-name: H_2_4_0_0/P1
port-name: H_2_4_1_0/P1
port-name: H_2_4_2_0/P1
port-name: H_2_4_3_0/P1
port-name: H_2_4_4_0/P1

port-name: H_3_0_0_0/P1
port-name: H_3_0_1_0/P1
port-name: H_3_0_2_0/P1
port-name: H_3_0_3_0/P1
port-name: H_3_0_4_0/P1

port-name: H_3_1_0_0/P1
port-name: H_3_1_1_0/P1
port-name: H_3_1_2_0/P1
port-name: H_3_1_3_0/P1
port-name: H_3_1_4_0/P1

port-name: H_3_2_0_0/P1
port-name: 

Re: [PATCH 09/11] opensm: Make it possible to configure no fallback routing engine.

2010-03-04 Thread Jim Schutt

On Thu, 2010-03-04 at 07:35 -0700, Yevgeny Kliteynik wrote:
 Hi Jim,
 
 On 20/Nov/09 21:15, Jim Schutt wrote:
  For a fabric that requires routing with an engine with special properties,
  say avoiding credit loops via making use of SLs in routing, it might
  be preferable to not fall back to minhop if the configured routing engine
  fails.
 
  E.g. the torus-2QoS routing engine uses both SL2VL maps and path SL values
  to provide routing free of credit loops, but cannot route fabrics for
  some patterns of failed switches.  Should a switch fail that creates such
  a pattern, it may be preferable to keep the previous routing information
  loaded in the switches until a switch can be replaced that restores
  torus-2QoS's ability to route the fabric.
 
  The alternative, having some other engine route the fabric, will immediately
  introduce credit loops.
 
 This is a great idea.
 Regarding the implementation: I would prefer seeing this
 as a purely OpenSM option and not as a new routing engine
 keyword.
 I think it would be cleaner to leave the list of routing
 engines w/o special keys, and have a general option
 that would prevent SM from falling back. 

That seems right to me, now.

 Actually, the
 fall-back itself is not bad, as it is defined by the list
 of routing engines, and SM should try them one by one.
 The problem is with using default routing that is not
 specified in the routing engines list.

I agree.  If a user explicitly configures which
routing engines to try, only those should be used,
and a notification logged if they all fail.

 
 Here's the patch that implements OSM option
 use_default_routing, and a command line parameter
 no_default_routing to control this option.

This looks good to me.

 
 I'll write the patch that adds this option to the
 OSM trunk and send it to Sasha shortly.

OK, thanks.

-- Jim

 
 Signed-off-by: Yevgeny Kliteynik klit...@dev.mellanox.co.il
 ---
   opensm/include/opensm/osm_subnet.h |2 +-
   opensm/opensm/main.c   |9 +
   opensm/opensm/osm_opensm.c |   10 --
   opensm/opensm/osm_subnet.c |8 
   opensm/opensm/osm_ucast_mgr.c  |7 +--
   5 files changed, 27 insertions(+), 9 deletions(-)
 
 diff --git a/opensm/include/opensm/osm_subnet.h 
 b/opensm/include/opensm/osm_subnet.h
 index a4133a0..905f64d 100644
 --- a/opensm/include/opensm/osm_subnet.h
 +++ b/opensm/include/opensm/osm_subnet.h
 @@ -190,6 +190,7 @@ typedef struct osm_subn_opt {
   boolean_t sweep_on_trap;
   char *routing_engine_names;
   boolean_t use_ucast_cache;
 + boolean_t use_default_routing;
   boolean_t connect_roots;
   char *lid_matrix_dump_file;
   char *lfts_file;
 @@ -215,7 +216,6 @@ typedef struct osm_subn_opt {
   osm_qos_options_t qos_rtr_options;
   boolean_t enable_quirks;
   boolean_t no_clients_rereg;
 - boolean_t no_fallback_routing_engine;
   #ifdef ENABLE_OSM_PERF_MGR
   boolean_t perfmgr;
   boolean_t perfmgr_redir;
 diff --git a/opensm/opensm/main.c b/opensm/opensm/main.c
 index 096bf5f..47075a2 100644
 --- a/opensm/opensm/main.c
 +++ b/opensm/opensm/main.c
 @@ -175,6 +175,10 @@ static void show_usage(void)
separated by commas so that specific ordering of 
 routing\n
algorithms will be tried if earlier routing engines 
 fail.\n
Supported engines: updn, file, ftree, lash, dor, 
 torus-2QoS\n\n);
 + printf(--no_default_routing\n
 +  This option prevents OpenSM from falling back to 
 default\n
 +  routing if none of the provided engines was able to\n
 +  configure the subnet.\n\n);
   printf(--do_mesh_analysis\n
This option enables additional analysis for the 
 lash\n
routing engine to precondition switch port 
 assignments\n
 @@ -612,6 +616,7 @@ int main(int argc, char *argv[])
   {sm_sl, 1, NULL, 7},
   {retries, 1, NULL, 8},
   {torus_config, 1, NULL, 9},
 + {no_default_routing, 0, NULL, 10},
   {NULL, 0, NULL, 0}  /* Required at the end of the array */
   };
   
 @@ -993,6 +998,10 @@ int main(int argc, char *argv[])
   case 9:
   SET_STR_OPT(opt.torus_conf_file, optarg);
   break;
 + case 10:
 + opt.use_default_routing = FALSE;
 + printf( No fall back to default routing\n);
 + break;
   case 'h':
   case '?':
   case ':':
 diff --git a/opensm/opensm/osm_opensm.c b/opensm/opensm/osm_opensm.c
 index e7ef55c..d153be5 100644
 --- a/opensm/opensm/osm_opensm.c
 +++ b/opensm/opensm/osm_opensm.c
 @@ -159,11 +159,6 @@ static struct osm_routing_engine 
 *setup_routing_engine(osm_opensm_t *osm,
   struct osm_routing_engine *re;
   const struct

Re: opensm: Status of torus-2QoS patchset?

2010-02-22 Thread Jim Schutt

On Sun, 2010-02-21 at 04:43 -0700, Sasha Khapyorsky wrote:
 Hi Jim,
 
 On 13:34 Tue 16 Feb , Jim Schutt wrote:
  
  Do you have any feedback regarding my patches to add
  a new routing module specialized for 2D/3D torus topologies?
 
 I've started to look at this. There is a lot of code, so it takes a
 time. I will comment over patches.

Thanks - I know it's a lot of code, and I appreciate that
you don't want to add code until you understand how it
works.

 
  I was hoping there was some chance this work might make it
  into the OFED 1.6 release.
 
 1.6 is realistic.

Great.

Thanks again for taking a look.

-- Jim

 
 Sasha
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: opensm: Status of torus-2QoS patchset?

2010-02-22 Thread Jim Schutt

On Sun, 2010-02-21 at 04:58 -0700, Sasha Khapyorsky wrote:
 On 13:43 Sun 21 Feb , Sasha Khapyorsky wrote:
  
  I've started to look at this. There is a lot of code, so it takes a
  time. I will comment over patches.
 
 BTW, assuming that now we have three (0/11, 0/12 and 0/3) subsequent
 patch series (basic + reworks + fixes) wouldn't it be simpler for review
 and apply to merge everything into single final patch series?

I've been wondering which you prefer - I wasn't sure if
you wanted to preserve some of the patch history, particularly
for the bug fixes.

I'm happy to merge everything into a new series, and
rebase to latest development head if you would prefer that.

Please let me know.

-- Jim

 
 Sasha
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] opensm: Use local variables when searching for torus-2QoS master spanning tree root.

2010-02-16 Thread Jim Schutt
Otherwise 1) presence of the wrong switches is checked; and 2) the y-loop
in good_xy_ring() can segfault on an out-of-bounds switch array x index.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_ucast_torus.c |   13 +++--
 1 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/opensm/opensm/osm_ucast_torus.c b/opensm/opensm/osm_ucast_torus.c
index e2eb324..728e56c 100644
--- a/opensm/opensm/osm_ucast_torus.c
+++ b/opensm/opensm/osm_ucast_torus.c
@@ -8751,22 +8751,23 @@ ib_api_status_t torus_mcast_stree(void *context, 
osm_mgrp_box_t *mgb)
 }
 
 static
-bool good_xy_ring(struct torus *t, int x, int y, int z)
+bool good_xy_ring(struct torus *t, const int x, const int y, const int z)
 {
struct t_switch sw = t-sw;
bool good_ring = true;
+   int x_tst, y_tst;
 
-   for (x = 0; x  t-x_sz  good_ring; x++)
-   good_ring = sw[x][y][z];
+   for (x_tst = 0; x_tst  t-x_sz  good_ring; x_tst++)
+   good_ring = sw[x_tst][y][z];
 
-   for (y = 0; y  t-y_sz  good_ring; y++)
-   good_ring = sw[x][y][z];
+   for (y_tst = 0; y_tst  t-y_sz  good_ring; y_tst++)
+   good_ring = sw[x][y_tst][z];
 
return good_ring;
 }
 
 static
-struct t_switch *find_plane_mid(struct torus *t, int z)
+struct t_switch *find_plane_mid(struct torus *t, const int z)
 {
int x, dx, xm = t-x_sz / 2;
int y, dy, ym = t-y_sz / 2;
-- 
1.5.6.GIT


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] opensm: Bug fixes for torus-2QoS patchset

2010-02-16 Thread Jim Schutt
These patches fix bugs discovered during further testing of the
torus-2QoS routing module for OpenSM.

(See http://www.spinics.net/lists/linux-rdma/msg01438.html
and http://www.spinics.net/lists/linux-rdma/msg01938.html)


Jim Schutt (3):
  opensm: Use local variables when searching for torus-2QoS master
spanning tree root.
  opensm: Fix handling of torus-2QoS topology discovery for radix 4
torus dimensions.
  opensm: Avoid havoc in dump_ucast_routes() caused by torus-2QoS
persistent use of osm_port_t:priv.

 opensm/include/opensm/osm_switch.h |   12 +
 opensm/opensm/osm_dump.c   |2 +-
 opensm/opensm/osm_switch.c |7 +-
 opensm/opensm/osm_ucast_mgr.c  |1 +
 opensm/opensm/osm_ucast_torus.c|  418 +++-
 5 files changed, 193 insertions(+), 247 deletions(-)


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] opensm: Fix handling of torus-2QoS topology discovery for radix 4 torus dimensions.

2010-02-16 Thread Jim Schutt
Torus-2QoS finds the torus topology in a fabric using an algorithm that
looks for 8 adjacent switches which form the corners of a cube, by looking
for 4 adjacent switches which form the corners of a face on that cube.

When a torus dimension has radix 4 (e.g. the y dimension in a 5x4x8 torus),
1-D rings which span that dimension cannot be distinguished topologically
from the faces the algorithm is trying to construct.

Code that prevents that situation from arising should only be applied in
cases where a torus dimension has radix 4, but due to a missing test, it
could be applied inappropriately.

This commit fixes the bug by adding the missing test.  It also restructures
the code in question to remove code duplication by adding helper functions.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_ucast_torus.c |  405 ---
 1 files changed, 168 insertions(+), 237 deletions(-)

diff --git a/opensm/opensm/osm_ucast_torus.c b/opensm/opensm/osm_ucast_torus.c
index 728e56c..ab0e6a6 100644
--- a/opensm/opensm/osm_ucast_torus.c
+++ b/opensm/opensm/osm_ucast_torus.c
@@ -1956,38 +1956,16 @@ struct f_switch *tfind_2d_perpendicular(struct t_switch 
*tsw0,
return ffind_2d_perpendicular(tsw0-tmp, tsw1-tmp, tsw2-tmp);
 }
 
-/*
- * These functions return true when it safe to call
- * tfind_3d_perpendicular()/ffind_3d_perpendicular().
- */
 static
-bool safe_x_perpendicular(struct torus *t, int i, int j, int k)
+bool safe_x_ring(struct torus *t, int i, int j, int k)
 {
-   int jm1, jp1, jp2, km1, kp1, kp2;
-
-   /*
-* If the dimensions perpendicular to the search direction are
-* not radix 4 torus dimensions, it is always safe to search for
-* a perpendicular.
-*/
-   if ((t-y_sz != 4  t-z_sz != 4) ||
-   (t-flags  Y_MESH  t-flags  Z_MESH) ||
-   (t-y_sz != 4  (t-flags  Z_MESH)) ||
-   (t-z_sz != 4  (t-flags  Y_MESH)))
-   return true;
-
-   jm1 = canonicalize(j - 1, t-y_sz);
-   jp1 = canonicalize(j + 1, t-y_sz);
-   jp2 = canonicalize(j + 2, t-y_sz);
-
-   km1 = canonicalize(k - 1, t-z_sz);
-   kp1 = canonicalize(k + 1, t-z_sz);
-   kp2 = canonicalize(k + 2, t-z_sz);
+   int im1, ip1, ip2;
+   bool success = true;
 
/*
-* Here we are checking for enough appropriate links having been
-* installed into the torus to prevent an incorrect link from being
-* considered as a perpendicular candidate.
+* If this x-direction radix-4 ring has at least two links
+* already installed into the torus,  then this ring does not
+* prevent us from looking for y or z direction perpendiculars.
 *
 * It is easier to check for the appropriate switches being installed
 * into the torus than it is to check for the links, so force the
@@ -1995,93 +1973,111 @@ bool safe_x_perpendicular(struct torus *t, int i, int 
j, int k)
 *
 * Recall that canonicalize(n - 2, 4) == canonicalize(n + 2, 4).
 */
-   if (((!!t-sw[i][jm1][k] +
- !!t-sw[i][jp1][k] + !!t-sw[i][jp2][k] = 2) 
-(!!t-sw[i][j][km1] +
- !!t-sw[i][j][kp1] + !!t-sw[i][j][kp2] = 2))) {
-
-   bool success = true;
-
-   if (t-sw[i][jp2][k]  t-sw[i][jm1][k])
-   success = link_tswitches(t, 1,
-t-sw[i][jp2][k],
-t-sw[i][jm1][k])
-success;
-
-   if (t-sw[i][jm1][k]  t-sw[i][j][k])
-   success = link_tswitches(t, 1,
-t-sw[i][jm1][k],
-t-sw[i][j][k])
-success;
-
-   if (t-sw[i][j][k]  t-sw[i][jp1][k])
-   success = link_tswitches(t, 1,
-t-sw[i][j][k],
-t-sw[i][jp1][k])
-success;
-
-   if (t-sw[i][jp1][k]  t-sw[i][jp2][k])
-   success = link_tswitches(t, 1,
-t-sw[i][jp1][k],
-t-sw[i][jp2][k])
-success;
-
-   if (t-sw[i][j][kp2]  t-sw[i][j][km1])
-   success = link_tswitches(t, 2,
-t-sw[i][j][kp2],
-t-sw[i][j][km1])
-success;
-
-   if (t-sw[i][j][km1]  t-sw[i][j][k])
-   success = link_tswitches(t, 2,
-t-sw[i][j][km1],
-t-sw[i][j][k

opensm: Status of torus-2QoS patchset?

2010-02-16 Thread Jim Schutt
Hi Sasha,

Do you have any feedback regarding my patches to add
a new routing module specialized for 2D/3D torus topologies?
I was hoping there was some chance this work might make it
into the OFED 1.6 release.

Thanks -- Jim





--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/11] opensm: Allow the routing engine to participate in path SL calculations.

2010-01-18 Thread Jim Schutt

Hi Yevgeny,

On Thu, 2010-01-14 at 09:24 -0700, Yevgeny Kliteynik wrote:
 Jim,
 
 On 20/Nov/09 21:15, Jim Schutt wrote:
  LASH already does this, in a hard-coded fashion.
 
  Generalize this by adding a callback to struct osm_routing_engine that
  computes a path SL value, and fix up LASH to use it.
 
  This patchset causes the requested or QoS-computed SL value to be passed
  to the routing engine path SL computation as a hint.  In the event the
  routing engine's use of SLs allows it to support more than one QoS level,
  it may be able to make use of the SL hint to do so.
 
  For now, LASH just ignores the hint.
 
  Note that before this change, if LASH was configured and a specific path
  SL value was requested that differed from what LASH needed to route the
  fabric without credit loops, the path SL lookup would fail.  Now LASH's
  SL value is always used.
 
  Possibly the choice between failing a path SL request when it conflicts
  with routing, vs. always providing an SL value that gives a credit-loop-
  free routing, should be user-configurable?
 
 SL can come from the following places:
   - user requested specific SL in PathRecord query
   - QoS policy configuration
   - SL specified in partition parameters
   - basic QoS (no policies, only SL2VL table)
   - routing engine
 
 Except for QoS policy being able to override SL that is specified in
 the partition parameters (with an error message in the log), IMHO if
 there's a conflict between SLs coming from different constraints
 PathRecord should fail to find a satisfiable path, or at least we
 should see some error message in the log that the selected SL
 conflicts with other OSM configurations, but will be used anyway.
 
 [snip...]
 
 
  @@ -725,6 +707,14 @@ static ib_api_status_t pr_rcv_get_path_parms(IN 
  osm_sa_t * sa,
  goto Exit;
  }
 
  +   /*
  +* If the routing engine wants to have a say in path SL selection,
  +* send the currently computed SL value as a hint and let the routing
  +* engine override it.
  +*/
  +   if (p_re  p_re-path_sl)
  +   sl = p_re-path_sl(p_re-context, sl, p_src_port, p_dest_port);
 
 
 In addition to error message if routing engine overrides the provided
 hint, need to check whether the returned SL is valid - check the
 corresponding bit in valid_sl_mask. It might be irrelevant for torus-2QoS
 routing (not sure yet, need to read more patches :-) ), but it's
 probably needed in general case.
 
 Also, perhaps it would be better to provide the bitmask of available
 SLs as a hint if there are more than one suitable SL?
 
 I mean something like this (didn't try it, didn't even compile it,
 need corresponding change in the p_re-path_sl callback, it's just
 to illustrate what I mean):

Your suggestion below won't accomplish what I was trying to 
accomplish.

Torus-2QoS needs to encode global path information into the
SL value in order to provide routing free of credit loops.

But it only needs 3 bits of SL to do this, leaving one free.
So, it uses that bit to provide two levels of quality of
service.

This usage of SL clashes with the QoS policy engine, which
uses each SL value to provide up to 16 levels of quality
of service.  So to the QoS policy engine, every SL value
is distinct, but to torus-2QoS, SL values 0-7 are all the
same wrt. QoS level, and SL values 8-15 are also all the
same wrt. a second QoS level.

I wanted to use the QoS policy engine to configure QoS
level in torus-2QoS, so I used this hint idea.
What torus-2QoS' path_sl() does is append the high-order
bit from the SL hint, as computed by the QoS policy engine,
onto the 3 low-order bits that it computes are needed 
to avoid deadlock.

Does that help explain what I'm after?

-- Jim

 
 ---
   opensm/opensm/osm_sa_path_record.c |   47 
 ++-
   1 files changed, 29 insertions(+), 18 deletions(-)
 
 diff --git a/opensm/opensm/osm_sa_path_record.c 
 b/opensm/opensm/osm_sa_path_record.c
 index 7120d65..6de8979 100644
 --- a/opensm/opensm/osm_sa_path_record.c
 +++ b/opensm/opensm/osm_sa_path_record.c
 @@ -171,7 +171,7 @@ static ib_api_status_t pr_rcv_get_path_parms(IN 
 osm_sa_t * sa,
   uint8_t required_mtu;
   uint8_t required_rate;
   uint8_t required_pkt_life;
 -uint8_t sl;
 +uint8_t sl = OSM_DEFAULT_SL;
   uint8_t in_port_num;
   ib_net16_t dest_lid;
   uint8_t i;
 @@ -688,33 +688,44 @@ static ib_api_status_t pr_rcv_get_path_parms(IN 
 osm_sa_t * sa,
   cl_ntoh16(pkey), sl);
   } else
   sl = p_prtn-sl;
 -} else if (sa-p_subn-opt.qos) {
 +}
 +
 +/*
 + * If the routing engine wants to have a say in path SL selection,
 + * send the currently computed SL value as a hint and let the routing
 + * engine override it.
 + */
 +if (p_re  p_re-path_sl)
 +sl = p_re-path_sl(p_re-context, valid_sl_mask, p_src_port, 
 p_dest_port);
 +
 +if (sa-p_subn-opt.qos  !(valid_sl_mask  (1

[PATCH 01/12] opensm: Make error message for torus-2QoS dateline specification match code check.

2009-12-18 Thread Jim Schutt

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_ucast_torus.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/opensm/opensm/osm_ucast_torus.c b/opensm/opensm/osm_ucast_torus.c
index 8eb2880..7108394 100644
--- a/opensm/opensm/osm_ucast_torus.c
+++ b/opensm/opensm/osm_ucast_torus.c
@@ -954,7 +954,7 @@ bool parse_dir_dateline(int c_dir, struct torus *t, const 
char *parse_sep)
if ((*dl  0  *dl = -max_dl) || *dl = max_dl)
OSM_LOG(t-osm-log, OSM_LOG_ERROR,
Error: dateline value for coordinate direction %d 
-   must be %d = dl = %d\n,
+   must be %d  dl  %d\n,
c_dir, -max_dl, max_dl);
else
success = true;
-- 
1.5.6.GIT


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/12] opensm: Implement master spanning tree for torus-2QoS multicast support.

2009-12-18 Thread Jim Schutt
In order to route a 2D/3D torus without credit loops while providing
support for two QoS levels, torus-2QoS needs to use 3 VL bits and
all 4 available SL bits.  This means that multicast traffic must
share SL values with unicast traffic, which in turn means that
multicast routing must be compatible with unicast routing to prevent
credit loops.

Torus-2QoS unicast routing is based on DOR, and it turns out to
be possible to construct spanning trees so that when multicast
and unicast traffic are overlaid, credit loops are not possible.

Here is a 2D example of such a spanning tree, where x is the
root switch, and each + is a non-root switch:

   +  +  +  +  +
   |  |  |  |  |
   +  +  +  +  +
   |  |  |  |  |
   +--+--x--+--+
   |  |  |  |  |
   +  +  +  +  +

For multicast traffic routed from root to tip, every turn in the
above spanning tree is a legal DOR turn.

For traffic routed from tip to root, and traffic routed through
the root, turns are not legal DOR turns.  However, to construct
a credit loop, the union of multicast routing on this spanning
tree with DOR unicast routing can only provide 3 of the 4 turns
needed for the loop.

In addition, if none of the above spanning tree branches crosses
a dateline used for unicast credit loop avoidance on a torus,
and multicast traffic is confined to SL 0 or SL 8 (recall that
torus-2QoS uses SL bit 3 to differentiate QoS level), then
multicast traffic also cannot contribute to the ring credit
loops that are otherwise possible in a torus.

Torus-2QoS uses these ideas to create a master spanning tree.
Every multicast group spanning tree will be constructed as a
subset of the master tree, with the same root as the master
tree.

Such multicast group spanning trees will in general not be
optimal for groups which are a subset of the full fabric.
However, this compromise must be made to enable support for
two QoS levels on a torus while preventing credit loops.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_ucast_torus.c |  267 +++
 1 files changed, 267 insertions(+), 0 deletions(-)

diff --git a/opensm/opensm/osm_ucast_torus.c b/opensm/opensm/osm_ucast_torus.c
index 61e0bf3..082fcf5 100644
--- a/opensm/opensm/osm_ucast_torus.c
+++ b/opensm/opensm/osm_ucast_torus.c
@@ -154,6 +154,19 @@ struct link {
  * type.  Furthermore, if that type is PASSTHRU, then the connected links:
  *   1) are parallel to a given coordinate direction
  *   2) share the same two switches as endpoints.
+ *
+ * Torus-2QoS uses one master spanning tree for multicast, of which every
+ * multicast group spanning tree is a subtree.  to_stree_root is a pointer
+ * to the next port_grp on the path to the master spanning tree root.
+ * to_stree_tip is a pointer to the next port_grp on the path to a master
+ * spanning tree branch tip.
+ *
+ * Each t_switch can have at most one port_grp with a non-NULL to_stree_root.
+ * Exactly one t_switch in the fabric will have all port_grp objects with
+ * to_stree_root NULL; it is the master spanning tree root.
+ *
+ * A t_switch with all port_grp objects where to_stree_tip is NULL is at a
+ * master spanning tree branch tip.
  */
 struct port_grp {
enum endpt_type type;
@@ -163,6 +176,8 @@ struct port_grp {
unsigned sw_dlid_cnt;   /* switch dlids routed through this group */
unsigned ca_dlid_cnt;   /* CA dlids routed through this group */
struct t_switch *sw;/* what switch we're attached to */
+   struct port_grp *to_stree_root;
+   struct port_grp *to_stree_tip;
struct endpoint **port;
 };
 
@@ -8499,6 +8514,256 @@ bool torus_lft(struct torus *t, struct t_switch *sw)
return success;
 }
 
+static
+bool good_xy_ring(struct torus *t, int x, int y, int z)
+{
+   struct t_switch sw = t-sw;
+   bool good_ring = true;
+
+   for (x = 0; x  t-x_sz  good_ring; x++)
+   good_ring = sw[x][y][z];
+
+   for (y = 0; y  t-y_sz  good_ring; y++)
+   good_ring = sw[x][y][z];
+
+   return good_ring;
+}
+
+static
+struct t_switch *find_plane_mid(struct torus *t, int z)
+{
+   int x, dx, xm = t-x_sz / 2;
+   int y, dy, ym = t-y_sz / 2;
+   struct t_switch sw = t-sw;
+
+   if (good_xy_ring(t, xm, ym, z))
+   return sw[xm][ym][z];
+
+   for (dx = 1, dy = 1; dx = xm  dy = ym; dx++, dy++) {
+
+   x = canonicalize(xm - dx, t-x_sz);
+   y = canonicalize(ym - dy, t-y_sz);
+   if (good_xy_ring(t, x, y, z))
+   return sw[x][y][z];
+
+   x = canonicalize(xm + dx, t-x_sz);
+   y = canonicalize(ym + dy, t-y_sz);
+   if (good_xy_ring(t, x, y, z))
+   return sw[x][y][z];
+   }
+   return NULL;
+}
+
+static
+struct t_switch *find_stree_root(struct torus *t)
+{
+   int x, y, z, dz, zm = t-z_sz / 2;
+   struct t_switch sw = t-sw;
+   struct t_switch *root

[PATCH 11/12] opensm: Implement multicast support for torus-2QoS.

2009-12-18 Thread Jim Schutt
Every multicast spanning tree used by torus-2QoS is a subset
of the master spanning tree built when unicast routing is
computed.  This is required because when QoS is enabled,
torus-2QoS needs to use the same SLs for unicast and multicast.
Thus, the multicast spanning trees must have special properties
to avoid credit loops between unicast and multicast traffic.

To build a spanning tree for a particular MLID, torus-2QoS just
needs to mark all the ports that participate in that multicast
group, then walk the master spanning tree and add switches
hosting the marked ports to the multicast group spanning tree.
Use a depth-first search of the master spanning tree for this.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_ucast_torus.c |  250 +--
 1 files changed, 239 insertions(+), 11 deletions(-)

diff --git a/opensm/opensm/osm_ucast_torus.c b/opensm/opensm/osm_ucast_torus.c
index 082fcf5..e2eb324 100644
--- a/opensm/opensm/osm_ucast_torus.c
+++ b/opensm/opensm/osm_ucast_torus.c
@@ -300,6 +300,7 @@ struct torus {
 
struct coord_dirs *origin;
struct t_switch sw;
+   struct t_switch *master_stree_root;
 
unsigned flags;
int debug;
@@ -8515,6 +8516,241 @@ bool torus_lft(struct torus *t, struct t_switch *sw)
 }
 
 static
+osm_mtree_node_t *mcast_stree_branch(struct t_switch *sw, osm_switch_t *osm_sw,
+osm_mgrp_box_t *mgb, unsigned depth,
+unsigned *port_cnt, unsigned *max_depth)
+{
+   osm_mtree_node_t *mtn = NULL;
+   osm_mcast_tbl_t *mcast_tbl, *ds_mcast_tbl;
+   osm_node_t *ds_node;
+   struct t_switch *ds_sw;
+   struct port_grp *ptgrp;
+   struct link *link;
+   struct endpoint *port;
+   unsigned g, p;
+   unsigned mcast_fwd_ports = 0, mcast_end_ports = 0;
+
+   depth++;
+
+   if (osm_sw-priv != sw) {
+   OSM_LOG(sw-torus-osm-log, OSM_LOG_INFO,
+   Error: osm_sw (GUID 0x%04llx) 
+   not in our fabric description\n,
+   ntohllu(osm_node_get_node_guid(osm_sw-p_node)));
+   goto out;
+   }
+   if (!osm_switch_supports_mcast(osm_sw)) {
+   OSM_LOG(sw-torus-osm-log, OSM_LOG_ERROR,
+   Error: osm_sw (GUID 0x%04llx) 
+   does not support multicast\n,
+   ntohllu(osm_node_get_node_guid(osm_sw-p_node)));
+   goto out;
+   }
+   mtn = osm_mtree_node_new(osm_sw);
+   if (!mtn) {
+   OSM_LOG(sw-torus-osm-log, OSM_LOG_ERROR,
+   Insufficient memory to build multicast tree\n);
+   goto out;
+   }
+   mcast_tbl = osm_switch_get_mcast_tbl_ptr(osm_sw);
+   /*
+* Recurse to downstream switches, i.e. those closer to master
+* spanning tree branch tips.
+*
+* Note that if there are multiple ports in this port group, i.e.,
+* multiple parallel links, we can pick any one of them to use for
+* any individual MLID without causing loops.  Pick one based on MLID
+* for now, until someone turns up evidence we need to be smarter.
+*
+* Also, it might be we got called in a window between a switch getting
+* removed from the fabric, and torus-2QoS getting to rebuild its
+* fabric representation.  If that were to happen, our next hop
+* osm_switch pointer might be stale.  Look it up via opensm's fabric
+* description to be sure it's not.
+*/
+   for (g = 0; g  2 * TORUS_MAX_DIM; g++) {
+   ptgrp = sw-ptgrp[g];
+   if (!ptgrp-to_stree_tip)
+   continue;
+
+   p = mgb-mlid % ptgrp-port_cnt;/* port # in port group */
+   p = ptgrp-port[p]-port;   /* now port # in switch */
+
+   ds_node = osm_node_get_remote_node(osm_sw-p_node, p, NULL);
+   ds_sw = ptgrp-to_stree_tip-sw;
+
+   if (!(ds_node  ds_node-sw 
+ ds_sw-osm_switch == ds_node-sw)) {
+   OSM_LOG(sw-torus-osm-log, OSM_LOG_ERROR,
+   Error: stale pointer to osm_sw 
+   (GUID 0x%04llx)\n, ntohllu(ds_sw-n_id));
+   continue;
+   }
+   mtn-child_array[p] =
+   mcast_stree_branch(ds_sw, ds_node-sw, mgb,
+  depth, port_cnt, max_depth);
+   if (!mtn-child_array[p])
+   continue;
+
+   osm_mcast_tbl_set(mcast_tbl, mgb-mlid, p);
+   mcast_fwd_ports++;
+   /*
+* Since we forward traffic for this multicast group on this
+* port, cause the switch on the other end of the link
+* to forward traffic back to us.  Do it now

[PATCH 09/12] opensm: Make mcast_mgr_purge_tree() available outside osm_mcast_mgr.c.

2009-12-18 Thread Jim Schutt
A routing engine that needs to compute multicast spanning trees with
special properties will need to delete old trees.  There's already
a function that does this: mcast_mgr_purge_tree().

Make it available outside osm_mcast_mgr.c, and change the name
to follow the naming convention (osm_ prefix) for global functions.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/include/opensm/osm_multicast.h |   33 +
 opensm/opensm/osm_mcast_mgr.c |4 ++--
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/opensm/include/opensm/osm_multicast.h 
b/opensm/include/opensm/osm_multicast.h
index 1da575d..df6ac6c 100644
--- a/opensm/include/opensm/osm_multicast.h
+++ b/opensm/include/opensm/osm_multicast.h
@@ -53,6 +53,7 @@
 #include opensm/osm_mcm_port.h
 #include opensm/osm_subnet.h
 #include opensm/osm_log.h
+#include opensm/osm_sm.h
 
 #ifdef __cplusplus
 #  define BEGIN_C_DECLS extern C {
@@ -193,6 +194,38 @@ osm_mgrp_t *osm_mgrp_new(IN osm_subn_t * subn, IN 
ib_net16_t mlid,
 *  Multicast Group, osm_mgrp_delete
 */
 
+/*
+ * Need a forward declaration to work around include loop:
+ * osm_sm.h - osm_multicast.h
+ */
+struct osm_sm;
+
+/f* OpenSM: Multicast Tree/osm_purge_mtree
+* NAME
+*  osm_purge_mtree
+*
+* DESCRIPTION
+*  Frees all the nodes in a multicast spanning tree
+*
+* SYNOPSIS
+*/
+void osm_purge_mtree(IN struct osm_sm * sm, IN osm_mgrp_box_t * mgb);
+/*
+* PARAMETERS
+*  sm
+*  [in] Pointer to osm_sm_t object.
+*  mgb
+*  [in] Pointer to an osm_mgrp_box_t object.
+*
+* RETURN VALUES
+*  None.
+*
+*
+* NOTES
+*
+* SEE ALSO
+*/
+
 /f* OpenSM: Multicast Group/osm_mgrp_is_guid
 * NAME
 *  osm_mgrp_is_guid
diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c
index e65e459..11a10ce 100644
--- a/opensm/opensm/osm_mcast_mgr.c
+++ b/opensm/opensm/osm_mcast_mgr.c
@@ -146,7 +146,7 @@ static void mcast_mgr_purge_tree_node(IN osm_mtree_node_t * 
p_mtn)
free(p_mtn);
 }
 
-static void mcast_mgr_purge_tree(osm_sm_t * sm, IN osm_mgrp_box_t * mbox)
+void osm_purge_mtree(osm_sm_t * sm, IN osm_mgrp_box_t * mbox)
 {
OSM_LOG_ENTER(sm-p_log);
 
@@ -695,7 +695,7 @@ static ib_api_status_t 
mcast_mgr_build_spanning_tree(osm_sm_t * sm,
   on multicast forwarding table information if the user wants to
   preserve existing multicast routes.
 */
-   mcast_mgr_purge_tree(sm, mbox);
+   osm_purge_mtree(sm, mbox);
 
/* build the first subset containing all member ports */
if (make_port_list(port_list, mbox)) {
-- 
1.5.6.GIT


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/12] opensm: Update documentation to describe torus-2QoS multicast support.

2009-12-18 Thread Jim Schutt

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/doc/current-routing.txt |  121 +++-
 1 files changed, 118 insertions(+), 3 deletions(-)

diff --git a/opensm/doc/current-routing.txt b/opensm/doc/current-routing.txt
index 141d793..78a2e01 100644
--- a/opensm/doc/current-routing.txt
+++ b/opensm/doc/current-routing.txt
@@ -400,8 +400,18 @@ Torus-2QoS Routing Algorithm
 
 
 Torus-2QoS is routing algorithm designed for large-scale 2D/3D torus fabrics.
-
-It is a DOR-based algorithm that avoids deadlocks that would otherwise
+The torus-2QoS routing engine can provide the following functionality on
+a 2D/3D torus:
+- routing that is free of credit loops
+- two levels of QoS, assuming switches support 8 data VLs
+- ability to route around a single failed switch, and/or multiple failed
+links, without
+- introducing credit loops
+- changing path SL values
+- very short run times, with good scaling properties as fabric size
+increases
+
+Torus-2QoS is a DOR-based algorithm that avoids deadlocks that would otherwise
 occur in a torus using the concept of a dateline for each torus dimension.
 It encodes into a path SL which datelines the path crosses as follows:
 
@@ -424,7 +434,7 @@ ports as follows:
 sl2vl(iport,oport,sl) = 0x1  (sl  cdir(oport));
 
 Thus torus-2QoS consumes 8 SL values (SL bits 0-2) and 2 VL values (VL bit 0)
- per QoS level to provide deadlock-free routing on a 3D torus.
+per QoS level to provide deadlock-free routing on a 3D torus.
 
 Torus-2QoS routes around link failure by taking the long way around any
 1D ring interrupted by a link failure.  For example, consider the 2D 6x5
@@ -538,3 +548,108 @@ path S-n-I-q-r-D, with illegal turn at switch I, and with 
hop I-q using a
 VL with bit 1 set.  In contrast to the earlier examples, the second hop
 after the illegal turn, q-r, can be used to construct a credit loop
 encircling the failed switches.
+
+Since torus-2QoS uses all four available SL bits, and the three data VL
+bits that are typically available in current switches, there is no way
+to use SL/VL values to separate multicast traffic from unicast traffic.
+Thus, torus-2QoS must generate multicast routing such that credit loops
+cannot arise from a combination of multicast and unicast path segments.
+
+It turns out that it is possible to construct spanning trees for multicast
+routing that have that property.  For the 2D 6x5 torus example above, here
+is the full-fabric spanning tree that torus-2QoS will construct, where x
+is the root switch and each + is a non-root switch:
+
+   4++++++
+||||||
+   3++++++
+||||||
+   2+++x++
+||||||
+   1++++++
+||||||
+ y=0++++++
+
+  x=012345
+
+For multicast traffic routed from root to tip, every turn in the above
+spanning tree is a legal DOR turn.
+
+For traffic routed from tip to root, and some traffic routed through the
+root, turns are not legal DOR turns.  However, to construct a credit loop,
+the union of multicast routing on this spanning tree with DOR unicast
+routing can only provide 3 of the 4 turns needed for the loop.
+
+In addition, if none of the above spanning tree branches crosses a dateline
+used for unicast credit loop avoidance on a torus, and if multicast traffic
+is confined to SL 0 or SL 8 (recall that torus-2QoS uses SL bit 3 to
+differentiate QoS level), then multicast traffic also cannot contribute to
+the ring credit loops that are otherwise possible in a torus.
+
+Torus-2QoS uses these ideas to create a master spanning tree.  Every
+multicast group spanning tree will be constructed as a subset of the master
+tree, with the same root as the master tree.
+
+Such multicast group spanning trees will in general not be optimal for
+groups which are a subset of the full fabric. However, this compromise must
+be made to enable support for two QoS levels on a torus while preventing
+credit loops.
+
+In the presence of link or switch failures that result in a fabric for
+which torus-2QoS can generate credit-loop-free unicast routes, it is also
+possible to generate a master spanning tree for multicast that retains the
+required properties.  For example, consider that same 2D 6x5 torus, with
+the link from (2,2) to (3,2) failed.  Torus-2QoS will generate the following
+master spanning tree:
+
+   4++++++
+||||||
+   3++++++
+||||||
+   2  --+++x++--
+||||||
+   1++++++
+||||||
+ y=0++++++
+
+  x=012345
+
+Two things are notable about this master

[PATCH 07/12] opensm: Make torus-2QoS always use OSM_LOG_INFO, never LOG_INFO.

2009-12-18 Thread Jim Schutt

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_ucast_torus.c |   10 +-
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/opensm/opensm/osm_ucast_torus.c b/opensm/opensm/osm_ucast_torus.c
index 0306af9..61e0bf3 100644
--- a/opensm/opensm/osm_ucast_torus.c
+++ b/opensm/opensm/osm_ucast_torus.c
@@ -7971,7 +7971,7 @@ void torus_update_osm_sl2vl(void *context, osm_port_t 
*sw_mgmt_port,
guid_t guid;
 
guid = osm_node_get_node_guid(sw_mgmt_port-p_node);
-   OSM_LOG(log, LOG_INFO,
+   OSM_LOG(log, OSM_LOG_INFO,
Error: osm_port (GUID 0x%04llx) 
not in our fabric description\n, ntohllu(guid));
return;
@@ -8527,7 +8527,7 @@ uint8_t torus_path_sl(void *context, uint8_t path_sl_hint,
sport = osm_port_relink_endpoint(osm_sport);
if (!sport) {
guid = osm_node_get_node_guid(osm_sport-p_node);
-   OSM_LOG(log, LOG_INFO,
+   OSM_LOG(log, OSM_LOG_INFO,
Error: osm_sport (GUID 0x%04llx) 
not in our fabric description\n,
ntohllu(guid));
@@ -8539,7 +8539,7 @@ uint8_t torus_path_sl(void *context, uint8_t path_sl_hint,
dport = osm_port_relink_endpoint(osm_dport);
if (!dport) {
guid = osm_node_get_node_guid(osm_dport-p_node);
-   OSM_LOG(log, LOG_INFO,
+   OSM_LOG(log, OSM_LOG_INFO,
Error: osm_dport (GUID 0x%04llx) 
not in our fabric description\n,
ntohllu(guid));
@@ -8552,14 +8552,14 @@ uint8_t torus_path_sl(void *context, uint8_t 
path_sl_hint,
 */
if (sport-type != SRCSINK) {
guid = osm_node_get_node_guid(osm_sport-p_node);
-   OSM_LOG(log, LOG_INFO,
+   OSM_LOG(log, OSM_LOG_INFO,
Error: osm_sport (GUID 0x%04llx) 
not a data src/sink port\n, ntohllu(guid));
goto out;
}
if (dport-type != SRCSINK) {
guid = osm_node_get_node_guid(osm_dport-p_node);
-   OSM_LOG(log, LOG_INFO,
+   OSM_LOG(log, OSM_LOG_INFO,
Error: osm_dport (GUID 0x%04llx) 
not a data src/sink port\n, ntohllu(guid));
goto out;
-- 
1.5.6.GIT


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/12] opensm: Remove redundant function names in torus-2QoS logging.

2009-12-18 Thread Jim Schutt

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/opensm/osm_ucast_torus.c |   14 +++---
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/opensm/opensm/osm_ucast_torus.c b/opensm/opensm/osm_ucast_torus.c
index b740f93..0306af9 100644
--- a/opensm/opensm/osm_ucast_torus.c
+++ b/opensm/opensm/osm_ucast_torus.c
@@ -8437,7 +8437,7 @@ bool get_lid(struct port_grp *pg, unsigned p,
osm_port = ep-osm_port;
if (!(osm_port  osm_port-priv == ep)) {
OSM_LOG(pg-sw-torus-osm-log, OSM_LOG_ERROR,
-   Error: get_lid: ep-osm_port-priv != ep 
+   Error: ep-osm_port-priv != ep 
for sw 0x%04llu port %d\n,
ntohllu(((struct t_switch *)(ep-sw))-n_id), ep-port);
return false;
@@ -8528,8 +8528,8 @@ uint8_t torus_path_sl(void *context, uint8_t path_sl_hint,
if (!sport) {
guid = osm_node_get_node_guid(osm_sport-p_node);
OSM_LOG(log, LOG_INFO,
-   Error: get_torus_sl: osm_sport (GUID 
-   0x%04llx) not in our fabric description\n,
+   Error: osm_sport (GUID 0x%04llx) 
+   not in our fabric description\n,
ntohllu(guid));
goto out;
}
@@ -8540,8 +8540,8 @@ uint8_t torus_path_sl(void *context, uint8_t path_sl_hint,
if (!dport) {
guid = osm_node_get_node_guid(osm_dport-p_node);
OSM_LOG(log, LOG_INFO,
-   Error: get_torus_sl: osm_dport (GUID 
-   0x%04llx) not in our fabric description\n,
+   Error: osm_dport (GUID 0x%04llx) 
+   not in our fabric description\n,
ntohllu(guid));
goto out;
}
@@ -8553,14 +8553,14 @@ uint8_t torus_path_sl(void *context, uint8_t 
path_sl_hint,
if (sport-type != SRCSINK) {
guid = osm_node_get_node_guid(osm_sport-p_node);
OSM_LOG(log, LOG_INFO,
-   Error: get_torus_sl: osm_sport (GUID 0x%04llx) 
+   Error: osm_sport (GUID 0x%04llx) 
not a data src/sink port\n, ntohllu(guid));
goto out;
}
if (dport-type != SRCSINK) {
guid = osm_node_get_node_guid(osm_dport-p_node);
OSM_LOG(log, LOG_INFO,
-   Error: get_torus_sl: osm_dport (GUID 0x%04llx) 
+   Error: osm_dport (GUID 0x%04llx) 
not a data src/sink port\n, ntohllu(guid));
goto out;
}
-- 
1.5.6.GIT


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/12] opensm: Enforce torus-2QoS link ordering convention.

2009-12-18 Thread Jim Schutt

The function ring_next_sw() used by torus-2QoS to build LFTs relies
on the ordering convention that the 1 end of a link is in the
positive coordinate direction WRT the 0 end.  Previously the links
were always built this way, but nothing enforced the convention.

This commit adds code to enforce the convention, including code
needed to label switches as they are installed into the torus,
rather than after all the torus switches are found.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---

I've attached the patch as a compressed file, as otherwise
it is too large to make it through the list.

-- Jim

 opensm/opensm/osm_ucast_torus.c |  433 +--
 1 files changed, 237 insertions(+), 196 deletions(-)




0005-opensm-Enforce-torus-2QoS-link-ordering-convention.patch.bz2
Description: application/bzip


Re: [PATCH 03/12] opensm: Remove unused port specification from torus-2QoS config file parsing.

2009-12-18 Thread Jim Schutt

The patch this email replies to changes the format of the torus-2QoS.conf
file.  The attached file works with the new format, and replaces the
example file sent in reply to the original torus-2QoS patch series.

-- Jim

# We want the torus routing engine to attempt to find a
# 5x5x5 torus in the fabric:
torus 5 5 5

# We need to tell the routing engine what directions we
# want the torus coordinate directions to be, by specifing
# the endpoints (switch GUID only) of a link in each
# direction.  Here we specify positive coordinate directions:
xp_link 0x20  0x200019   # S_0_0_0 - S_1_0_0
yp_link 0x20  0x25   # S_0_0_0 - S_0_1_0
zp_link 0x20  0x21   # S_0_0_0 - S_0_0_1

# If one of the above switches were to fail, the routing
# engine would not have sufficient information to locate the
# torus in the fabric.  Specify a backup origin here:

next_origin
xp_link 0x20001f  0x200038   # S_1_1_1 - S_2_1_1
yp_link 0x20001f  0x200024   # S_1_1_1 - S_1_2_1
zp_link 0x20001f  0x200020   # S_1_1_1 - S_1_1_2

# The torus routing engine uses the concept of a dateline,
# where a coordinate wraps from its maximum back to zero,
# in order to compute path SL values that provide routing
# that is free from credit loops.
#
# If it is forced by a failed switch to use the backup
# origin specification, that would cause the datelines
# to move, which would change many path SL values, which
# defeats one of the main benefits of this routing engine.
# So, describe the position of the original datelines
# relative to the backup origin as follows:
x_dateline -1
y_dateline -1
z_dateline -1

# You can specify as many backup origins as you like, but
# in practice, the torus routing engine is only guaranteed
# to be able to route around a single failed switch without
# introducing credit loops, so one backup origin is enough.


[PATCH 00/11] Add new torus routing engine: torus-2QoS

2009-11-20 Thread Jim Schutt
 introduced
into the path that are legal in the sense that no credit loops
can be constructed using them.

The path hop after the turn at switch I has VL bit 1 set, which marks
it as a hop after an illegal turn.

I've used the latest development version of ibdmchk, because it can use
path SL values and SL2VL tables, to check for credit loops in cases like 
the above routed with torus-2QoS, and it finds none.

I've also looked for credit loops in a torus with multiple failed switches
routed with torus-2QoS, and learned that if and only if the failed switches
are adjacent in the last DOR dimension, there will be no credit loops.

Since torus-2QoS makes use of all available SL values when supporting
2 QoS levels, there are none left over on which to confine multicast.
It turns out there is a way to construct a spanning tree which can 
overlay a DOR-routed mesh, so that multicast and unicast can coexist
on the same SL/VL without causing credit loops.  I'm working on that but
don't have it implemented yet.

In the meantime, if you do not request QoS using opensm -Q, then
torus-2QoS will only use SLs 8-15, and thus VLs 4-7, leaving SL0/VL0
free for multicast.


Jim Schutt (11):
  opensm: Prepare for routing engine input to path record SL lookup and
SL2VL map setup.
  opensm: Allow the routing engine to influence SL2VL calculations.
  opensm: Allow the routing engine to participate in path SL
calculations.
  opensm: Track the minimum value in the fabric of data VLs supported.
  opensm: Add torus-2QoS routing engine.
  opensm: Enable torus-2QoS routing engine.
  opensm: Add opensm option to specify file name for extra torus-2QoS
configuration information.
  opensm: Do not require -Q option for torus-2QoS routing engine.
  opensm: Make it possible to configure no fallback routing engine.
  opensm:  Avoid havoc in minhop caused by torus-2QoS persistent use of
osm_port_t:priv.
  opensm: Update documentation to describe torus-2QoS.

 opensm/doc/current-routing.txt |  154 +-
 opensm/include/opensm/osm_base.h   |   18 +
 opensm/include/opensm/osm_opensm.h |   24 +-
 opensm/include/opensm/osm_subnet.h |7 +
 opensm/include/opensm/osm_ucast_lash.h |3 -
 opensm/man/opensm.8.in |9 +-
 opensm/opensm/Makefile.am  |2 +-
 opensm/opensm/main.c   |8 +
 opensm/opensm/osm_console.c|   10 +-
 opensm/opensm/osm_dump.c   |3 +-
 opensm/opensm/osm_link_mgr.c   |   16 +-
 opensm/opensm/osm_opensm.c |   54 +-
 opensm/opensm/osm_port_info_rcv.c  |   13 +-
 opensm/opensm/osm_qos.c|   26 +-
 opensm/opensm/osm_sa_path_record.c |   33 +-
 opensm/opensm/osm_state_mgr.c  |   10 +-
 opensm/opensm/osm_subnet.c |   20 +-
 opensm/opensm/osm_ucast_lash.c |   11 +-
 opensm/opensm/osm_ucast_mgr.c  |   44 +-
 opensm/opensm/osm_ucast_torus.c| 8665 
 20 files changed, 9038 insertions(+), 92 deletions(-)
 create mode 100644 opensm/opensm/osm_ucast_torus.c


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/11] opensm: Track the minimum value in the fabric of data VLs supported.

2009-11-20 Thread Jim Schutt
A routing engine that wants to make contributions to SL2VL maps in support
of routing free from credit loops may need to know the minimum number
of supported data VLs in the fabric.

This code tracks that value.

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/include/opensm/osm_subnet.h |1 +
 opensm/opensm/osm_port_info_rcv.c  |   13 -
 opensm/opensm/osm_state_mgr.c  |6 ++
 opensm/opensm/osm_subnet.c |1 +
 4 files changed, 20 insertions(+), 1 deletions(-)

diff --git a/opensm/include/opensm/osm_subnet.h 
b/opensm/include/opensm/osm_subnet.h
index 0302f91..c303e86 100644
--- a/opensm/include/opensm/osm_subnet.h
+++ b/opensm/include/opensm/osm_subnet.h
@@ -509,6 +509,7 @@ typedef struct osm_subn {
uint16_t max_mcast_lid_ho;
uint8_t min_ca_mtu;
uint8_t min_ca_rate;
+   uint8_t min_data_vls;
boolean_t ignore_existing_lfts;
boolean_t subnet_initialization_error;
boolean_t force_heavy_sweep;
diff --git a/opensm/opensm/osm_port_info_rcv.c 
b/opensm/opensm/osm_port_info_rcv.c
index 8a99064..b0d54c8 100644
--- a/opensm/opensm/osm_port_info_rcv.c
+++ b/opensm/opensm/osm_port_info_rcv.c
@@ -82,6 +82,7 @@ static void pi_rcv_process_endport(IN osm_sm_t * sm, IN 
osm_physp_t * p_physp,
ib_api_status_t status;
ib_net64_t port_guid;
uint8_t rate, mtu;
+   unsigned data_vls;
cl_qmap_t *p_sm_tbl;
osm_remote_sm_t *p_sm;
 
@@ -91,7 +92,7 @@ static void pi_rcv_process_endport(IN osm_sm_t * sm, IN 
osm_physp_t * p_physp,
 
/* HACK extended port 0 should be handled too! */
if (osm_physp_get_port_num(p_physp) != 0) {
-   /* track the minimal endport MTU and rate */
+   /* track the minimal endport MTU, rate, and operational VLs */
mtu = ib_port_info_get_mtu_cap(p_pi);
if (mtu  sm-p_subn-min_ca_mtu) {
OSM_LOG(sm-p_log, OSM_LOG_VERBOSE,
@@ -107,6 +108,16 @@ static void pi_rcv_process_endport(IN osm_sm_t * sm, IN 
osm_physp_t * p_physp,
PRIx64 \n, rate, cl_ntoh64(port_guid));
sm-p_subn-min_ca_rate = rate;
}
+
+   data_vls = 1U  (ib_port_info_get_op_vls(p_pi) - 1);
+   if (data_vls = IB_MAX_NUM_VLS)
+   data_vls = IB_MAX_NUM_VLS - 1;
+   if ((uint8_t)data_vls  sm-p_subn-min_data_vls) {
+   OSM_LOG(sm-p_log, OSM_LOG_VERBOSE,
+   Setting endport minimal data VLs to:%u defined 
by port:0x%
+   PRIx64 \n, data_vls, cl_ntoh64(port_guid));
+   sm-p_subn-min_data_vls = data_vls;
+   }
}
 
if (port_guid != sm-p_subn-sm_port_guid) {
diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
index c3f49dc..b6c41a6 100644
--- a/opensm/opensm/osm_state_mgr.c
+++ b/opensm/opensm/osm_state_mgr.c
@@ -1132,6 +1132,12 @@ repeat_discovery:
sm-p_subn-force_reroute = FALSE;
sm-p_subn-subnet_initialization_error = FALSE;
 
+   /* Reset tracking values in case limiting component got removed
+* from fabric. */
+   sm-p_subn-min_ca_mtu = IB_MAX_MTU;
+   sm-p_subn-min_ca_rate = IB_MAX_RATE;
+   sm-p_subn-min_data_vls = IB_MAX_NUM_VLS - 1;
+
/* rescan configuration updates */
if (!config_parsed  osm_subn_rescan_conf_files(sm-p_subn)  0)
OSM_LOG(sm-p_log, OSM_LOG_ERROR, ERR 331A: 
diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c
index 2cfcbe6..19ba730 100644
--- a/opensm/opensm/osm_subnet.c
+++ b/opensm/opensm/osm_subnet.c
@@ -526,6 +526,7 @@ ib_api_status_t osm_subn_init(IN osm_subn_t * p_subn, IN 
osm_opensm_t * p_osm,
p_subn-max_mcast_lid_ho = IB_LID_MCAST_END_HO;
p_subn-min_ca_mtu = IB_MAX_MTU;
p_subn-min_ca_rate = IB_MAX_RATE;
+   p_subn-min_data_vls = IB_MAX_NUM_VLS - 1;
p_subn-ignore_existing_lfts = TRUE;
 
/* we assume master by default - so we only need to set it true if 
STANDBY */
-- 
1.5.6.GIT


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/11] opensm: Enable torus-2QoS routing engine.

2009-11-20 Thread Jim Schutt

Signed-off-by: Jim Schutt jasc...@sandia.gov
---
 opensm/include/opensm/osm_opensm.h |1 +
 opensm/opensm/osm_opensm.c |6 ++
 2 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/opensm/include/opensm/osm_opensm.h 
b/opensm/include/opensm/osm_opensm.h
index ef9d4e1..90c6c0f 100644
--- a/opensm/include/opensm/osm_opensm.h
+++ b/opensm/include/opensm/osm_opensm.h
@@ -105,6 +105,7 @@ typedef enum _osm_routing_engine_type {
OSM_ROUTING_ENGINE_TYPE_FTREE,
OSM_ROUTING_ENGINE_TYPE_LASH,
OSM_ROUTING_ENGINE_TYPE_DOR,
+   OSM_ROUTING_ENGINE_TYPE_TORUS_2QOS,
OSM_ROUTING_ENGINE_TYPE_UNKNOWN
 } osm_routing_engine_type_t;
 /***/
diff --git a/opensm/opensm/osm_opensm.c b/opensm/opensm/osm_opensm.c
index 9cd254e..7052d49 100644
--- a/opensm/opensm/osm_opensm.c
+++ b/opensm/opensm/osm_opensm.c
@@ -70,6 +70,7 @@ extern int osm_ucast_file_setup(struct osm_routing_engine *, 
osm_opensm_t *);
 extern int osm_ucast_ftree_setup(struct osm_routing_engine *, osm_opensm_t *);
 extern int osm_ucast_lash_setup(struct osm_routing_engine *, osm_opensm_t *);
 extern int osm_ucast_dor_setup(struct osm_routing_engine *, osm_opensm_t *);
+extern int osm_ucast_torus2QoS_setup(struct osm_routing_engine *, osm_opensm_t 
*);
 
 const static struct routing_engine_module routing_modules[] = {
{minhop, osm_ucast_minhop_setup},
@@ -78,6 +79,7 @@ const static struct routing_engine_module routing_modules[] = 
{
{ftree, osm_ucast_ftree_setup},
{lash, osm_ucast_lash_setup},
{dor, osm_ucast_dor_setup},
+   {torus-2QoS, osm_ucast_torus2QoS_setup},
{NULL, NULL}
 };
 
@@ -98,6 +100,8 @@ const char *osm_routing_engine_type_str(IN 
osm_routing_engine_type_t type)
return lash;
case OSM_ROUTING_ENGINE_TYPE_DOR:
return dor;
+   case OSM_ROUTING_ENGINE_TYPE_TORUS_2QOS:
+   return torus-2QoS;
default:
break;
}
@@ -124,6 +128,8 @@ osm_routing_engine_type_t osm_routing_engine_type(IN const 
char *str)
return OSM_ROUTING_ENGINE_TYPE_LASH;
else if (!strcasecmp(str, dor))
return OSM_ROUTING_ENGINE_TYPE_DOR;
+   else if (!strcasecmp(str, torus-2QoS))
+   return OSM_ROUTING_ENGINE_TYPE_TORUS_2QOS;
else
return OSM_ROUTING_ENGINE_TYPE_UNKNOWN;
 }
-- 
1.5.6.GIT


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >