Signed-off-by: Jim Schutt <jasc...@sandia.gov>
---
 opensm/Makefile.am              |    2 +-
 opensm/configure.in             |    6 +-
 opensm/man/torus-2QoS.8.in      |  476 +++++++++++++++++++++++++++++++++++++++
 opensm/man/torus-2QoS.conf.5.in |  184 +++++++++++++++
 4 files changed, 666 insertions(+), 2 deletions(-)
 create mode 100644 opensm/man/torus-2QoS.8.in
 create mode 100644 opensm/man/torus-2QoS.conf.5.in

diff --git a/opensm/Makefile.am b/opensm/Makefile.am
index 88ff9da..58a682b 100644
--- a/opensm/Makefile.am
+++ b/opensm/Makefile.am
@@ -12,7 +12,7 @@ install-exec-hook:
        chmod 755 $(DESTDIR)/$(sysconfdir)/init.d/opensmd
 
 
-man_MANS = man/opensm.8 man/osmtest.8
+man_MANS = man/opensm.8 man/osmtest.8 man/torus-2QoS.8 man/torus-2QoS.conf.5
 
 various_scripts = $(wildcard scripts/*)
 docs = doc/performance-manager-HOWTO.txt doc/QoS_management_in_OpenSM.txt \
diff --git a/opensm/configure.in b/opensm/configure.in
index 8695965..aaad999 100644
--- a/opensm/configure.in
+++ b/opensm/configure.in
@@ -196,6 +196,10 @@ AC_DEFINE_UNQUOTED(HAVE_DEFAULT_QOS_POLICY_FILE,
        [Define a QOS policy config file])
 AC_SUBST(QOS_POLICY_FILE)
 
+dnl For now, this does not need to be configurable
+TORUS2QOS_CONF_FILE=torus-2QoS.conf
+AC_SUBST(TORUS2QOS_CONF_FILE)
+
 dnl Check for a different prefix-routes file
 PREFIX_ROUTES_FILE=prefix-routes.conf
 AC_MSG_CHECKING(for --with-prefix-routes-conf)
@@ -226,7 +230,7 @@ dnl Checks for headers and libraries
 OPENIB_APP_OSMV_CHECK_HEADER
 OPENIB_APP_OSMV_CHECK_LIB
 
-AC_CONFIG_FILES([man/opensm.8 scripts/opensm.init scripts/redhat-opensm.init 
scripts/sldd.sh])
+AC_CONFIG_FILES([man/opensm.8 man/torus-2QoS.8 man/torus-2QoS.conf.5 
scripts/opensm.init scripts/redhat-opensm.init scripts/sldd.sh])
 
 dnl Create the following Makefiles
 AC_OUTPUT([include/opensm/osm_version.h Makefile include/Makefile 
complib/Makefile libvendor/Makefile opensm/Makefile osmeventplugin/Makefile 
osmtest/Makefile opensm.spec])
diff --git a/opensm/man/torus-2QoS.8.in b/opensm/man/torus-2QoS.8.in
new file mode 100644
index 0000000..68e2bce
--- /dev/null
+++ b/opensm/man/torus-2QoS.8.in
@@ -0,0 +1,476 @@
+.TH TORUS\-2QOS 8 "November 10, 2010" "OpenIB" "OpenIB Management"
+.
+.SH NAME
+torus\-2QoS \- Routing engine for OpenSM subnet manager
+.
+.SH DESCRIPTION
+.
+Torus-2QoS is routing algorithm designed for large-scale 2D/3D torus fabrics.
+The torus-2QoS routing engine can provide the following functionality on
+a 2D/3D torus:
+.br
+\" roff illiteracy leads to following brain-dead list implementation
+\"
+.na  \" otherwise line space adjustment can add spaces between dash and text
+.in +2m
+\[en]
+'in +2m
+Routing that is free of credit loops.
+.in
+\[en]
+'in +2m
+Two levels of Quality of Service (QoS), assuming switches and channel
+adapters support eight data VLs.
+.in
+\[en]
+'in +2m
+The ability to route around a single failed switch, and/or multiple failed
+links, without
+.in
+.in +2m
+\[en]
+'in +2
+introducing credit loops, or
+.in
+\[en]
+'in +2m
+changing path SL values.
+.in -4m
+\[en]
+'in +2m
+Very short run times, with good scaling properties as fabric size increases.
+.ad
+.
+.SH UNICAST ROUTING
+.
+Unicast routing in torus-2QoS is based on Dimension Order Routing (DOR).
+It avoids the deadlocks that would otherwise occur in a DOR-routed
+torus using the concept of a dateline for each torus dimension.
+It encodes into a path SL which datelines the path crosses, as follows:
+\f(CR
+.P
+.nf
+    sl = 0;
+    for (d = 0; d < torus_dimensions; d++) {
+        /* path_crosses_dateline(d) returns 0 or 1 */
+        sl |= path_crosses_dateline(d) << d;
+    }
+.fi
+\fR
+.P
+On a 3D torus this consumes three SL bits, leaving one SL bit unused.
+Torus-2QoS uses this SL bit to implement two QoS levels.
+.P
+Torus-2QoS also makes use of the output port
+dependence of switch SL2VL maps to encode into one VL bit the
+information encoded in three SL bits.
+It computes in which torus coordinate direction each inter-switch link
+"points", and writes SL2VL maps for such ports as follows:
+\f(CR
+.P
+.nf
+    for (sl = 0; sl < 16; sl++) {
+        /* cdir(port) computes which torus coordinate direction
+         * a switch port "points" in; returns 0, 1, or 2
+         */
+        sl2vl(iport,oport,sl) = 0x1 & (sl >> cdir(oport));
+    }
+.fi
+\fR
+.P
+Thus, on a pristine 3D torus,
+\fIi.e.\fR,
+in the absence of failed fabric switches,
+torus-2QoS consumes eight SL values (SL bits 0-2) and
+two VL values (VL bit 0) per QoS level to provide deadlock-free routing.
+.P
+Torus-2QoS routes around link failure by "taking the long way around" any
+1D ring interrupted by link failure.  For example, consider the 2D 6x5
+torus below, where switches are denoted by [+a-zA-Z]:
+.
+.
+\# define macros to start and end ascii art, assuming Roman font.
+\# the start macro takes an argument which is the width in ems of
+\# the ascii art, and is used to center it.
+\#
+.de ascii_art
+.nop \f(CR
+.nr indent_in_ems ((((\\n[.ll] - \\n[.i]) / \\w'm') - \\$1)/2)
+.in +\\n[indent_in_ems]m
+.nf
+..
+.de end_ascii_art
+.fi
+.in
+.nop \fR
+..
+\# end of macro definitions
+.
+.
+.ascii_art 36
+       |    |    |    |    |    |
+  4  --+----+----+----+----+----+--
+       |    |    |    |    |    |
+  3  --+----+----+----D----+----+--
+       |    |    |    |    |    |
+  2  --+----+----I----r----+----+--
+       |    |    |    |    |    |
+  1  --m----S----n----T----o----p--
+       |    |    |    |    |    |
+y=0  --+----+----+----+----+----+--
+       |    |    |    |    |    |
+
+     x=0    1    2    3    4    5
+.end_ascii_art
+.P
+For a pristine fabric the path from S to D would be S-n-T-r-D.
+In the event that either link S-n or n-T has failed, torus-2QoS would
+use the path S-m-p-o-T-r-D.
+Note that it can do this without changing the path SL
+value; once the 1D ring m-S-n-T-o-p-m has been broken by failure, path
+segments using it cannot contribute to deadlock, and the x-direction
+dateline (between, say, x=5 and x=0) can be ignored for path segments on
+that ring.
+.P
+One result of this is that torus-2QoS can route around many simultaneous
+link failures, as long as no 1D ring is broken into disjoint segments.
+For example, if links n-T and T-o have both failed, that ring has been broken
+into two disjoint segments, T and o-p-m-S-n.
+Torus-2QoS checks for such
+issues, reports if they are found, and refuses to route such fabrics.
+.P
+Note that in the case where there are multiple parallel links between a
+pair of switches, torus-2QoS will allocate routes across such links
+in a round-robin fashion, based on ports at the path destination switch that
+are active and not used for inter-switch links.
+Should a link that is one of several such parallel links fail, routes
+are redistributed across the remaining links.
+When the last of such a set of parallel links fails, traffic is rerouted
+as described above.
+.P
+Handling a failed switch under DOR requires introducing into a path at
+least one turn that would be otherwise "illegal",
+\fIi.e.\fR,
+not allowed by DOR rules.
+Torus-2QoS will introduce such a turn as close as possible to the
+failed switch in order to route around it.
+.P
+In the above example, suppose switch T has failed, and consider the path
+from S to D.
+Torus-2QoS will produce the path S-n-I-r-D, rather than the
+S-n-T-r-D path for a pristine torus, by introducing an early turn at n.
+Normal DOR rules will cause traffic arriving at switch I to be forwarded
+to switch r; for traffic arriving from I due to the "early" turn at n,
+this will generate an "illegal" turn at I.
+.P
+Torus-2QoS will also use the input port dependence of SL2VL maps to set VL
+bit 1 (which would be otherwise unused) for y-x, z-x, and z-y turns,
+\fIi.e.\fR,
+those turns that are illegal under DOR.
+This causes the first hop after any such turn to use a separate set of
+VL values, and prevents deadlock in the presence of a single failed switch.
+.P
+For any given path, only the hops after a turn that is illegal under DOR
+can contribute to a credit loop that leads to deadlock.  So in the example
+above with failed switch T, the location of the illegal turn at I in the
+path from S to D requires that any credit loop caused by that turn must
+encircle the failed switch at T.  Thus the second and later hops after the
+illegal turn at I (\fIi.e.\fR, hop r-D) cannot contribute to a credit loop
+because they cannot be used to construct a loop encircling T.  The hop I-r
+uses a separate VL, so it cannot contribute to a credit loop encircling T.
+.P
+Extending this argument shows that in addition to being capable of routing
+around a single switch failure without introducing deadlock, torus-2QoS can
+also route around multiple failed switches on the condition they are
+adjacent in the last dimension routed by DOR.  For example, consider the
+following case on a 6x6 2D torus:
+.
+.ascii_art 36
+       |    |    |    |    |    |
+  5  --+----+----+----+----+----+--
+       |    |    |    |    |    |
+  4  --+----+----+----D----+----+--
+       |    |    |    |    |    |
+  3  --+----+----I----u----+----+--
+       |    |    |    |    |    |
+  2  --+----+----q----R----+----+--
+       |    |    |    |    |    |
+  1  --m----S----n----T----o----p--
+       |    |    |    |    |    |
+y=0  --+----+----+----+----+----+--
+       |    |    |    |    |    |
+
+     x=0    1    2    3    4    5
+.end_ascii_art
+.P
+Suppose switches T and R have failed, and consider the path from S to D.
+Torus-2QoS will generate the path S-n-q-I-u-D, with an illegal turn at
+switch I, and with hop I-u using a VL with bit 1 set.
+.P
+As a further example, consider a case that torus-2QoS cannot route without
+deadlock: two failed switches adjacent in a dimension that is not the last
+dimension routed by DOR; here the failed switches are O and T:
+.
+.ascii_art 36
+       |    |    |    |    |    |
+  5  --+----+----+----+----+----+--
+       |    |    |    |    |    |
+  4  --+----+----+----+----+----+--
+       |    |    |    |    |    |
+  3  --+----+----+----+----D----+--
+       |    |    |    |    |    |
+  2  --+----+----I----q----r----+--
+       |    |    |    |    |    |
+  1  --m----S----n----O----T----p--
+       |    |    |    |    |    |
+y=0  --+----+----+----+----+----+--
+       |    |    |    |    |    |
+
+     x=0    1    2    3    4    5
+.end_ascii_art
+.P
+In a pristine fabric, torus-2QoS would generate the path from S to D as
+S-n-O-T-r-D.  With failed switches O and T, torus-2QoS will generate the
+path S-n-I-q-r-D, with illegal turn at switch I, and with hop I-q using a
+VL with bit 1 set.  In contrast to the earlier examples, the second hop
+after the illegal turn, q-r, can be used to construct a credit loop
+encircling the failed switches.
+.
+.SH MULTICAST ROUTING
+.
+Since torus-2QoS uses all four available SL bits, and the three data VL
+bits that are typically available in current switches, there is no way
+to use SL/VL values to separate multicast traffic from unicast traffic.
+Thus, torus-2QoS must generate multicast routing such that credit loops
+cannot arise from a combination of multicast and unicast path segments.
+.P
+It turns out that it is possible to construct spanning trees for multicast
+routing that have that property.  For the 2D 6x5 torus example above, here
+is the full-fabric spanning tree that torus-2QoS will construct, where "x"
+is the root switch and each "+" is a non-root switch:
+.
+.ascii_art 36
+  4    +    +    +    +    +    +
+       |    |    |    |    |    |
+  3    +    +    +    +    +    +
+       |    |    |    |    |    |
+  2    +----+----+----x----+----+
+       |    |    |    |    |    |
+  1    +    +    +    +    +    +
+       |    |    |    |    |    |
+y=0    +    +    +    +    +    +
+
+     x=0    1    2    3    4    5
+.end_ascii_art
+.P
+For multicast traffic routed from root to tip, every turn in the above
+spanning tree is a legal DOR turn.
+.P
+For traffic routed from tip to root, and some traffic routed through the
+root, turns are not legal DOR turns.  However, to construct a credit loop,
+the union of multicast routing on this spanning tree with DOR unicast
+routing can only provide 3 of the 4 turns needed for the loop.
+.P
+In addition, if none of the above spanning tree branches crosses a dateline
+used for unicast credit loop avoidance on a torus, and if multicast traffic
+is confined to SL 0 or SL 8 (recall that torus-2QoS uses SL bit 3 to
+differentiate QoS level), then multicast traffic also cannot contribute to
+the "ring" credit loops that are otherwise possible in a torus.
+.P
+Torus-2QoS uses these ideas to create a master spanning tree.  Every
+multicast group spanning tree will be constructed as a subset of the master
+tree, with the same root as the master tree.
+.P
+Such multicast group spanning trees will in general not be optimal for
+groups which are a subset of the full fabric. However, this compromise must
+be made to enable support for two QoS levels on a torus while preventing
+credit loops.
+.P
+In the presence of link or switch failures that result in a fabric for
+which torus-2QoS can generate credit-loop-free unicast routes, it is also
+possible to generate a master spanning tree for multicast that retains the
+required properties.  For example, consider that same 2D 6x5 torus, with
+the link from (2,2) to (3,2) failed.  Torus-2QoS will generate the following
+master spanning tree:
+.
+.ascii_art 36
+  4    +    +    +    +    +    +
+       |    |    |    |    |    |
+  3    +    +    +    +    +    +
+       |    |    |    |    |    |
+  2  --+----+----+    x----+----+--
+       |    |    |    |    |    |
+  1    +    +    +    +    +    +
+       |    |    |    |    |    |
+y=0    +    +    +    +    +    +
+
+     x=0    1    2    3    4    5
+.end_ascii_art
+.P
+Two things are notable about this master spanning tree.  First, assuming
+the x dateline was between x=5 and x=0, this spanning tree has a branch
+that crosses the dateline.  However, just as for unicast, crossing a
+dateline on a 1D ring (here, the ring for y=2) that is broken by a failure
+cannot contribute to a torus credit loop.
+.P
+Second, this spanning tree is no longer optimal even for multicast groups
+that encompass the entire fabric.  That, unfortunately, is a compromise that
+must be made to retain the other desirable properties of torus-2QoS routing.
+.P
+In the event that a single switch fails, torus-2QoS will generate a master
+spanning tree that has no "extra" turns by appropriately selecting a root
+switch.
+In the 2D 6x5 torus example, assume now that the switch at (3,2),
+\fIi.e.\fR, the root for a pristine fabric, fails.
+Torus-2QoS will generate the
+following master spanning tree for that case:
+.
+.ascii_art 36
+                      |
+  4    +    +    +    +    +    +
+       |    |    |    |    |    |
+  3    +    +    +    +    +    +
+       |    |    |         |    |
+  2    +    +    +         +    +
+       |    |    |         |    |
+  1    +----+----x----+----+----+
+       |    |    |    |    |    |
+y=0    +    +    +    +    +    +
+                      |
+
+     x=0    1    2    3    4    5
+.end_ascii_art
+.P
+Assuming the y dateline was between y=4 and y=0, this spanning tree has
+a branch that crosses a dateline.  However, again this cannot contribute
+to credit loops as it occurs on a 1D ring (the ring for x=3) that is
+broken by a failure, as in the above example.
+.
+.SH TORUS TOPOLOGY DISCOVERY
+.
+The algorithm used by torus-2QoS to contruct the torus topology from
+the undirected graph representing the fabric requires that the radix of
+each dimension be configured via torus-2QoS.conf.
+It also requires that the torus topology be "seeded"; for a 3D torus this
+requires configuring four switches that define the three coordinate
+directions of the torus.
+.P
+Given this starting information, the algorithm is to examine the
+cube formed by the eight switch locations bounded by the corners
+(x,y,z) and (x+1,y+1,z+1).
+Based on switches already placed into the torus topology at some of these
+locations, the algorithm examines 4-loops of inter-switch links to find the
+one that is consistent with a face of the cube of switch locations,
+and adds its swiches to the discovered topology in the correct locations.
+.P
+Because the algorithm is based on examing the topology of 4-loops of links,
+a torus with one or more radix-4 dimensions requires extra initial
+seed configuration.
+See torus-2QoS.conf(5) for details.
+Torus-2QoS will detect and report when it has insufficient configuration
+for a torus with radix-4 dimensions.
+.P
+In the event the torus is significantly degraded, \fIi.e.\fR, there are
+many missing switches or links, it may happen that torus-2QoS is unable
+to place into the torus some switches and/or links that were discoverd
+in the fabric, and will generate a warning in that case.
+A similar condition occurs if torus-2QoS is misconfigured, \fIi.e.\fR,
+the radix of a torus dimension as configured does not match the radix
+of that torus dimension as wired, and many switches/links in the fabric
+will not be placed into the torus.
+.
+.SH QUALITY OF SERVICE CONFIGURATION
+.
+OpenSM will not program switchs and channel adapters with
+SL2VL maps or VL arbitration configuration unless it is invoked with -Q.
+Since torus-2QoS depends on such functionality for correct operation,
+always invoke OpenSM with -Q when torus-2QoS is in the list of routing
+engines.
+.P
+Any quality of service configuration method supported by OpenSM will
+work with torus-2QoS, subject to the following limitations and
+considerations.
+.P
+For all routing engines supported by OpenSM except torus-2QoS,
+there is a one-to-one correspondence between QoS level and SL.
+Torus-2QoS can only support two quality of service levels, so only
+the high-order bit of any SL value used for unicast QoS configuration
+will be honored by torus-2QoS.
+.P
+For multicast QoS configuration, only SL values 0 and 8 should be used
+with torus-2QoS.
+.P
+Since SL to VL map configuration must be under the complete control of
+torus-2QoS, any configuration via qos_sl2vl, qos_swe_sl2vl,
+\fIetc.\fR, must and  will be ignored, and a warning will be generated.
+.P
+Torus-2QoS uses VL values 0-3 to implement one of its supported QoS
+levels, and VL values 4-7 to implement the other.  Hard-to-diagnose
+application issues may arise if traffic is not delivered fairly
+across each of these two VL ranges.
+Torus-2QoS will detect and warn if VL arbitration is configured
+unfairly across VLs in the range 0-3, and also in the range 4-7.
+Note that the default OpenSM VL arbitration configuration
+does not meet this constraint, so all torus-2QoS users should
+configure VL arbitration via qos_vlarb_high, qos_vlarb_low, \fIetc.\fR
+.
+.SH OPERATIONAL CONSIDERATIONS
+.
+Any routing algorithm for a torus IB fabric must employ path
+SL values to avoid credit loops.
+As a result, all applications run over such fabrics must perform a
+path record query to obtain the correct path SL for connection setup.
+Applications that use \fBrdma_cm\fR for connection setup will automatically
+meet this requirement.
+.P
+If a change in fabric topology causes changes in path SL values required
+to route without credit loops, in general all applications would need
+to repath to avoid message deadlock.  Since torus-2QoS has the ability
+to reroute after a single switch failure without changing path SL values,
+repathing by running applications is not required when the fabric
+is routed with torus-2QoS.
+.P
+Torus-2QoS can provide unchanging path SL values in the presence of
+subnet manager failover provided that all OpenSM instances have the
+same idea of dateline location.  See torus-2QoS.conf(5) for details.
+.P
+Torus-2QoS will detect configurations of failed switches and links
+that prevent routing that is free of credit loops, and will
+log warnings and refuse to route.  If "no_fallback" was configured in the
+list of OpenSM routing engines, then no other routing engine
+will attempt to route the fabric.  In that case all paths that
+do not transit the failed components will continue to work, and
+the subset of paths that are still operational will continue to remain
+free of credit loops.
+OpenSM will continue to attempt to route the fabric after every sweep
+interval, and after any change (such as a link up) in the fabric topology.
+When the fabric components are repaired, full functionality will be
+restored.
+.P
+In the event OpenSM was configured to allow some other engine to
+route the fabric if torus-2QoS fails, then credit loops and message
+deadlock are likely if torus-2QoS had previously routed
+the fabric successfully.
+Even if the other engine is capable of routing a torus
+without credit loops, applications that built connections with
+path SL values granted under torus-2QoS will likely experience
+message deadlock under routing generated by a different engine,
+unless they repath.
+.P
+To verify that a torus fabric is routed free of credit loops,
+use \fBibdmchk\fR to analyze data collected via \fBibdiagnet -vlr\fR.
+.
+.SH FILES
+.TP
+.B @OPENSM_CONFIG_DIR@/@OPENSM_CONFIG_FILE@
+default OpenSM config file.
+.TP
+.B @OPENSM_CONFIG_DIR@/@QOS_POLICY_FILE@
+default QoS policy config file.
+.TP
+.B @OPENSM_CONFIG_DIR@/@TORUS2QOS_CONF_FILE@
+default torus-2QoS config file.
+.
+.SH SEE ALSO
+.
+opensm(8), torus-2QoS.conf(5), ibdiagnet(1), ibdmchk(1), rdma_cm(7).
diff --git a/opensm/man/torus-2QoS.conf.5.in b/opensm/man/torus-2QoS.conf.5.in
new file mode 100644
index 0000000..147a7b1
--- /dev/null
+++ b/opensm/man/torus-2QoS.conf.5.in
@@ -0,0 +1,184 @@
+.TH TORUS\-2QOS.CONF 5 "November 11, 2010" "OpenIB" "OpenIB Management"
+.
+.SH NAME
+torus\-2QoS.conf \- Torus-2QoS configuration for OpenSM subnet manager
+.
+.SH DESCRIPTION
+.
+The file
+.B torus-2QoS.conf
+contains configuration information that is specific to the OpenSM
+routing engine torus-2QoS.
+Blank lines and lines where the first non-whitespace character is
+"#" are ignored.
+A token is any contiguous group of non-whitespace characters.
+Any tokens on a line following the recognized configuration tokens described
+below are ignored.
+.
+.P
+\fR[\fBtorus\fR|\fBmesh\fR]
+\fIx_radix\fR[\fBm\fR|\fBM\fR|\fBt\fR|\fBT\fR]
+\fIy_radix\fR[\fBm\fR|\fBM\fR|\fBt\fR|\fBT\fR]
+\fIz_radix\fR[\fBm\fR|\fBM\fR|\fBt\fR|\fBT\fR]
+.RS
+Either \fBtorus\fR or \fBmesh\fR must be the first keyword in the
+configuration, and sets the topology
+that torus-2QoS will try to construct.
+A 2D topology can be configured by specifying one of
+\fIx_radix\fR, \fIy_radix\fR, or \fIz_radix\fR as 1.
+An individual dimension can be configured as mesh (open) or torus
+(looped) by suffixing its radix specification with one of
+\fBm\fR, \fBM\fR, \fBt\fR, or \fBT\fR.  Thus, "mesh 3T 4 5" and
+"torus 3 4M 5M" both specify the same topology.
+.P
+Note that although torus-2QoS can route mesh fabrics, its ability to
+route around failed components is severely compromised on such fabrics.
+A failed fabric component is very likely to cause a disjoint ring;
+see \fBUNICAST ROUTING\fR in torus-2QoS(8).
+.RE
+.
+.P
+\fBxp_link
+\fIsw0_GUID sw1_GUID
+.br
+.ns
+\fByp_link
+\fIsw0_GUID sw1_GUID
+.br
+.ns
+\fBzp_link
+\fIsw0_GUID sw1_GUID
+.br
+.ns
+\fBxm_link
+\fIsw0_GUID sw1_GUID
+.br
+.ns
+\fBym_link
+\fIsw0_GUID sw1_GUID
+.br
+.ns
+\fBzm_link
+\fIsw0_GUID sw1_GUID
+\fR
+.RS
+These keywords are used to seed the torus/mesh topolgy.
+For example, "xp_link 0x2000 0x2001" specifies that a link from
+the switch with node GUID 0x2000 to the switch with node GUID 0x2001
+would point in the positive x direction,
+while "xm_link 0x2000 0x2001" specifies that a link from
+the switch with node GUID 0x2000 to the switch with node GUID 0x2001
+would point in the negative x direction.  All the link keywords for
+a given seed must specify the same "from" switch.
+.P
+In general, it is not necessary to configure both the positive and
+negative directions for a given coordinate; either is sufficient.
+However, the algorithm used for topology discovery needs extra information
+for torus dimensions of radix four (see \fBTOPOLOGY DISCOVERY\fR in
+torus-2QoS(8)).  For such cases both the positive and negative coordinate
+directions must be specified.
+.P
+Based on the topology specifed via the \fBtorus\fR/\fBmesh\fR keyword,
+torus-2QoS will detect and log when it has insufficient seed configuration.
+.RE
+.
+.P
+\fBx_dateline
+\fIposition
+.br
+.ns
+\fBy_dateline
+\fIposition
+.br
+.ns
+\fBz_dateline
+\fIposition
+\fR
+.RS
+In order for torus-2QoS to provide the guarantee that path SL values
+do not change under any conditions for which it can still route the fabric,
+its idea of dateline position must not change relative to physical switch
+locations.  The dateline keywords provide the means to configure such
+behavior.
+.P
+The dateline for a torus dimension is always between the switch with
+coordinate 0 and the switch with coordinate radix-1 for that dimension.
+By default, the common switch in a torus seed is taken as the origin of
+the coordinate system used to describe switch location.
+The \fIposition\fR parameter for a dateline keyword moves the origin
+(and hence the dateline) the specified amount relative to the common
+switch in a torus seed.
+.RE
+.
+.P
+\fBnext_seed
+\fR
+.RS
+If any of the switches used to specify a seed were to fail torus-2QoS
+would be unable to complete topology discovery successfully.
+The \fBnext_seed\fR keyword specifies that the following link and dateline
+keywords apply to a new seed specification.
+.P
+For maximum resiliency, no seed specification should share a switch
+with any other seed specification.
+Multiple seed specifications should use dateline configuration to
+ensure that torus-2QoS can grant path SL values that are constant,
+regardless of which seed was used to initiate topology discovery.
+.RE
+.
+.P
+\fBportgroup_max_ports
+\fImax_ports
+\fR
+.RS
+This keyword specifies the maximum number of parallel inter-switch
+links, and also the maximum number of host ports per switch, that
+torus-2QoS can accommodate.
+The default value is 16.
+Torus-2QoS will log an error message during topology discovery if this
+parameter needs to be increased.
+If this keyword appears multiple times, the last instance prevails.
+.RE
+.
+.SH EXAMPLE
+.
+\f(RC
+.nf
+# Look for a 2D (since x radix is one) 4x5 torus.
+torus 1 4 5
+
+# y is radix-4 torus dimension, need both
+# ym_link and yp_link configuration.
+yp_link 0x200000 0x200005  # sw @ y=0,z=0 -> sw @ y=1,z=0
+ym_link 0x200000 0x20000f  # sw @ y=0,z=0 -> sw @ y=3,z=0
+
+# z is not radix-4 torus dimension, only need one of
+# zm_link or zp_link configuration.
+zp_link 0x200000 0x200001  # sw @ y=0,z=0 -> sw @ y=0,z=1
+
+next_seed
+
+yp_link 0x20000b 0x200010  # sw @ y=2,z=1 -> sw @ y=3,z=1
+ym_link 0x20000b 0x200006  # sw @ y=2,z=1 -> sw @ y=1,z=1
+zp_link 0x20000b 0x20000c  # sw @ y=2,z=1 -> sw @ y=2,z=2
+
+y_dateline -2  # Move the dateline for this seed
+z_dateline -1  # back to its original position.
+
+# If OpenSM failover is configured, for maximum resiliency
+# one instance should run on a host attached to a switch
+# from the first seed, and another instance should run
+# on a host attached to a switch from the second seed.
+# Both instances should use this torus-2QoS.conf to ensure
+# path SL values do not change in the event of SM failover.
+.fi
+\fR
+.
+.SH FILES
+.TP
+.B @OPENSM_CONFIG_DIR@/@TORUS2QOS_CONF_FILE@
+Default torus-2QoS config file.
+.
+.SH SEE ALSO
+.
+opensm(8), torus-2QoS(8).
-- 
1.6.2.2


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to