Re: user SA notifications, redux

2010-10-28 Thread Hal Rosenstock

On 10/27/2010 2:10 PM, Mike Heinz wrote:



-Original Message-

Is this intended to handle multiple applications subscribing/unsubscribing for 
the same report ?


Yes. This is based on Sean's old patches to the ib_sa which added support for 
multiplexing traps/notices to multiple client apps and which provided ib_usa 
to expose this capability to user space. I had originally submitted his code as-is for 
acceptance into OFED and the upstream kernel, but we got hung up on what the application 
API would be. These headers are meant to address that part of the conversation. Once we 
get that out of the way we can decide what needs to change in ib_sa and ib_usa to meet 
the API needs.



Why aren't traps 144 and 145 also defined ?


Only because the existing ib_usa patch doesn't support them and I forgot to add 
them. That can be done.


Nit: If trap_number is in network byte order, shouldn't it be ib_net16_t below ?


Done.


Shouldn't there be more granularity in this API in terms of what can be
subscribed for ?  IMO trap number is insufficient for registration and
this API should contain a trap specific variable with a component mask
indicating what fields are valid in that variable.


I don't think I understand what you're getting at. You subscribe to the individual traps 
that you are interested in. When a trap is generated, you get an event which 
contains the entire message.


How about support for the other (than TrapNumber) InformInfo fields ? Of 
particular interest are GID or LID range as not all apps want all the 
ones for the entire subnet. I'm not sure about whether support for any 
other fields is needed or not.



Also, this API should be able to support registering for all traps.


That capability is already present:

IBV_SA_SM_TRAP_ALL = __constant_cpu_to_be16(0x)



What does releasing SA registration mean exactly ? Is it purely a local
operation or does it also do the deregistration with the SA ? I can't
tell for sure from the description above but suspect this API causes the
SA deregistration to occur as well.


No, this is strictly local. Registering and deregistering with the SM depends on 
the patched ib_sa module and I don't think it unregisters with the module till it 
unloads.


Shouldn't there be an API to deregister from the SA (without unloading) 
? I think to do that subscription reference counting would be needed 
which I don't think is supported in ib_usa.


-- Hal
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Add exponential backoff + random delay to MADs when retrying after timeout.

2010-10-28 Thread Hal Rosenstock

On 10/26/2010 2:33 PM, Mike Heinz wrote:

Resending. Didn't get any reply after the last posting.

-Original Message-
From: Mike Heinz
Sent: Monday, October 11, 2010 11:34 AM
To: 'linux-rdma@vger.kernel.org'
Subject: [PATCH] Add exponential backoff + random delay to MADs when retrying 
after timeout.

This patch builds upon a discussion we had earlier this year on adding a 
backoff function when retrying MAD sends after a timeout.


Agreed that the fixed timer retry strategy is b0rken for some scenarios 
(e.g. BUSY handling). I think a description of this should be added to 
the description as background/motivation. You had previously written:


The current behavior is to simply return the BUSY to the client or ULP, 
which  is either treated as a permanent error or causes an immediate 
retry. This can be a big problem with, for example, ipoib which sets 
retries to 15 and (as I understand it) immediately retries to connect 
when getting an error response from the SA. Other ulps have similar 
settings. Without some kind of delay, starting up ipoib on a large 
fabric (at boot time, for example) can cause a real packet storm.


By treating BUSY replies identically to timeouts, this patch at least 
introduces a delay between attempts. In the case of the ULPs, the delay 
is typically 4 seconds.


This approach encourages applications to adjust their timeouts 
appropriately by treating BUSY responses as non-events and forcing the 
applications to wait for their request to time out.


Depending on the application developers to take BUSY responses into 
account seems to be asking for trouble - it allows one rogue app to 
bring the SA to its knees, for example. By enforcing this timeout model 
in the kernel, we guarantee that there will be at least some delay 
between each message when the SA is reporting a busy status.


Maybe some shorter version of the above should be part of this and/or 
your subsequent busy handling patch.



This patch does NOT implement the ABI/API changes that would be needed to take 
advantage of the new features, but it lays the groundwork for doing so. In 
addition, it provides a new module parameter that allow the administrator to 
coerce existing code into using the new capability.

First, I've added a new field called randomized_wait to the ib_mad_send_buf structure. If 
this parameter is set, each time the WR times out, the the timeout for the next retry is set to 
(send_wr-timeout_ms + 511(send_wr-retries) - random32()511). In other words, on the 
first retry, the randomization code will add between 0 and 1/2 second to the timeout. On the second 
retry, it will add between 1 and 1.5 seconds to the timeout, on the 3rd, between 2 and 2.5 seconds, on 
the 4th, between 4 and 4.5, et cetera.


What experience/confidence is there in this (specific) randomization 
policy ? On what (how large) IB cluster sizes has this policy been tried 
? Is this specific policy modeled from other policies in use elsewhere ?


Also, is this randomized timeout used on RMPP packets if this parameter 
is not 0 ?



In addition, a new field, total_timeout has been added to the ib_mad_send_wr_private 
and is initialized to (send_wr-timeout * send_wr-max_retries). Retries cannot 
exceed this total time, even though that will mean a lower number of retry attempts.


Why is the total timeout more important than number of retries in terms 
of terminating the transaction ?


Shouldn't there be another parameter as to whether this limits the 
retries or whether the number of retries should be the limiting factor ? 
I'm not sure about reducing the number of retries on a large fabric. I 
think the typical number used is 3 so this would be at most 2 depending 
on the per retry timeout. I think the default policy should be to 
preserve the number of retries rather than the total timeout.



Finally, I've added a module parameter to coerce all mad work requests to use 
this feature if desired.


On one hand, I don't want to introduce unneeded parameters/complexity 
but I'm wondering whether more granularity is useful on which requests 
(classes ?) this applies to. For example, should SM requests be 
randomized ? This feature is primarily an SA thing although busy can be 
used for other management classes but it's use is mainly GS related.



parm:   randomized_wait:When true, use a randomized backoff algorithm 
to control retries for timeouts. (int)



diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c 
index ef1304f..3b03f1c 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -42,6 +42,11 @@
  #include smi.h
  #include agent.h

+#include linux/random.h
+
+#define MAD_MIN_TIMEOUT_MS 511
+#define MAD_RAND_TIMEOUT_MS 511
+
  MODULE_LICENSE(Dual BSD/GPL);
  MODULE_DESCRIPTION(kernel IB MAD API);  MODULE_AUTHOR(Hal Rosenstock); @@ -55,6 +60,10 
@@ MODULE_PARM_DESC(send_queue_size, Size of send queue in number of work requests  

Re: [PATCH] Handling BUSY responses from the SM.

2010-10-28 Thread Hal Rosenstock

On 10/26/2010 2:33 PM, Mike Heinz wrote:

Resending. Didn't get any reply after sending the last time.

-Original Message-
From: Mike Heinz
Sent: Monday, October 11, 2010 1:24 PM
To: 'linux-rdma@vger.kernel.org'; 'Hefty, Sean'
Cc: Todd Rimmer
Subject: [PATCH] Handling BUSY responses from the SM.

This patch builds upon feedback received earlier this year to add a treat BUSY as 
timeout feature to ib_mad. It does NOT implement the ABI/API changes that would be 
needed in user space to take advantage of the new feature, but it lays the groundwork for 
doing so. In addition, it provides a new module parameter that allow the administrator to 
coerce existing code into using the new capability.



The patch builds upon the randomization/backoff patch I sent earlier today to 
add a random factor to timeouts to prevent synchronized storms of MAD queries. 
I chose to build upon the existing timeout handling because it seemed the best 
way to add the functionality without

Initially, I had tried to completely separate BUSY retries from timeout 
handling, but that seemed difficult due to the way the timeout code is 
structured. As a result, true timeouts and busy handling still use the same 
timeout values, but I was still able to address the idea of randomizing the 
retry timeout if desired.

By default, the behavior of ib_mad with respect to BUSY responses is unchanged. If, 
however, a send work request is provided that has the new busy_wait parameter 
set, ib_mad will ignore BUSY responses to that WR, allowing it to timeout and retry as if 
no response had been received.


Is setting this useful without the randomization also set for this (or 
all) transaction (request/response) WRs ?



Finally, I've added a module parameter to coerce all mad work requests to use 
this new feature:

parm:   treat_busy_as_timeout:When true, treat BUSY responses as if 
they were timeouts. (int)

As I mentioned in the past, this change solves a problem we see in the real world all the 
time (the SM being pounded by unintelligent queries) so I strongly hope this 
meets your concerns.



diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 3b03f1c..9e5e566 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -60,6 +60,10 @@ MODULE_PARM_DESC(send_queue_size, Size of send queue in 
number of work requests
  module_param_named(recv_queue_size, mad_recvq_size, int, 0444);
  MODULE_PARM_DESC(recv_queue_size, Size of receive queue in number of work 
requests);

+int mad_wait_on_busy = 0;
+module_param_named(treat_busy_as_timeout, mad_wait_on_busy, int, 0444);
+MODULE_PARM_DESC(treat_busy_as_timeout, When true, treat BUSY responses as if they 
were timeouts.);
+
  int mad_randomized_wait = 0;
  module_param_named(randomized_wait, mad_randomized_wait, int, 0444);
  MODULE_PARM_DESC(randomized_wait, When true, use a randomized backoff algorithm 
to control retries for timeouts.);
@@ -1120,6 +1124,7 @@ int ib_post_send_mad(struct ib_mad_send_buf *send_buf,

mad_send_wr-max_retries = send_buf-retries;
mad_send_wr-retries_left = send_buf-retries;
+   mad_send_wr-wait_on_busy = send_buf-wait_on_busy || 
mad_wait_on_busy;

send_buf-retries = 0;

@@ -1819,6 +1824,8 @@ static void ib_mad_complete_recv(struct 
ib_mad_agent_private *mad_agent_priv,

/* Complete corresponding request */
if (ib_response_mad(mad_recv_wc-recv_buf.mad)) {
+   u16 busy = 
__be16_to_cpu(mad_recv_wc-recv_buf.mad-mad_hdr.status)
+   IB_MGMT_MAD_STATUS_BUSY;


Should this just use be16_to_cpu be used here for consistency ?

Nit: the definition of IB_MGMT_MAD_STATUS_BUSY should be part of this 
patch rather than the previous one.




spin_lock_irqsave(mad_agent_priv-lock, flags);
mad_send_wr = ib_find_send_mad(mad_agent_priv, mad_recv_wc);
@@ -1829,6 +1836,17 @@ static void ib_mad_complete_recv(struct 
ib_mad_agent_private *mad_agent_priv,
return;
}

+   printk(KERN_DEBUG PFX Completing recv %p: busy = %d, retries_left = 
%d, wait_on_busy = %d\n,
+   mad_send_wr, busy, mad_send_wr-retries_left, 
mad_send_wr-wait_on_busy);
+   if (busy  mad_send_wr-retries_left  
mad_send_wr-wait_on_busy) {


Nit: formatting: if (busy  mad_send_wr-retries_left  
mad_send_wr-wait_on_busy) {



+   /* Just let the query timeout and have it requeued 
later */
+   spin_unlock_irqrestore(mad_agent_priv-lock, flags);
+   ib_free_recv_mad(mad_recv_wc);
+   deref_mad_agent(mad_agent_priv);
+   printk(KERN_INFO PFX SA/SM responded MAD_STATUS_BUSY. 
Allowing request to time out.\n);


Do we need this printk ? Won't this spam the kernel log ?

-- Hal


+

Re: [PATCH] Handling BUSY responses from the SM.

2010-10-28 Thread Hal Rosenstock

On 10/28/2010 9:14 AM, Hal Rosenstock wrote:

A couple more things I missed in my previous post on this:


On 10/26/2010 2:33 PM, Mike Heinz wrote:

Resending. Didn't get any reply after sending the last time.

-Original Message-
From: Mike Heinz
Sent: Monday, October 11, 2010 1:24 PM
To: 'linux-rdma@vger.kernel.org'; 'Hefty, Sean'
Cc: Todd Rimmer
Subject: [PATCH] Handling BUSY responses from the SM.

This patch builds upon feedback received earlier this year to add a
treat BUSY as timeout feature to ib_mad. It does NOT implement the
ABI/API changes that would be needed in user space to take advantage
of the new feature, but it lays the groundwork for doing so. In
addition, it provides a new module parameter that allow the
administrator to coerce existing code into using the new capability.

 

The patch builds upon the randomization/backoff patch I sent earlier
today to add a random factor to timeouts to prevent synchronized
storms of MAD queries. I chose to build upon the existing timeout
handling because it seemed the best way to add the functionality without


incomplete sentence: without what ?



Initially, I had tried to completely separate BUSY retries from
timeout handling, but that seemed difficult due to the way the timeout
code is structured. As a result, true timeouts and busy handling still
use the same timeout values, but I was still able to address the idea
of randomizing the retry timeout if desired.

By default, the behavior of ib_mad with respect to BUSY responses is
unchanged. If, however, a send work request is provided that has the
new busy_wait parameter set, ib_mad will ignore BUSY responses to
that WR, allowing it to timeout and retry as if no response had been
received.


Is setting this useful without the randomization also set for this (or
all) transaction (request/response) WRs ?


Finally, I've added a module parameter to coerce all mad work requests
to use this new feature:

parm: treat_busy_as_timeout:When true, treat BUSY responses as if they
were timeouts. (int)

As I mentioned in the past, this change solves a problem we see in the
real world all the time (the SM being pounded by unintelligent
queries) so I strongly hope this meets your concerns.



diff --git a/drivers/infiniband/core/mad.c
b/drivers/infiniband/core/mad.c
index 3b03f1c..9e5e566 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -60,6 +60,10 @@ MODULE_PARM_DESC(send_queue_size, Size of send
queue in number of work requests
module_param_named(recv_queue_size, mad_recvq_size, int, 0444);
MODULE_PARM_DESC(recv_queue_size, Size of receive queue in number of
work requests);

+int mad_wait_on_busy = 0;
+module_param_named(treat_busy_as_timeout, mad_wait_on_busy, int, 0444);
+MODULE_PARM_DESC(treat_busy_as_timeout, When true, treat BUSY
responses as if they were timeouts.);
+
int mad_randomized_wait = 0;
module_param_named(randomized_wait, mad_randomized_wait, int, 0444);
MODULE_PARM_DESC(randomized_wait, When true, use a randomized backoff
algorithm to control retries for timeouts.);
@@ -1120,6 +1124,7 @@ int ib_post_send_mad(struct ib_mad_send_buf
*send_buf,

mad_send_wr-max_retries = send_buf-retries;
mad_send_wr-retries_left = send_buf-retries;
+ mad_send_wr-wait_on_busy = send_buf-wait_on_busy || mad_wait_on_busy;

send_buf-retries = 0;

@@ -1819,6 +1824,8 @@ static void ib_mad_complete_recv(struct
ib_mad_agent_private *mad_agent_priv,

/* Complete corresponding request */
if (ib_response_mad(mad_recv_wc-recv_buf.mad)) {
+ u16 busy = __be16_to_cpu(mad_recv_wc-recv_buf.mad-mad_hdr.status)
+ IB_MGMT_MAD_STATUS_BUSY;


Should this just use be16_to_cpu be used here for consistency ?

Nit: the definition of IB_MGMT_MAD_STATUS_BUSY should be part of this
patch rather than the previous one.



spin_lock_irqsave(mad_agent_priv-lock, flags);
mad_send_wr = ib_find_send_mad(mad_agent_priv, mad_recv_wc);
@@ -1829,6 +1836,17 @@ static void ib_mad_complete_recv(struct
ib_mad_agent_private *mad_agent_priv,
return;
}

+ printk(KERN_DEBUG PFX Completing recv %p: busy = %d, retries_left =
%d, wait_on_busy = %d\n,
+ mad_send_wr, busy, mad_send_wr-retries_left,
mad_send_wr-wait_on_busy);
+ if (busy mad_send_wr-retries_left mad_send_wr-wait_on_busy) {


This appears to include trap represses (as determined by 
ib_response_mad). Shouldn't busy be ignored for that case ? I don't 
think that would be used (e.g. trap repress sent w/ busy) but it seems 
safer to me. I think we previously discussed this way back in June.


-- Hal


Nit: formatting: if (busy  mad_send_wr-retries_left 
mad_send_wr-wait_on_busy) {


+ /* Just let the query timeout and have it requeued later */
+ spin_unlock_irqrestore(mad_agent_priv-lock, flags);
+ ib_free_recv_mad(mad_recv_wc);
+ deref_mad_agent(mad_agent_priv);
+ printk(KERN_INFO PFX SA/SM responded MAD_STATUS_BUSY. Allowing
request to time out.\n);


Do we need this printk ? Won't this spam the kernel log ?

-- Hal


+ return;
+ }
+

Re: opensm: Add new torus routing engine: torus-2QoS

2010-10-28 Thread Albino A. Aveleda
Dear Jim,

I have compiled and installed the opensm with torus-2qos.
There is comment below in config file.
---
# We need to tell the routing engine what directions we
# want the torus coordinate directions to be, by specifing
# the endpoints (switch GUID + port) of a link in each
# direction. These links need to share a common switch,
# which we call the torus seed.
# Here we specify positive coordinate directions:
xp_link 0x20  0x200019   # S_0_0_0 - S_1_0_0
yp_link 0x20  0x25   # S_0_0_0 - S_0_1_0
zp_link 0x20  0x21   # S_0_0_0 - S_0_0_1
---

How do I get xp, yp and zp_link address?

Best regards,
Albino

- Jim Schutt jasc...@sandia.gov escreveu:

 
 This posting http://www.spinics.net/lists/linux-rdma/msg02967.html
 has some example input for a 5x5x5 torus.
 
 You'll want to configure your torus (via opensm --torus_config
 file)
 so that the intra-NEM links are z-direction links.  This will allow
 you to swap a QNEM and keep the fabric routable during the process.
 
 Please look over the torus-2QoS section in
 opensm/doc/current-routing.txt
 to see why this is so, and to help understand why the info
 in torus-2QoS.conf is required.
 

-- 

__
Albino A. Aveleda   b...@nacad.ufrj.br
System Engineer   +55 21 2562-8080
NACAD-COPPE/UFRJ
Federal University of Rio de Janeiro (UFRJ) 
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: opensm: Add new torus routing engine: torus-2QoS

2010-10-28 Thread Jim Schutt

On Thu, 2010-10-28 at 07:42 -0600, Albino A. Aveleda wrote:
 Dear Jim,
 
 I have compiled and installed the opensm with torus-2qos.
 There is comment below in config file.
 ---
 # We need to tell the routing engine what directions we
 # want the torus coordinate directions to be, by specifing
 # the endpoints (switch GUID + port) of a link in each
 # direction. These links need to share a common switch,
 # which we call the torus seed.
 # Here we specify positive coordinate directions:
 xp_link 0x20  0x200019   # S_0_0_0 - S_1_0_0
 yp_link 0x20  0x25   # S_0_0_0 - S_0_1_0
 zp_link 0x20  0x21   # S_0_0_0 - S_0_0_1
 ---
 
 How do I get xp, yp and zp_link address?

You can bring up your fabric once with some other routing
engine, say minhop, and run ibnetdiscover.

This will tell you the node GUIDs for all your switches.

Then you need to pick a switch to be the seed.  If you
know that, e.g. your fabric is wired such that port 1,
say, connects to the switch in the direction you want to
be +x, look in your ibnetdiscover output to find the 
node GUID for the switch connected to that port of your
seed switch.

For maximum resiliency, pick the switch that your
opensm host connects to as the seed.

-- Jim

 
 Best regards,
 Albino
 
 - Jim Schutt jasc...@sandia.gov escreveu:
 
  
  This posting http://www.spinics.net/lists/linux-rdma/msg02967.html
  has some example input for a 5x5x5 torus.
  
  You'll want to configure your torus (via opensm --torus_config
  file)
  so that the intra-NEM links are z-direction links.  This will allow
  you to swap a QNEM and keep the fabric routable during the process.
  
  Please look over the torus-2QoS section in
  opensm/doc/current-routing.txt
  to see why this is so, and to help understand why the info
  in torus-2QoS.conf is required.
  
 


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: opensm: Add new torus routing engine: torus-2QoS

2010-10-28 Thread Jim Schutt

On Thu, 2010-10-28 at 09:48 -0600, Hal Rosenstock wrote:
 On Thu, Oct 28, 2010 at 11:05 AM, Jim Schutt jasc...@sandia.gov wrote:
 
  On Thu, 2010-10-28 at 07:42 -0600, Albino A. Aveleda wrote:
  Dear Jim,
 
  I have compiled and installed the opensm with torus-2qos.
  There is comment below in config file.
  ---
  # We need to tell the routing engine what directions we
  # want the torus coordinate directions to be, by specifing
  # the endpoints (switch GUID + port) of a link in each
  # direction. These links need to share a common switch,
  # which we call the torus seed.
  # Here we specify positive coordinate directions:
  xp_link 0x20  0x200019   # S_0_0_0 - S_1_0_0
  yp_link 0x20  0x25   # S_0_0_0 - S_0_1_0
  zp_link 0x20  0x21   # S_0_0_0 - S_0_0_1
  ---
 
  How do I get xp, yp and zp_link address?
 
  You can bring up your fabric once with some other routing
  engine, say minhop, and run ibnetdiscover.
 
 
  This will tell you the node GUIDs for all your switches.
 
 A minor clarification to the above: you don't need to run OpenSM at
 all to run ibnetdiscover to get the switch GUIDs.

D'oh!!  Thanks, Hal.

-- Jim

 
 -- Hal
 
  Then you need to pick a switch to be the seed.  If you
  know that, e.g. your fabric is wired such that port 1,
  say, connects to the switch in the direction you want to
  be +x, look in your ibnetdiscover output to find the
  node GUID for the switch connected to that port of your
  seed switch.
 
  For maximum resiliency, pick the switch that your
  opensm host connects to as the seed.
 
  -- Jim
 
 
  Best regards,
  Albino
 
  - Jim Schutt jasc...@sandia.gov escreveu:
 
  
   This posting http://www.spinics.net/lists/linux-rdma/msg02967.html
   has some example input for a 5x5x5 torus.
  
   You'll want to configure your torus (via opensm --torus_config
   file)
   so that the intra-NEM links are z-direction links.  This will allow
   you to swap a QNEM and keep the fabric routable during the process.
  
   Please look over the torus-2QoS section in
   opensm/doc/current-routing.txt
   to see why this is so, and to help understand why the info
   in torus-2QoS.conf is required.
  
 
 
 
  --
  To unsubscribe from this list: send the line unsubscribe linux-rdma in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: opensm: Add new torus routing engine: torus-2QoS

2010-10-28 Thread Albino A. Aveleda
Thanks everybody.


- Jim Schutt jasc...@sandia.gov escreveu:

 On Thu, 2010-10-28 at 09:48 -0600, Hal Rosenstock wrote:
  On Thu, Oct 28, 2010 at 11:05 AM, Jim Schutt jasc...@sandia.gov
 wrote:
  
   On Thu, 2010-10-28 at 07:42 -0600, Albino A. Aveleda wrote:
   Dear Jim,
  
   I have compiled and installed the opensm with torus-2qos.
   There is comment below in config file.
   ---
   # We need to tell the routing engine what directions we
   # want the torus coordinate directions to be, by specifing
   # the endpoints (switch GUID + port) of a link in each
   # direction. These links need to share a common switch,
   # which we call the torus seed.
   # Here we specify positive coordinate directions:
   xp_link 0x20  0x200019   # S_0_0_0 - S_1_0_0
   yp_link 0x20  0x25   # S_0_0_0 - S_0_1_0
   zp_link 0x20  0x21   # S_0_0_0 - S_0_0_1
   ---
  
   How do I get xp, yp and zp_link address?
  
   You can bring up your fabric once with some other routing
   engine, say minhop, and run ibnetdiscover.
  
  
   This will tell you the node GUIDs for all your switches.
  
  A minor clarification to the above: you don't need to run OpenSM at
  all to run ibnetdiscover to get the switch GUIDs.
 
 D'oh!!  Thanks, Hal.
 
 -- Jim
 
  
  -- Hal
  
   Then you need to pick a switch to be the seed.  If you
   know that, e.g. your fabric is wired such that port 1,
   say, connects to the switch in the direction you want to
   be +x, look in your ibnetdiscover output to find the
   node GUID for the switch connected to that port of your
   seed switch.
  
   For maximum resiliency, pick the switch that your
   opensm host connects to as the seed.
  
   -- Jim
  
  
   Best regards,
   Albino
  
   - Jim Schutt jasc...@sandia.gov escreveu:
  
   
This posting
 http://www.spinics.net/lists/linux-rdma/msg02967.html
has some example input for a 5x5x5 torus.
   
You'll want to configure your torus (via opensm --torus_config
file)
so that the intra-NEM links are z-direction links.  This will
 allow
you to swap a QNEM and keep the fabric routable during the
 process.
   
Please look over the torus-2QoS section in
opensm/doc/current-routing.txt
to see why this is so, and to help understand why the info
in torus-2QoS.conf is required.
   
  
  
  
   --
   To unsubscribe from this list: send the line unsubscribe
 linux-rdma in
   the body of a message to majord...@vger.kernel.org
   More majordomo info at 
 http://vger.kernel.org/majordomo-info.html
  
 

-- 

__
Albino A. Aveleda   b...@nacad.ufrj.br
System Engineer   +55 21 2562-8080
NACAD-COPPE/UFRJ
Federal University of Rio de Janeiro (UFRJ) 
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] DAPL v2.0: common: print out errors on free build and not just debug builds

2010-10-28 Thread Davis, Arlin R

Signed-off-by: Arlin Davis arlin.r.da...@intel.com
---
 dapl/openib_common/dapl_ib_common.h |4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/dapl/openib_common/dapl_ib_common.h 
b/dapl/openib_common/dapl_ib_common.h
index 10b5d22..3cb1fe3 100644
--- a/dapl/openib_common/dapl_ib_common.h
+++ b/dapl/openib_common/dapl_ib_common.h
@@ -336,10 +336,8 @@ dapl_convert_errno( IN int err, IN const char *str )
 {
 if (!err)  return DAT_SUCCESS;

-#if DAPL_DBG
 if ((err != EAGAIN)  (err != ETIMEDOUT))
-   dapl_dbg_log (DAPL_DBG_TYPE_ERR, %s %s\n, str, strerror(err));
-#endif 
+   dapl_log (DAPL_DBG_TYPE_ERR, %s %s\n, str, strerror(err));
 
 switch( err )
 {
-- 
1.7.3



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html