Re: Handling busy responses from the SA

2010-06-17 Thread Hal Rosenstock
Mike,

On Wed, Jun 16, 2010 at 3:57 PM, Mike Heinz michael.he...@qlogic.com wrote:
 Hal,

 But if the original trap had retries  0, wouldn't resending the trap be what 
 the issuer intended?

I suppose as there's nothing in the IBA spec that precludes using busy
on TrapRepresses although I'd be hard pressed to rationalize using
that particularly for SMP traps.

-- Hal

 I guess I'm confused why treating BUSY as similar to simply never getting a 
 response at all is a bad thing. In my mind, receiving a BUSY response is like 
 getting a busy signal when you call someone on the phone - a sign you need to 
 wait a bit then try again. Similarly, if I call someone and never get an 
 answer my strategy is going to be to wait, then try again.

 -Original Message-
 From: Hal Rosenstock [mailto:hal.rosenst...@gmail.com]
 Sent: Tuesday, June 08, 2010 8:16 PM
 To: Mike Heinz
 Cc: Hefty, Sean; linux-rdma@vger.kernel.org
 Subject: Re: Handling busy responses from the SA

 Mike,

 I'm referring to the receipt of the TrapRepress with busy status.
 Wouldn't your patch cause the original Trap to be resent when retries
 0 ? TrapRepress is essentially a response to Trap and classified as
 such by ib_response_mad. Your proposed patch treats a busy as a
 timeout and can cause retry of the original sent Trap.

 -- Hal

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Handling busy responses from the SA

2010-06-17 Thread Mike Heinz
To be honest, we haven't been able to think of a case where a sender would use 
retries on a trap or a busy on a repress either, but I don't think it would 
hurt to omit represses from the busy handling either.

Would that be acceptable to everyone? To alter the patch to allow BUSY trap 
repress MADs to pass through?

-Original Message-
From: linux-rdma-ow...@vger.kernel.org 
[mailto:linux-rdma-ow...@vger.kernel.org] On Behalf Of Hal Rosenstock
Sent: Thursday, June 17, 2010 9:30 AM
To: Mike Heinz
Cc: Hefty, Sean; linux-rdma@vger.kernel.org; Todd Rimmer
Subject: Re: Handling busy responses from the SA

Mike,

On Wed, Jun 16, 2010 at 3:57 PM, Mike Heinz michael.he...@qlogic.com wrote:
 Hal,

 But if the original trap had retries  0, wouldn't resending the trap be what 
 the issuer intended?

I suppose as there's nothing in the IBA spec that precludes using busy
on TrapRepresses although I'd be hard pressed to rationalize using
that particularly for SMP traps.

-- Hal

 I guess I'm confused why treating BUSY as similar to simply never getting a 
 response at all is a bad thing. In my mind, receiving a BUSY response is like 
 getting a busy signal when you call someone on the phone - a sign you need to 
 wait a bit then try again. Similarly, if I call someone and never get an 
 answer my strategy is going to be to wait, then try again.

 -Original Message-
 From: Hal Rosenstock [mailto:hal.rosenst...@gmail.com]
 Sent: Tuesday, June 08, 2010 8:16 PM
 To: Mike Heinz
 Cc: Hefty, Sean; linux-rdma@vger.kernel.org
 Subject: Re: Handling busy responses from the SA

 Mike,

 I'm referring to the receipt of the TrapRepress with busy status.
 Wouldn't your patch cause the original Trap to be resent when retries
 0 ? TrapRepress is essentially a response to Trap and classified as
 such by ib_response_mad. Your proposed patch treats a busy as a
 timeout and can cause retry of the original sent Trap.

 -- Hal

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Handling busy responses from the SA

2010-06-16 Thread Mike Heinz
Hal,

But if the original trap had retries  0, wouldn't resending the trap be what 
the issuer intended?

I guess I'm confused why treating BUSY as similar to simply never getting a 
response at all is a bad thing. In my mind, receiving a BUSY response is like 
getting a busy signal when you call someone on the phone - a sign you need to 
wait a bit then try again. Similarly, if I call someone and never get an answer 
my strategy is going to be to wait, then try again. 

-Original Message-
From: Hal Rosenstock [mailto:hal.rosenst...@gmail.com] 
Sent: Tuesday, June 08, 2010 8:16 PM
To: Mike Heinz
Cc: Hefty, Sean; linux-rdma@vger.kernel.org
Subject: Re: Handling busy responses from the SA

Mike,

I'm referring to the receipt of the TrapRepress with busy status.
Wouldn't your patch cause the original Trap to be resent when retries
 0 ? TrapRepress is essentially a response to Trap and classified as
such by ib_response_mad. Your proposed patch treats a busy as a
timeout and can cause retry of the original sent Trap.

-- Hal


Re: Handling busy responses from the SA

2010-06-08 Thread Hal Rosenstock
Mike,

On Mon, Jun 7, 2010 at 12:00 PM, Mike Heinz michael.he...@qlogic.com wrote:
 Hal said:
 Should a busy be retried at all at the mad layer ? Is a special longer) 
 timeout policy for busy needed ?

 Also, should this be done for all MADs classified by ib_response_mad (e.g. 
 trap represses) ?

 Hal,

 The idea of processing BUSY responses in the MAD layer is to BUSY responses 
 like timeouts - which are currently handled by the MAD layer. Right now there 
 is an issue where various apps and ULPs either treat BUSY as a cause to 
 immediately retry or as a permanent error. This doesn't seem to affect users 
 of the OpenSM so much because (as I understand it) the OpenSM seems to 
 discard requests when it gets too busy - but for other SA/SMs, it can cause a 
 major packet storm or, worse, a simple loss of connectivity where MPI jobs or 
 kernel ULPs simply assume the SA is broken because they got a BUSY reply.

 By treating the BUSY reply as a timeout, we're actually simplifying matters 
 by fitting into existing practice.

Understood. Timing these out makes sense to me but still does not
preclude the client from potentially handling this if the retries
fail.

 As for needing a longer timeout - in our old proprietary stack, QLogic did 
 have a longer timeout for retrying busy replies than for normal timeouts

How much longer ? What are the two timeouts used ?

 - but we should try to get this in now so we can get some relief before we 
 begin the long term discussion of the best way to handle this issue overall.

All I was getting at here was: does retrying when busy work ? If not,
why retry at all at the MAD layer (regardless of retries requested)
and perhaps use a longer timeout for this. If it does work, maybe the
timeout on the subsequent retries should be extended.

I think my two other comments on details are relevant to an updated patch.

-- Hal
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Handling busy responses from the SA

2010-06-08 Thread Hefty, Sean
 As for needing a longer timeout - in our old proprietary stack, QLogic did
 have a longer timeout for retrying busy replies than for normal timeouts -
 but we should try to get this in now so we can get some relief before we
 begin the long term discussion of the best way to handle this issue
 overall.

Because applications may handle BUSY replies differently, we shouldn't simply 
start hiding them from the user.  I would much rather agree on the longer term 
plan, so that the ABI can reflect the proper semantics.  I don't see any issue 
with changing the current behavior for kernel clients, however.

- Sean


RE: Handling busy responses from the SA

2010-06-08 Thread Mike Heinz
Sean said,

 Because applications may handle BUSY replies differently, we shouldn't simply 
 start hiding them from the user.  

Sean - remember that this patch will still return a BUSY status to the caller, 
if retries are exhausted and the last return code was BUSY, then that's what 
the caller will get. Thus, code which sets retries to zero will not be affected 
by this patch at all.

Hal said,

 All I was getting at here was: does retrying when busy work ? If not,
 why retry at all at the MAD layer (regardless of retries requested)
 and perhaps use a longer timeout for this. If it does work, maybe the
 timeout on the subsequent retries should be extended.

Personally, I think it's been extremely helpful - we've been using busy status 
to tell compute nodes to slow down since our old proprietary stack and we've 
seen a significant improvement in overall traffic congestion when we added this 
patch to OFED clusters using our SM. In addition use of the BUSY return code 
simplifies debugging traffic congestion problems (since it allows you to 
immediately differentiate between SA overload and other traffic issues) and it 
paves the way for more sophisticated back-off strategies in the future.

As to that, and your question, our old stack used two different timeout values 
specified by the client. One value was for actual timeouts and one for busy 
responses. In the case of busy responses, we added a randomization factor to 
spread out the traffic.

This issue with adapting that to the Linux-RDMA stack is that it's an API 
change. What I would suggest personally, is something like this:

1. Take either the timeout passed by the caller OR a predefined constant, 
whichever is larger. I would suggest setting the predefined constant to 
something moderate, say 2 seconds.
2. Add a randomization factor - say between -250 and +250 ms?
3. Update the packet timeout with this new value.


N�r��yb�X��ǧv�^�)޺{.n�+{��ٚ�{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�mzZ+�ݢj��!�i

RE: Handling busy responses from the SA

2010-06-08 Thread Mike Heinz
Anyone know why my messages are being appended with interesting garbage?

-Original Message-
From: linux-rdma-ow...@vger.kernel.org 
[mailto:linux-rdma-ow...@vger.kernel.org] On Behalf Of Mike Heinz
Sent: Tuesday, June 08, 2010 11:49 AM
To: Hal Rosenstock
Cc: linux-rdma@vger.kernel.org
Subject: RE: Handling busy responses from the SA

N�r��y���b�X��ǧv�^�)޺{.n�+{��ٚ�{ay�ʇڙ�,j
��f���h���z��w���
���j:+v���w�j�m
zZ+�ݢj��!�i


RE: Handling busy responses from the SA

2010-06-08 Thread Hefty, Sean
 Anyone know why my messages are being appended with interesting garbage?

I get that too.  I first noticed it a couple of weeks ago.  It eventually went 
back to the normal 'To unsubscribe from this list' message.


RE: Handling busy responses from the SA

2010-06-08 Thread Hefty, Sean
 Sean - remember that this patch will still return a BUSY status to the
 caller, if retries are exhausted and the last return code was BUSY, then
 that's what the caller will get. Thus, code which sets retries to zero will
 not be affected by this patch at all.

It looks like it only returns the BUSY response if that matches with the last 
retry, otherwise, the BUSY response is dropped.  It also looks like it applies 
to all MADs, including vendor specific ones, and not just those from the SA.

- Sean


RE: Handling busy responses from the SA

2010-06-08 Thread Mike Heinz
Right. Effectively this is similar to the I/O resolution timeout policy laid 
out in the spec.

-Original Message-
From: Hefty, Sean [mailto:sean.he...@intel.com] 
Sent: Tuesday, June 08, 2010 12:27 PM
To: Mike Heinz; Hal Rosenstock
Cc: linux-rdma@vger.kernel.org
Subject: RE: Handling busy responses from the SA

 Sean - remember that this patch will still return a BUSY status to the
 caller, if retries are exhausted and the last return code was BUSY, then
 that's what the caller will get. Thus, code which sets retries to zero will
 not be affected by this patch at all.

It looks like it only returns the BUSY response if that matches with the last 
retry, otherwise, the BUSY response is dropped.  It also looks like it applies 
to all MADs, including vendor specific ones, and not just those from the SA.

- Sean


RE: Handling busy responses from the SA

2010-06-08 Thread Mike Heinz
Sean -

Is there case where we would ever want to treat BUSY responses differently from 
timeouts?



-Original Message-
From: Hefty, Sean [mailto:sean.he...@intel.com] 
Sent: Tuesday, June 08, 2010 12:27 PM
To: Mike Heinz; Hal Rosenstock
Cc: linux-rdma@vger.kernel.org
Subject: RE: Handling busy responses from the SA

 Sean - remember that this patch will still return a BUSY status to the
 caller, if retries are exhausted and the last return code was BUSY, then
 that's what the caller will get. Thus, code which sets retries to zero will
 not be affected by this patch at all.

It looks like it only returns the BUSY response if that matches with the last 
retry, otherwise, the BUSY response is dropped.  It also looks like it applies 
to all MADs, including vendor specific ones, and not just those from the SA.

- Sean


RE: Handling busy responses from the SA

2010-06-08 Thread Hefty, Sean
 Is there case where we would ever want to treat BUSY responses differently
 from timeouts?

I doubt it for a single MAD, but I can't say what people may have implemented.  
The main difference I can think of is that a busy response requires a retry, 
whereas a timeout does not.  This affects the retry policy when multiple MADs 
are outstanding.  E.g. if there are 10 requests outstanding and the first times 
out, we may only resend the first request and increase the timeouts of the 
other 9.  If the 10 requests all receive a busy, then they must all be retried.

To me, it looks like it makes more sense to never send busy, except maybe when 
receive buffer space is full consumed, but implement a more intelligent 
timeout/retry mechanism on the sender side.  The SA almost needs some sort of 
MRA like message.

- Sean


Re: Handling busy responses from the SA

2010-06-08 Thread Hal Rosenstock
On Tue, Jun 8, 2010 at 12:27 PM, Hefty, Sean sean.he...@intel.com wrote:
 Sean - remember that this patch will still return a BUSY status to the
 caller, if retries are exhausted and the last return code was BUSY, then
 that's what the caller will get. Thus, code which sets retries to zero will
 not be affected by this patch at all.

 It looks like it only returns the BUSY response if that matches with the last 
 retry, otherwise, the BUSY response is dropped.  It also looks like it 
 applies to all MADs, including vendor specific ones, and not just those from 
 the SA.

Per the proposed patch, it currently includes trap represses (as
determined by ib_response_mad). Shouldn't busy be ignored for that
case ? I don't think that would be used but it seems safer to me.

-- Hal


 - Sean

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Handling busy responses from the SA

2010-06-08 Thread Mike Heinz
Hal,

I may be confused - but I thought the spec said there was no valid response to 
a trap repress. I interpreted

o14-3.a4: The SMA shall not send any message in response to a valid 
SubnTrapRepress() message

to mean that the SMA isn't allowed to respond with a BUSY status for a trap 
repress.

-Original Message-
From: Hal Rosenstock [mailto:hal.rosenst...@gmail.com] 
Sent: Tuesday, June 08, 2010 3:09 PM
To: Hefty, Sean
Cc: Mike Heinz; linux-rdma@vger.kernel.org
Subject: Re: Handling busy responses from the SA

On Tue, Jun 8, 2010 at 12:27 PM, Hefty, Sean sean.he...@intel.com wrote:
 Sean - remember that this patch will still return a BUSY status to the
 caller, if retries are exhausted and the last return code was BUSY, then
 that's what the caller will get. Thus, code which sets retries to zero will
 not be affected by this patch at all.

 It looks like it only returns the BUSY response if that matches with the last 
 retry, otherwise, the BUSY response is dropped.  It also looks like it 
 applies to all MADs, including vendor specific ones, and not just those from 
 the SA.

Per the proposed patch, it currently includes trap represses (as
determined by ib_response_mad). Shouldn't busy be ignored for that
case ? I don't think that would be used but it seems safer to me.

-- Hal


 - Sean

N�r��yb�X��ǧv�^�)޺{.n�+{��ٚ�{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�mzZ+�ݢj��!�i

Re: Handling busy responses from the SA

2010-06-08 Thread Roland Dreier
  Is there case where we would ever want to treat BUSY responses
  differently from timeouts?

If there isn't then it's silly for the SA to ever send a BUSY response.

 - R.
-- 
Roland Dreier rola...@cisco.com || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Handling busy responses from the SA

2010-06-08 Thread Hal Rosenstock
Mike,

On Tue, Jun 8, 2010 at 3:59 PM, Mike Heinz michael.he...@qlogic.com wrote:
 Hal,

 I may be confused - but I thought the spec said there was no valid response 
 to a trap repress. I interpreted

 o14-3.a4: The SMA shall not send any message in response to a valid 
 SubnTrapRepress() message

 to mean that the SMA isn't allowed to respond with a BUSY status for a trap 
 repress.

I'm referring to the receipt of the TrapRepress with busy status.
Wouldn't your patch cause the original Trap to be resent when retries
 0 ? TrapRepress is essentially a response to Trap and classified as
such by ib_response_mad. Your proposed patch treats a busy as a
timeout and can cause retry of the original sent Trap.

-- Hal


 -Original Message-
 From: Hal Rosenstock [mailto:hal.rosenst...@gmail.com]
 Sent: Tuesday, June 08, 2010 3:09 PM
 To: Hefty, Sean
 Cc: Mike Heinz; linux-rdma@vger.kernel.org
 Subject: Re: Handling busy responses from the SA

 On Tue, Jun 8, 2010 at 12:27 PM, Hefty, Sean sean.he...@intel.com wrote:
 Sean - remember that this patch will still return a BUSY status to the
 caller, if retries are exhausted and the last return code was BUSY, then
 that's what the caller will get. Thus, code which sets retries to zero will
 not be affected by this patch at all.

 It looks like it only returns the BUSY response if that matches with the 
 last retry, otherwise, the BUSY response is dropped.  It also looks like it 
 applies to all MADs, including vendor specific ones, and not just those from 
 the SA.

 Per the proposed patch, it currently includes trap represses (as
 determined by ib_response_mad). Shouldn't busy be ignored for that
 case ? I don't think that would be used but it seems safer to me.

 -- Hal


 - Sean


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Handling busy responses from the SA

2010-06-07 Thread Mike Heinz
Hal said:
Should a busy be retried at all at the mad layer ? Is a special longer) 
timeout policy for busy needed ?

Also, should this be done for all MADs classified by ib_response_mad (e.g. trap 
represses) ?

Hal, 

The idea of processing BUSY responses in the MAD layer is to BUSY responses 
like timeouts - which are currently handled by the MAD layer. Right now there 
is an issue where various apps and ULPs either treat BUSY as a cause to 
immediately retry or as a permanent error. This doesn't seem to affect users of 
the OpenSM so much because (as I understand it) the OpenSM seems to discard 
requests when it gets too busy - but for other SA/SMs, it can cause a major 
packet storm or, worse, a simple loss of connectivity where MPI jobs or kernel 
ULPs simply assume the SA is broken because they got a BUSY reply.

By treating the BUSY reply as a timeout, we're actually simplifying matters by 
fitting into existing practice.

As for needing a longer timeout - in our old proprietary stack, QLogic did have 
a longer timeout for retrying busy replies than for normal timeouts - but we 
should try to get this in now so we can get some relief before we begin the 
long term discussion of the best way to handle this issue overall.