RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers insteadof own

2005-06-02 Thread James Lentini



On Tue, 31 May 2005, Tom Duffy wrote:


On Tue, 2005-05-31 at 14:34 -0700, Sean Hefty wrote:

Sean,

Is there any way of requesting an infinite number of retries?


There is not, but nothing prevents a user from simply re-issuing a request
after it times out.


Infinite retries inside the kernel does not sound like a good idea.  How
would you break it?  At least we should have some sort of exponential
backoff to prevent flooding the network.


We want a value that the consumer can pass as the timeout that will 
give the protocol enough time to connect regardless of network 
topology. If the value is a fixed period of time, say 500 us, there 
will be some network configuration that it won't work on.


The infinite value allows a user to give the CM protocol all the time 
it needs to work. If it fails for some other reason (like the remote 
node is down) then the connection attempt should fail.


So I think we should do a few things:

- rename this constant
- define its value as 0
- upon seeing it, assign a CM timeout value that allows for a the
  protocol to complete

We don't need to do all that on this first pass though. Let's just get 
the timeout value working in the general case and then we can worry 
about DAT_TIMEOUT_INFINITE and what it means.


james
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers insteadof own

2005-05-31 Thread James Lentini


Tom,

We should attempt to connect for no less than dat_ep_connect's timeout 
value. We don't need to guarantee that the connection attempts will 
last for exactly a specific time.


Sean,

Is there any way of requesting an infinite number of retries?

On Fri, 27 May 2005, Sean Hefty wrote:


From a CM perspective, this sounds fine.  Note that the CM timeout will

not

occur until the number of retries has been met.  So I don't know if the
timeout passed to dapl_ep_connect() should convert directly into the
remote_cm_response_timeout, or needs to be divided by the number of

retries.

So, are you saying that if you have a timeout of 4 seconds (you pass in
20) and you have retries set to 2, that it will fail after 8 seconds?

James, what is the timeout value passed into dapl_ep_connect mean, the
total timeout time?  Or how much for each retry?


If you pass in a timeout of 4 seconds with retries to 2, the call will
timeout in 12 seconds.  The request will be sent 3 times (2 retries).  I
should also note that the CM timeout includes the packet lifetime (round
trip time) in its timeout calculation, but this should be small.

- Sean


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers insteadof own

2005-05-31 Thread Sean Hefty
Sean,

Is there any way of requesting an infinite number of retries?

There is not, but nothing prevents a user from simply re-issuing a request
after it times out.

- Sean


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers insteadof own

2005-05-31 Thread Tom Duffy
On Tue, 2005-05-31 at 14:34 -0700, Sean Hefty wrote:
 Sean,
 
 Is there any way of requesting an infinite number of retries?
 
 There is not, but nothing prevents a user from simply re-issuing a request
 after it times out.

Infinite retries inside the kernel does not sound like a good idea.  How
would you break it?  At least we should have some sort of exponential
backoff to prevent flooding the network.

-tduffy


signature.asc
Description: This is a digitally signed message part
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers insteadof own

2005-05-27 Thread Sean Hefty
 From a CM perspective, this sounds fine.  Note that the CM timeout will
not
 occur until the number of retries has been met.  So I don't know if the
 timeout passed to dapl_ep_connect() should convert directly into the
 remote_cm_response_timeout, or needs to be divided by the number of
retries.

So, are you saying that if you have a timeout of 4 seconds (you pass in
20) and you have retries set to 2, that it will fail after 8 seconds?

James, what is the timeout value passed into dapl_ep_connect mean, the
total timeout time?  Or how much for each retry?

If you pass in a timeout of 4 seconds with retries to 2, the call will
timeout in 12 seconds.  The request will be sent 3 times (2 retries).  I
should also note that the CM timeout includes the packet lifetime (round
trip time) in its timeout calculation, but this should be small.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general