RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers insteadof own
On Tue, 31 May 2005, Tom Duffy wrote: On Tue, 2005-05-31 at 14:34 -0700, Sean Hefty wrote: Sean, Is there any way of requesting an infinite number of retries? There is not, but nothing prevents a user from simply re-issuing a request after it times out. Infinite retries inside the kernel does not sound like a good idea. How would you break it? At least we should have some sort of exponential backoff to prevent flooding the network. We want a value that the consumer can pass as the timeout that will give the protocol enough time to connect regardless of network topology. If the value is a fixed period of time, say 500 us, there will be some network configuration that it won't work on. The infinite value allows a user to give the CM protocol all the time it needs to work. If it fails for some other reason (like the remote node is down) then the connection attempt should fail. So I think we should do a few things: - rename this constant - define its value as 0 - upon seeing it, assign a CM timeout value that allows for a the protocol to complete We don't need to do all that on this first pass though. Let's just get the timeout value working in the general case and then we can worry about DAT_TIMEOUT_INFINITE and what it means. james ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers insteadof own
Tom, We should attempt to connect for no less than dat_ep_connect's timeout value. We don't need to guarantee that the connection attempts will last for exactly a specific time. Sean, Is there any way of requesting an infinite number of retries? On Fri, 27 May 2005, Sean Hefty wrote: From a CM perspective, this sounds fine. Note that the CM timeout will not occur until the number of retries has been met. So I don't know if the timeout passed to dapl_ep_connect() should convert directly into the remote_cm_response_timeout, or needs to be divided by the number of retries. So, are you saying that if you have a timeout of 4 seconds (you pass in 20) and you have retries set to 2, that it will fail after 8 seconds? James, what is the timeout value passed into dapl_ep_connect mean, the total timeout time? Or how much for each retry? If you pass in a timeout of 4 seconds with retries to 2, the call will timeout in 12 seconds. The request will be sent 3 times (2 retries). I should also note that the CM timeout includes the packet lifetime (round trip time) in its timeout calculation, but this should be small. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers insteadof own
Sean, Is there any way of requesting an infinite number of retries? There is not, but nothing prevents a user from simply re-issuing a request after it times out. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers insteadof own
On Tue, 2005-05-31 at 14:34 -0700, Sean Hefty wrote: Sean, Is there any way of requesting an infinite number of retries? There is not, but nothing prevents a user from simply re-issuing a request after it times out. Infinite retries inside the kernel does not sound like a good idea. How would you break it? At least we should have some sort of exponential backoff to prevent flooding the network. -tduffy signature.asc Description: This is a digitally signed message part ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers insteadof own
From a CM perspective, this sounds fine. Note that the CM timeout will not occur until the number of retries has been met. So I don't know if the timeout passed to dapl_ep_connect() should convert directly into the remote_cm_response_timeout, or needs to be divided by the number of retries. So, are you saying that if you have a timeout of 4 seconds (you pass in 20) and you have retries set to 2, that it will fail after 8 seconds? James, what is the timeout value passed into dapl_ep_connect mean, the total timeout time? Or how much for each retry? If you pass in a timeout of 4 seconds with retries to 2, the call will timeout in 12 seconds. The request will be sent 3 times (2 retries). I should also note that the CM timeout includes the packet lifetime (round trip time) in its timeout calculation, but this should be small. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general