Re: [openib-general] [ANNOUNCE] Contribute RDS(ReliableDatagramSockets) to OpenIB

Nitin Hande Fri, 11 Nov 2005 13:01:47 -0800

Michael Krause wrote:

At 10:28 AM 11/9/2005, Rick Frank wrote:
Yes, the application is responsible for detecting lost msgs at theapplication level - the transport can not do this.RDS does not guarantee that a message has been delivered to theapplication - just that once the transport has accepted a msg it willdeliver the msg to the remote node in order without duplication -dealing with retransmissions, etc due to sporadic / intermittent msgloss over the interconnect. If after accepting the send - the currentpath fails - then RDS will transparently fail over to another path -and if required will resend / send any already queued msgs to theremote node - again insuring that no msg is duplicated and they are inorder. This is no different than APM - with the exception that RDScan do this across HCAs.The application - Oracle in this case - will deal with detecting acatastrophic path failure - either due to a send that does not arriveand or a timedout response or send failure returned from thetransport. If there is no network path to a remote node - it isrequired that we remove the remote node from the operating cluster toavoid what is commonly termed as a "split brain" condition - otherwiseknown as a "partition in time".BTW - in our case - the application failure domain logic is the samewhether we are using UDP / uDAPL / iTAPI / TCP / SCTP / etc.Basically, if we can not talk to a remote node - after some definedperiod of time - we will remove the remote node from the cluster. Inthis case the database will recover all the interesting state that mayhave been maintained on the removed node - allowing the remainingnodes to continue. If later on, communication to the remote node isrestored - it will be allowed to rejoin the cluster and take onapplication load.
Please clarify the following which was in the document provided by Oracle.
On page 3 of the RDS document, under the section "RDP Interface", the2nd and 3rd paragraphs are state:
* RDP does not guarantee that a datagram is delivered to the remoteapplication.* It is up to the RDP client to deal with datagrams lost due totransport failure or remote application failure.
The HCA is still a fault domain with RDS - it does not address flushingdata out of the HCA fault domain, nor does it sound like it ensures thatCQE loss is recoverable.
I do believe RDS will replay all of the sendmsg's that it believes arepending, but it has no way to determine if already sent sendmsgs wereactually successfully delivered to the remote application unless itprovides some level of resync of the outstanding sends not completedfrom an application's perspective as well as any state updated via RDMAoperations which may occur without an explicit send operation to flushto a known state.

If RDS could define a mechanism that the application could use toinform the sender to resync and replay on catastrophic failure, isthat a correct understanding of your suggestion ?


I'm still trying to ascertain whether RDS completely

recovers from HCA failure (assuming there is another HCA / pathavailable) between the two endnodes

Reading at the doc and the thread, it looks like we need src/dst portfor multiplexing connections, we need seq/ack# for resyncing, we needsome kind of window availability for flow control. Are'nt we veryclose to tcp header ? ..


Nitin

.


Mike


------------------------------------------------------------------------

_______________________________________________
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


_______________________________________________
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [ANNOUNCE] Contribute RDS(ReliableDatagramSockets) to OpenIB

Reply via email to