Just realised that I sent this to the SIPWG and not SIP-Implementors. Resending.
Paul D Smith Network Protocols Group Data Connection Ltd (DCL) Tel: +44 20 8366 1177 Email: [EMAIL PROTECTED] Fax: +44 20 8363 1039 Web: http://www.dataconnection.com -----Original Message----- From: Paul D.Smith Sent: 02 September 2004 15:53 To: SIP Public Folder (E-mail) Subject: ACKs with invalid branch-ids All, We have encountered various implementations, both at SIPits and in the field, that fail to generate the branch-id in an ACK correctly. Specifically, we have observed ACKs to negative responses with _new_ branch-ids and ACKs to positive responses with the same branch-id as the original INVITE, both of which are in violation of RFC3261. This has lead us to be lenient with such partners and attempt to process the ACK anyway. Unfortunately we have become aware of some scenarios, encountered in the field, where this leads to other interoperability problems so we believe we can no longer be lenient with badly formed ACKs. We therefore believe that manufacturers MUST correct ACK generation code to ensure that branch-ids are generated correctly in order to ensure interoperability under these circumstances. _Overview_ The problem we have encountered is that there are various scenarios under which a UAS receives two similar INVITEs as a result of forking or failover of proxies in the SIP network. The UAS accepts the first INVITE and then correctly rejects the second INVITE, which means that the UAS sends a positive response and a negative response to very similar INVITEs i.e. same Call-Id, same CSeq, same From tag, same To tag, but different branch-ids. The resulting ACKs sent to the UAS can then only be distinguished based on the branch-id but if the partner implementation does not generate the branch-id on the ACKs correctly, then the ACKs are totally indistinguishable. Depending on the detail of the implementation, this can result in INVITEs failing to complete, responses being retransmitted because the ACK does not quench retransmissions, and similar issues. Below, we present the problem in detail. We believe that fixing non-compliant implementations to strictly follow the ACK branch-id rules in RFC3261 is the only way to achieve reliable interoperability. We therefore urge people to check their own implementations' behavior. We also propose that the next version of the SIP specification contains the following clarifying text in sections 13 and/or 17.1.1.3. "The generation of the correct branch-id for an ACK is essential for correct processing under merged request conditions. Under these conditions, a UAS may generate multiple final responses as a result of receiving multiple variants of an INVITE as a result of request forking and merging within the SIP network. The UAS then receives multiple ACKs to these final responses and correct ACK branch-ids are necessary for the UAS to correctly match the ACKs to the correct final responses." _Detail_ The following sections describe the problem in detail, focusing on Merged Requests, where the problem is most commonly observed, and another scenario resulting from a failed and restarted proxy during an established dialog. We believe they show that implementations cannot be lenient regarding the branch-ids of received ACKs without introducing other interoperability problems. _Merged Requests_ The problem is that after a merged request is detected and a 482 (Loop Detected) response is sent, an ACK is received which cannot be unambiguously matched to the negative response, as opposed to the probably positive response to the initially received INVITE. The following flow clarifies the problem, indicating branch-ids as b1, b2 etc. and To tags as t1. UAC Proxy1 Proxy2 UAS -INV, b1 --------------->-INV, b1,b2------------------------------------> [1] | -INV, b1,b3-+ \ [2] \ <-180,b1,t1-------------<-------------------------180,b1,b2,t1 ---------- [3] \ +------>-INV, b1,b3,b4---------> [4] <-482,b1,b3,t1----------<-482,b1,b3,b4,t1-------- [5] -ACK,b4,t1-------------> [6] -ACK,b3,t1-------------> <-200,b1,t1-------------<-200,b1,b2,t1----------------------------------- [7] -ACK,b5,t1--------------------------------------------------------------> [8] [1] The INVITE is forked by proxy1 [2] 100 responses are omitted for clarity. [3] The UAS responds to the first fork using 180, establishing an early dialog [4] Network configuration results in Proxy2 forwarding the second fork to the same UAS as the first fork from Proxy1. The UAS must therefore detect and handle request merging. [5] The UAS responds to the second fork using 482, within the context of the same early dialog (same To tag, t1) [6] Proxy2 sends an ACK to the negative response. [7] The UAS responds to the first fork using a 200, again using the context of the dialog, t1 [8] UAC sends an ACK to the positive response. Clearly the first ACK is an ACK to the second INVITE because the branch, b4, matches and the second ACK is an ACK to the first INVITE because it has a new unique branch-id. But if we consider the case where we cannot rely on Proxy2 and/or the UAC correctly indicating branch b4, then ACK matching becomes more problematic. The ACKs marked "b4,t1" and that marked "b5,t1" are effectively indistinguishable and we cannot tell to which transaction they relate. Aside: Note that the Call-Id, the From tag, the CSeq and the To tag are all identical in the two ACK "b4,t1" and "b5,t1". The only difference is the branch, which we have observed cannot be relied on to be correct! One solution might be to send to the 482 response with a different tag to the 200, but this does not solve the following, very similar, problem. _Restarted Proxies_ A similar problem may occur within an existing dialog as a result of a stateful proxy failing and then reverting to stateless behavior after restart. In this case, we cannot manipulate the To tag as suggested above, because we already have a valid To tag (because we are within an existing dialog). This is an unusual situation but there may be others than can result in similar situations occurring. UAC Proxy1 Proxy2 UAS -INV, b1,t1------------->-INV, b1,b2,t1--------->-INV, b1,b2,b3,t1------> [1] Proxy1 fails [2] [3] +-<-200,b1,b2,b3,t1-------- [4] / -INV, b1,t1------------->-INV, b1,b4,t1---------------------------------> [5] / / <-200,b1,t1-------------<-200,b1,b2,t1---+ +---------------------------- [6] / -ACK,b5,t1--------------------------------------------------------------> [7] / <-500,b1,t1-------------<-500,b1,b4,t1--+ -ACK,b3,t1--------------------------------------> [7] UAC will probably ignore second final response [8] [1] Is is assumed that during initial dialog creation, Proxy1 identified 2 forks and chose one of these to establish the dialog. [2] Proxy fails. Either 100 to UAC is never sent or is lost en route. [3] 100 responses from UAS and Proxy2 omitted for clarity. [4] UAS responds to INVITE with 200. [5] After failover, Proxy1 is stateless and forwards the request via a different route from that chosen whilst stateful. UAS therefore receives two similar INVITEs and must reject the second INVITE. [6] UAS rejects second INVITE and Proxy1 passes this back to UAC because Proxy1 is now stateless. [7] Unless branch-ids can be relied upon, ACKs to 200 and 500 cannot be distinguished by UAS. [8] UAC probably just ignores second final response to completed INVITE. _RFC2543_ The key to distinguishing the ACKs is the correct use of unique branch-ids. However, RFC2543 does not mandate that branch-ids be unique. This means that it is possible in a predominately RFC2543 compliant network that the scenarios given above cannot be handled correctly because there is no way in which the UAS can distinguish the various ACKs received. We are not aware of a possible solution to the problem under these conditions. _Conclusion_ There appears to be no manner in which implementations can be lenient towards badly behaved systems that generate incorrect ACK branch-ids. Given this conclusion, all implementers must act to ensure that their systems generate ACK branch-ids according to RFC3261. Paul D Smith Network Protocols Group Data Connection Ltd (DCL) Tel: +44 20 8366 1177 Email: [EMAIL PROTECTED] Fax: +44 20 8363 1039 Web: http://www.dataconnection.com _______________________________________________ Sip-implementors mailing list [EMAIL PROTECTED] http://lists.cs.columbia.edu/mailman/listinfo/sip-implementors
