Just realised that I sent this to the SIPWG and not SIP-Implementors.
Resending.

Paul D Smith
Network Protocols Group 
Data Connection Ltd (DCL)
Tel: +44 20 8366 1177  Email: [EMAIL PROTECTED] 
Fax: +44 20 8363 1039  Web:   http://www.dataconnection.com

-----Original Message-----
From: Paul D.Smith 
Sent: 02 September 2004 15:53
To: SIP Public Folder (E-mail)
Subject: ACKs with invalid branch-ids


All,

We have encountered various implementations, both at SIPits and in the
field,
that fail to generate the branch-id in an ACK correctly.  Specifically, we
have
observed ACKs to negative responses with _new_ branch-ids and ACKs to
positive
responses with the same branch-id as the original INVITE, both of which are
in
violation of RFC3261.  This has lead us to be lenient with such partners and
attempt to process the ACK anyway.  Unfortunately we have become aware of
some
scenarios, encountered in the field, where this leads to other
interoperability
problems so we believe we can no longer be lenient with badly formed ACKs.

We therefore believe that manufacturers MUST correct ACK generation code to
ensure that branch-ids are generated correctly in order to ensure
interoperability under these circumstances.

_Overview_

The problem we have encountered is that there are various scenarios under
which
a UAS receives two similar INVITEs as a result of forking or failover of
proxies in the SIP network.  The UAS accepts the first INVITE and then
correctly rejects the second INVITE, which means that the UAS sends a
positive
response and a negative response to very similar INVITEs i.e. same Call-Id,
same CSeq, same From tag, same To tag, but different branch-ids.

The resulting ACKs sent to the UAS can then only be distinguished based on
the
branch-id but if the partner implementation does not generate the branch-id
on
the ACKs correctly, then the ACKs are totally indistinguishable.  Depending
on
the detail of the implementation, this can result in INVITEs failing to
complete, responses being retransmitted because the ACK does not quench
retransmissions, and similar issues.

Below, we present the problem in detail.  We believe that fixing
non-compliant
implementations to strictly follow the ACK branch-id rules in RFC3261 is the
only way to achieve reliable interoperability.  We therefore urge people to
check their own implementations' behavior.  We also propose that the next
version of the SIP specification contains the following clarifying text in
sections 13 and/or 17.1.1.3.

  "The generation of the correct branch-id for an ACK is essential for
correct
   processing under merged request conditions.  Under these conditions,
   a UAS may generate multiple final responses as a result of receiving
   multiple variants of an INVITE as a result of request forking and merging
   within the SIP network.  The UAS then receives multiple ACKs to these
final
   responses and correct ACK branch-ids are necessary for the UAS to
correctly
   match the ACKs to the correct final responses."

_Detail_

The following sections describe the problem in detail, focusing on Merged
Requests, where the problem is most commonly observed, and another scenario
resulting from a failed and restarted proxy during an established dialog.
We
believe they show that implementations cannot be lenient regarding the
branch-ids of received ACKs without introducing other interoperability
problems.

_Merged Requests_

The problem is that after a merged request is detected and a 482 (Loop
Detected)
response is sent, an ACK is received which cannot be unambiguously matched
to
the negative response, as opposed to the probably positive response to the
initially received INVITE.  The following flow clarifies the problem,
indicating branch-ids as b1, b2 etc. and To tags as t1.

UAC                     Proxy1                  Proxy2                  UAS

-INV, b1 --------------->-INV, b1,b2------------------------------------>
[1]
                        |
                        -INV, b1,b3-+
                                     \
[2]
                                      \  
<-180,b1,t1-------------<-------------------------180,b1,b2,t1 ----------
[3]
                                        \      
                                         +------>-INV, b1,b3,b4--------->
[4]
                                             
                        <-482,b1,b3,t1----------<-482,b1,b3,b4,t1--------
[5]

                                                 -ACK,b4,t1------------->
[6]

                        -ACK,b3,t1------------->

<-200,b1,t1-------------<-200,b1,b2,t1-----------------------------------
[7]

-ACK,b5,t1-------------------------------------------------------------->
[8]

[1] The INVITE is forked by proxy1
[2] 100 responses are omitted for clarity.
[3] The UAS responds to the first fork using 180, establishing an early
dialog
[4] Network configuration results in Proxy2 forwarding the second fork to
the
    same UAS as the first fork from Proxy1.  The UAS must therefore detect
and
    handle request merging.
[5] The UAS responds to the second fork using 482, within the context of the
    same early dialog (same To tag, t1)
[6] Proxy2 sends an ACK to the negative response.
[7] The UAS responds to the first fork using a 200, again using the context
of
    the dialog, t1
[8] UAC sends an ACK to the positive response.

Clearly the first ACK is an ACK to the second INVITE because the branch, b4,
matches and the second ACK is an ACK to the first INVITE because it has a
new
unique branch-id.

But if we consider the case where we cannot rely on Proxy2 and/or the UAC
correctly indicating branch b4, then ACK matching becomes more problematic.
The ACKs marked "b4,t1" and that marked "b5,t1" are effectively
indistinguishable and we cannot tell to which transaction they relate.

Aside: Note that the Call-Id, the From tag, the CSeq and the To tag are all
  identical in the two ACK "b4,t1" and "b5,t1".  The only difference is the
  branch, which we have observed cannot be relied on to be correct!

One solution might be to send to the 482 response with a different tag to
the
200, but this does not solve the following, very similar, problem.

_Restarted Proxies_

A similar problem may occur within an existing dialog as a result of a
stateful
proxy failing and then reverting to stateless behavior after restart.  In
this
case, we cannot manipulate the To tag as suggested above, because we already
have a valid To tag (because we are within an existing dialog).

This is an unusual situation but there may be others than can result in 
similar situations occurring.

UAC                     Proxy1                  Proxy2                  UAS

-INV, b1,t1------------->-INV, b1,b2,t1--------->-INV, b1,b2,b3,t1------>
[1]

                        Proxy1 fails
[2]
 
[3]
                                              +-<-200,b1,b2,b3,t1--------
[4]
                                             /
-INV, b1,t1------------->-INV, b1,b4,t1--------------------------------->
[5]
                                           /
                                          /
<-200,b1,t1-------------<-200,b1,b2,t1---+

                                            +----------------------------
[6]
                                           /
-ACK,b5,t1-------------------------------------------------------------->
[7]
                                         / 
<-500,b1,t1-------------<-500,b1,b4,t1--+

                                             
                        -ACK,b3,t1-------------------------------------->
[7]

UAC will probably ignore second final response
[8]

[1] Is is assumed that during initial dialog creation, Proxy1 identified 2
    forks and chose one of these to establish the dialog.
[2] Proxy fails.  Either 100 to UAC is never sent or is lost en route.
[3] 100 responses from UAS and Proxy2 omitted for clarity.
[4] UAS responds to INVITE with 200.
[5] After failover, Proxy1 is stateless and forwards the request via a
    different route from that chosen whilst stateful. UAS therefore
    receives two similar INVITEs and must reject the second INVITE.
[6] UAS rejects second INVITE and Proxy1 passes this back to UAC because
    Proxy1 is now stateless.
[7] Unless branch-ids can be relied upon, ACKs to 200 and 500 cannot be
    distinguished by UAS.
[8] UAC probably just ignores second final response to completed INVITE.

_RFC2543_

The key to distinguishing the ACKs is the correct use of unique branch-ids.
However, RFC2543 does not mandate that branch-ids be unique.  This means
that
it is possible in a predominately RFC2543 compliant network that the
scenarios
given above cannot be handled correctly because there is no way in which the
UAS
can distinguish the various ACKs received.  We are not aware of a possible
solution to the problem under these conditions.

_Conclusion_

There appears to be no manner in which implementations can be lenient
towards
badly behaved systems that generate incorrect ACK branch-ids.  Given this
conclusion, all implementers must act to ensure that their systems generate
ACK branch-ids according to RFC3261.

Paul D Smith
Network Protocols Group 
Data Connection Ltd (DCL)
Tel: +44 20 8366 1177  Email: [EMAIL PROTECTED] 
Fax: +44 20 8363 1039  Web:   http://www.dataconnection.com
_______________________________________________
Sip-implementors mailing list
[EMAIL PROTECTED]
http://lists.cs.columbia.edu/mailman/listinfo/sip-implementors

Reply via email to