On Thu, Dec 25, 2025 at 7:14 PM Aijun Wang <[email protected]>
wrote:
> Hi, Eric and Martin:
>
>
>
> During my presentation in IETF 124 meeting for this service affinity
> proposal, you raised the following questions that can be summarized as the
> followings:
>
> 1) How to assure there is no application data loss during the switch
> take place?
>
> 2) How to keep the application unware about the switchover, because
> the TCP connection switchover?
>
>
>
> Regarding to the question 1), after discussing with some experts offline,
> we think:
> 1.1) TLS is based on TCP, and the TCP layer can assure the application
> data be transferred successfully to the peer, based on the TCP ACK
> mechanism.
> 1.2) The client can decide when to switchover, based on the peer’s TCP ACK.
> 1.3) If some application data comes to the client during the switchover,
> the client can store them temporarily. This is local implementation
> consideration, needs not the extension of the protocol.
>
I'm not sure that this is quite correct about the concerns MT and I
were raising, so let me try to clarify.
Consider a simple HTTP REST API.
Client Server
POST /make-payment ----------------------------------------->
[Process payment]
<------------------------------------------------------ 200 OK
When the client receives a 200 OK, it knows that the request
has been processed. If it doesn't, it has to assume it has not.
For instance, consider the following case:
Client Server
POST /make-payment ----------------------------------------->
[Process payment]
[Server crashes]
[TCP retransmit] -------------------------------------------->
<--------------------------------------------------------- RST
In this case, the client doesn't know what has happened. You need
mechanisms either at the HTTP layer--or more typically at the REST API
layer--to do the right thing, which might be an idempotency layer
combined with client-side retransmit. This is all just a
straightforward application of the end-to-end argument, and there's no
real way around it as long as systems might asynchronously fail, but
it's also a source of defects (think about how many times sites tell
you not to press the submit button twice) because these mechanisms may
not have been exercised or tested. For instance, if the server is high
reliability and the client just assumes that anything it sent works,
that will be good enough a very large fraction of the time, but not if
the server has a high failure rate.
This brings us to your proposed mechanism, in which the server sends
an indication that you should do a switchover, and then the client
does it. The base case of this looks something like the following:
Client Server
<------------------ First TCP/TLS Connection ---------------->
POST /make-payment ----------------------------------------->
[Process payment]
<------------------------------------------------------ 200 OK
<----------------------------------------------- Switch servers
<-------------------- New TCP/TLS Connection ---------------->
POST /something-else ---------------------------------------->
<------------------------------------------------------ 200 OK
This is fine, but now consider what happens if the server sends its
request to switch immediately and so the client's request and the
server's request to switch cross in transit, like so:
Client Server
<------------------ First TCP/TLS Connection ---------------->
POST /make-payment ---------\ /---------------- Switch servers
X
<---------------------------/ \------------------------------>
Until the client receives the 200 OK, it has no way of knowing that
the server has processed the request, so the right thing to do is is
to wait until that 200 OK is received, because otherwise you have
created exactly the issue discussed above, where you're counting on
the application retry logic to be right on both the client and the
server; and because this happens a lot we're risking a lot of new
errors.
Unfortunately, these transaction semantics only exist at the HTTP
layer, not the TLS layer, so the TLS layer has no way of knowing to
wait for the 200 OK, it just knows that the client sent some data, but
not whether that reflects an outstanding request or something else;
recall that TLS doesn't even know about the HTTP request/response
semantics, because it's just a dumb pipe.
In your email, you suggest that the client ought to:
1. Wait for the server's TCP ACK of all transmitted data, with the
implied semantics being that once the message is ACKed it will be
reliably delivered to the server, not just to the TCP stack.
2. Buffer any data it receives form the cleint while waiting for the
ACK and retransmit it on the new connections.
This is a pretty big layering violation, but I don't believe that this
will work either, because we don't know that the client's flight was
complete. Suppose instead that we have a situation where the client's
request is in two parts and the first one and the server's request
cross in transmission. In that case, we might get this:
Client Server
<------------------ First TCP/TLS Connection ---------------->
POST /make-payment (1/2) ---\ /---------------- Switch servers
X
<---------------------------/ \------------------------------>
[Buffer /make-payment (2/2)]
<-------------------------------------------------------- ACK
<-------------------- New TCP/TLS Connection ---------------->
/make-payment (2/2) ----------------------------------------->
In this case, the server won't process the first half of the request
(probably), and the second half of the request isn't even validly
formed HTTP, so we've got a mess. The only thing that will work
here is that the client retries the POST in totality, but we're back
to the situation above, and in neither case do we want to buffer
and re-send the second half of the request.
Here too, the problem is that the semantics are only known at the HTTP
layer, and the TLS layer doesn't know them, so it can't do the right
thing. By contrast, if you do the switchover at the application layer,
it can pick an appropriate timepoint where there is no ambiguity
about the client and server state and proceed appropriately.
-Ekr
> Regarding to question 2), we think:
> 2.1) The application runs on the TLS layer. If the TLS layer doesn’t
> notify the application during the switchover, the application needs not be
> aware for the underlying switchover
> 2.2) TLS layer should only keep the switchover as quickly as possible, for
> example, reuse the previous negotiated shared key etc, as we proposed in
> the draft.
> 2.3) The application needs just send the data to the same TLS instance.
>
> How about your concerns regarding to the above explanation?
>
> Should we include them into the document for further clarify if the above
> considerations is reasonable?
>
>
>
> Happy holiday to you all!
>
>
>
> Best Regards
>
>
>
> Aijun Wang
>
> China Telecom
>
>
>
> *From:* [email protected] [mailto:[email protected]]
> *On Behalf Of *Eric Rescorla
> *Sent:* Tuesday, October 28, 2025 4:13 AM
> *To:* Mohit Sahni <[email protected]>
> *Cc:* Aijun Wang <[email protected]>; [email protected];
> [email protected]; Aijun Wang <
> [email protected]>
> *Subject:* [TLS] Re: FW: New Version Notification for
> draft-wang-tls-service-affinity-00.txt
>
>
>
>
>
>
>
> On Mon, Oct 27, 2025 at 1:02 PM Mohit Sahni <[email protected]>
> wrote:
>
> Hi Eric,
>
> One concern regarding using HTTP Alt Svc is that this limits the solution
> to HTTP based application, however TLS based solution helps with other
> application protocols too e.g. FTP or SMTP or any other protocol that uses
> STARTTLS construct.
>
>
>
> Yes, I'm aware of that. However, for the reasons Indicated in my response
> upstream, I think it's a feature that it happen at the application layer in
> a protocol-specific fashion.
>
>
>
> -Ekr
>
>
>
>
>
> -Mohit
>
>
>
> On Sun, Oct 26, 2025 at 7:55 PM Aijun Wang <[email protected]>
> wrote:
>
> Hi, Eric:
>
> Thanks for your comments.
> Your understanding of the overall procedure for this proposal is correct.
>
> But, as indicated by Usama and replied by Mohit, the detail procedures in
> Figure 2 of this document should be based on TLS 1.3
> https://urldefense.proofpoint.com/v2/url?u=https-3A__datatracker.ietf.org_doc_html_rfc8446-23section-2D2&d=DwIFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=J7DgfMyeL26OZuy8d3qTy_h24Ff1NatxSKMgDUj2Kxg&m=S278vH9k736nF13K7hekoC9UmWiLbx5bPpySG6AG0wl-GJWmZBEH76RXKh178Prx&s=B0_YVjIgvDRP9AWMrgVcHGU594aeWIXGEZAZqvD8Liw&e=
> If there is any misunderstanding due to the above ignorance, let's discuss
> further based on our future update based on TLS 1.3
>
> Anyway, I try to explain our considerations in more detail inline below.
>
> Best Regards
>
> Aijun Wang
> China Telecom
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]]
> On Behalf Of 【外部账号】 Eric Rescorla
> Sent: Saturday, October 25, 2025 1:24 AM
> To: Aijun Wang <[email protected]>
> Cc: [email protected]; [email protected]
> Subject: Re: [TLS] FW: New Version Notification for
> draft-wang-tls-service-affinity-00.txt
>
> Document: draft-wang-tls-service-affinity-00.txt
>
> I'm a little confused about the requirements driving this design.
>
> At a high level, it seems to me that you have the following set of
> events:
>
> 1. The client connects to the server using TLS via an anycast address
> A1.
> 2. The server tells the client that it can/should be reached
> at a new non-anycast address A2.
> 3. The client reconnects to the server at A2.
> 【WAJ】Yes
>
> I would make several points.
>
> First, the mechanism you propose seems heavyweight for this purpose.
> In particular, I don't understand why you need any authentication at all
> for the new address indication (the MigrationToken) because the client is
> going to authenticate to the server via normal TLS mechanisms. Recall that
> TLS is designed for a Dolev-Yao style attacker and doesn't trust the
> network at all, including the binding of DNS name to IP address; even if
> the client were provided with a completely false IP address for the server
> this would not allow impersonation of the server.
> 【WAJ】The "Migration_Token" is manly used to bind the new connection to
> the previous session.
>
> Second, I don't understand why you need the server to validate the
> MigrationToken. What properties are being bound to this token? It seems
> better to just bind whatever properties those are into the session ticket
> and treat this as a new connection.
> 【WAJ】The main properties in "Migration_Token" is the session_id, which
> can be used to lookup the previous negotiated PSK. Such design can
> eradicate the new PSK negotiation procedure.
>
> Third, I'm skeptical that the TLS layer is the right place to do this kind
> of migration, because you have race conditions where one side initiates a
> migration and the other side has outstanding data which will never be
> processed. These kinds of issues need to be resolved at the application
> layer, which is also a more convenient layer to initiate migration.
> 【WAJ】The initial purpose is to switch the address ASAP. There may be some
> race conditions(would you like to illustrate some?) and extra signal may be
> necessary later to refine the switchover.
>
> Overall, ISTM that a better design would be to just use something like
> HTTP Alt-Svc to steer the client to a different address, rather than doing
> this at the TLS layer. If you disagree, I think it would be helpful to
> explain the requirements that lead to this design.
> 【WAJ】Before proposing the switchover at TLS layer, we have analyzed the
> other possible solutions, for example, via application load balance, http
> redirection and DNS redirection(please review
> https://urldefense.proofpoint.com/v2/url?u=https-3A__datatracker.ietf.org_doc_html_draft-2Dwang-2Dtls-2Dservice-2Daffinity-2D00-23name-2Dintroduction&d=DwIFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=J7DgfMyeL26OZuy8d3qTy_h24Ff1NatxSKMgDUj2Kxg&m=S278vH9k736nF13K7hekoC9UmWiLbx5bPpySG6AG0wl-GJWmZBEH76RXKh178Prx&s=wjYYFdlQkm_hyIPH5wlCSjErLDcYyWrJ_FkapLN_7k0&e=
> ).
> The reason that we propose the switchover at TLS layer, due to the
> optimization selection decision is made at the network itself(together with
> the availability of server resource), not at the application layer. The
> application is difficult to know which is the best server that can match
> the client's QoS requirements.(we call it the combination optimization
> process, which is the goal of the CATS WG).
> And, actually, QUIC has also such migration process:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__datatracker.ietf.org_doc_html_rfc9000-23name-2Dconnection-2Dmigration&d=DwIFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=J7DgfMyeL26OZuy8d3qTy_h24Ff1NatxSKMgDUj2Kxg&m=S278vH9k736nF13K7hekoC9UmWiLbx5bPpySG6AG0wl-GJWmZBEH76RXKh178Prx&s=IhoM_hlC_ptCYdMMOa72ZAogUt3qS7ywnXCGgH8gpWA&e=
>
>
> -Ekr
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Tue, Oct 21, 2025 at 2:10 AM Aijun Wang <[email protected]>
> wrote:
>
> > Hi, All:
> >
> > We have submitted one new draft regarding to the service affinity
> > function for TLS based application.
> > We are also applying the time slot for the presentation on the coming
> > IETF
> > 124 meeting.
> >
> > Wish to get your comments/suggestions on this topic before the
> > meeting, and we can also discuss further during the on-site meeting.
> >
> > Best Regards
> >
> > Aijun Wang
> > China Telecom
> >
> > -----Original Message-----
> > From: [email protected] [mailto:[email protected]]
> > Sent: Friday, October 17, 2025 4:34 PM
> > To: Aijun Wang <[email protected]>; Ketul Sheth <
> > [email protected]>; Mohit Sahni
> > <[email protected]>; Wei Wang <[email protected]>
> > Subject: New Version Notification for
> > draft-wang-tls-service-affinity-00.txt
> >
> > A new version of Internet-Draft draft-wang-tls-service-affinity-00.txt
> > has been successfully submitted by Wei Wang and posted to the IETF
> repository=
>
>
_______________________________________________
TLS mailing list -- [email protected]
To unsubscribe send an email to [email protected]