Over the past week or so I've sent a few messages here on 0RTT but I've been muddling together some separate concerns, and I'd like to split them apart and treat them separately, with some concrete suggestions.
*Resumption and Forward Secrecy* One of the reasons I'm excited about TLS1.3 is the prospect of greater deployment of forward secrecy. Compromise of a credential - like an RSA or ECDSA key - shouldn't crack open a trove of collected data. But a pitfall here has been session tickets and session caches. If an attacker compromises a session ticket encryption key, they can decrypt any sessions encrypted which used that key. If an attacker compromises a session cache, they can decrypt any sessions contained in the cache. In the real world: the former is much worse than the latter. A cache is bounded in size and capacity and has real cost associated with storing entries; it's also more likely to be sharded with "local" caches relying on routing affinity. On the other hand it is cheap and convenient to use a single session ticket encryption key. Widely deployed software (e.g. Apache, nginx ... ) has poor support for key rotation - for example no schedule for rotating keys, no tracking how many times each key is used, no support for multiple keys overlapping in time. It is not surprising to see configurations where the same ticket encryption key has been in use for years; outlasting RSA key lifetimes. Put bluntly; session tickets as deployed in the real world defeat the point of forward secrecy. A compromise of a single credential loads to a catastrophic loss in security for previously collected sessions. Worse of all: users have no way to audit how these keys are being managed. At least it's possible to observe how long an RSA/ECDSA key is in use. An alternative way to go is to restructure session resumption as single-use session resumption IDs. Here's how it would work: * TLS client asks server for 1 ... N session resumption IDs. Both the client and server iterate their PRF/KDFs N times ... deriving the same keys (without exchanging them). The server then nominates N small (e.g. 8 byte) IDs for each set of resumption state. * Each ID is valid for use on a future connection. TLS clients are advised that they MUST use each ID just once; discarding and erasing it upon use during the resumption handshake. Now this balloons the already-costly cost of session caches; by a factor of N. And strangely, that's the point - it structures things so that the cache implementor is incentivized to evict and replace entries. With eviction in place; the regular security model of TLS is also restored. A compromise of the cache only puts future connections at risk, and future connections are generally always at risk due to server compromise. I've built and operated a large CDN, and we use tickets (though we rotate the keys!), and it would be an increase in costs to implement a session cache, though I'd guess it's manageable. But that does suck. My argument here is squarely user-security-centric; TLS tickets are a dangerous sharp edge that implementors keep screwing up. I think they should be blunted, even at the expense of increasing costs for people like me. One might reasonably say that this cost is too much, that forward secrecy isn't worth losing tickets (which are cheap) over. But then, if we are willing to sacrifice FS, why not go back to just encrypting using the server key? that's a lot simpler. And that's how 0RTT is defined, which brings me to ... *0RTT and Safety* 0-RTT had the potential to lower the latency for a lot of web users; which is an awesome goal. Though I'd like to point out that the benefits don't look that great to me compared to keeping connections alive for very long periods of time: Despite the name 0RTT still requires 1RTT (the TCP SYN -> SYN|ACK exchange) before any data can be sent. A long-lived connection doesn't have this problem : and in response to web sockets, IOT, and other shifts, the technology to keep millions of connections open for long periods of times on the server side (and even move live connections between machines) is improving, along with long-lived connection battery-conserving improvements for mobile. A connection that you keep open is "really" 0RTT; the socket is primed for immediate I/O. I see at least three different challenges with 0RTT as defined. The first is a general and high level one: we seem to willing to accept a "lower" level of security for 0RTT data (e.g. no FS, even if the rest of the session has it). Why? What is it we think is special about this data that it is "less" worth protecting? surely there are very sensitive things in urls, surely there are potential oracles and other things in there too? It just seems super strange to me. The second challenge is that the replayability of the 0RTT poses a cryptographic safety challenge. Take Lucky13 - which is a brilliant attack and is stunningly effective against DTLS because it is so easy to replay over and over; barely needing to change any parameters - and let the server do the work. 0RTT looks very similar. It doesn't seem wise to let cipher text manipulators take as many cracks at the whip as they'd like. The third challenge is that the 0RTT plaintext data itself may not be safe to replay; that is that it might trigger some kind of non-idempotent action. Idempotence is really really hard, it isn't safe to simply plug in a replayable section to existing protocols. There's also a huge difference between being tolerant to a small number of replays, and a large unbounded number. For example: a large unbounded number may be used to generate DOS attacks against throttles and quotas. *Tying things together* Short of some kind of transactional locking protocol during TLS handshakes, I don't think there is a scheme that can perfectly prevent replay. Bill Cox' analysis is a really good one here. But I'd like to observe that the sort of single-use-session-id cache outlined above has a nice property that it makes for a sort of strike register. Since the server-side implementor is incentivized to evict entries, or at least mark them as used, so that the slot is available for re-use; that can be doubled-up as a "we've seen this already" signal. This reduces the replay window to the time period for that signal to propagate (e.g. for an eviction to happen from the cache). So 0RTT data could be encrypted under the resumption session id. That creates the challenge that the session might not be there any more, so the server may not be able to decrypt the 0RTT data. I actually think this is a plus, and lines up with a separate important change I think is necessary - the 0RTT data shouldn't be application data. It should be a separate, optional, stream. I find it helpful to think of it as a hint, so it could be called "replayable_hint". Instead of breaking apart an existing protocol and putting some of it in the early data and some in the application data transparently (a disaster in waiting), the client and server would have to formally agree on the kind of data that could be in a "replayable_hint". This goes a long way to mitigating many protocol level idempotency concerns, and has no impact on the kind of pre-fetching people want to do for HTTP and other protocols. At a bare minimum, I think we should make this change. Lastly, and this is a little crazy but I haven't let that stop me before ... to guard against the smaller replay window and idempotency problems at the application levels,clients should occasionally send duplicate and unrelated hints, just opportunistically. This keeps the server side application "on notice" that that kind of craziness can occur, and better to have it happen a little all of the time in a controlled way, than rarely by attackers. *Summary* A common theme in the above is that it makes things more expensive for server-side implementors, and that sucks - but I don't see another way to avoid some of the pitfalls here; and I'm unhappy with the state of tickets today. If I'm on my own on that, I'd be interested in what kinds of data people might kind convincing. My own impressions come from being an Apache httpd developer and assisting people with configurations and running workshops at conferences. It's not scientific, but the prevalence of non-rotation is so severe in my sample set that I'm convinced it's the norm. -- Colm
_______________________________________________ TLS mailing list TLS@ietf.org https://www.ietf.org/mailman/listinfo/tls