Re: [TLS] Maximum Fragment Length negotiation
Hi Thomas, We encountered the same issue and suggested something similar in [1] -- although not at the same level of detail as you below. I like your proposal, but I'm not convinced that overloading the semantics of an already existing extension when used in combination with a specific version of the protocol is necessarily the best strategy. Besides, I'd like to be able to deploy a similar mechanism in 1.2. So, why not simply allocating a new code-point for an extension with the semantics you describe and make it available across different protocol versions? Cheers, t [1] https://tools.ietf.org/html/draft-fossati-tls-iot-optimizations-00#section- 6 On 24/11/2016 19:50, "TLS on behalf of Thomas Pornin" wrote: >Hello, > >I know that I am a bit late to the party, but I have a suggestion for >the upcoming TLS 1.3. > >Context: I am interested in TLS support in constrained architectures, >specifically those which have very little RAM. I recently published a >first version of an implementation of TLS 1.0 to 1.2, that primarily >targets that kind of system ( https://www.bearssl.org/ ); a fully >functional TLS server can then run in as little as 25 kB of RAM (and >even less of ROM, for the code itself). > >Out of these 25 kB, 16 kB are used for the buffer for incoming records, >because encrypted records cannot be processed until fully received (data >could be obtained from a partial record, but we must wait for the MAC >before actually acting on the data) and TLS specifies that records can >have up to 16384 bytes of plaintext. Thus, about 2/3 of the RAM usage is >directly related to that maximum fragment length. > >There is a defined extension (in RFC 6066) that allows a client to >negotiate a smaller maximum fragment length. That extension is simple >to implement, but it has two problems that prevent it from being >really usable: > > 1. It is optional, so any implementation is free not to implement it, >and in practice many do not (e.g. last time I checked, OpenSSL did >not support it). > > 2. It is one-sided: the client may asked for a smaller fragment, but >the server has no choice but to accept the value sent by the client. >In situations where the constrained system is the server, the >extension is not useful (e.g. the embedded system runs a minimal >HTTPS server, for a Web-based configuration interface; the client is >a Web browser and won't ask for a smaller maximum fragment length). > > >I suggest to fix these issues in TLS 1.3. My proposal is the following: > > - Make Max Fragment Length extension support mandatory (right now, > draft 18 makes it "recommended" only). > > - Extend the extension semantics **when used in TLS 1.3** in the >following > ways: > > * When an implementation supports a given maximum fragment length, it > MUST also support all smaller lengths (in the list of lengths > indicated in the extension: 512, 1024, 2048, 4096 and 16384). > > * When the server receives the extension for maximum length N, it > may respond with the extension with any length N' <= N (in the > list above). > > * If the client does not send the extension, then this is equivalent > to sending it with a maximum length of 16384 bytes (so the server > may still send the extension, even if the client did not). > > Semantics for the extension in TLS 1.2 and previous is unchanged. > >With these changes, RAM-constrained clients and servers can negotiate a >maximum length for record plaintext that they both support, and such an >implementation can use a small record buffer with the guarantee that all >TLS-1.3-aware peers will refrain from sending larger records. With, for >instance, a 2048-byte buffer, per-record overhead is still small (about >1%), and overall RAM usage is halved, which is far from negligible. > > >RAM-constrained full TLS 1.3 is likely to be challenging (I envision >issues with, for instance, cookies, since they can be up to 64 kB in >length), but a guaranteed flexible negotiation for maximum fragment >length would be a step in the right direction. > >Any comments / suggestions ? > >Thanks, > > > --Thomas Pornin > >___ >TLS mailing list >TLS@ietf.org >https://www.ietf.org/mailman/listinfo/tls > ___ TLS mailing list TLS@ietf.org https://www.ietf.org/mailman/listinfo/tls
[TLS] Maximum Fragment Length negotiation
Hello, I know that I am a bit late to the party, but I have a suggestion for the upcoming TLS 1.3. Context: I am interested in TLS support in constrained architectures, specifically those which have very little RAM. I recently published a first version of an implementation of TLS 1.0 to 1.2, that primarily targets that kind of system ( https://www.bearssl.org/ ); a fully functional TLS server can then run in as little as 25 kB of RAM (and even less of ROM, for the code itself). Out of these 25 kB, 16 kB are used for the buffer for incoming records, because encrypted records cannot be processed until fully received (data could be obtained from a partial record, but we must wait for the MAC before actually acting on the data) and TLS specifies that records can have up to 16384 bytes of plaintext. Thus, about 2/3 of the RAM usage is directly related to that maximum fragment length. There is a defined extension (in RFC 6066) that allows a client to negotiate a smaller maximum fragment length. That extension is simple to implement, but it has two problems that prevent it from being really usable: 1. It is optional, so any implementation is free not to implement it, and in practice many do not (e.g. last time I checked, OpenSSL did not support it). 2. It is one-sided: the client may asked for a smaller fragment, but the server has no choice but to accept the value sent by the client. In situations where the constrained system is the server, the extension is not useful (e.g. the embedded system runs a minimal HTTPS server, for a Web-based configuration interface; the client is a Web browser and won't ask for a smaller maximum fragment length). I suggest to fix these issues in TLS 1.3. My proposal is the following: - Make Max Fragment Length extension support mandatory (right now, draft 18 makes it "recommended" only). - Extend the extension semantics **when used in TLS 1.3** in the following ways: * When an implementation supports a given maximum fragment length, it MUST also support all smaller lengths (in the list of lengths indicated in the extension: 512, 1024, 2048, 4096 and 16384). * When the server receives the extension for maximum length N, it may respond with the extension with any length N' <= N (in the list above). * If the client does not send the extension, then this is equivalent to sending it with a maximum length of 16384 bytes (so the server may still send the extension, even if the client did not). Semantics for the extension in TLS 1.2 and previous is unchanged. With these changes, RAM-constrained clients and servers can negotiate a maximum length for record plaintext that they both support, and such an implementation can use a small record buffer with the guarantee that all TLS-1.3-aware peers will refrain from sending larger records. With, for instance, a 2048-byte buffer, per-record overhead is still small (about 1%), and overall RAM usage is halved, which is far from negligible. RAM-constrained full TLS 1.3 is likely to be challenging (I envision issues with, for instance, cookies, since they can be up to 64 kB in length), but a guaranteed flexible negotiation for maximum fragment length would be a step in the right direction. Any comments / suggestions ? Thanks, --Thomas Pornin ___ TLS mailing list TLS@ietf.org https://www.ietf.org/mailman/listinfo/tls
Re: [TLS] record layer limits of TLS1.3
A) OpenSSL does not measure the actual TLS performance (including nonce construction, additional data, etc), but rather just the speed of the main encryption loop. B) Still, I agree with Yoav. From my experience, the difference in TPT between 16K records and 64K records is negligible, as well as the network overhead. On the other hand using larger records increases the risk of HoL blocking. Cheers, Vlad > On Nov 24, 2016, at 6:16 AM, Yoav Nir wrote: > > >> On 24 Nov 2016, at 15:47, Hubert Kario wrote: >> >> On Wednesday, 23 November 2016 10:50:37 CET Yoav Nir wrote: On 23 Nov 2016, at 10:30, Nikos Mavrogiannopoulos wrote: > On Wed, 2016-11-23 at 10:05 +0200, Yoav Nir wrote: > Hi, Nikos > > On 23 Nov 2016, at 9:06, Nikos Mavrogiannopoulos That to my understanding is a way to reduce latency in contrast to cpu costs. An increase to packet size targets bandwidth rather than latency (speed). >>> >>> Sure, but running ‘openssl speed’ on either aes-128-cbc or hmac or sha256 >>> (there’s no test for AES-GCM or ChaCha-poly) you get smallish differences >>> in terms of kilobytes per second between 1024-byte buffers and 8192-byte >>> buffers. And the difference going to be even smaller going to 16KB buffers, >>> let alone 64KB buffers. >> >> this is not valid comparison. openssl speed doesn't use the hardware >> accelerated codepath >> >> you need to use `openssl speed -evp aes-128-gcm` to see it (and yes, >> aes-gcm and chacha20-poly1305 is supported then) >> >> What I see is nearly a 1GB/s throughput increase between 1024 and 8192 byte >> blocks for AES-GCM: >> >> type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes >> aes-128-gcm 614979.91k 1388369.31k 2702645.76k 3997320.76k >> 4932512.79k >> >> While indeed, for chacha20 there's little to no difference at the high end: >> type 16 bytes 64 bytes256 bytes 1024 bytes 8192 >> bytes 16384 bytes >> chacha20-poly1305 242518.50k 514356.72k 1035220.57k 1868933.46k >> 1993609.50k 1997438.98k >> >> (aes-128-gcm performance from openssl-1.0.2j-1.fc24.x86_64, >> chacha20-poly1305 from openssl master, both on >> Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz) > > Cool. So you got a 23% improvement, and I got an 18% improvement for AES-GCM. > I still claim (but cannot prove without modifying openssl code (maybe I’ll do > that over the weekend) that the jump from 16KB to 64KB will be far, far less > pronounced. > > Yoav > > ___ > TLS mailing list > TLS@ietf.org > https://www.ietf.org/mailman/listinfo/tls ___ TLS mailing list TLS@ietf.org https://www.ietf.org/mailman/listinfo/tls
Re: [TLS] Additional warnings on 0-RTT data
On Wed, Nov 23, 2016 at 10:44 PM, Christian Huitema wrote: > On Wednesday, November 23, 2016 7:20 PM, Colm MacCárthaigh wrote: > > > > Prior to TLS1.3, replay is not possible, so the risks are new, but the > end-to-end designers > > may not realize to update their threat model and just what is required. > I'd like to spell > > that out more than what's where at present. > > Uh? Replay was always possible, at the application level. Someone might > for example click twice on the same URL, opening two tabs, closing one at > random. And that's without counting on deliberate mischief. > Much more than browsers use TLS, and also more than HTTP. There are many web service APIs that rely on TLS for anti-replay, and do not simple retry requests. Transaction and commit protocols for example will usually have unique IDs for each attempt. But even if this were not the case, there are other material differences that are still relevant even to browsers. Firstly, an attacker can replay 0-RTT data at a vastly higher rate than they could ever cause a browser to do anything. Second, they can replay 0-RTT data to arbitrary nodes beyond what the browser may select. Together these open new attacks, like the third example I provided. -- Colm ___ TLS mailing list TLS@ietf.org https://www.ietf.org/mailman/listinfo/tls
Re: [TLS] record layer limits of TLS1.3
> On 24 Nov 2016, at 15:47, Hubert Kario wrote: > > On Wednesday, 23 November 2016 10:50:37 CET Yoav Nir wrote: >> On 23 Nov 2016, at 10:30, Nikos Mavrogiannopoulos wrote: >>> On Wed, 2016-11-23 at 10:05 +0200, Yoav Nir wrote: Hi, Nikos On 23 Nov 2016, at 9:06, Nikos Mavrogiannopoulos >>> That to my understanding is a way to reduce >>> latency in contrast to cpu costs. An increase to packet size targets >>> bandwidth rather than latency (speed). >> >> Sure, but running ‘openssl speed’ on either aes-128-cbc or hmac or sha256 >> (there’s no test for AES-GCM or ChaCha-poly) you get smallish differences >> in terms of kilobytes per second between 1024-byte buffers and 8192-byte >> buffers. And the difference going to be even smaller going to 16KB buffers, >> let alone 64KB buffers. > > this is not valid comparison. openssl speed doesn't use the hardware > accelerated codepath > > you need to use `openssl speed -evp aes-128-gcm` to see it (and yes, > aes-gcm and chacha20-poly1305 is supported then) > > What I see is nearly a 1GB/s throughput increase between 1024 and 8192 byte > blocks for AES-GCM: > > type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes > aes-128-gcm 614979.91k 1388369.31k 2702645.76k 3997320.76k 4932512.79k > > While indeed, for chacha20 there's little to no difference at the high end: > type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes > 16384 bytes > chacha20-poly1305 242518.50k 514356.72k 1035220.57k 1868933.46k > 1993609.50k 1997438.98k > > (aes-128-gcm performance from openssl-1.0.2j-1.fc24.x86_64, chacha20-poly1305 > from openssl master, both on > Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz) Cool. So you got a 23% improvement, and I got an 18% improvement for AES-GCM. I still claim (but cannot prove without modifying openssl code (maybe I’ll do that over the weekend) that the jump from 16KB to 64KB will be far, far less pronounced. Yoav ___ TLS mailing list TLS@ietf.org https://www.ietf.org/mailman/listinfo/tls
Re: [TLS] record layer limits of TLS1.3
On Wednesday, 23 November 2016 10:50:37 CET Yoav Nir wrote: > On 23 Nov 2016, at 10:30, Nikos Mavrogiannopoulos wrote: > > On Wed, 2016-11-23 at 10:05 +0200, Yoav Nir wrote: > >> Hi, Nikos > >> > >> On 23 Nov 2016, at 9:06, Nikos Mavrogiannopoulos > > That to my understanding is a way to reduce > > latency in contrast to cpu costs. An increase to packet size targets > > bandwidth rather than latency (speed). > > Sure, but running ‘openssl speed’ on either aes-128-cbc or hmac or sha256 > (there’s no test for AES-GCM or ChaCha-poly) you get smallish differences > in terms of kilobytes per second between 1024-byte buffers and 8192-byte > buffers. And the difference going to be even smaller going to 16KB buffers, > let alone 64KB buffers. this is not valid comparison. openssl speed doesn't use the hardware accelerated codepath you need to use `openssl speed -evp aes-128-gcm` to see it (and yes, aes-gcm and chacha20-poly1305 is supported then) What I see is nearly a 1GB/s throughput increase between 1024 and 8192 byte blocks for AES-GCM: type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes aes-128-gcm 614979.91k 1388369.31k 2702645.76k 3997320.76k 4932512.79k While indeed, for chacha20 there's little to no difference at the high end: type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes 16384 bytes chacha20-poly1305 242518.50k 514356.72k 1035220.57k 1868933.46k 1993609.50k 1997438.98k (aes-128-gcm performance from openssl-1.0.2j-1.fc24.x86_64, chacha20-poly1305 from openssl master, both on Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz) -- Regards, Hubert Kario Senior Quality Engineer, QE BaseOS Security team Web: www.cz.redhat.com Red Hat Czech s.r.o., Purkyňova 99/71, 612 45, Brno, Czech Republic signature.asc Description: This is a digitally signed message part. ___ TLS mailing list TLS@ietf.org https://www.ietf.org/mailman/listinfo/tls
[TLS] [Errata Verified] RFC5288 (4694)
The following errata report has been verified for RFC5288, "AES Galois Counter Mode (GCM) Cipher Suites for TLS". -- You may review the report below and at: http://www.rfc-editor.org/errata_search.php?rfc=5288&eid=4694 -- Status: Verified Type: Technical Reported by: Aaron Zauner Date Reported: 2016-05-14 Verified by: Stephen Farrell (IESG) Section: 6.1 Original Text - AES-GCM security requires that the counter is never reused. The IV construction in Section 3 is designed to prevent counter reuse. Implementers should also understand the practical considerations of IV handling outlined in Section 9 of [GCM]. Corrected Text -- Security of AES-GCM requires that the "nonce" (number used once) is never reused. The IV construction in Section 3 does not prevent implementers from reusing the nonce by mistake. It is paramount that the implementer be aware of the security implications when a nonce is reused even once. Nonce reuse in AES-GCM allows for the recovery of the authentication key resulting in complete failure of the mode's authenticity. Hence, TLS sessions can be effectively attacked through forgery by an adversary. This enables an attacker to inject data into the TLS allowing for XSS and other attack vectors. Notes - Obviously the original wording is so ambiguous that implementers got it wrong in the real world. Related to: https://www.blackhat.com/us-16/briefings.html#nonce-disrespecting-adversaries-practical-forgery-attacks-on-gcm-in-tls It may be worth adding a reference to [JOUX] http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/...38.../GCM/Joux_comments.pdf and maybe the paper we're intending to release on the actual HTTPS forgery/injection attack. I'd actually like to change the nonce construction to that of the ChaCha20/Poly1305 document, but I figure this will cause massive breakage for already deployed implementations. TLS 1.3 fixes this issue per design. -- RFC5288 (draft-ietf-tls-rsa-aes-gcm-03) -- Title : AES Galois Counter Mode (GCM) Cipher Suites for TLS Publication Date: August 2008 Author(s) : J. Salowey, A. Choudhury, D. McGrew Category: PROPOSED STANDARD Source : Transport Layer Security Area: Security Stream : IETF Verifying Party : IESG ___ TLS mailing list TLS@ietf.org https://www.ietf.org/mailman/listinfo/tls