Re: [TLS] Maximum Fragment Length negotiation

2016-11-24 Thread Fossati, Thomas (Nokia - GB)
Hi Thomas,

We encountered the same issue and suggested something similar in [1] --
although not at the same level of detail as you below.

I like your proposal, but I'm not convinced that overloading the semantics
of an already existing extension when used in combination with a specific
version of the protocol is necessarily the best strategy.  Besides, I'd
like to be able to deploy a similar mechanism in 1.2.

So, why not simply allocating a new code-point for an extension with the
semantics you describe and make it available across different protocol
versions?

Cheers, t

[1] 
https://tools.ietf.org/html/draft-fossati-tls-iot-optimizations-00#section-
6

On 24/11/2016 19:50, "TLS on behalf of Thomas Pornin"
 wrote:
>Hello,
>
>I know that I am a bit late to the party, but I have a suggestion for
>the upcoming TLS 1.3.
>
>Context: I am interested in TLS support in constrained architectures,
>specifically those which have very little RAM. I recently published a
>first version of an implementation of TLS 1.0 to 1.2, that primarily
>targets that kind of system ( https://www.bearssl.org/ ); a fully
>functional TLS server can then run in as little as 25 kB of RAM (and
>even less of ROM, for the code itself).
>
>Out of these 25 kB, 16 kB are used for the buffer for incoming records,
>because encrypted records cannot be processed until fully received (data
>could be obtained from a partial record, but we must wait for the MAC
>before actually acting on the data) and TLS specifies that records can
>have up to 16384 bytes of plaintext. Thus, about 2/3 of the RAM usage is
>directly related to that maximum fragment length.
>
>There is a defined extension (in RFC 6066) that allows a client to
>negotiate a smaller maximum fragment length. That extension is simple
>to implement, but it has two problems that prevent it from being
>really usable:
>
> 1. It is optional, so any implementation is free not to implement it,
>and in practice many do not (e.g. last time I checked, OpenSSL did
>not support it).
>
> 2. It is one-sided: the client may asked for a smaller fragment, but
>the server has no choice but to accept the value sent by the client.
>In situations where the constrained system is the server, the
>extension is not useful (e.g. the embedded system runs a minimal
>HTTPS server, for a Web-based configuration interface; the client is
>a Web browser and won't ask for a smaller maximum fragment length).
>
>
>I suggest to fix these issues in TLS 1.3. My proposal is the following:
>
> - Make Max Fragment Length extension support mandatory (right now,
>   draft 18 makes it "recommended" only).
>
> - Extend the extension semantics **when used in TLS 1.3** in the
>following
>   ways:
>
>   * When an implementation supports a given maximum fragment length, it
> MUST also support all smaller lengths (in the list of lengths
> indicated in the extension: 512, 1024, 2048, 4096 and 16384).
>
>   * When the server receives the extension for maximum length N, it
> may respond with the extension with any length N' <= N (in the
> list above).
>
>   * If the client does not send the extension, then this is equivalent
> to sending it with a maximum length of 16384 bytes (so the server
> may still send the extension, even if the client did not).
>
>   Semantics for the extension in TLS 1.2 and previous is unchanged.
>
>With these changes, RAM-constrained clients and servers can negotiate a
>maximum length for record plaintext that they both support, and such an
>implementation can use a small record buffer with the guarantee that all
>TLS-1.3-aware peers will refrain from sending larger records. With, for
>instance, a 2048-byte buffer, per-record overhead is still small (about
>1%), and overall RAM usage is halved, which is far from negligible.
>
>
>RAM-constrained full TLS 1.3 is likely to be challenging (I envision
>issues with, for instance, cookies, since they can be up to 64 kB in
>length), but a guaranteed flexible negotiation for maximum fragment
>length would be a step in the right direction.
>
>Any comments / suggestions ?
>
>Thanks,
>
>
>   --Thomas Pornin
>
>___
>TLS mailing list
>TLS@ietf.org
>https://www.ietf.org/mailman/listinfo/tls
>


___
TLS mailing list
TLS@ietf.org
https://www.ietf.org/mailman/listinfo/tls


[TLS] Maximum Fragment Length negotiation

2016-11-24 Thread Thomas Pornin
Hello,

I know that I am a bit late to the party, but I have a suggestion for
the upcoming TLS 1.3.

Context: I am interested in TLS support in constrained architectures,
specifically those which have very little RAM. I recently published a
first version of an implementation of TLS 1.0 to 1.2, that primarily
targets that kind of system ( https://www.bearssl.org/ ); a fully
functional TLS server can then run in as little as 25 kB of RAM (and
even less of ROM, for the code itself).

Out of these 25 kB, 16 kB are used for the buffer for incoming records,
because encrypted records cannot be processed until fully received (data
could be obtained from a partial record, but we must wait for the MAC
before actually acting on the data) and TLS specifies that records can
have up to 16384 bytes of plaintext. Thus, about 2/3 of the RAM usage is
directly related to that maximum fragment length.

There is a defined extension (in RFC 6066) that allows a client to
negotiate a smaller maximum fragment length. That extension is simple
to implement, but it has two problems that prevent it from being
really usable:

 1. It is optional, so any implementation is free not to implement it,
and in practice many do not (e.g. last time I checked, OpenSSL did
not support it).

 2. It is one-sided: the client may asked for a smaller fragment, but
the server has no choice but to accept the value sent by the client.
In situations where the constrained system is the server, the
extension is not useful (e.g. the embedded system runs a minimal
HTTPS server, for a Web-based configuration interface; the client is
a Web browser and won't ask for a smaller maximum fragment length).


I suggest to fix these issues in TLS 1.3. My proposal is the following:

 - Make Max Fragment Length extension support mandatory (right now,
   draft 18 makes it "recommended" only).

 - Extend the extension semantics **when used in TLS 1.3** in the following
   ways:

   * When an implementation supports a given maximum fragment length, it
 MUST also support all smaller lengths (in the list of lengths
 indicated in the extension: 512, 1024, 2048, 4096 and 16384).

   * When the server receives the extension for maximum length N, it
 may respond with the extension with any length N' <= N (in the
 list above).

   * If the client does not send the extension, then this is equivalent
 to sending it with a maximum length of 16384 bytes (so the server
 may still send the extension, even if the client did not).

   Semantics for the extension in TLS 1.2 and previous is unchanged.

With these changes, RAM-constrained clients and servers can negotiate a
maximum length for record plaintext that they both support, and such an
implementation can use a small record buffer with the guarantee that all
TLS-1.3-aware peers will refrain from sending larger records. With, for
instance, a 2048-byte buffer, per-record overhead is still small (about
1%), and overall RAM usage is halved, which is far from negligible.


RAM-constrained full TLS 1.3 is likely to be challenging (I envision
issues with, for instance, cookies, since they can be up to 64 kB in
length), but a guaranteed flexible negotiation for maximum fragment
length would be a step in the right direction.

Any comments / suggestions ?

Thanks,


--Thomas Pornin

___
TLS mailing list
TLS@ietf.org
https://www.ietf.org/mailman/listinfo/tls


Re: [TLS] record layer limits of TLS1.3

2016-11-24 Thread Vlad Krasnov
A) OpenSSL does not measure the actual TLS performance (including nonce 
construction, additional data, etc),  but rather just the speed of the main 
encryption loop.

B) Still, I agree with Yoav. From my experience, the difference in TPT between 
16K records and 64K records is negligible, as well as the network overhead. On 
the other hand using larger records increases the risk of HoL blocking.

Cheers,
Vlad

> On Nov 24, 2016, at 6:16 AM, Yoav Nir  wrote:
> 
> 
>> On 24 Nov 2016, at 15:47, Hubert Kario  wrote:
>> 
>> On Wednesday, 23 November 2016 10:50:37 CET Yoav Nir wrote:
 On 23 Nov 2016, at 10:30, Nikos Mavrogiannopoulos  wrote:
> On Wed, 2016-11-23 at 10:05 +0200, Yoav Nir wrote:
> Hi, Nikos
> 
> On 23 Nov 2016, at 9:06, Nikos Mavrogiannopoulos 
 That to my understanding is a way to reduce
 latency in contrast to cpu costs. An increase to packet size targets
 bandwidth rather than latency (speed).
>>> 
>>> Sure, but running ‘openssl speed’ on either aes-128-cbc or hmac or sha256
>>> (there’s no test for AES-GCM or ChaCha-poly) you get smallish differences
>>> in terms of kilobytes per second between 1024-byte buffers and 8192-byte
>>> buffers. And the difference going to be even smaller going to 16KB buffers,
>>> let alone 64KB buffers.
>> 
>> this is not valid comparison. openssl speed doesn't use the hardware
>> accelerated codepath
>> 
>> you need to use `openssl speed -evp aes-128-gcm` to see it (and yes, 
>> aes-gcm and chacha20-poly1305 is supported then)
>> 
>> What I see is nearly a 1GB/s throughput increase between 1024 and 8192 byte 
>> blocks for AES-GCM:
>> 
>> type 16 bytes 64 bytes256 bytes   1024 bytes   8192 bytes
>> aes-128-gcm 614979.91k  1388369.31k  2702645.76k  3997320.76k  
>> 4932512.79k
>> 
>> While indeed, for chacha20 there's little to no difference at the high end:
>> type 16 bytes 64 bytes256 bytes   1024 bytes   8192 
>> bytes  16384 bytes
>> chacha20-poly1305   242518.50k   514356.72k  1035220.57k  1868933.46k  
>> 1993609.50k  1997438.98k
>> 
>> (aes-128-gcm performance from openssl-1.0.2j-1.fc24.x86_64, 
>> chacha20-poly1305 from openssl master, both on 
>> Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz)
> 
> Cool. So you got a 23% improvement, and I got an 18% improvement for AES-GCM. 
> I still claim (but cannot prove without modifying openssl code (maybe I’ll do 
> that over the weekend) that the jump from 16KB to 64KB will be far, far less 
> pronounced.
> 
> Yoav
> 
> ___
> TLS mailing list
> TLS@ietf.org
> https://www.ietf.org/mailman/listinfo/tls

___
TLS mailing list
TLS@ietf.org
https://www.ietf.org/mailman/listinfo/tls


Re: [TLS] Additional warnings on 0-RTT data

2016-11-24 Thread Colm MacCárthaigh
On Wed, Nov 23, 2016 at 10:44 PM, Christian Huitema 
wrote:

> On Wednesday, November 23, 2016 7:20 PM, Colm MacCárthaigh wrote:
> >
> > Prior to TLS1.3, replay is not possible, so the risks are new, but the
> end-to-end designers
> > may not  realize to update their threat model and just what is required.
> I'd like to spell
> > that out more than what's where at present.
>
> Uh? Replay was always possible, at the application level. Someone might
> for example click twice on the same URL, opening two tabs, closing one at
> random. And that's without counting on deliberate mischief.
>

Much more than browsers use TLS, and also more than HTTP. There are many
web service APIs that rely on TLS for anti-replay, and do not simple retry
requests. Transaction and commit protocols for example will usually have
unique IDs for each attempt.

But even if this were not the case, there are other material differences
that are still relevant even to browsers. Firstly, an attacker can replay
0-RTT data at a vastly higher rate than they could ever cause a browser to
do anything. Second, they can replay 0-RTT data to arbitrary nodes beyond
what the browser may select. Together these open new attacks, like the
third example I provided.

-- 
Colm
___
TLS mailing list
TLS@ietf.org
https://www.ietf.org/mailman/listinfo/tls


Re: [TLS] record layer limits of TLS1.3

2016-11-24 Thread Yoav Nir

> On 24 Nov 2016, at 15:47, Hubert Kario  wrote:
> 
> On Wednesday, 23 November 2016 10:50:37 CET Yoav Nir wrote:
>> On 23 Nov 2016, at 10:30, Nikos Mavrogiannopoulos  wrote:
>>> On Wed, 2016-11-23 at 10:05 +0200, Yoav Nir wrote:
 Hi, Nikos
 
 On 23 Nov 2016, at 9:06, Nikos Mavrogiannopoulos 
>>> That to my understanding is a way to reduce
>>> latency in contrast to cpu costs. An increase to packet size targets
>>> bandwidth rather than latency (speed).
>> 
>> Sure, but running ‘openssl speed’ on either aes-128-cbc or hmac or sha256
>> (there’s no test for AES-GCM or ChaCha-poly) you get smallish differences
>> in terms of kilobytes per second between 1024-byte buffers and 8192-byte
>> buffers. And the difference going to be even smaller going to 16KB buffers,
>> let alone 64KB buffers.
> 
> this is not valid comparison. openssl speed doesn't use the hardware
> accelerated codepath
> 
> you need to use `openssl speed -evp aes-128-gcm` to see it (and yes, 
> aes-gcm and chacha20-poly1305 is supported then)
> 
> What I see is nearly a 1GB/s throughput increase between 1024 and 8192 byte 
> blocks for AES-GCM:
> 
> type 16 bytes 64 bytes256 bytes   1024 bytes   8192 bytes
> aes-128-gcm 614979.91k  1388369.31k  2702645.76k  3997320.76k  4932512.79k
> 
> While indeed, for chacha20 there's little to no difference at the high end:
> type 16 bytes 64 bytes256 bytes   1024 bytes   8192 bytes 
>  16384 bytes
> chacha20-poly1305   242518.50k   514356.72k  1035220.57k  1868933.46k  
> 1993609.50k  1997438.98k
> 
> (aes-128-gcm performance from openssl-1.0.2j-1.fc24.x86_64, chacha20-poly1305 
> from openssl master, both on 
> Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz)

Cool. So you got a 23% improvement, and I got an 18% improvement for AES-GCM. I 
still claim (but cannot prove without modifying openssl code (maybe I’ll do 
that over the weekend) that the jump from 16KB to 64KB will be far, far less 
pronounced.

Yoav

___
TLS mailing list
TLS@ietf.org
https://www.ietf.org/mailman/listinfo/tls


Re: [TLS] record layer limits of TLS1.3

2016-11-24 Thread Hubert Kario
On Wednesday, 23 November 2016 10:50:37 CET Yoav Nir wrote:
> On 23 Nov 2016, at 10:30, Nikos Mavrogiannopoulos  wrote:
> > On Wed, 2016-11-23 at 10:05 +0200, Yoav Nir wrote:
> >> Hi, Nikos
> >> 
> >> On 23 Nov 2016, at 9:06, Nikos Mavrogiannopoulos 
> > That to my understanding is a way to reduce
> > latency in contrast to cpu costs. An increase to packet size targets
> > bandwidth rather than latency (speed).
> 
> Sure, but running ‘openssl speed’ on either aes-128-cbc or hmac or sha256
> (there’s no test for AES-GCM or ChaCha-poly) you get smallish differences
> in terms of kilobytes per second between 1024-byte buffers and 8192-byte
> buffers. And the difference going to be even smaller going to 16KB buffers,
> let alone 64KB buffers.

this is not valid comparison. openssl speed doesn't use the hardware
accelerated codepath

you need to use `openssl speed -evp aes-128-gcm` to see it (and yes, 
aes-gcm and chacha20-poly1305 is supported then)

What I see is nearly a 1GB/s throughput increase between 1024 and 8192 byte 
blocks for AES-GCM:

type 16 bytes 64 bytes256 bytes   1024 bytes   8192 bytes
aes-128-gcm 614979.91k  1388369.31k  2702645.76k  3997320.76k  4932512.79k

While indeed, for chacha20 there's little to no difference at the high end:
type 16 bytes 64 bytes256 bytes   1024 bytes   8192 bytes  
16384 bytes
chacha20-poly1305   242518.50k   514356.72k  1035220.57k  1868933.46k  
1993609.50k  1997438.98k

(aes-128-gcm performance from openssl-1.0.2j-1.fc24.x86_64, chacha20-poly1305 
from openssl master, both on 
Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz)
-- 
Regards,
Hubert Kario
Senior Quality Engineer, QE BaseOS Security team
Web: www.cz.redhat.com
Red Hat Czech s.r.o., Purkyňova 99/71, 612 45, Brno, Czech Republic

signature.asc
Description: This is a digitally signed message part.
___
TLS mailing list
TLS@ietf.org
https://www.ietf.org/mailman/listinfo/tls


[TLS] [Errata Verified] RFC5288 (4694)

2016-11-24 Thread RFC Errata System
The following errata report has been verified for RFC5288,
"AES Galois Counter Mode (GCM) Cipher Suites for TLS". 

--
You may review the report below and at:
http://www.rfc-editor.org/errata_search.php?rfc=5288=4694

--
Status: Verified
Type: Technical

Reported by: Aaron Zauner 
Date Reported: 2016-05-14
Verified by: Stephen Farrell (IESG)

Section: 6.1

Original Text
-
   AES-GCM security requires that the counter is never reused.  The IV
   construction in Section 3 is designed to prevent counter reuse.

   Implementers should also understand the practical considerations of
   IV handling outlined in Section 9 of [GCM].

Corrected Text
--
   Security of AES-GCM requires that the "nonce" (number used once) is
   never reused.  The IV construction in Section 3 does not prevent 
   implementers from reusing the nonce by mistake.  It is paramount that 
   the implementer be aware of the security implications when a nonce 
   is reused even once. 

   Nonce reuse in AES-GCM allows for the recovery of the authentication key 
   resulting in complete failure of the mode's authenticity.  Hence, TLS 
   sessions can be effectively attacked through forgery by an adversary.
   This enables an attacker to inject data into the TLS allowing for XSS and 
   other attack vectors.

Notes
-
Obviously the original wording is so ambiguous that implementers got it wrong 
in the real world. Related to: 
https://www.blackhat.com/us-16/briefings.html#nonce-disrespecting-adversaries-practical-forgery-attacks-on-gcm-in-tls

It may be worth adding a reference to [JOUX] 
http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/...38.../GCM/Joux_comments.pdf
 and maybe the paper we're intending to release on the actual HTTPS 
forgery/injection attack.

I'd actually like to change the nonce construction to that of the 
ChaCha20/Poly1305 document, but I figure this will cause massive breakage for 
already deployed implementations. TLS 1.3 fixes this issue per design.

--
RFC5288 (draft-ietf-tls-rsa-aes-gcm-03)
--
Title   : AES Galois Counter Mode (GCM) Cipher Suites for TLS
Publication Date: August 2008
Author(s)   : J. Salowey, A. Choudhury, D. McGrew
Category: PROPOSED STANDARD
Source  : Transport Layer Security
Area: Security
Stream  : IETF
Verifying Party : IESG

___
TLS mailing list
TLS@ietf.org
https://www.ietf.org/mailman/listinfo/tls