Re: TLS for localhost connections
ok, thanks. Yep -- the realization of the simple fact that we should not protect against super-user on a local machine came to me after I sent that e-mail. Sorry for the noise. On Fri, Feb 10, 2017 at 10:30 AM, Todd Lipconwrote: > On Fri, Feb 10, 2017 at 10:14 AM, Alexey Serbin > wrote: > > > Hi Todd, > > > > Thank you for sharing the perf stats you observed. I'm curious: during > > those s_client/s_server tests, was the TLS/SSL compression on or off? I > > don't think it would change the result a lot but it's interesting to > know. > > > > Compression was off: > > todd@todd-ThinkPad-T540p:~/sw/openssl-1.0.1f$ perf stat bash -c 'dd > if=/dev/zero bs=1M count=5000 | openssl s_client -cipher ADH-AES128-SHA' > CONNECTED(0003) > --- > no peer certificate available > --- > No client certificate CA names sent > Server Temp Key: DH, 2048 bits > --- > SSL handshake has read 850 bytes and written 441 bytes > --- > New, TLSv1/SSLv3, Cipher is ADH-AES128-SHA > Secure Renegotiation IS supported > Compression: NONE > Expansion: NONE > No ALPN negotiated > SSL-Session: > Protocol : TLSv1.2 > Cipher: ADH-AES128-SHA > Session-ID: > 5FE2AA31BC78C5578DE5FE95D3380E4D7094B1040A7D6E9C6A5EC15929F04564 > Session-ID-ctx: > Master-Key: > AE0FA5291957492495B7B3424CD4283FEA113727919D393AA19318516827 > E9AB074BCBC2A445584FE5C01DC59424B6F3 > Key-Arg : None > PSK identity: None > PSK identity hint: None > SRP username: None > TLS session ticket lifetime hint: 300 (seconds) > TLS session ticket: > - 8f 7e 92 27 06 5f 24 7c-3c a0 20 5d 7e a3 f8 d1 .~.'._$|<. > ]~... > 0010 - 4f 49 ad fc 52 30 e3 89-e0 a8 3a 53 29 e1 07 d4 > OI..R0:S)... > 0020 - 22 01 4b 95 40 5d 27 77-cf 6c b5 77 41 97 3a 88 ".K.@ > ]'w.l.wA.:. > 0030 - 35 23 6e c4 c7 66 36 0b-aa b5 ef d5 eb d8 3e cf > 5#n..f6...>. > 0040 - 34 c3 38 2a 0d b3 f9 26-1c a2 49 fe bc 27 b1 74 > 4.8*...&..I..'.t > 0050 - 89 96 42 69 af 11 c9 6c-da 3d 65 bc 85 dd 64 d7 > ..Bi...l.=e...d. > 0060 - 39 0f 78 34 6a c6 27 7e-57 37 b3 eb 60 cc c0 2d > 9.x4j.'~W7..`..- > 0070 - 3a a2 12 bc e6 d6 85 8e-ba 9d 7a 9e e2 e7 a0 ab > :.z. > 0080 - 47 1a d9 67 ec be 78 2a-d4 91 57 75 93 e1 28 a3 > G..g..x*..Wu..(. > 0090 - 30 24 c9 8f d1 37 bd e1-69 4b 18 43 85 f6 7e 63 > 0$...7..iK.C..~c > > Start Time: 1486707067 > Timeout : 300 (sec) > Verify return code: 0 (ok) > > > > > > > I think that from performance perspective dropping TLS wrapping around > the > > connection just after authentication is the best solution. > > > > From the other side, I think dropping TLS opens a door for localhost MITM > > attacks if an attacker can control access to ipfilter (fiddling with data > > like rewriting traffic?). > > > > I think the assumption we're going on is that we can't protect against root > on the same machine. (if you're root you could also just read the process's > memory, or edit the process, or dump the WAL, etc) > > > > > > BTW, if dropping encryption, are we concerned about leaking authz tokens > > when they are introduced? > > > > > Same answer as above -- I don't think we're attempting to protect against > local root in our threat model. > > -Todd > > > > > > On Thu, Feb 9, 2017 at 10:22 PM, Todd Lipcon wrote: > > > > > Hey folks, > > > > > > For those not following along, we're very close to the point where > we'll > > be > > > enabling TLS for all wire communication done by a Kudu cluster (at > least > > > when security features are enabled). One thing we've decided is > important > > > is to preserve good performance for applications like Spark and Impala > > > which typically schedule tasks local to the data on the tablet servers, > > and > > > we think that enabling TLS for these localhost connections will have an > > > unacceptable performance hit. > > > > > > Our thinking was to continue to use TLS *authentication* to prevent > MITM > > > attacks (possible because we typically don't bind to low ports). But, > we > > > don't need TLS *encryption*. > > > > > > This is possible using the various TLS "NULL" ciphers -- we can have > both > > > the client and server notice that the remote peer is local and enable > the > > > NULL cipher suite. However, I did some research this evening and it > looks > > > like the NULL ciphers disable encryption but don't disable the MAC > > > integrity portion of TLS. Best I can tell, there is no API to do so. > > > > > > I did some brief checks using openssl s_client and s_server on my > laptop > > > (openssl 1.0.2g, haswell), and got the following numbers for > transferring > > > 5GB: > > > > > > ADH-AES128-SHA > > > Client: 42.2M cycles > > > Server: 35.3M cycles > > > > > > AECDH-NULL-SHA: (closest NULL I could find to the above) > > > Client: 36.2M cycles > > > Server: 28.6M cycles > > > > > > no TLS at all (using netcat to a local TCP port): > > > Client:
Re: TLS for localhost connections
On Fri, Feb 10, 2017 at 10:29 AM, Dan Burkertwrote: > On Fri, Feb 10, 2017 at 10:02 AM, Todd Lipcon wrote: > > > > Yea, but still the best number here is 685MB/sec. Assuming 2ghz, that's > > around 3 cycles/byte (~25x slower than crc32). According to Intel, AES > > encryption with AESNI can be around 1.3 cycles/byte: > > https://software.intel.com/sites/default/files/m/d/4/1/d/ > > 8/10TB24_Breakthrough_AES_Performance_with_Intel_AES_ > > New_Instructions.final.secure.pdf > > > > > Sorry I was a little opaque there, but my point was we should expect the > integrity overhead to be even *higher* on Centos6, since it ships with an > older > OpenSSL than you tested on (although not as old as my numbers from 0.9.8). > Ah, I see. Yes, that's true -- older OSes are missing a lot of the more recent intel optimizations (both in terms of SHA and in terms of AESNI) -Todd -- Todd Lipcon Software Engineer, Cloudera
Re: TLS for localhost connections
On Fri, Feb 10, 2017 at 10:02 AM, Todd Lipconwrote: > > Yea, I figured that both sides would check the remote peer, and if they > both agree that the other side is local, they'd offer TLS_NEGOTIATION_ONLY > or somesuch. This could also be used in a scenario where a user needs > authentication but fully trusts their network and therefore doesn't > need/want encryption, so I'd call it NEGOTIATION_ONLY or AUTH_ONLY rather > than LOCAL. > > SGTM. > > Yea, but still the best number here is 685MB/sec. Assuming 2ghz, that's > around 3 cycles/byte (~25x slower than crc32). According to Intel, AES > encryption with AESNI can be around 1.3 cycles/byte: > https://software.intel.com/sites/default/files/m/d/4/1/d/ > 8/10TB24_Breakthrough_AES_Performance_with_Intel_AES_ > New_Instructions.final.secure.pdf > > Sorry I was a little opaque there, but my point was we should expect the integrity overhead to be even *higher* on Centos6, since it ships with an older OpenSSL than you tested on (although not as old as my numbers from 0.9.8). > -Todd > > > > > > On Thu, Feb 9, 2017 at 10:22 PM, Todd Lipcon wrote: > > > > > Hey folks, > > > > > > For those not following along, we're very close to the point where > we'll > > be > > > enabling TLS for all wire communication done by a Kudu cluster (at > least > > > when security features are enabled). One thing we've decided is > important > > > is to preserve good performance for applications like Spark and Impala > > > which typically schedule tasks local to the data on the tablet servers, > > and > > > we think that enabling TLS for these localhost connections will have an > > > unacceptable performance hit. > > > > > > Our thinking was to continue to use TLS *authentication* to prevent > MITM > > > attacks (possible because we typically don't bind to low ports). But, > we > > > don't need TLS *encryption*. > > > > > > This is possible using the various TLS "NULL" ciphers -- we can have > both > > > the client and server notice that the remote peer is local and enable > the > > > NULL cipher suite. However, I did some research this evening and it > looks > > > like the NULL ciphers disable encryption but don't disable the MAC > > > integrity portion of TLS. Best I can tell, there is no API to do so. > > > > > > I did some brief checks using openssl s_client and s_server on my > laptop > > > (openssl 1.0.2g, haswell), and got the following numbers for > transferring > > > 5GB: > > > > > > ADH-AES128-SHA > > > Client: 42.2M cycles > > > Server: 35.3M cycles > > > > > > AECDH-NULL-SHA: (closest NULL I could find to the above) > > > Client: 36.2M cycles > > > Server: 28.6M cycles > > > > > > no TLS at all (using netcat to a local TCP port): > > > Client: 20.8M cycles > > > Server: 10.0M cycles > > > > > > baseline: iperf -n 5000M localhost > > > Client: 2.3M cycles > > > Server: 1.8M cycles > > > [not sure why this is so much faster than netcat - I guess because with > > > netcat I was piping to /dev/null which still requires more syscalls?] > > > > > > (note that the client in all of these cases includes the 'dd' command > to > > > generate the data, which probably explains why it's 7-10M cycles more > > than > > > the server in every case) > > > > > > To summarize, just disabling encryption has not much improvement, given > > > that Intel chips now optimize AES. The checksumming itself adds more > > > significant overhead than the encryption. This agrees with numbers I've > > > seen around the web that crypto-strength checksums only go 1GB/sec or > so > > > max, typically much slower. > > > > > > Thinking about the best solution here, I think we should consider using > > TLS > > > during negotiation, and then just completely dropping the TLS (i.e not > > > wrapping the sockets in TlsSockets). I think this still gives us the > > > protection against the localhost MITM (because the handshake would > fail) > > > and be trivially zero-overhead. Am I missing any big issues with this > > idea? > > > Anyone got a better one? > > > > > > -Todd > > > -- > > > Todd Lipcon > > > Software Engineer, Cloudera > > > > > > > > > -- > Todd Lipcon > Software Engineer, Cloudera >
Re: TLS for localhost connections
Hi Todd, Thank you for sharing the perf stats you observed. I'm curious: during those s_client/s_server tests, was the TLS/SSL compression on or off? I don't think it would change the result a lot but it's interesting to know. I think that from performance perspective dropping TLS wrapping around the connection just after authentication is the best solution. >From the other side, I think dropping TLS opens a door for localhost MITM attacks if an attacker can control access to ipfilter (fiddling with data like rewriting traffic?). BTW, if dropping encryption, are we concerned about leaking authz tokens when they are introduced? Best regards, Alexey On Thu, Feb 9, 2017 at 10:22 PM, Todd Lipconwrote: > Hey folks, > > For those not following along, we're very close to the point where we'll be > enabling TLS for all wire communication done by a Kudu cluster (at least > when security features are enabled). One thing we've decided is important > is to preserve good performance for applications like Spark and Impala > which typically schedule tasks local to the data on the tablet servers, and > we think that enabling TLS for these localhost connections will have an > unacceptable performance hit. > > Our thinking was to continue to use TLS *authentication* to prevent MITM > attacks (possible because we typically don't bind to low ports). But, we > don't need TLS *encryption*. > > This is possible using the various TLS "NULL" ciphers -- we can have both > the client and server notice that the remote peer is local and enable the > NULL cipher suite. However, I did some research this evening and it looks > like the NULL ciphers disable encryption but don't disable the MAC > integrity portion of TLS. Best I can tell, there is no API to do so. > > I did some brief checks using openssl s_client and s_server on my laptop > (openssl 1.0.2g, haswell), and got the following numbers for transferring > 5GB: > > ADH-AES128-SHA > Client: 42.2M cycles > Server: 35.3M cycles > > AECDH-NULL-SHA: (closest NULL I could find to the above) > Client: 36.2M cycles > Server: 28.6M cycles > > no TLS at all (using netcat to a local TCP port): > Client: 20.8M cycles > Server: 10.0M cycles > > baseline: iperf -n 5000M localhost > Client: 2.3M cycles > Server: 1.8M cycles > [not sure why this is so much faster than netcat - I guess because with > netcat I was piping to /dev/null which still requires more syscalls?] > > (note that the client in all of these cases includes the 'dd' command to > generate the data, which probably explains why it's 7-10M cycles more than > the server in every case) > > To summarize, just disabling encryption has not much improvement, given > that Intel chips now optimize AES. The checksumming itself adds more > significant overhead than the encryption. This agrees with numbers I've > seen around the web that crypto-strength checksums only go 1GB/sec or so > max, typically much slower. > > Thinking about the best solution here, I think we should consider using TLS > during negotiation, and then just completely dropping the TLS (i.e not > wrapping the sockets in TlsSockets). I think this still gives us the > protection against the localhost MITM (because the handshake would fail) > and be trivially zero-overhead. Am I missing any big issues with this idea? > Anyone got a better one? > > -Todd > -- > Todd Lipcon > Software Engineer, Cloudera >
Re: TLS for localhost connections
On Thu, Feb 9, 2017 at 10:50 PM, Dan Burkertwrote: > A couple thoughts: > > * You had to explicitly turn on the ADH and AECDH ciphers, right? We > shouldn't be using those in any circumstances, but I don't think it would > change the results of your test. > Right, I was just too lazy to set up a self-signed key, and using the anonymous ciphers allowed me to use -nocert. > > * We've discussed adding checksums to the RPC system in the past, but > punted since we would get it 'for free' with TLS. Perhaps we would want to > turn off checksums in the localhost scenario anyway, though? > > I think even if we add checksums, it would be based on CRC32 (which is something like 0.12 cycles/byte given hardware acceleration). Doing a cryptographic checksum is a bit expensive for this purpose. And yea, we would probably disable them in localhost anyway -- if we're worried about memory getting corrupted, the bigger risk would be all of the memory sitting in MRS, block cache, etc. > * The solution of not wrapping the socket sounds reasonable. We may need > to add a new RPC feature flag (e.g. TLS_LOCAL) so that we can be sure that > both sides recognize they are on a localhost connection. Or perhaps this > isn't a problem and a connection can always be determined to be to the > localhost? > Yea, I figured that both sides would check the remote peer, and if they both agree that the other side is local, they'd offer TLS_NEGOTIATION_ONLY or somesuch. This could also be used in a scenario where a user needs authentication but fully trusts their network and therefore doesn't need/want encryption, so I'd call it NEGOTIATION_ONLY or AUTH_ONLY rather than LOCAL. > > * The overhead of crypto hashes is highly dependent on the implementation. > For instance, some tests on my laptop (openssl speed md5 sha1 sha256 sha512 > ): > > OpenSSL 0.9.8zg > > The 'numbers' are in 1000s of bytes per second processed. > type 16 bytes 64 bytes256 bytes 1024 bytes 8192 > bytes > md5 37815.63k 116197.14k 251336.60k 358376.77k > 414424.95k > sha1 44766.06k 130830.68k 275829.53k 392799.78k > 452215.48k > sha256 31386.52k73018.50k 132622.47k 161953.92k > 175492.62k > sha512 20567.70k84586.04k 160368.90k 240929.09k > 273687.68k > > OpenSSL 1.0.2k > > The 'numbers' are in 1000s of bytes per second processed. > type 16 bytes 64 bytes256 bytes 1024 bytes 8192 > bytes > md5 55295.46k 158955.86k 360768.81k 510628.52k > 580591.62k > sha1 66130.74k 188498.24k 443250.86k 685325.87k > 824453.80k > sha256 69510.92k 153928.53k 275164.76k 349643.09k > 379147.13k > sha512 44970.85k 175308.13k 304458.49k 448055.48k > 524516.17k > > sha1 is almost 50% faster, and sha512 is almost 100% faster. > > Yea, but still the best number here is 685MB/sec. Assuming 2ghz, that's around 3 cycles/byte (~25x slower than crc32). According to Intel, AES encryption with AESNI can be around 1.3 cycles/byte: https://software.intel.com/sites/default/files/m/d/4/1/d/8/10TB24_Breakthrough_AES_Performance_with_Intel_AES_New_Instructions.final.secure.pdf -Todd > > On Thu, Feb 9, 2017 at 10:22 PM, Todd Lipcon wrote: > > > Hey folks, > > > > For those not following along, we're very close to the point where we'll > be > > enabling TLS for all wire communication done by a Kudu cluster (at least > > when security features are enabled). One thing we've decided is important > > is to preserve good performance for applications like Spark and Impala > > which typically schedule tasks local to the data on the tablet servers, > and > > we think that enabling TLS for these localhost connections will have an > > unacceptable performance hit. > > > > Our thinking was to continue to use TLS *authentication* to prevent MITM > > attacks (possible because we typically don't bind to low ports). But, we > > don't need TLS *encryption*. > > > > This is possible using the various TLS "NULL" ciphers -- we can have both > > the client and server notice that the remote peer is local and enable the > > NULL cipher suite. However, I did some research this evening and it looks > > like the NULL ciphers disable encryption but don't disable the MAC > > integrity portion of TLS. Best I can tell, there is no API to do so. > > > > I did some brief checks using openssl s_client and s_server on my laptop > > (openssl 1.0.2g, haswell), and got the following numbers for transferring > > 5GB: > > > > ADH-AES128-SHA > > Client: 42.2M cycles > > Server: 35.3M cycles > > > > AECDH-NULL-SHA: (closest NULL I could find to the above) > > Client: 36.2M cycles > > Server: 28.6M cycles > > > > no TLS at all (using netcat to a local TCP port): > > Client: 20.8M cycles > > Server: 10.0M cycles > > > > baseline: iperf -n 5000M localhost > > Client: 2.3M cycles > > Server: 1.8M cycles > > [not