On Thu, Feb 9, 2017 at 10:50 PM, Dan Burkert <d...@cloudera.com> wrote:
> A couple thoughts: > > * You had to explicitly turn on the ADH and AECDH ciphers, right? We > shouldn't be using those in any circumstances, but I don't think it would > change the results of your test. > Right, I was just too lazy to set up a self-signed key, and using the anonymous ciphers allowed me to use -nocert. > > * We've discussed adding checksums to the RPC system in the past, but > punted since we would get it 'for free' with TLS. Perhaps we would want to > turn off checksums in the localhost scenario anyway, though? > > I think even if we add checksums, it would be based on CRC32 (which is something like 0.12 cycles/byte given hardware acceleration). Doing a cryptographic checksum is a bit expensive for this purpose. And yea, we would probably disable them in localhost anyway -- if we're worried about memory getting corrupted, the bigger risk would be all of the memory sitting in MRS, block cache, etc. > * The solution of not wrapping the socket sounds reasonable. We may need > to add a new RPC feature flag (e.g. TLS_LOCAL) so that we can be sure that > both sides recognize they are on a localhost connection. Or perhaps this > isn't a problem and a connection can always be determined to be to the > localhost? > Yea, I figured that both sides would check the remote peer, and if they both agree that the other side is local, they'd offer TLS_NEGOTIATION_ONLY or somesuch. This could also be used in a scenario where a user needs authentication but fully trusts their network and therefore doesn't need/want encryption, so I'd call it NEGOTIATION_ONLY or AUTH_ONLY rather than LOCAL. > > * The overhead of crypto hashes is highly dependent on the implementation. > For instance, some tests on my laptop (openssl speed md5 sha1 sha256 sha512 > ): > > OpenSSL 0.9.8zg > > The 'numbers' are in 1000s of bytes per second processed. > type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 > bytes > md5 37815.63k 116197.14k 251336.60k 358376.77k > 414424.95k > sha1 44766.06k 130830.68k 275829.53k 392799.78k > 452215.48k > sha256 31386.52k 73018.50k 132622.47k 161953.92k > 175492.62k > sha512 20567.70k 84586.04k 160368.90k 240929.09k > 273687.68k > > OpenSSL 1.0.2k > > The 'numbers' are in 1000s of bytes per second processed. > type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 > bytes > md5 55295.46k 158955.86k 360768.81k 510628.52k > 580591.62k > sha1 66130.74k 188498.24k 443250.86k 685325.87k > 824453.80k > sha256 69510.92k 153928.53k 275164.76k 349643.09k > 379147.13k > sha512 44970.85k 175308.13k 304458.49k 448055.48k > 524516.17k > > sha1 is almost 50% faster, and sha512 is almost 100% faster. > > Yea, but still the best number here is 685MB/sec. Assuming 2ghz, that's around 3 cycles/byte (~25x slower than crc32). According to Intel, AES encryption with AESNI can be around 1.3 cycles/byte: https://software.intel.com/sites/default/files/m/d/4/1/d/8/10TB24_Breakthrough_AES_Performance_with_Intel_AES_New_Instructions.final.secure.pdf -Todd > > On Thu, Feb 9, 2017 at 10:22 PM, Todd Lipcon <t...@cloudera.com> wrote: > > > Hey folks, > > > > For those not following along, we're very close to the point where we'll > be > > enabling TLS for all wire communication done by a Kudu cluster (at least > > when security features are enabled). One thing we've decided is important > > is to preserve good performance for applications like Spark and Impala > > which typically schedule tasks local to the data on the tablet servers, > and > > we think that enabling TLS for these localhost connections will have an > > unacceptable performance hit. > > > > Our thinking was to continue to use TLS *authentication* to prevent MITM > > attacks (possible because we typically don't bind to low ports). But, we > > don't need TLS *encryption*. > > > > This is possible using the various TLS "NULL" ciphers -- we can have both > > the client and server notice that the remote peer is local and enable the > > NULL cipher suite. However, I did some research this evening and it looks > > like the NULL ciphers disable encryption but don't disable the MAC > > integrity portion of TLS. Best I can tell, there is no API to do so. > > > > I did some brief checks using openssl s_client and s_server on my laptop > > (openssl 1.0.2g, haswell), and got the following numbers for transferring > > 5GB: > > > > ADH-AES128-SHA > > Client: 42.2M cycles > > Server: 35.3M cycles > > > > AECDH-NULL-SHA: (closest NULL I could find to the above) > > Client: 36.2M cycles > > Server: 28.6M cycles > > > > no TLS at all (using netcat to a local TCP port): > > Client: 20.8M cycles > > Server: 10.0M cycles > > > > baseline: iperf -n 5000M localhost > > Client: 2.3M cycles > > Server: 1.8M cycles > > [not sure why this is so much faster than netcat - I guess because with > > netcat I was piping to /dev/null which still requires more syscalls?] > > > > (note that the client in all of these cases includes the 'dd' command to > > generate the data, which probably explains why it's 7-10M cycles more > than > > the server in every case) > > > > To summarize, just disabling encryption has not much improvement, given > > that Intel chips now optimize AES. The checksumming itself adds more > > significant overhead than the encryption. This agrees with numbers I've > > seen around the web that crypto-strength checksums only go 1GB/sec or so > > max, typically much slower. > > > > Thinking about the best solution here, I think we should consider using > TLS > > during negotiation, and then just completely dropping the TLS (i.e not > > wrapping the sockets in TlsSockets). I think this still gives us the > > protection against the localhost MITM (because the handshake would fail) > > and be trivially zero-overhead. Am I missing any big issues with this > idea? > > Anyone got a better one? > > > > -Todd > > -- > > Todd Lipcon > > Software Engineer, Cloudera > > > -- Todd Lipcon Software Engineer, Cloudera