A couple thoughts: * You had to explicitly turn on the ADH and AECDH ciphers, right? We shouldn't be using those in any circumstances, but I don't think it would change the results of your test.
* We've discussed adding checksums to the RPC system in the past, but punted since we would get it 'for free' with TLS. Perhaps we would want to turn off checksums in the localhost scenario anyway, though? * The solution of not wrapping the socket sounds reasonable. We may need to add a new RPC feature flag (e.g. TLS_LOCAL) so that we can be sure that both sides recognize they are on a localhost connection. Or perhaps this isn't a problem and a connection can always be determined to be to the localhost? * The overhead of crypto hashes is highly dependent on the implementation. For instance, some tests on my laptop (openssl speed md5 sha1 sha256 sha512 ): OpenSSL 0.9.8zg The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes md5 37815.63k 116197.14k 251336.60k 358376.77k 414424.95k sha1 44766.06k 130830.68k 275829.53k 392799.78k 452215.48k sha256 31386.52k 73018.50k 132622.47k 161953.92k 175492.62k sha512 20567.70k 84586.04k 160368.90k 240929.09k 273687.68k OpenSSL 1.0.2k The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes md5 55295.46k 158955.86k 360768.81k 510628.52k 580591.62k sha1 66130.74k 188498.24k 443250.86k 685325.87k 824453.80k sha256 69510.92k 153928.53k 275164.76k 349643.09k 379147.13k sha512 44970.85k 175308.13k 304458.49k 448055.48k 524516.17k sha1 is almost 50% faster, and sha512 is almost 100% faster. On Thu, Feb 9, 2017 at 10:22 PM, Todd Lipcon <t...@cloudera.com> wrote: > Hey folks, > > For those not following along, we're very close to the point where we'll be > enabling TLS for all wire communication done by a Kudu cluster (at least > when security features are enabled). One thing we've decided is important > is to preserve good performance for applications like Spark and Impala > which typically schedule tasks local to the data on the tablet servers, and > we think that enabling TLS for these localhost connections will have an > unacceptable performance hit. > > Our thinking was to continue to use TLS *authentication* to prevent MITM > attacks (possible because we typically don't bind to low ports). But, we > don't need TLS *encryption*. > > This is possible using the various TLS "NULL" ciphers -- we can have both > the client and server notice that the remote peer is local and enable the > NULL cipher suite. However, I did some research this evening and it looks > like the NULL ciphers disable encryption but don't disable the MAC > integrity portion of TLS. Best I can tell, there is no API to do so. > > I did some brief checks using openssl s_client and s_server on my laptop > (openssl 1.0.2g, haswell), and got the following numbers for transferring > 5GB: > > ADH-AES128-SHA > Client: 42.2M cycles > Server: 35.3M cycles > > AECDH-NULL-SHA: (closest NULL I could find to the above) > Client: 36.2M cycles > Server: 28.6M cycles > > no TLS at all (using netcat to a local TCP port): > Client: 20.8M cycles > Server: 10.0M cycles > > baseline: iperf -n 5000M localhost > Client: 2.3M cycles > Server: 1.8M cycles > [not sure why this is so much faster than netcat - I guess because with > netcat I was piping to /dev/null which still requires more syscalls?] > > (note that the client in all of these cases includes the 'dd' command to > generate the data, which probably explains why it's 7-10M cycles more than > the server in every case) > > To summarize, just disabling encryption has not much improvement, given > that Intel chips now optimize AES. The checksumming itself adds more > significant overhead than the encryption. This agrees with numbers I've > seen around the web that crypto-strength checksums only go 1GB/sec or so > max, typically much slower. > > Thinking about the best solution here, I think we should consider using TLS > during negotiation, and then just completely dropping the TLS (i.e not > wrapping the sockets in TlsSockets). I think this still gives us the > protection against the localhost MITM (because the handshake would fail) > and be trivially zero-overhead. Am I missing any big issues with this idea? > Anyone got a better one? > > -Todd > -- > Todd Lipcon > Software Engineer, Cloudera >