[ 
https://issues.apache.org/jira/browse/IMPALA-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-6128.
-----------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 2.11.0

Commit in:
https://github.com/apache/incubator-impala/commit/fb4c3b01240d8f65fc2c45bf27b668ae9b1fa5d2

> Spill-to-disk Encryption(AES-CFB + SHA256) can be a performance bottleneck 
> while IO is getting faster
> -----------------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-6128
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6128
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>            Reporter: Xianda Ke
>              Labels: perf
>             Fix For: Impala 2.11.0
>
>
> Currently, Impala's encryption(AES-CFB + SHA256 - see 
> be/src/util/openssl-util.h) can be a bottleneck while IO is getting faster.
> The throughput of AES-CFB + SHA256 is about ~200~300MB/s, while nowadays' SSD 
> throughput can be up to GB/s. for instance, the read throughput is ~2600MB/s 
> in Intel's DC P3600, and write throughput is 1700MB/s. And the coming Intel's 
> Optance is getting more faster.  
> If the customers who care about security and turn on the flag.  Shuffle temp 
> file can be a performance bottleneck.  if we replace CFB+SHA256 with AES-GCM, 
>  Encryption/Decryption can be ~10x faster.
> h2. Brief introduction to AES-CTR & AES-GCM
> Confidentiality Modes: CFB & CTR
> * Both are Stream Ciphers
> * provable-security when use different nonce/IV for every message
> But, CTR has its advantages:
> * Hardware efficiency on an x86
> * Random-access
> * encryption/description
> The CTR mode can be parallelized in instruction level(ILP), it is about 4~6 
> times faster than CFB on x86 platform. its implementation is well-optimized 
> in OpenSSL or JVM on x86 platform.
> "It is hard to think of any modern, bulk-privacy application scenario where 
> any of the “original
> four” blockcipher modes—ECB, CBC, CFB, or OFB—make more sense than CTR."
> --Phillip Rogaway
> Confidentiality + Integrity
> AES-GCM is a relatively new standard (2008).  It is a combination of CTR and 
> GMAC. GCM has both encryption and message integrity. AES-GCM was fully 
> supported since OpenSSL 1.0.1d. Intel has added a carry-less-multiplication 
> instruction (PCLMULQDQ) since Westmere.
> *  GCM is already widely used.
> *  provable-security, it is fragile only if you re-use an IV like CTR/CFB.
> GCM is a very fast but arguably complex combination of CTR mode and GHASH. 
> Luckily, we don't have to implement it. The well-optimized 
> implementation(Prof. Shay Gueron's algorithm) with hardware acceleration(AES 
> & PCLMULQDQ) has been adopted in OpenSSL, Linux, go language...
> References:
> [AES-GCM for Efficient Authenticated Encryption –
> Ending the Reign of HMAC-SHA-1? 
> ](https://crypto.stanford.edu/RealWorldCrypto/slides/gueron.pdf)
> [Evaluation of Some Blockcipher Modes of 
> Operation](http://web.cs.ucdavis.edu/~rogaway/papers/modes.pdf)
> h2. mirco-benchmark
> Here is the mirco-benchmark on my desktop(Memory 16G, CPU: i5-4590 CPU @ 
> 3.30GHz):
> {code}
> OpenSSL 1.0.2g,
> OpenSSL CTR Encryption (Total=1024MB, key=256bits, Chunk= 16KB)       
> throughput= 3202.58MB/s.
> OpenSSL CTR Encryption (Total=1024MB, key=256bits, Chunk=  1MB)       
> throughput= 3241.76MB/s.
> OpenSSL CTR Decryption (Total=1024MB, key=256bits, Chunk= 16KB)       
> throughput= 3199.91MB/s.
> OpenSSL CTR Decryption (Total=1024MB, key=256bits, Chunk=  1MB)       
> throughput= 3231.22MB/s.
> OpenSSL CFB Encryption (Total=1024MB, key=256bits, Chunk= 16KB)       
> throughput= 427.07MB/s.
> OpenSSL CFB Encryption (Total=1024MB, key=256bits, Chunk=  1MB)       
> throughput= 423.92MB/s.
> OpenSSL CFB Decryption (Total=1024MB, key=256bits, Chunk= 16KB)       
> throughput= 425.87MB/s.
> OpenSSL CFB Decryption (Total=1024MB, key=256bits, Chunk=  1MB)       
> throughput= 423.44MB/s.
> OpenSSL SHA256 Encryption (Total=64MB, key=256bits, Chunk= 64KB)      
> throughput= 449.48MB/s.
> OpenSSL SHA256 Encryption (Total=1024MB, key=256bits, Chunk=  1MB)    
> throughput= 446.63MB/s.
> OpenSSL GCM Encryption (Total=1024MB, key=256bits, Chunk= 16KB)       
> throughput= 2340.80MB/s.
> OpenSSL GCM Encryption (Total=1024MB, key=256bits, Chunk=  1MB)       
> throughput= 2366.55MB/s.
> OpenSSL CFB+SHA256 Encryption (Total=1024MB, key=256bits, Chunk= 16KB)        
> throughput= 218.77MB/s.
> OpenSSL CFB+SHA256 Encryption (Total=1024MB, key=256bits, Chunk=  1MB)        
> throughput= 220.53MB/s.
> OpenSSL CFB+SHA256 Decryption (Total=1024MB, key=256bits, Chunk= 16KB)        
> throughput= 219.10MB/s.
> OpenSSL CFB+SHA256 Decryption (Total=1024MB, key=256bits, Chunk=  1MB)        
> throughput= 219.92MB/s.
> {code}
> We can see that GCM is *~10 times* faster than CFB+SHA256
> h2. Solutions
> Option A: if replace CFB+SHA256 with AES-GCM. Encryption/Decryption can be 
> ~10x faster.
> Option B: Just replace CFB with CTR, it is very simple, and ~70% performance 
> gain.
> folks, any comments? I will upload the patches soon.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to