Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/23254 )

Change subject: IMPALA-12108: Add support for LZ4 high compression levels
......................................................................

IMPALA-12108: Add support for LZ4 high compression levels

LZ4 has a high compression mode that gets higher compression ratios
(at the cost of higher compression time) while maintaining the fast
decompression speed. This type of compression would be useful for
workloads that write data once and read it many times.

This adds support for specifying a compression level for the
LZ4 codec. Compression level 1 is the current fast API. Compression
levels between LZ4HC_CLEVEL_MIN (3) and LZ4HC_CLEVEL_MAX (12) use
the high compression API. This lines up with the behavior of the lz4
commandline.

TPC-H 42 scale comparison
Compression codec | Avg Time (s) | Geomean Time (s) | Lineitem Size (GB) | 
Compression time for lineitem (s)
------------------+--------------+------------------+--------------------+------------------------------
Snappy            | 2.75         | 2.08             | 8.76               |   
7.436
LZ4 level 1       | 2.58         | 1.91             | 9.1                |   
6.864
LZ4 level 3       | 2.58         | 1.93             | 7.9                |  
43.918
LZ4 level 9       | 2.68         | 1.98             | 7.6                | 125.0
Zstd level 3      | 3.03         | 2.31             | 6.36               |  
17.274
Zstd level 6      | 3.10         | 2.38             | 6.33               |  
44.955

LZ4 level 3 is about 10% smaller in data size while being about as fast as
regular LZ4. It compresses at about the same speed as Zstd level 6.

Testing:
 - Ran perf-AB-test with lz4 high compression levels
 - Added test cases to decompress-test

Change-Id: Ie7470ce38b8710c870cacebc80bc02cf5d022791
Reviewed-on: http://gerrit.cloudera.org:8080/23254
Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
---
M be/src/service/query-options-test.cc
M be/src/util/codec.cc
M be/src/util/compress.cc
M be/src/util/compress.h
M be/src/util/decompress-test.cc
5 files changed, 94 insertions(+), 13 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/23254
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ie7470ce38b8710c870cacebc80bc02cf5d022791
Gerrit-Change-Number: 23254
Gerrit-PatchSet: 10
Gerrit-Owner: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Surya Hebbar <sheb...@cloudera.com>

Reply via email to