Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/23254 )
Change subject: IMPALA-12108: Add support for LZ4 high compression levels ...................................................................... IMPALA-12108: Add support for LZ4 high compression levels LZ4 has a high compression mode that gets higher compression ratios (at the cost of higher compression time) while maintaining the fast decompression speed. This type of compression would be useful for workloads that write data once and read it many times. This adds support for specifying a compression level for the LZ4 codec. Compression level 1 is the current fast API. Compression levels between LZ4HC_CLEVEL_MIN (3) and LZ4HC_CLEVEL_MAX (12) use the high compression API. This lines up with the behavior of the lz4 commandline. TPC-H 42 scale comparison Compression codec | Avg Time (s) | Geomean Time (s) | Lineitem Size (GB) | Compression time for lineitem (s) ------------------+--------------+------------------+--------------------+------------------------------ Snappy | 2.75 | 2.08 | 8.76 | 7.436 LZ4 level 1 | 2.58 | 1.91 | 9.1 | 6.864 LZ4 level 3 | 2.58 | 1.93 | 7.9 | 43.918 LZ4 level 9 | 2.68 | 1.98 | 7.6 | 125.0 Zstd level 3 | 3.03 | 2.31 | 6.36 | 17.274 Zstd level 6 | 3.10 | 2.38 | 6.33 | 44.955 LZ4 level 3 is about 10% smaller in data size while being about as fast as regular LZ4. It compresses at about the same speed as Zstd level 6. Testing: - Ran perf-AB-test with lz4 high compression levels - Added test cases to decompress-test Change-Id: Ie7470ce38b8710c870cacebc80bc02cf5d022791 Reviewed-on: http://gerrit.cloudera.org:8080/23254 Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> --- M be/src/service/query-options-test.cc M be/src/util/codec.cc M be/src/util/compress.cc M be/src/util/compress.h M be/src/util/decompress-test.cc 5 files changed, 94 insertions(+), 13 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/23254 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ie7470ce38b8710c870cacebc80bc02cf5d022791 Gerrit-Change-Number: 23254 Gerrit-PatchSet: 10 Gerrit-Owner: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Surya Hebbar <sheb...@cloudera.com>