[ https://issues.apache.org/jira/browse/IMPALA-11603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17629238#comment-17629238 ]
Joe McDonnell commented on IMPALA-11603: ---------------------------------------- Cloudflare zlib does have a nice performance boost over regular zlib for ORC with deflate compression: {noformat} +----------+-------------------+---------+------------+------------+----------------+ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | +----------+-------------------+---------+------------+------------+----------------+ | TPCH(42) | orc / def / block | 4.48 | -4.72% | 3.63 | -4.86% | +----------+-------------------+---------+------------+------------+----------------+{noformat} [https://jenkins.impala.io/job/perf-AB-test/375/artifact/Impala/perf_results/latest/performance_result.txt] Diving into the profiles would get us the exact decompression speedup. > Investigate using cloudflare's zlib library > ------------------------------------------- > > Key: IMPALA-11603 > URL: https://issues.apache.org/jira/browse/IMPALA-11603 > Project: IMPALA > Issue Type: Improvement > Components: Backend > Affects Versions: Impala 4.2.0 > Reporter: Joe McDonnell > Priority: Major > > Amazon recommends the use of cloudflare's zlib implementation at > [https://github.com/cloudflare/zlib] > In a blog post, they claim pretty large performance boosts over the regular > zlib implementation: > [https://aws.amazon.com/blogs/opensource/improving-zlib-cloudflare-and-comparing-performance-with-other-zlib-forks/] > {noformat} > On Arm: > Compression performance: ~90 percent faster than zlib-madler (original > zlib). > Decompression performance: ~52 percent faster than zlib-madler. > On x86: > Compression performance: ~113 percent faster than zlib-madler. > Decompression performance: ~44 percent faster than zlib-madler.{noformat} > The blog post is a year and a half old, so things may have changed since > then, but it seems interesting. Amazon's guidebooks still recommend it. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org