[ 
https://issues.apache.org/jira/browse/IMPALA-11603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17629238#comment-17629238
 ] 

Joe McDonnell commented on IMPALA-11603:
----------------------------------------

Cloudflare zlib does have a nice performance boost over regular zlib for ORC 
with deflate compression:
{noformat}
+----------+-------------------+---------+------------+------------+----------------+
| Workload | File Format       | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+----------+-------------------+---------+------------+------------+----------------+
| TPCH(42) | orc / def / block | 4.48    | -4.72%     | 3.63       | -4.86%     
    |
+----------+-------------------+---------+------------+------------+----------------+{noformat}
[https://jenkins.impala.io/job/perf-AB-test/375/artifact/Impala/perf_results/latest/performance_result.txt]

Diving into the profiles would get us the exact decompression speedup.

> Investigate using cloudflare's zlib library
> -------------------------------------------
>
>                 Key: IMPALA-11603
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11603
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 4.2.0
>            Reporter: Joe McDonnell
>            Priority: Major
>
> Amazon recommends the use of cloudflare's zlib implementation at 
> [https://github.com/cloudflare/zlib]
> In a blog post, they claim pretty large performance boosts over the regular 
> zlib implementation:
> [https://aws.amazon.com/blogs/opensource/improving-zlib-cloudflare-and-comparing-performance-with-other-zlib-forks/]
> {noformat}
> On Arm:
>   Compression performance: ~90 percent faster than zlib-madler (original 
> zlib).
>   Decompression performance: ~52 percent faster than zlib-madler.
> On x86:
>   Compression performance: ~113 percent faster than zlib-madler.
>   Decompression performance: ~44 percent faster than zlib-madler.{noformat}
> The blog post is a year and a half old, so things may have changed since 
> then, but it seems interesting. Amazon's guidebooks still recommend it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to