[ https://issues.apache.org/jira/browse/IMPALA-12076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joe McDonnell resolved IMPALA-12076. ------------------------------------ Fix Version/s: Impala 4.4.0 Assignee: Joe McDonnell Resolution: Fixed > Potential performance improvement using ZSTD's ZSTD_decompressDCtx interface > ---------------------------------------------------------------------------- > > Key: IMPALA-12076 > URL: https://issues.apache.org/jira/browse/IMPALA-12076 > Project: IMPALA > Issue Type: Improvement > Components: Backend > Affects Versions: Impala 4.3.0 > Reporter: Joe McDonnell > Assignee: Joe McDonnell > Priority: Major > Fix For: Impala 4.4.0 > > > In ORC-639, they note that ZSTD's simple interface initializes the context on > each call to ZSTD_decompress(). When calling ZSTD_decompress() many times, it > is better to allocate the context once and use the ZSTD_decompressDCtx() > interface to avoid the repeated initialization. > The ZSTD code mentions that here: > > {noformat} > /*= Decompression context > * When decompressing many times, > * it is recommended to allocate a context only once, > * and re-use it for each successive compression operation. > * This will make workload friendlier for system's memory. > * Use one context per thread for parallel execution. */ > typedef struct ZSTD_DCtx_s ZSTD_DCtx;{noformat} > We should investigate using this for decompress.h/.cc's > ZstandardDecompressor. We already do that for the streaming decompression > mode, but this should also apply to block decompression. Something similar is > possible for compression as well. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org