[
https://issues.apache.org/jira/browse/LUCENE-9919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17319623#comment-17319623
]
Michael McCandless commented on LUCENE-9919:
--------------------------------------------
How about just building a custom {{Codec}} using pure Java ZSTD implementation,
instead of JDK DEFLATE, and then run our standard {{luceneutil}} benchmarks?
This is a great example use-case of why we built the {{Codec}} API in the first
place – to enable fun experimentation exactly like this!
> ZSTD Compressor/Decompressor support in Lucene
> ----------------------------------------------
>
> Key: LUCENE-9919
> URL: https://issues.apache.org/jira/browse/LUCENE-9919
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/codecs
> Reporter: Praveen Nishchal
> Priority: Major
> Labels: compression, lucene, zstandard
>
> Lucene currently supports LZ4 and Zlib compression/decompression for
> StoredFieldsFormat, DocValuesFormat, TermVectorsFormat and PostingsFormat
> codecs. We propose Zstandard ([https://facebook.github.io/zstd/])
> compression/decompression for all codecs mentioned earlier for following
> reasons:
> * ZStandard is being used in some of the most popular open source projects
> like Apache Cassandra, Hadoop and Kafka.
> * Zstandard, at the default setting of 3, is expected to show substantial
> improvements in both compression and decompression speed, while compressing
> at the same ratio as zlib as per study mentioned by Yann Collet at Facebook.
> * Zstandard currently offers 22 different Compression levels, which enable
> flexible, granular trade-offs between compression speed and ratios for future
> data. For example, we can use level 1 if speed is most important and level 22
> if size is most important.
> * Zstandard designed to scale with modern hardware.
> * Small data
> - It has APIs for dictionary compression as well. Small data
> compression can range anywhere from 2x to 5x better than compression without
> dictionaries.
> * Zstandard is being continuously improved by Facebook/Community.
>
> Kindly go through below link for more details:
> [https://engineering.fb.com/2016/08/31/core-data/smaller-and-faster-data-compression-with-zstandard/]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]