[ https://issues.apache.org/jira/browse/LUCENE-8739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17468498#comment-17468498 ]
Adrien Grand commented on LUCENE-8739: -------------------------------------- bq. Would such an increase even make sense or would this cause other issues? It would require reading more data from disk. This read would be sequential so I suspect it wouldn't hurt much, including on slower I/O. The main drawback is probably that it would trash a bit more of filesystem cache. That said I agree with you that we should probably look into increasing the block size with ZStandard. I just did a run with 1.5x larger blocks and level=6, it slightly outperforms our current BEST_COMPRESSION mode across indexing time, disk usage and compression. ||Codec ||Indexing time (ms) ||Disk usage (MB) || Retrieval time per 10k docs (ms) || | ZSTD dict level=6 1.5x larger blocks | 43228 | 57.455 | 1269.22127 | bq. Or would 3 presets be too much choice? IMO it would be too much, but I like the fact that ZSTD could help us have two options for compression that share the exact same read logic, e.g. if we replaced BEST_SPEED with what you suggested for BALANCED: low level ZSTD compression with a small block size. bq. Anyway I see potential for good tradeoffs here. +1 ZSTD is quite great. I wouldn't use it in the Lucene default codec yet, because lucene-core shouldn't have dependencies and we don't want to use JNI in the lucere-core build. Maybe we can reconsider when Project Panama lands and it gets easier to interact with native libraries. > ZSTD Compressor support in Lucene > --------------------------------- > > Key: LUCENE-8739 > URL: https://issues.apache.org/jira/browse/LUCENE-8739 > Project: Lucene - Core > Issue Type: New Feature > Components: core/codecs > Reporter: Sean Torres > Priority: Minor > Labels: features > Time Spent: 1.5h > Remaining Estimate: 0h > > ZStandard has a great speed and compression ratio tradeoff. > ZStandard is open source compression from Facebook. > More about ZSTD > [https://github.com/facebook/zstd] > [https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/] -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org