[
https://issues.apache.org/jira/browse/LUCENE-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17338626#comment-17338626
]
Jack Conradson commented on LUCENE-9843:
----------------------------------------
I have attached a patch ([^LUCENE-9843.patch]) that attempts to address this
issue with the suggestions given by [~rcmuir] .
The patch does the following:
* Removes the best_speed/best_compression mode for just doc values from the
Lucene90Codec
* Terms dictionaries now always use compression unless the values are below
the {color:#9876aa}TERMS_DICT_BLOCK_COMPRESSION_THRESHOLD{color}
* Binary fields now never use compression
* Consolidated many tests into TestLucene90DocValuesFormat as there is no
longer a need for separate tests for the different options
* Removed tests that relied on both best_speed/best_compression for
comparisons against each other
> Remove compression option on doc values
> ---------------------------------------
>
> Key: LUCENE-9843
> URL: https://issues.apache.org/jira/browse/LUCENE-9843
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Priority: Blocker
> Attachments: LUCENE-9843.patch
>
>
> Options on file formats add complexity and put a big tax on
> backward-compatibility testing. I'm the one who introduced it LUCENE-9378 but
> I would now like to think about what we can do to remove this option.
> For the record, compression was initially introduced because some binary
> fields have so much redundancy that it's wasteful not to compress them at
> all. But unfortunately, this slowed down some search workloads and we decided
> to introduce this option as a way to let users choose the trade-off they want.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]