[
https://issues.apache.org/jira/browse/LUCENE-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17321327#comment-17321327
]
Robert Muir commented on LUCENE-9843:
-------------------------------------
I moved this issue to a blocker for 9.0 because i've already seen multiple
instances where these compression settings are set inappropriately, and from a
back-compat perspective we need to stop the bleeding before we have to support
all these variants for a long time.
I'll summarize my proposal above again:
* remove the option for SORTED term dictionaries, just compress always. does
not impact speed of per-doc ordinals.
* remove the option for BINARY, don't compress. it is a catch-all and we don't
know the use-case. Supply a different codec if someone wants to do block
compression over binary, but avoid back compat hassle.
Seems the issue could be easily split into two tasks.
> Remove compression option on doc values
> ---------------------------------------
>
> Key: LUCENE-9843
> URL: https://issues.apache.org/jira/browse/LUCENE-9843
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Priority: Blocker
>
> Options on file formats add complexity and put a big tax on
> backward-compatibility testing. I'm the one who introduced it LUCENE-9378 but
> I would now like to think about what we can do to remove this option.
> For the record, compression was initially introduced because some binary
> fields have so much redundancy that it's wasteful not to compress them at
> all. But unfortunately, this slowed down some search workloads and we decided
> to introduce this option as a way to let users choose the trade-off they want.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]