[
https://issues.apache.org/jira/browse/LUCENE-10170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17428077#comment-17428077
]
Adrien Grand commented on LUCENE-10170:
---------------------------------------
I attached a pull request that gets compression speed to a better level to what
it was before LUCENE-7521. I'm seeing the same disk usage and retrieval times
as current main, and 18% faster indexing with geonames and 25% faster indexing
with enwiki.
It looks like the reason why none of the Lucene benchmarks caught it is because
geo benchmarks don't use stored fields, the NYC Taxis benchmark only has a
couple stored fields which are not a bottleneck, and the enwiki benchmark has
terms/postings and vectors as bottlenecks so regressions related to points, doc
values or stored fields are very unlikely to show up.
> Regression in stored fields compression in 9.0
> ----------------------------------------------
>
> Key: LUCENE-10170
> URL: https://issues.apache.org/jira/browse/LUCENE-10170
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Adrien Grand
> Priority: Minor
> Time Spent: 10m
> Remaining Estimate: 0h
>
> One of the Elasticsearch benchmarks detected a regression that is attributed
> to stored fields compression after moving to a 9.0 snapshot. I managed to
> isolate it to the change done on LUCENE-7521. It looks like LZ4 relied
> significantly on Direct16 for good performance, so replacing it with Packed64
> degraded compression speed.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]