[ 
https://issues.apache.org/jira/browse/LUCENE-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17195562#comment-17195562
 ] 

ASF subversion and git services commented on LUCENE-9510:
---------------------------------------------------------

Commit cdfdc1e0851478713b6f0e997bff3947cdaf98e9 in lucene-solr's branch 
refs/heads/branch_8x from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=cdfdc1e ]

LUCENE-9510: Don't pull a merge instance when flushing stored fields 
out-of-order. (#1872)

With recent changes to stored fields that split blocks into several sub
blocks, the merge instance has become much slower at random access since
it would decompress all sub blocks when accessing a document. Since
stored fields likely get accessed in random order at flush time when
index sorting is enabled, it's better not to use the merge instance.

On a synthetic benchmark that has one stored field and one numeric
doc-value field that is used for sorting and fed with random values,
this made indexing more than 4x faster.

> SortingStoredFieldsConsumer should use a format that has better random-access
> -----------------------------------------------------------------------------
>
>                 Key: LUCENE-9510
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9510
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> We noticed some indexing rate regressions in Elasticsearch after upgrading to 
> a new Lucene snapshot. This is due to the fact that 
> SortingStoredFieldsConsumer is using the default codec to write stored fields 
> on flush. Compression doesn't matter much for this case since these are 
> temporary files that get removed on flush after the segment is sorted anyway 
> so we could switch to a format that has faster random access.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to