[
https://issues.apache.org/jira/browse/LUCENE-7396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398172#comment-15398172
]
Michael McCandless commented on LUCENE-7396:
--------------------------------------------
Thanks [~jpountz], I'm re-indexing the 1.2B taxi rides!
Can't we use the {{MutableReader.getByteAt}} when computing the cardinality of
each dim's suffix prefix byte in the leaf block writing too?
\{@cod mid} should be \{@code mid} in {{MutablePointsReaderUtils.partition}},
but hopefully {{ant precommit}} would catch that.
The patch looks great: +1 to commit. We can polish in follow on issues ... I
think it's important to get the builds chewing on this great improvement.
I'll revert LUCENE-7390 once you've pushed.
> Speed up flush of 1-dimension points
> ------------------------------------
>
> Key: LUCENE-7396
> URL: https://issues.apache.org/jira/browse/LUCENE-7396
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Priority: Minor
> Attachments: LUCENE-7396.patch, LUCENE-7396.patch, LUCENE-7396.patch,
> LUCENE-7396.patch
>
>
> 1D points already have an optimized merge implementation which works when
> points come in order. So maybe we could make IndexWriter's PointValuesWriter
> sort before feeding the PointsFormat and somehow propagate the information to
> the PointsFormat?
> The benefit is that flushing could directly stream points to disk with little
> memory usage.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]