[jira] [Commented] (LUCENE-7396) Speed up flush of 1-dimension points

Michael McCandless (JIRA) Thu, 28 Jul 2016 13:51:36 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-7396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398172#comment-15398172
 ]


Michael McCandless commented on LUCENE-7396:
--------------------------------------------

Thanks [~jpountz], I'm re-indexing the 1.2B taxi rides!

Can't we use the {{MutableReader.getByteAt}} when computing the cardinality of 
each dim's suffix prefix byte in the leaf block writing too?

\{@cod mid} should be \{@code mid} in {{MutablePointsReaderUtils.partition}}, 
but hopefully {{ant precommit}} would catch that.

The patch looks great: +1 to commit.  We can polish in follow on issues ... I 
think it's important to get the builds chewing on this great improvement.

I'll revert LUCENE-7390 once you've pushed.

> Speed up flush of 1-dimension points
> ------------------------------------
>
>                 Key: LUCENE-7396
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7396
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-7396.patch, LUCENE-7396.patch, LUCENE-7396.patch, 
> LUCENE-7396.patch
>
>
> 1D points already have an optimized merge implementation which works when 
> points come in order. So maybe we could make IndexWriter's PointValuesWriter 
> sort before feeding the PointsFormat and somehow propagate the information to 
> the PointsFormat?
> The benefit is that flushing could directly stream points to disk with little 
> memory usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-7396) Speed up flush of 1-dimension points

Reply via email to