[
https://issues.apache.org/jira/browse/LUCENE-4688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13555201#comment-13555201
]
Michael McCandless commented on LUCENE-4688:
--------------------------------------------
I think it's interesting/powerful to enable across-segment reuse: none
of our other reuse APIs (DocsEnum, D&PEnum) can do that.
But I'm not sure we should do it: to take full advantage of it
requires API changes (like the MTQ.getTermsEnum change) ... we'd have
to do something similar to Weight/Scorer to share the D/&PEnum across
segments.
The patch itself is spooky: this BlockTree code is hairy, and I'm not
sure that the reset() isn't going to cause subtle corner-case bugs.
(Separately: we need to simplify this code: it's unapproachable now).
The benchmark gain is impressive, but, we are talking about 10 seconds
over 2M docs right? So 5 micro-seconds (.005 msec) per document? In a
more realistic scenario (indexing more "normal" docs) surely this is a
minor part of the time ...
The app can always reuse itself per-segment today ... I think reuse is
rather expert so it's OK to offer that as the way to reuse?
> Reuse TermsEnum in BlockTreeTermsReader
> ---------------------------------------
>
> Key: LUCENE-4688
> URL: https://issues.apache.org/jira/browse/LUCENE-4688
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/codecs
> Affects Versions: 4.0, 4.1
> Reporter: Simon Willnauer
> Fix For: 4.2, 5.0
>
> Attachments: LUCENE-4688.patch
>
>
> Opening a TermsEnum comes with a significant cost at this point if done
> frequently like primary key lookups or if many segments are present.
> Currently we don't reuse it at all and create a lot of objects even if the
> enum is just used for a single seekExact (ie. TermQuery). Stressing the
> Terms#iterator(reuse) call shows significant gains with reuse...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]