[
https://issues.apache.org/jira/browse/LUCENE-8551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16688303#comment-16688303
]
David Smiley commented on LUCENE-8551:
--------------------------------------
bq. For instance in the NRT case that means that you could have two consecutive
point-in-time views of the same index that disagree on the FieldInfo of a field?
Can you please elaborate on that? I'm unclear how NRT in particular relates.
> Purge unused FieldInfo on segment merge
> ---------------------------------------
>
> Key: LUCENE-8551
> URL: https://issues.apache.org/jira/browse/LUCENE-8551
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/index
> Reporter: David Smiley
> Priority: Major
>
> If a field is effectively unused (no norms, terms index, term vectors,
> docValues, stored value, points index), it will nonetheless hang around in
> FieldInfos indefinitely. It would be nice to be able to recognize an unused
> FieldInfo and allow it to disappear after a merge (or two).
> SegmentMerger merges FieldInfo (from each segment) as nearly the first thing
> it does. After that, the different index parts, before it's known what's
> "used" or not. After writing, we theoretically know which fields are used or
> not, though we're not doing any bookkeeping to track it. Maybe we should
> track the fields used during writing so we write a filtered merged fieldInfo
> at the end instead of unfiltered up front? Or perhaps upon reading a
> segment, we make it cheap/easy for each index type (e.g. terms index, stored
> fields, ...) to know which fields have data for the corresponding type.
> Then, on a subsequent merge, we know up front to filter the FieldInfos.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]