[ 
https://issues.apache.org/jira/browse/LUCENE-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Busch updated LUCENE-2881:
----------------------------------

    Attachment: lucene-2881.patch

  * Creates for every segment a new FieldInfos
  * Changes FieldInfos, so that the FieldInfo numbers within a single 
FieldInfos don't have to be contiguous - this allows using the same numbering 
as the previous segment(s), even if not all fields are present in the new 
segment
  * Adds a global fieldName -> fieldNumber map;  if possible when a new field 
is added to a FieldInfo it tries to use an already assigned number for that 
field

All tests pass.  Though I need to verify if the global map works correctly 
(it'd probably be good to add a test for that).  Also it'd be nice to remove 
hasVectors and hasProx from SegmentInfo, but we could also do that in a 
separate issue. 

> Track FieldInfo per segment instead of per-IW-session
> -----------------------------------------------------
>
>                 Key: LUCENE-2881
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2881
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: Realtime Branch, CSF branch, 4.0
>            Reporter: Simon Willnauer
>            Assignee: Michael Busch
>             Fix For: Realtime Branch, CSF branch, 4.0
>
>         Attachments: lucene-2881.patch
>
>
> Currently FieldInfo is tracked per IW session to guarantee consistent global 
> field-naming / ordering. IW carries FI instances over from previous segments 
> which also carries over field properties like isIndexed etc. While having 
> consistent field ordering per IW session appears to be important due to bulk 
> merging stored fields etc. carrying over other properties might become 
> problematic with Lucene's Codec support.  Codecs that rely on consistent 
> properties in FI will fail if FI properties are carried over.
> The DocValuesCodec (DocValuesBranch) for instance writes files per segment 
> and field (using the field id within the file name). Yet, if a segment has no 
> DocValues indexed in a particular segment but a previous segment in the same 
> IW session had DocValues, FieldInfo#docValues will be true  since those 
> values are reused from previous segments. 
> We already work around this "limitation" in SegmentInfo with properties like 
> hasVectors or hasProx which is really something we should manage per Codec & 
> Segment. Ideally FieldInfo would be managed per Segment and Codec such that 
> its properties are valid per segment. It also seems to be necessary to bind 
> FieldInfoS to SegmentInfo logically since its really just per segment 
> metadata.  

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to