[jira] [Commented] (LUCENE-5618) DocValues updates send wrong fieldinfos to codec producers

Robert Muir (JIRA) Fri, 16 May 2014 22:29:33 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14000666#comment-14000666
 ]


Robert Muir commented on LUCENE-5618:
-------------------------------------

This looks good to me. My only concern (which can be a followup issue), is to 
try to simplify a lot of stuff in SegmentReader.initDocValuesProducers

In general I think SegmentReader needs a cleanup to ensure things are fast.

This logic is now more complex than before as there is back compat etc going 
on, and involves multiple full passes over fieldinfos/dv fields. In general we 
should really try to avoid this.

Its now quite a bit difficult to see what is happening in the common case (no 
updates for a segment) via 
initDocValuesProducers/SegmentDocValues/getNumericXXX codepaths.

On that issue we should cleanup other inefficiencies while we are there: e.g. 
we also want to try to reduce the overhead going on in e.g. 
SR.getNumericDocValues. For example today this is doing two hash lookups, when 
this method could just try 'dvFields' first and optimize the common case.

But lets fix the bugs first, this approach looks good to me. Long-term we 
should also investigate refactoring the livedocs format maybe to use this 
"files" approach recorded in the commit. Because currently the LiveDocs codec 
api is really horrible, and really its just an updatable 1-bit numeric 
docvalues.

> DocValues updates send wrong fieldinfos to codec producers
> ----------------------------------------------------------
>
>                 Key: LUCENE-5618
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5618
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Robert Muir
>            Assignee: Shai Erera
>            Priority: Blocker
>             Fix For: 4.9
>
>         Attachments: LUCENE-5618.patch, LUCENE-5618.patch
>
>
> Spinoff from LUCENE-5616.
> See the example there, docvalues readers get a fieldinfos, but it doesn't 
> contain the correct ones, so they have invalid field numbers at read time.
> This should really be fixed. Maybe a simple solution is to not write 
> "batches" of fields in updates but just have only one field per gen? 
> This removes many-many relationships and would make things easy to understand.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-5618) DocValues updates send wrong fieldinfos to codec producers

Reply via email to