[ 
https://issues.apache.org/jira/browse/LUCENE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13756896#comment-13756896
 ] 

Robert Muir commented on LUCENE-5189:
-------------------------------------

By the way: the "general" issue is that for updates, its unfortunately not 
enough to concern ourselves with data, we have to worry about metadata too:

I see at least 4 problems (and i have not thought about it completely):
# FieldInfo.attributes: these "writes" by the NumericDocValues impl will be 
completely discarded during update, because its per-segment, not per-commit.
# SegmentInfo.attributes: same as the above
# Field doesnt exist in FieldInfo at all: (because the segment the update 
applies to happens to have no values for the field)
# Field exists in FieldInfo, but is incomplete: (because the segment the update 
applies to, had say a stored-only or stored+indexed value for the field, but no 
dv one).

PerFieldDVF is just one implementation that happens to use #1. Fixing it is 
fixing the symptom, thats why I say we really need to instead fix the disease, 
or things will get very ugly.

The only reasons you dont see more problems with #1 and #2, is that currently 
its not used very much (only by PerField and back-compat). If we had more 
codecs exercising the APIs, you would be seeing these problems already.

A perfectly good solution would be to remove these APIs completely for public 
use (which would solve #1 and #2). PerField(PF/DVF) could write its own .per 
file instead. Back compat cruft could then use these now-internal-only-APIs 
(and it wont matter since they dont support updates), or we could implement 
their hacks in another way.

But this still leaves issues like #3 and #4.

Adding a boolean 'isFieldUpdate' doesn't really solve anything, and it totally 
breaks the whole concept of the codec being unaware of updates.

It is the wrong direction.

                
> Numeric DocValues Updates
> -------------------------
>
>                 Key: LUCENE-5189
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5189
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: core/index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>         Attachments: LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, 
> LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch
>
>
> In LUCENE-4258 we started to work on incremental field updates, however the 
> amount of changes are immense and hard to follow/consume. The reason is that 
> we targeted postings, stored fields, DV etc., all from the get go.
> I'd like to start afresh here, with numeric-dv-field updates only. There are 
> a couple of reasons to that:
> * NumericDV fields should be easier to update, if e.g. we write all the 
> values of all the documents in a segment for the updated field (similar to 
> how livedocs work, and previously norms).
> * It's a fairly contained issue, attempting to handle just one data type to 
> update, yet requires many changes to core code which will also be useful for 
> updating other data types.
> * It has value in and on itself, and we don't need to allow updating all the 
> data types in Lucene at once ... we can do that gradually.
> I have some working patch already which I'll upload next, explaining the 
> changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to