[ https://issues.apache.org/jira/browse/LUCENE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13756896#comment-13756896 ]
Robert Muir commented on LUCENE-5189: ------------------------------------- By the way: the "general" issue is that for updates, its unfortunately not enough to concern ourselves with data, we have to worry about metadata too: I see at least 4 problems (and i have not thought about it completely): # FieldInfo.attributes: these "writes" by the NumericDocValues impl will be completely discarded during update, because its per-segment, not per-commit. # SegmentInfo.attributes: same as the above # Field doesnt exist in FieldInfo at all: (because the segment the update applies to happens to have no values for the field) # Field exists in FieldInfo, but is incomplete: (because the segment the update applies to, had say a stored-only or stored+indexed value for the field, but no dv one). PerFieldDVF is just one implementation that happens to use #1. Fixing it is fixing the symptom, thats why I say we really need to instead fix the disease, or things will get very ugly. The only reasons you dont see more problems with #1 and #2, is that currently its not used very much (only by PerField and back-compat). If we had more codecs exercising the APIs, you would be seeing these problems already. A perfectly good solution would be to remove these APIs completely for public use (which would solve #1 and #2). PerField(PF/DVF) could write its own .per file instead. Back compat cruft could then use these now-internal-only-APIs (and it wont matter since they dont support updates), or we could implement their hacks in another way. But this still leaves issues like #3 and #4. Adding a boolean 'isFieldUpdate' doesn't really solve anything, and it totally breaks the whole concept of the codec being unaware of updates. It is the wrong direction. > Numeric DocValues Updates > ------------------------- > > Key: LUCENE-5189 > URL: https://issues.apache.org/jira/browse/LUCENE-5189 > Project: Lucene - Core > Issue Type: New Feature > Components: core/index > Reporter: Shai Erera > Assignee: Shai Erera > Attachments: LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, > LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch > > > In LUCENE-4258 we started to work on incremental field updates, however the > amount of changes are immense and hard to follow/consume. The reason is that > we targeted postings, stored fields, DV etc., all from the get go. > I'd like to start afresh here, with numeric-dv-field updates only. There are > a couple of reasons to that: > * NumericDV fields should be easier to update, if e.g. we write all the > values of all the documents in a segment for the updated field (similar to > how livedocs work, and previously norms). > * It's a fairly contained issue, attempting to handle just one data type to > update, yet requires many changes to core code which will also be useful for > updating other data types. > * It has value in and on itself, and we don't need to allow updating all the > data types in Lucene at once ... we can do that gradually. > I have some working patch already which I'll upload next, explaining the > changes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org