[
https://issues.apache.org/jira/browse/LUCENE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13756896#comment-13756896
]
Robert Muir commented on LUCENE-5189:
-------------------------------------
By the way: the "general" issue is that for updates, its unfortunately not
enough to concern ourselves with data, we have to worry about metadata too:
I see at least 4 problems (and i have not thought about it completely):
# FieldInfo.attributes: these "writes" by the NumericDocValues impl will be
completely discarded during update, because its per-segment, not per-commit.
# SegmentInfo.attributes: same as the above
# Field doesnt exist in FieldInfo at all: (because the segment the update
applies to happens to have no values for the field)
# Field exists in FieldInfo, but is incomplete: (because the segment the update
applies to, had say a stored-only or stored+indexed value for the field, but no
dv one).
PerFieldDVF is just one implementation that happens to use #1. Fixing it is
fixing the symptom, thats why I say we really need to instead fix the disease,
or things will get very ugly.
The only reasons you dont see more problems with #1 and #2, is that currently
its not used very much (only by PerField and back-compat). If we had more
codecs exercising the APIs, you would be seeing these problems already.
A perfectly good solution would be to remove these APIs completely for public
use (which would solve #1 and #2). PerField(PF/DVF) could write its own .per
file instead. Back compat cruft could then use these now-internal-only-APIs
(and it wont matter since they dont support updates), or we could implement
their hacks in another way.
But this still leaves issues like #3 and #4.
Adding a boolean 'isFieldUpdate' doesn't really solve anything, and it totally
breaks the whole concept of the codec being unaware of updates.
It is the wrong direction.
> Numeric DocValues Updates
> -------------------------
>
> Key: LUCENE-5189
> URL: https://issues.apache.org/jira/browse/LUCENE-5189
> Project: Lucene - Core
> Issue Type: New Feature
> Components: core/index
> Reporter: Shai Erera
> Assignee: Shai Erera
> Attachments: LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch,
> LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch
>
>
> In LUCENE-4258 we started to work on incremental field updates, however the
> amount of changes are immense and hard to follow/consume. The reason is that
> we targeted postings, stored fields, DV etc., all from the get go.
> I'd like to start afresh here, with numeric-dv-field updates only. There are
> a couple of reasons to that:
> * NumericDV fields should be easier to update, if e.g. we write all the
> values of all the documents in a segment for the updated field (similar to
> how livedocs work, and previously norms).
> * It's a fairly contained issue, attempting to handle just one data type to
> update, yet requires many changes to core code which will also be useful for
> updating other data types.
> * It has value in and on itself, and we don't need to allow updating all the
> data types in Lucene at once ... we can do that gradually.
> I have some working patch already which I'll upload next, explaining the
> changes.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]