[
https://issues.apache.org/jira/browse/LUCENE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13753586#comment-13753586
]
Uwe Schindler commented on LUCENE-5189:
---------------------------------------
Hi,
I had an idea yesterday when thinking about this. Currently (like for deletes)
we can update DocValues based on an ID term (by docid is not easily possible
with IndexWriter). As the ID term can be anything, you could also use some
(group) key that updates lots of documents (like you can delete all documents
with a specific term). The current code updates the given field for all those
documents to a fixed value. My two ideas are:
- we could also support update by query (means like for deletes you provide a
query that selects the documents to update)
- we could make "modifications" possible: Instead of giving a value that is set
for all selected documents, we could provide a "callback" interface that is
used to modify the current docvalue (numeric or String) of the document to
update and returns a changed value. This would be a one-method interface, so it
could be used as closure in Java 8, like {{writer.updateDocValues(term, value
-> value+1);}} (in Java 6/7 this would be {{writer.updateDocValues(term, new
NumericDocValuesUpdater() \{ public long update(long value) \{ return value+1;
\}\});}}). Servers like Solr or ElasticSearch could implement this
interface/closure using e.g. javascript, so one could execute a docvalues
update and pass a javascript function applied to every value. We just have to
think about concurency: What happens if 2 threads are updating the same value
at the same time - maybe this is already handled by the BufferedDeletesQueue!?
I just wanted to write this down in this issue, so we could think about
allowing to implement this. Of course the current patch is more important to
get the whole game running! The updateable by term/query is just one thing
which is often requested by users. The typical example is a webapp where you
can vote for a document. In that case one would execute the closure {{value ->
value+1}}. If we implement this so low level, the whole concurreny should be
easier than how it is currently impelemented e.g. in Solr or ES.
> Numeric DocValues Updates
> -------------------------
>
> Key: LUCENE-5189
> URL: https://issues.apache.org/jira/browse/LUCENE-5189
> Project: Lucene - Core
> Issue Type: New Feature
> Components: core/index
> Reporter: Shai Erera
> Assignee: Shai Erera
> Attachments: LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch,
> LUCENE-5189.patch
>
>
> In LUCENE-4258 we started to work on incremental field updates, however the
> amount of changes are immense and hard to follow/consume. The reason is that
> we targeted postings, stored fields, DV etc., all from the get go.
> I'd like to start afresh here, with numeric-dv-field updates only. There are
> a couple of reasons to that:
> * NumericDV fields should be easier to update, if e.g. we write all the
> values of all the documents in a segment for the updated field (similar to
> how livedocs work, and previously norms).
> * It's a fairly contained issue, attempting to handle just one data type to
> update, yet requires many changes to core code which will also be useful for
> updating other data types.
> * It has value in and on itself, and we don't need to allow updating all the
> data types in Lucene at once ... we can do that gradually.
> I have some working patch already which I'll upload next, explaining the
> changes.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]