[ https://issues.apache.org/jira/browse/LUCENE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13753586#comment-13753586 ]
Uwe Schindler commented on LUCENE-5189: --------------------------------------- Hi, I had an idea yesterday when thinking about this. Currently (like for deletes) we can update DocValues based on an ID term (by docid is not easily possible with IndexWriter). As the ID term can be anything, you could also use some (group) key that updates lots of documents (like you can delete all documents with a specific term). The current code updates the given field for all those documents to a fixed value. My two ideas are: - we could also support update by query (means like for deletes you provide a query that selects the documents to update) - we could make "modifications" possible: Instead of giving a value that is set for all selected documents, we could provide a "callback" interface that is used to modify the current docvalue (numeric or String) of the document to update and returns a changed value. This would be a one-method interface, so it could be used as closure in Java 8, like {{writer.updateDocValues(term, value -> value+1);}} (in Java 6/7 this would be {{writer.updateDocValues(term, new NumericDocValuesUpdater() \{ public long update(long value) \{ return value+1; \}\});}}). Servers like Solr or ElasticSearch could implement this interface/closure using e.g. javascript, so one could execute a docvalues update and pass a javascript function applied to every value. We just have to think about concurency: What happens if 2 threads are updating the same value at the same time - maybe this is already handled by the BufferedDeletesQueue!? I just wanted to write this down in this issue, so we could think about allowing to implement this. Of course the current patch is more important to get the whole game running! The updateable by term/query is just one thing which is often requested by users. The typical example is a webapp where you can vote for a document. In that case one would execute the closure {{value -> value+1}}. If we implement this so low level, the whole concurreny should be easier than how it is currently impelemented e.g. in Solr or ES. > Numeric DocValues Updates > ------------------------- > > Key: LUCENE-5189 > URL: https://issues.apache.org/jira/browse/LUCENE-5189 > Project: Lucene - Core > Issue Type: New Feature > Components: core/index > Reporter: Shai Erera > Assignee: Shai Erera > Attachments: LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, > LUCENE-5189.patch > > > In LUCENE-4258 we started to work on incremental field updates, however the > amount of changes are immense and hard to follow/consume. The reason is that > we targeted postings, stored fields, DV etc., all from the get go. > I'd like to start afresh here, with numeric-dv-field updates only. There are > a couple of reasons to that: > * NumericDV fields should be easier to update, if e.g. we write all the > values of all the documents in a segment for the updated field (similar to > how livedocs work, and previously norms). > * It's a fairly contained issue, attempting to handle just one data type to > update, yet requires many changes to core code which will also be useful for > updating other data types. > * It has value in and on itself, and we don't need to allow updating all the > data types in Lucene at once ... we can do that gradually. > I have some working patch already which I'll upload next, explaining the > changes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org