[ 
https://issues.apache.org/jira/browse/LUCENE-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13423397#comment-13423397
 ] 

Shai Erera commented on LUCENE-4258:
------------------------------------

bq. ...which is to update the contents of one field, without reindexing the 
entire document

I agree, but I distinguish between two operations:
# replacing the content of a field entirely with a new content (or remove the 
field)
# update the field's content by adding/removing individual terms

bq. I think requiring no positions, no frequencies, and no norms makes it even 
more fringe. This means its not really useful for any search purposes. And we 
are a search engine library.

I disagree. Where I come from, the most common use case where such operation 
will be useful is when a single change affects hundreds and sometimes thousands 
of documents. An example is a document library like application which manages 
folders with ACLs. You can add an ACL to a top-level folder and it affects the 
entire documents and folder beneath it. That results in reindexing, sometimes, 
a huge amount of documents.

I don't diminish the use case of updating a field for scoring purposes, not at 
all. Just saying that starting by supporting one use case is more than 
supporting no use case.

Now, and this probably stems from my lack of understanding of the Lucene 
internals -- I see "supporting terms that omit TFAP" as a starting point 
because that is the easiest case, and even that requires a lot of understanding 
of the internals. After we do that, I'll feel more comfortable discussing other 
types of updates for other field types ... at least, I'll feel that I have more 
intelligent things to say :).

Regarding your other concerns, I share them with you, and we of course need to 
benchmark everything. I don't know how this affect search or not. But those 
updates will get merged away when segments are merged, so while I'm sure search 
will be affected, it's not for eternity - only until that segment is merged. 
And, I think we need to add capability to MergePolicy to 
findSegmentsForMergeUpdates, just like we expungeDeletes.

If the first step means that in order to update a field used for scoring (i.e. 
w/ norms) means that you need to replace the content of the field entirely by a 
new content, I'm ok with it. As one esteem member of this community always says 
"progress, not perfection" - I'm totally soled for that !
                
> Incremental Field Updates through Stacked Segments
> --------------------------------------------------
>
>                 Key: LUCENE-4258
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4258
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Sivan Yogev
>   Original Estimate: 2,520h
>  Remaining Estimate: 2,520h
>
> Shai and I would like to start working on the proposal to Incremental Field 
> Updates outlined here (http://markmail.org/message/zhrdxxpfk6qvdaex).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to