Document updates work as delete/add under the hood

2015-07-09 Thread chalitha udara Perera
Hi All, I have a requirement for updating lucene index (add single field for existing docs and modify value of another field). These documents contain many other fields that do not need any modifications. But as I understand luence provides delete/add mechanism for even single field value updates.

Re: Document updates work as delete/add under the hood

2015-07-10 Thread Gimantha Bandara
Hi Chalitha, You can simply use indexWriter.updateDocument to update the existing index documents On Fri, Jul 10, 2015 at 11:38 AM, chalitha udara Perera < chalithaud...@gmail.com> wrote: > Hi All, > > I have a requirement for updating lucene index (add single field for > existing docs and modif

Re: Document updates work as delete/add under the hood

2015-07-10 Thread chalitha udara Perera
Hi Gimatha, Yes. It is possible to use IndexWriter updateDocument() to update document. But with that method what happens under the hood is it deletes matching documents and re-index new document. I need to update only a single field. Re-indexing a new document with updated field + other fields se

Re: Document updates work as delete/add under the hood

2015-07-10 Thread Gimantha Bandara
ah.. I misread the thread,I thought you were using two APIs to acheive the same done by updateDocument. Yes it is an overhead and harder for user to keep track of the fields that he doesn't need to update. Already there is a Jira opened for this[1]. [1] https://issues.apache.org/jira/browse/LUCENE

Re: Document updates work as delete/add under the hood

2015-07-10 Thread Erick Erickson
Well, if it's a docValues field you can do update in place at the Lucene level for certain types of simple values (numerics, strings, but not text types) see: https://issues.apache.org/jira/browse/LUCENE-5189 In essence the reason it's a delete/re-add is that the structure of the postings list and

Re: Document updates work as delete/add under the hood

2015-07-12 Thread chalitha udara Perera
Hi Erick, Thanks for the explanation. I am doing some experiments on off-line clustering on document features indexed in lucene and update few document fields in order provide different search experience. E.g. for text documents insert cluster ID for doc that document belongs to, for images create

Re: Document updates work as delete/add under the hood

2015-07-13 Thread Erick Erickson
bq: Is there any generic benchmark analysis done on the update rate of lucene saying that It can handle X number of document updates without any performance issues _Of course_ there will be a performance hit when indexing, the question is whether it's tolerable given your environment. How big is t