The machine is swamped with tests. I will run the experiment when the
machine is free.


Regards,
Ning


Ning Li
Search Technologies
IBM Almaden Research Center
650 Harry Road
San Jose, CA 95120



|---------+---------------------------->
|         |           Otis Gospodnetic |
|         |           <otis_gospodnetic|
|         |           @yahoo.com>      |
|         |                            |
|         |           05/09/2006 07:30 |
|         |           AM               |
|         |           Please respond to|
|         |           java-dev         |
|---------+---------------------------->
  
>------------------------------------------------------------------------------------------------------------------------------|
  |                                                                             
                                                 |
  |       To:       java-dev@lucene.apache.org                                  
                                                 |
  |       cc:                                                                   
                                                 |
  |       Subject:  Re: Supporting deleteDocuments in IndexWriter (Code and 
Performance Results Provided)                        |
  
>------------------------------------------------------------------------------------------------------------------------------|




I agree - a delete (typically for a Term that represents a "primary key"
for a Document in an index) followed by re-add of a Document is a very
common scenario, and I'd love to see the numbers for that.

Thanks,
Otis

> We experimented with three workloads:
>   - Insert only. 1.6M documents were inserted and the final
>     index size was 2.3GB.
>   - Insert/delete (big batches). The same documents were
>     inserted, but 25% were deleted. 1000 documents were
>     deleted for every 4000 inserted.
>   - Insert/delete (small batches). In this case, 5 documents
>     were deleted for every 20 inserted.

Thanks, these benchmarks are very important.

If you can do it, I'd love to see the results of a fourth benchmark,
which represents a typical situation (which you also mentioned)
of document updates: every single insert is preceded by a delete,
25% of which actually delete (the updated document existed previously)
and the rest end up not finding an old document and not deleting
anything. I expect this benchmark to show an even greater improvment
of your approach over the naive IndexModifier.


--
Nadav Har'El


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to