Mike,

Here you go:


IndexWriter:
----------------
$ java -classpath /Users/jibo/Desktop/iwork/lucene/java/trunk/build/ lucene-core-2.9-dev.jar org.apache.lucene.index.CheckIndex /Users/jibo/ Desktop/iwork/lucene/java/trunk/contrib/benchmark/work/index

NOTE: testing will be more thorough if you run java with '- ea:org.apache.lucene...', so assertions are enabled

Opening index @ /Users/jibo/Desktop/iwork/lucene/java/trunk/contrib/ benchmark/work/index

Segments file=segments_a numSegments=1 version=FORMAT_DIAGNOSTICS [Lucene 2.9]
 1 of 1: name=_18 docCount=200000
   compound=true
   hasProx=true
   numFiles=1
   size (MB)=427.448
diagnostics = {java.version=1.5.0_19, lucene.version=2.9-dev 779767M - 2009-05-28 17:02:17, os=Mac OS X, os.arch=i386, optimize=true, mergeDocStores=true, java.vendor=Apple Inc., os.version=10.5.7, source=merge, mergeFactor=4}
   no deletions
   test: open reader.........OK
   test: fields, norms.......OK [4 fields]
test: terms, freq, prox...OK [3512343 terms; 80020204 terms/docs pairs; 163219760 tokens] test: stored fields.......OK [200000 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/ freq vector fields per doc]

No problems were detected with this index.


ThreadedIndexWriter:
-----------------------------

$ java -classpath /Users/jibo/Desktop/iwork/lucene/java/trunk/build/ lucene-core-2.9-dev.jar org.apache.lucene.index.CheckIndex /Users/jibo/ Desktop/iwork/lucene/java/trunk/contrib/benchmark/work/index

NOTE: testing will be more thorough if you run java with '- ea:org.apache.lucene...', so assertions are enabled

Opening index @ /Users/jibo/Desktop/iwork/lucene/java/trunk/contrib/ benchmark/work/index

Segments file=segments_3 numSegments=1 version=FORMAT_DIAGNOSTICS [Lucene 2.9]
 1 of 1: name=_q docCount=199970
   compound=true
   hasProx=true
   numFiles=3
   size (MB)=319.107
diagnostics = {java.version=1.5.0_19, lucene.version=2.9-dev 779767M - 2009-05-28 17:02:17, os=Mac OS X, os.arch=i386, optimize=true, mergeDocStores=false, java.vendor=Apple Inc., os.version=10.5.7, source=merge, mergeFactor=6}
   docStoreOffset=0
   docStoreSegment=_0
   docStoreIsCompoundFile=false
   no deletions
   test: open reader.........OK
   test: fields, norms.......OK [4 fields]
test: terms, freq, prox...OK [1227086 terms; 69244121 terms/docs pairs; 134390948 tokens] test: stored fields.......OK [199970 total field count; avg 1 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/ freq vector fields per doc]

No problems were detected with this index.


$



On Jul 31, 2009, at 2:52 PM, Michael McCandless wrote:

Hmmm... can you run CheckIndex on both indexes and post the results?

 java org.apache.lucene.index.CheckIndex /path/to/index

Mike

On Fri, Jul 31, 2009 at 2:38 PM, Jibo John<jiboj...@mac.com> wrote:
Number of docs are the same in the index for both the cases (200,000). I haven't altered the benchmark/ code, but, used a profiler to verify that Benchmark main thread is closed only after all other threads are closed.

Thanks,
-Jibo


On Jul 31, 2009, at 2:34 AM, Michael McCandless wrote:

Hmm... this doesn't sound right.

That example (ThreadedIndexWriter) is meant to be a drop-in
replacement, wherever you use an IndexWriter, that keeps an
under-the-hood thread pool (using java.util.concurrent.*) to
add/update documents with multiple threads.

It should not result in a smaller index.

Can you sanity check the index?  Eg is numDocs() the same for both?
You definitely called close() on the writer, right? That method waits
for all threads to finish their work before actually closing.

Mike

On Thu, Jul 30, 2009 at 8:01 PM, Jibo John<jiboj...@mac.com> wrote:

While trying out a few tuning options using contrib/benchmak as described
in
LIA (2nd edition) book, I had an interesting observation.

If I use a ThreadedIndexWriter (picked the example from lia2e, page 356) instead of IndexWriter, the index size got reduced by 40% compared to
using
IndexWriter.
Index related configuration were the same for both the tests in the alg
file.

I am curious how come using a threaded index writer will have an impact
on
the index size.

Appreciate your input.

Thanks,
-Jibo

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to