Mike,
Verified that I have the latest source code.
Here are the alg files and the checkindexer output.
----------------------------------------- indexwriter
alg----------------------------------------------------------------
analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
doc.maker=org.apache.lucene.benchmark.byTask.feeds.LineDocMaker
directory=FSDirectory
doc.stored = true #A
docs.file=wikipedia.lines.txt
ram.flush.mb=50
compound=false
merge.factor=5
doc.add.log.step=1000
doc.term.vector=false
doc.term.vector.positions=false
doc.term.vector.offsets=false
{ "Rounds" #B
ResetSystemErase
{ "BuildIndex"
-CreateIndex()
[ { "AddDocs" AddDoc > : 40000 ] :
5 #C
-CloseIndex()
}
NewRound
} : 1
RepSumByPrefRound BuildIndex #D
-----------------------------------------threadedindexwriter alg
----------------------------------------------------------------
analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
doc.maker=org.apache.lucene.benchmark.byTask.feeds.LineDocMaker
directory=FSDirectory
doc.stored = true #A
docs.file=wikipedia.lines.txt
ram.flush.mb=50
compound=false
merge.factor=5
doc.add.log.step=1000
doc.term.vector=false
doc.term.vector.positions=false
doc.term.vector.offsets=false
writer.num.threads=15
writer.max.thread.queue.size=75
work.dir=work_t
{ "Rounds" #B
ResetSystemErase
{ "BuildIndex"
-CreateThreadedIndex()
{ "AddDocs" AddDoc > : 200000
-CloseIndex()
}
NewRound
} : 1
RepSumByPrefRound BuildIndex #D
-----------------------------------------------threadedindexwriter
checkindex ----------------------------------------------------------
$ java -classpath /Users/jibo/Desktop/iwork/lucene/java/trunk/build/
lucene-core-2.9-dev.jar org.apache.lucene.index.CheckIndex /Users/jibo/
Desktop/iwork/lucene/java/trunk/contrib/benchmark/work_t/index
NOTE: testing will be more thorough if you run java with '-
ea:org.apache.lucene...', so assertions are enabled
Opening index @ /Users/jibo/Desktop/iwork/lucene/java/trunk/contrib/
benchmark/work_t/index
Segments file=segments_3 numSegments=1 version=FORMAT_DIAGNOSTICS
[Lucene 2.9]
1 of 1: name=_p docCount=199941
compound=true
hasProx=true
numFiles=3
size (MB)=317.1
diagnostics = {java.version=1.5.0_19, lucene.version=2.9-dev
779767M - 2009-05-28 17:02:17, os=Mac OS X, os.arch=i386,
optimize=true, mergeDocStores=false, java.vendor=Apple Inc.,
os.version=10.5.7, source=merge, mergeFactor=5}
docStoreOffset=0
docStoreSegment=_0
docStoreIsCompoundFile=false
no deletions
test: open reader.........OK
test: fields, norms.......OK [4 fields]
test: terms, freq, prox...OK [1269552 terms; 67887116 terms/docs
pairs; 133241176 tokens]
test: stored fields.......OK [199941 total field count; avg 1
fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/
freq vector fields per doc]
No problems were detected with this index.
------------------------------------------indexwriter checkindex
---------------------------------------------------------------
$ java -classpath /Users/jibo/Desktop/iwork/lucene/java/trunk/build/
lucene-core-2.9-dev.jar org.apache.lucene.index.CheckIndex /Users/jibo/
Desktop/iwork/lucene/java/trunk/contrib/benchmark/work/index
NOTE: testing will be more thorough if you run java with '-
ea:org.apache.lucene...', so assertions are enabled
Opening index @ /Users/jibo/Desktop/iwork/lucene/java/trunk/contrib/
benchmark/work/index
Segments file=segments_a numSegments=1 version=FORMAT_DIAGNOSTICS
[Lucene 2.9]
1 of 1: name=_18 docCount=200000
compound=true
hasProx=true
numFiles=1
size (MB)=427.445
diagnostics = {java.version=1.5.0_19, lucene.version=2.9-dev
779767M - 2009-05-28 17:02:17, os=Mac OS X, os.arch=i386,
optimize=true, mergeDocStores=true, java.vendor=Apple Inc.,
os.version=10.5.7, source=merge, mergeFactor=4}
no deletions
test: open reader.........OK
test: fields, norms.......OK [4 fields]
test: terms, freq, prox...OK [3512343 terms; 80020204 terms/docs
pairs; 163219760 tokens]
test: stored fields.......OK [200000 total field count; avg 1
fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/
freq vector fields per doc]
No problems were detected with this index.
---------------------------------------------------------------------------------------------------------
Thanks,
-Jibo
On Aug 1, 2009, at 2:08 AM, Michael McCandless wrote:
(Please note that ThreadedIndexWriter is source code available with
the upcoming revision to Lucene in Action.)
Phil, is it possible you are using an older version of the book's
source code? In particular, can you check whether your version of
ThreadedIndexWriter.java has this:
public void close(boolean doWait) throws CorruptIndexException,
IOException {
finish();
super.close(doWait);
}
(I vaguely remember that being missing from earlier releases, which
could explain what you're seeing). If you are missing that, can you
download the current code from http://www.manning.com/hatcher3 and try
again?
If that's not the problem... can you post the benchmark alg you are
using in each case?
Mike
On Fri, Jul 31, 2009 at 8:26 PM, Jibo John<jiboj...@mac.com> wrote:
Hi Phil,
It's 5 threads for IndexWriter.
For ThreadedIndexWriter, I used:
writer.num.threads=16
writer.max.thread.queue.size=80
Thanks,
-Jibo
On Jul 31, 2009, at 5:01 PM, Phil Whelan wrote:
Hi Jibo,
Your mergeFactor is different, and the resulting numFiles (segment
files) is different. Maybe each thread is responsible for a segment
file. Just curious - do you have 3 threads?
Phil
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org