My mail client died while sending this mail.. Sorry for any duplicate.
It is strange that it should take 20 second to gather fields, this is
the only thing that really suprises me. I'd expect it to be instant
compared to RAMDirectory. It is hard to say from the information you
provided. Did you perhaps lazy load field values from your
RAMDirectory and not retrieve them, or something like that?
Why your queries are slow is also hard to say, there can be many
reaons. 70k documents can be quite a few documents for II if they
contain enough text. Here are a few questions that may or may not be
helpful:
What is the content of the documents? Do they contain a lot of the
same text? Or are they all rather unique? The major thing that makes
II faster than RAMDirectory is that it does not have to deserialize
values from the bytestream. As the index grows binary searching for
documents containing a given term will start consume more time than
deserializing the index.
What speed do you see if you only load 10% (7k)?
Did you see the graphics in the package level javadocs?
http://lucene.apache.org/java/3_0_2/api/all/org/apache/lucene/store/instantiated/package-summary.html
karl
26 aug 2010 kl. 09.24 skrev Li Li:
I have about 70k document, the total indexed size is about 15MB(the
orginal text files' size).
dir=new RAMDirectory();
IndexWriter write=new IndexWriter(dir,...;
for(loop){
writer.addDocument(doc);
}
writer.optimize();
writer.close();
IndexReader ir=IndexReader.open(dir,true);
InstantiatedIndex ii=new InstantiatedIndex(ir);
InstantiatedIndexReader iir=new InstantiatedIndexReader(ii);
is=new IndexSearcher(ir);
is2=new IndexSearcher(iir);
I calculate the time by:
long searchStart=System.nanoTime();
TopDocs docs=is.search(bQuery,Integer.MAX_VALUE);
long searchEnd=System.nanoTime();
I searched 10,000 documents and the time of RAMDirectory
and instantiated
the time used is time1: 21s(21812978000 ns) time2:
20s(20713817000 ns)
I also calulate the time including get field value:
total1: 23852ms total2: 22610ms
it seems instantiated is not much faster than
RAMDirectory. Is there any thing wrong I used? my max memory is 4GB
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]