Aviran wrote:
First let me explain what I found out. I'm running Lucene on a 4 CPU server.
While doing some stress tests I've noticed (by doing full thread dump) that
searching threads are blocked on the method: public FieldInfo fieldInfo(int
fieldNumber) This causes for a significant cpu idle time.

What version of Lucene are you running? Also, can you please send the stack traces of the blocked threads, or at least a description of them? I'd be interested to see what context this happens in. In particular, which IndexReader and Searcher/Scorer/Weight methods does it happen under?


I noticed that the class org.apache.lucene.index.FieldInfos uses private
class members Vector byNumber and Hashtable byName, both of which are
synchronized objects. By changing the Vector byNumber to ArrayList byNumber
I was able to get 110% improvement in performance (number of searches per
second).

That's impressive! Good job finding a bottleneck!

My question is: do the fields byNumber and byName have to be synchronized
and what can happen if I'll change them to be ArrayList and HashMap which
are not synchronized ? Can this corrupt the index or the integrity of the
results?

I think that is a safe change. FieldInfos is only modifed by DocumentWriter and SegmentMerger, and there is no possibility of other threads accessing those instances. Please submit a patch to the developer mailing list.


Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to