Hello,
I was read a posting from Doug Cutting (circa 2001) that stated the following:
"Multi-CPU and/or multi-disk systems can provide greater parallelism and hence query
throughput. However Lucene's FSDirectory serializes reads to a given file (since it
only has a single file descriptor per file) which limits i/o parallelism. Someone with
a large disk array would be better served by a Directory implementation that uses Java
1.4's new i/o classes. In particular, the FileChannel class supports reads that do not
move the file pointer, so that multiple reads on the same file can be in progress at
the same time."
I attempted to implement this suggestion. But, I did not have great success.
Basically, I copied the existing FSDirectory (from 1.3-rc1) and modified the
FCInputStream inner class. I changed it to get a FileChannel (channel) in the
constructor and to clone properly. But, mainly, I changed "readInternal" to look like
this:
protected void readInternal(byte[] b, int offset, int len)
throws IOException
{
channel.read(ByteBuffer.wrap(b, offset, len), getFilePointer());
}
In other words, wrap the byte array, let the channel do the reading, and get the
current file pointer from the super class.
It works fine... the same queries return the same results, etc. But, the new
Directory implementation consistently falls a few ms short of the old one (over
sustained trials with various amounts of concurrency) re: overall response time.
Usually it wins out for both 'querying' (i.e. Searcher.search) and loading (i.e.
Hits.doc(i) ).
According to the FileChannel API, absolute reads should be able to occur concurrently.
However, the existing FSDirectory serializes access to the underlying files. So, I
figured that FSDirectory would be faster with a single search thread... but
FileChannelDirectory would win with multiple threads. Apparently, not so (given my
implementation :-). I also tested on a regular IDE HD and a SCSI. Both tests,
however, were Win2k based.
Does anyone know why I might not be seing a performance increase for multiple
concurrent threads using my "FileChannelDirectory" ?
Any ideas would be appreciated.
Thank you,
Tate
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]