I've wanted something similar, for the same purpose -- to keep lucene
from consuming disk I/O resources when another process is running on
the same machine.

A general solution might be to define a simple interface such as

interface IndexInputOutputListener {
    void willReadBytes(int numberOfBytes);
    void willWriteBytes(int numberOfBytes);
}

If non-null, then Lucene can call these methods just before it calls
IndexInput.readBytes() or IndexOutput.writeBytes()   (referring to the
Lucene 1.9 API).  People could implement these methods to do whatever
they want, including throttling I/O, or just keeping I/O profiling
statistics.

Also the above interface could be split into two, an
IndexInputListener and IndexOutputListener, if people are likely to
use one and not the other.

In the meantime, you may be able to do what you want by extending or
decorating FSDirectory, and do the throttling just before or after
calling the real FSDirectory's read/write methods (in FSIndexOutput
and FSIndexInput inner classes).

-chris

On 8/2/05, Otis Gospodnetic <[EMAIL PROTECTED]> wrote:
> Hi,
> 
> > Ok, let me rephrase the question. Assuming the RAMDirectory holds
> > approximately 500 MB of data which needs to be written to the
> > filesystem, I'm afraid that sending this much data in one shot might
> > choke the NFS. Is there a parameter with FSDirectory with which I can
> > instruct Lucene to restrict the amount of data my application writes
> > to the filesystem per second.
> 
> There is no support for that, but if you add it, this may be a nice
> enhancement.
> 
> > BTW, I could see maxMergeDocs, but not maxBufferedDocs with the
> > IndexWriter. Otis, is there really some parameter with that name? I
> > think maxMergeDocs may not work well here because the size of the
> > documents are pretty random, ranging from few hundred bytes to tens
> > of megabytes.
> 
> That's the one.   You are looking at 1.4.3, and I was referring to the
> new name of that parameter from 1.9-dev1.
> 
> Otis
> 
> > On 8/3/05, Otis Gospodnetic <[EMAIL PROTECTED]> wrote:
> > > While not exactly what you are describing, you can use one of the
> > > IndexWriter parameters (maxBufferedDocs) to control the size of the
> > > RAMDirectory that's used as a buffer during indexing.
> > >
> > > Otis
> > >
> > > --- Gopikrishnan Subramani <[EMAIL PROTECTED]> wrote:
> > >
> > > > Hello,
> > > >
> > > > Is there a way I can control the IO bandwidth utilized by Lucene?
> > > >
> > > > Here is my scenario. RAMDirectory is used to build a in-memory
> > index
> > > > and finally the index size approaches a limit, the contents are
> > > > flushed to a FSDirectory. The index size could be approximately
> > 512
> > > > MB. I'm a bit concerned that this might affect other NFS clients
> > of
> > > > the same NFS server, even if it's for a short period. To avoid
> > this I
> > > > would like to implement some sort of limit on the amount of data
> > > > transferred to the filesystem akin to the --bwlimit with the
> > rsync
> > > > command.
> > > >
> > > > Thanks,
> > > > Gopi
> > > >
> > > >
> > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > > > For additional commands, e-mail: [EMAIL PROTECTED]
> > > >
> > > >
> > >
> > >
> > >
> > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > > For additional commands, e-mail: [EMAIL PROTECTED]
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to