Re: Lucene 2.1, soon

Michael McCandless Fri, 19 Jan 2007 05:05:59 -0800

Chuck Williams wrote:

I need to support NFS and would not want to rely on the reader
refreshing in X minutes.  Setting X too small risks a query failure and
setting X too large wastes disk space.  X would need to be set for 100%
reader availability, implying a large value and a lot of disk space waste.


Agreed, this is the downside of the "delete after X minutes obsolete"
deletion policy.

I like the idea of customizable delete policies in IndexFileDeleter.  My
current application does not have the need for multiple processes
accessing the same index, only many threads in a single process.  There
are multiple processes cooperating, but each has its own piece of the
index stored separately.  So, an in-memory reference count scheme would
work best.


Neat, so this would be a 4th deletion policy doing its own "local"
tracking.

The point is that different applications have different needs.  This
could be addressed well by ensuring that IndexFileDeleter is nicely
customizable and has a few common policies available such as:  delete
immediately (current), delete after obsolete for X minutes, keep
in-memory reference counts, and keep persistent reference counts.  These
strategies might be used respectively by:  linux or windows app with
local file system, multiple processes sharing an index on nfs, single
process with an index on nfs or more efficient strategy for single
process on Windows, alternative solution for multiple processes with an
index on nfs.


Yes.  I think this is very much a "one size does not fit all"
situation.

Reference count schemes might best be done at the Directory level,
analogous to what Linux does.  So long as all readers and writer use the
same Directory it is easy to keep reference counts.

Perhaps IndexFileDeleter should be integrated into Directory?


Well, IndexFileDeleter needs to know alot about the index format.
It needs to load the SegmentInfos for past commits, ask them for
their files, etc.  The ref counting I'm referring to here would
be in the IndexFileDeleter base class to track which index files
are pointed to by which commits.

Of course one might complain that this is throwing in the towel,
implementing a bunch of options instead of one elegant solution.


I think this is because we can't come up with a "one size fits
all".  I think this is similar to locking and the different
LockFactory implementations that we now provide.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Lucene 2.1, soon

Reply via email to