Re: [jira] Commented: (LUCENE-710) Implement "point in time" searching without relying on filesystem semantics

Otis Gospodnetic Sun, 21 Jan 2007 03:42:59 -0800

I feel like NFS is not a must, but a nice to have.  Lucene's been around for 
nearly a decade without a proper NFS support :)
That said, it looks like Mike has a good plan and a lot of good will, and will 
try to keep support for NFS and various deletion policies external, so I'm 
looking forward to his work.


Otis

----- Original Message ----
From: robert engels <[EMAIL PROTECTED]>
To: [email protected]
Sent: Friday, January 19, 2007 5:28:42 PM
Subject: Re: [jira] Commented: (LUCENE-710) Implement "point in time" searching 
without relying on filesystem semantics

I don't dispute that NFS is used a lot. I think it is VERY debatable  
as to whether or not NFS is used in Lucene environments.

If there are the type of org to be using NFS to begin with, they are  
probably more accustom to "server" software, and then you can deploy  
Lucene with any of this nonsense. And it should run MUCH faster since  
you do not need to rely on the OS for locking protocols.

On Jan 19, 2007, at 4:04 PM, Michael McCandless (JIRA) wrote:

>
>     [ https://issues.apache.org/jira/browse/LUCENE-710? 
> page=com.atlassian.jira.plugin.system.issuetabpanels:comment- 
> tabpanel#action_12466164 ]
>
> Michael McCandless commented on LUCENE-710:
> -------------------------------------------
>
> OK, a few top level summary comments and then some specifics below:
>
>    * I don't want to modify the Lucene core (adding advisory read
>      locks, etc) just to handle the NFS case.  That risks hurting the
>      non-NFS cases.  "First do no harm."  "Keep it simple."  We've  
> been
>      working hard to remove locking lately (lockless commits) and I
>      rather not add more locking back.
>
>      By implementing each approach (I think there are now 5 different
>      ideas now) instead as its own deletion policy (subclass of
>      IndexFileDeleter) we contain the added complexity of locking,
>      file-based ref counts, etc, to just that one subclass of
>      IndexFileDeleter.
>
>    * I think NFS support is part of Lucene's core mission.
>
>      Lucene should try hard to be as portable as possible.
>
>      NFS is used *alot*.
>
>      It's tempting to tell users to "upgrade OS", "upgrade NFS server
>      and/or client", etc, but taking that approach will only hurt our
>      users because typically this is not something they can control.
>
>      Now, if we had to bend over backwards for NFS, then I would agree
>      it's not worth it.  But, we don't have to: by allowing custom
>      deletion policies (which is a minor change) we can try out all of
>      the approaches suggested so far on this thread.
>
>      Rather than baking any one of these approaches into the Lucene
>      core, I'd rather just enable "custom deletion policies" then
>      people can build out these polices outside of the core (eg in
>      "contrib" first).
>
>    * I agree that "giving a good error message when index is on  
> NFS" is
>      really important and that custom deletion policy alone doesn't
>      address this.
>
>      Marvin I don't think your test will work either (see below).
>
>      But I really like this direction: does anyone know how (Java
>      friendly way) to determine that a given directory is on an NFS
>      mount?  That would be wonderful.  I will spin off a new thread
>      here.
>
>
> Some specifics below:
>
> Marvin Humphrey wrote:
>
>> The first step to graceful degradation is a meaningful error message.
>   That means detecting a problem which normally requires both a  
> indexing
> process and a search process to trigger, so we have to simulate it
> artificially with a test.
>>
>>    1) Open a file for read/write and write a few bytes to it.
>>    2) Delete the test file (without closing the handle).
>>    3) Try to read from the handle and if a "stale NFS filehandle"
>>       exception is caught, throw something more informative.
>>
>> Our first opportunity to perform this test occurs at index-creation
> time.  This is essentially cost free.
>>
>> A second opportunity arises whenever deletions are performed.  Here,
> there's a small cost involved, and it may not be worth it, as this  
> would
> only catch cases where an index was copied onto an NFS volume rather
> than created there, then subsequently modified.
>
> I think this test won't work (though I haven't tested...).  Typically
> an NFS client will catch this case and locally emulate "delete on last
> close".  Worse, even if it doesn't, those bytes would likely be cached
> and then would fail to hit "stale NFS filehandle".
>
>> But the good news is since we will allow subclassing to make your own
>> deletion policy, we can eventually do both of these approaches and  
>> our
>> users can pick one or do their own.
>>
>> The number of users this class will serve is diminishingly small.
> Other mitigation strategies are available.
>>
>> 1) If we implement advisory read locks, many people who see this
> error will no longer see it.  For those who do, the best option is to
> upgrade the OS to a version which supports advisory locks over NFS.
> Then an index on an NFS volume will behave as any other.
>> 2) If you don't actually need to put the index on an NFS volume, put
> it somewhere else.
>> 3) Catch stale NFS filehandle exceptions in your search application
> and refresh the reader when they occur.
>> 4) Maintain two copies of an index and do an rsync/switch.
>> 5) Hack Lucene.
>
> 5) isn't really a good option since we all can't even agree how to
> "hack Lucene" to make this work!  1) I think is too dangerous as part
> of the core.  2) typically this is not an option.  People choose NFS
> because they want to share the index.  4) is a fair amount of added
> complexity.  3) is the most viable option I see here, but it's not
> great because you're forced to refresh "right now".  What if warming
> takes 8 minutes?  What if "now" is a bad time because deletes were
> done by the writer but not yet adds?
>
>> Flexibility is not free.  There have been recent lamentations on
> java-dev about how difficult it will be to merge the write  
> interfaces of
> IndexReader and IndexWriter to provide a single, unified class through
> which all index modifications can be performed.  The exposure of the
> IndexFileDeleter mechanism contributes to this problem -- it's one  
> more
> small step in the wrong direction.
>
> Yes there is an open question now on what to do about the confusion on
> using IndexReader vs IndexWriter.  I think moving towards "use
> IndexWriter for changes, use IndexReader for reading" is the best
> solution here.  But I don't see how this relates to allowing
> subclassing of IndexFileDeleter to make your own deletion policy.
>
>> Providing a subclassing/callback API is often an elegant strategy,
> and it is surely better in this case than it would be to provide a  
> list
> of deletion policies for the user to select from.  However, whenever
> possible, _no_ API is always a better solution -- especially in a case
> like this one, where the functionality provided has nothing to do with
> Lucene's core mission and is there solely to work around an
> implmentation-specific bug.
>
> I disagree on this point ("no" API is better than subclassing).  As
> you've said, this issue won't affect that many people (though I think
> it's a fairly large subset of our users).  Given that, I would not
> want to add file locking & additional complexity into the Lucene core,
> just to handle NFS.
>
> By allowing a different delete policy as a subclass of
> IndexFileDeleter we keep the changes required for supporting NFS way
> outside the Lucene core.  Since there's so much debate about which
> deletion policy is best we should create all of these in contrib to
> begin with and if something proves reliable we can eventually promote
> it into core Lucene.
>
> I think subclassing is perfect for this sort of situation.  It's like
> the various LockFactory implementations we have: there is no "one size
> fits all".
>
> Mike
>
>
>> Implement "point in time" searching without relying on filesystem  
>> semantics
>> --------------------------------------------------------------------- 
>> ------
>>
>>                 Key: LUCENE-710
>>                 URL: https://issues.apache.org/jira/browse/LUCENE-710
>>             Project: Lucene - Java
>>          Issue Type: Improvement
>>          Components: Index
>>    Affects Versions: 2.1
>>            Reporter: Michael McCandless
>>         Assigned To: Michael McCandless
>>            Priority: Minor
>>
>> This was touched on in recent discussion on dev list:
>>   http://www.gossamer-threads.com/lists/lucene/java-dev/41700#41700
>> and then more recently on the user list:
>>   http://www.gossamer-threads.com/lists/lucene/java-user/42088
>> Lucene's "point in time" searching currently relies on how the
>> underlying storage handles deletion files that are held open for
>> reading.
>> This is highly variable across filesystems.  For example, UNIX-like
>> filesystems usually do "close on last delete", and Windows filesystem
>> typically refuses to delete a file open for reading (so Lucene  
>> retries
>> later).  But NFS just removes the file out from under the reader, and
>> for that reason "point in time" searching doesn't work on NFS
>> (see LUCENE-673 ).
>> With the lockless commits changes (LUCENE-701 ), it's quite simple to
>> re-implement "point in time searching" so as to not rely on  
>> filesystem
>> semantics: we can just keep more than the last segments_N file (as
>> well as all files they reference).
>> This is also in keeping with the design goal of "rely on as little as
>> possible from the filesystem".  EG with lockless we no longer re-use
>> filenames (don't rely on filesystem cache being coherent) and we no
>> longer use file renaming (because on Windows it can fails).  This
>> would be another step of not relying on semantics of "deleting open
>> files".  The less we require from filesystem the more portable Lucene
>> will be!
>> Where it gets interesting is what "policy" we would then use for
>> removing segments_N files.  The policy now is "remove all but the  
>> last
>> one".  I think we would keep this policy as the default.  Then you
>> could imagine other policies:
>>   * Keep past N day's worth
>>   * Keep the last N
>>   * Keep only those in active use by a reader somewhere (note: tricky
>>     how to reliably figure this out when readers have crashed, etc.)
>>   * Keep those "marked" as rollback points by some transaction, or
>>     marked explicitly as a "snaphshot".
>>   * Or, roll your own: the "policy" would be an interface or abstract
>>     class and you could make your own implementation.
>> I think for this issue we could just create the framework
>> (interface/abstract class for "policy" and invoke it from
>> IndexFileDeleter) and then implement the current policy (delete all
>> but most recent segments_N) as the default policy.
>> In separate issue(s) we could then create the above more interesting
>> policies.
>> I think there are some important advantages to doing this:
>>   * "Point in time" searching would work on NFS (it doesn't now
>>     because NFS doesn't do "delete on last close"; see LUCENE-673 )
>>     and any other Directory implementations that don't work
>>     currently.
>>   * Transactional semantics become a possibility: you can set a
>>     snapshot, do a bunch of stuff to your index, and then rollback to
>>     the snapshot at a later time.
>>   * If a reader crashes or machine gets rebooted, etc, it could  
>> choose
>>     to re-open the snapshot it had previously been using, whereas now
>>     the reader must always switch to the last commit point.
>>   * Searchers could search the same snapshot for follow-on actions.
>>     Meaning, user does search, then next page, drill down (Solr),
>>     drill up, etc.  These are each separate trips to the server  
>> and if
>>     searcher has been re-opened, user can get inconsistent results (=
>>     lost trust).  But with, one series of search interactions could
>>     explicitly stay on the snapshot it had started with.
>
> -- 
> This message is automatically generated by JIRA.
> -
> If you think it was sent incorrectly contact one of the  
> administrators: https://issues.apache.org/jira/secure/ 
> Administrators.jspa
> -
> For more information on JIRA, see: http://www.atlassian.com/ 
> software/jira
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [jira] Commented: (LUCENE-710) Implement "point in time" searching without relying on filesystem semantics

Reply via email to