Chris Hostetter <[EMAIL PROTECTED]> wrote on 24/08/2006 23:46:39:

>
> If i'm understanding this suggestion correctly, the main change in
> observable behavior will be that actions performed by a "reader" will
> never block or invalidate actions performed by a "writer" -- writers on
> the other hand can still block eachother.
>

Yes this is true: here readers do not block writers (nor readers), a writer
blocks readers, and a writer blocks other writers.

> This seems like it might be the opposite of what most people would want:
> that opening "reader" threads for doing searches need to be fast, and if
a
> writer thread has to wait a half second that's okay.

Right... this is an important point that I missed - in the numbered-files
approach a reader never has to wait, while in this suggestion readers may
need to wait for a writer that commits just now.

Still it is interesting to notice that the way Lucene works today, readers
initialization also block one another, so they initialize serially - each
reader needs to obtain a commit lock, initialize, and release the lock. In
this suggestion all readers initialize in parallel, and perhaps
re-initialize if a writer happens to commit just now.

Also, the way that writers do their work - most work is done out of the
"commit-window" - so the commit-window is both short and "relatively rare".

>
> I also don't believe this would "solve" the NFS issues with regards to
the
> commit lock -- as i recall, the problem stems from NFS not being able to
> garuntee transactional order of file operations (ie: i open the commit
> lock file, i modify and close segments, i close/delete the commit file --
> a remote NFS client might still see the orriginal segments file after the
> commit file is deleted.  Your version file might suffer the same fate
> (with reader clients seeing V1==V2 because the whole file is a second
> stale)

I thought that the (cooperative) lock-file related problems with NFS stem
from deleteFile() that may return failure code due to timeout although it
actually succeeded, possibly causing the lock-releasing party to retry
deleting, but now erroneously deleting a lock file just obtained by another
process.

The RFC for NFS version 2 (http://tools.ietf.org/html/rfc1094) says: "All
of the procedures in the NFS protocol are assumed to be synchronous.  When
a procedure returns to the client, the client can assume that the operation
has completed and any data associated with the request is now on stable
storage."

So if writer did actions { a1 , a2 } in this order and they completed, it
seems that a reader "seeing" the result of action a2 must also "feel" the
result of action a1. (This would prevent errors with the proposed version
number.) But I am no expert in NFS and may be wrong here.

>
>
> : Date: Thu, 24 Aug 2006 23:22:56 -0700
> : From: Doron Cohen <[EMAIL PROTECTED]>
> : Reply-To: java-dev@lucene.apache.org
> : To: java-dev@lucene.apache.org
> : Subject: Re: Lock-less commits
> :
> : I would like to discuss an additional approach, that requires small
changes
> : to current Lucene implementation. Here, the index version (currently in
> : segments file) is maintained in a separate file, and is used to
synchronize
> : between readers and writers, without requiring readers to create/obtain
any
> : lock files, and without requiring readers to write anything to disk.
> :
> : - Index version would be maintained in a separate, dedicated Version
file -
> : (say .vsn) - one per index.
> : - Version file contains two occurrences of the version number - V1 and
V2.
> : - In steady state, V1 == V2, but During update V1 == V2+1.
> : - Every commit would:
> :   - obtain a write lock (as today), to guarantee single writer at a
time.
> :   - increments V1 in that file, using RandomAccessFile API (RAF).
> :     notice: now V1 != V2.
> :   - do the commit work (merge, delete, whatever).
> :   - increments V2 in that file, using RAF.
> :     notice: now, again, V1 == V2.
> :   - release the write lock (as today)
> : - Every reader would read the version data in opposite order:
> :   (1) read V2 from the version file, using RAF.
> :   (2) read V1 from the version file, using RAF.
> :   (3) if not V1==V2 wait some time, and try again (from step 1), until
> : V1==V2, or timeout and fail.
> :   (4) initialize reader data (read segment infos, open files).
> :   (5) read again V2 then V1 using RAF.
> :   (6) if not V2==V1 or they changed from steps 1 and 2, try again (from
> : step 1), or timeout and fail.
> :
> :
> : A few points to notice:
> : - Using RAF protects from errors due to IO buffering.
> : - Only tiny amount of version data is being read/written using RAF, so
> : performance should not degrade.
> : - Readers are not writing any data, so they are faster (A reader that
does
> : deleteDoc is a writer in this regard).
> : - The opposite read/write order of RAF operations, i.e. writing V1 and
then
> : V2 by writer but reading V2 and then V1 by reader, protects from race
> : conditions between readers and writers that otherwise might have caused
> : reading corrupt data and concluding wrongly that the data is consistent
> : while in fact it is not.
> : - By using RAF and by the order of operations above, this scheme would
work
> : also for NFS (excluding the write lock mechanism which remains an issue
in
> : NFS).
> : - For backward compatibility with current index structure the code can
fall
> : back to obtain a commit lock file if the .vsn file does not exist, and
then
> : create that .vsn file if it still does not exist.
> :
> : Regards,
> : Doron
> :
> :
> :
> : ---------------------------------------------------------------------
> : To unsubscribe, e-mail: [EMAIL PROTECTED]
> : For additional commands, e-mail: [EMAIL PROTECTED]
> :
>
>
>
> -Hoss
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to