Re: help on Lock.obtain(lockWaitTimeout)

2006-09-22 Thread Michael McCandless

Yonik Seeley wrote:

On 9/21/06, Michael McCandless [EMAIL PROTECTED] wrote:

Anyway, my first reaction was to change this to use
System.currentTimeMillis() to measure elapsed time, but then I
remembered is a dangerous approach because whenever the clock on the
machine is updated (eg by a time-sync NTP client) it would mess up
this function, causing it to either take longer than was asked for (if
clock is moved backwards) or, to timeout in [much] less time than was
asked for (if clock was moved forwards).


Um, wow... that's thorough design work!


Thanks :) I've hit just one too many bugs due to system time changing!
Time is always a sneaky thing to work with.  Basically you can't
really use system time as a reliable way to measure elapsed time.


In this case, I don't think it's something to worry about though.
NTP corrections are likely to be very small, not on the scale of
lock-obtain timeouts.
If one can't obtain a lock, it's due to something else asynchronously
happening, and that throws a lot bigger time variation into the
equation anyway.


Yes, I hope so, in a well-behaved server environment that's already
converged its clock and is tracking well to real time, has the right
command line options to ntp, and doesn't have an admin coming in and
making clock changes.  But in more chaotic user's desktop where the
user could update the clock at random times themselves, it would be
horrible to let such an event falsely throw a Lock obtain timed out
to any desktop deployments of Lucene.

Even with lock-less commits we will still need to obtain the write
lock (eg for the interleaved add/delete case, until we can fix
IndexWriter to handle deletes, the write lock is being acquired fairly
often).  Each of these obtains is then vulnerable if [too large] a
clock change is made during this call.

Lucene doens't currently have this issue (relying on currentTimeMillis
to measure elapsed time) so I'd hate to be the one to introduce it.

Are there any objections to the acquire a random test lock approach?

If your locking is mis-configured, you will get an error on
creating the NativeFSLockFactory.  But if it is configured
properly, it will quickly get the lock (and release it) and move on.

Also, there is a single instance of NativeFSLockFactory per [canonical]
lock directory, so it would only be the first time (per JVM instance)
that the NativeFSLockFactory is created for the given directory that
this simple test would be performed.

Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: help on Lock.obtain(lockWaitTimeout)

2006-09-21 Thread Yonik Seeley

On 9/21/06, Michael McCandless [EMAIL PROTECTED] wrote:

Anyway, my first reaction was to change this to use
System.currentTimeMillis() to measure elapsed time, but then I
remembered is a dangerous approach because whenever the clock on the
machine is updated (eg by a time-sync NTP client) it would mess up
this function, causing it to either take longer than was asked for (if
clock is moved backwards) or, to timeout in [much] less time than was
asked for (if clock was moved forwards).


Um, wow... that's thorough design work!

In this case, I don't think it's something to worry about though.
NTP corrections are likely to be very small, not on the scale of
lock-obtain timeouts.
If one can't obtain a lock, it's due to something else asynchronously
happening, and that throws a lot bigger time variation into the
equation anyway.


-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: help on Lock.obtain(lockWaitTimeout)

2006-09-21 Thread Doron Cohen
For obtain(timeout), to prevent waiting too long you could compute the
maximum number of times that obtain() can be executed (assuming, as in
current code, that obtain() executes in no time). Then break if either it
was executed sufficiently many times or if time is up. I don't see how to
prevent waiting too short.

Btw, I wonder what happens if the time change as of sync occurs in the
middle of the sleep - since sleep is implemented natively this must be
taken care of correctly by the underlying OS...?

[EMAIL PROTECTED] wrote on 21/09/2006 13:05:06:
 On 9/21/06, Michael McCandless [EMAIL PROTECTED] wrote:
  Anyway, my first reaction was to change this to use
  System.currentTimeMillis() to measure elapsed time, but then I
  remembered is a dangerous approach because whenever the clock on the
  machine is updated (eg by a time-sync NTP client) it would mess up
  this function, causing it to either take longer than was asked for (if
  clock is moved backwards) or, to timeout in [much] less time than was
  asked for (if clock was moved forwards).

 Um, wow... that's thorough design work!

 In this case, I don't think it's something to worry about though.
 NTP corrections are likely to be very small, not on the scale of
 lock-obtain timeouts.
 If one can't obtain a lock, it's due to something else asynchronously
 happening, and that throws a lot bigger time variation into the
 equation anyway.


 -Yonik
 http://incubator.apache.org/solr Solr, the open-source Lucene search
server

 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]