Re: help on Lock.obtain(lockWaitTimeout)
Yonik Seeley wrote: On 9/21/06, Michael McCandless [EMAIL PROTECTED] wrote: Anyway, my first reaction was to change this to use System.currentTimeMillis() to measure elapsed time, but then I remembered is a dangerous approach because whenever the clock on the machine is updated (eg by a time-sync NTP client) it would mess up this function, causing it to either take longer than was asked for (if clock is moved backwards) or, to timeout in [much] less time than was asked for (if clock was moved forwards). Um, wow... that's thorough design work! Thanks :) I've hit just one too many bugs due to system time changing! Time is always a sneaky thing to work with. Basically you can't really use system time as a reliable way to measure elapsed time. In this case, I don't think it's something to worry about though. NTP corrections are likely to be very small, not on the scale of lock-obtain timeouts. If one can't obtain a lock, it's due to something else asynchronously happening, and that throws a lot bigger time variation into the equation anyway. Yes, I hope so, in a well-behaved server environment that's already converged its clock and is tracking well to real time, has the right command line options to ntp, and doesn't have an admin coming in and making clock changes. But in more chaotic user's desktop where the user could update the clock at random times themselves, it would be horrible to let such an event falsely throw a Lock obtain timed out to any desktop deployments of Lucene. Even with lock-less commits we will still need to obtain the write lock (eg for the interleaved add/delete case, until we can fix IndexWriter to handle deletes, the write lock is being acquired fairly often). Each of these obtains is then vulnerable if [too large] a clock change is made during this call. Lucene doens't currently have this issue (relying on currentTimeMillis to measure elapsed time) so I'd hate to be the one to introduce it. Are there any objections to the acquire a random test lock approach? If your locking is mis-configured, you will get an error on creating the NativeFSLockFactory. But if it is configured properly, it will quickly get the lock (and release it) and move on. Also, there is a single instance of NativeFSLockFactory per [canonical] lock directory, so it would only be the first time (per JVM instance) that the NativeFSLockFactory is created for the given directory that this simple test would be performed. Mike - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: help on Lock.obtain(lockWaitTimeout)
On 9/21/06, Michael McCandless [EMAIL PROTECTED] wrote: Anyway, my first reaction was to change this to use System.currentTimeMillis() to measure elapsed time, but then I remembered is a dangerous approach because whenever the clock on the machine is updated (eg by a time-sync NTP client) it would mess up this function, causing it to either take longer than was asked for (if clock is moved backwards) or, to timeout in [much] less time than was asked for (if clock was moved forwards). Um, wow... that's thorough design work! In this case, I don't think it's something to worry about though. NTP corrections are likely to be very small, not on the scale of lock-obtain timeouts. If one can't obtain a lock, it's due to something else asynchronously happening, and that throws a lot bigger time variation into the equation anyway. -Yonik http://incubator.apache.org/solr Solr, the open-source Lucene search server - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: help on Lock.obtain(lockWaitTimeout)
For obtain(timeout), to prevent waiting too long you could compute the maximum number of times that obtain() can be executed (assuming, as in current code, that obtain() executes in no time). Then break if either it was executed sufficiently many times or if time is up. I don't see how to prevent waiting too short. Btw, I wonder what happens if the time change as of sync occurs in the middle of the sleep - since sleep is implemented natively this must be taken care of correctly by the underlying OS...? [EMAIL PROTECTED] wrote on 21/09/2006 13:05:06: On 9/21/06, Michael McCandless [EMAIL PROTECTED] wrote: Anyway, my first reaction was to change this to use System.currentTimeMillis() to measure elapsed time, but then I remembered is a dangerous approach because whenever the clock on the machine is updated (eg by a time-sync NTP client) it would mess up this function, causing it to either take longer than was asked for (if clock is moved backwards) or, to timeout in [much] less time than was asked for (if clock was moved forwards). Um, wow... that's thorough design work! In this case, I don't think it's something to worry about though. NTP corrections are likely to be very small, not on the scale of lock-obtain timeouts. If one can't obtain a lock, it's due to something else asynchronously happening, and that throws a lot bigger time variation into the equation anyway. -Yonik http://incubator.apache.org/solr Solr, the open-source Lucene search server - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]