Elias Torres wrote:
Folks,

I have been doing some work on clustered search (more on future email)
and I begun looking at the task lock recently implemented in 3.1. I
believe (and have verified with Allen) that we might need to make some
changes to both the API and implementation so it works properly on a
cluster of size 2 or more. Currently I think we can have a couple of
places where two nodes can obtain a lock at the same time.

I'm working on some pseudo-code you all can review to make sure we are
safe from these issues above. I also want to refactor the interfaces to
reflect more the use of leases as opposed to locks.

interface LeaseManager {

  registerLease(String name, String id, long leaseTime);
  renewLease(String name, String id);
  unregisterLease(String name, String id);

}

I don't mind any of these changes if it's going to improve the implementation, but now that I look at it I don't see how this is all that different from what we have now ...

acquireLock(RollerTask)
releaseLock(RollerTask)
isLocked(RollerTask)

I can see how just calling things a lease is possibly a better way to describe what's happening, but calling registerLease() is the exact same thing as acquiring a lock in the current code. When you acquire a lock now you get the lock for a given amount of time, or until you manually release it.

The current code doesn't have any way to renew a lease (or lock), so that definitely sounds like a good addition if we plan to have tasks which may run over their initial lease time and want to extend the lease. I didn't really envision that happening, but it sounds reasonable.

And of course, unregistering a lease is the same thing as releasing a lock, right? So all in all I don't think we are really changing much.

But in any case, I have a few comments ...

1. I would prefer that these just remain in the ThreadManager where the existing methods are now. I don't think we need a whole new Manager for this.

2. I think we would still need an isLeased() method which is basically just a convenience for any task which is about to try and obtain a lease to see if it is already leased.

3. I also think that we want to stick with having these methods take in RollerTask objects rather than individualized parameters. The RollerTask class already forces subclasses to define a name and a lease time, so those are already included, and it should be easy enough to just add in an id attribute which would just need to be unique to each node of the cluster somehow.

I think the big changes which you didn't really detail in this email is that we would be shifting more of the logic around dealing with time from app time to database time to make sure that all nodes are properly synced. Then we would also want to ensure the leasing process was rigid enough to prevent any possible race conditions, which I believe at the moment is not the case. Accomplishing those two things definitely sound like a good idea to me.

-- Allen



If you notice, I'm introducing a new property (id) that will be used to
identify which node in the cluster has the lease in order that we can
allow for renewals if they desire so. Currently, we only have support
for task name and that's enough for what I'm trying to do. I'll explain
more in my next email and if I'm brief on this one is because I must
worry about dinner and putting the kids to bed.

-Elias

Reply via email to