Hello,

I need to synchronize the access to a couple of hundred-thousand
files[1]. It seems to me that creating one lock object for each of the
files is a waste of resources, but I cannot use a global lock for all
of them either (since the locked operations go over the network, this
would make the whole application essentially single-threaded even
though most operations act on different files).

My idea is therefore to create and destroy per-file locks "on-demand"
and to protect the creation and destruction by a global lock
(self.global_lock). For that, I add a "usage counter"
(wlock.user_count) to each lock, and destroy the lock when it reaches
zero. The number of currently active lock objects is stored in a dict:

    def lock_s3key(s3key):

        self.global_lock.acquire()
        try:

            # If there is a lock object, use it
            if self.key_lock.has_key(s3key):
                wlock = self.key_lock[s3key]
                wlock.user_count += 1
                lock = wlock.lock

            # otherwise create a new lock object
            else:
                wlock = WrappedLock()
                wlock.lock = threading.Lock()
                wlock.user_count = 1
                self.key_lock[s3key] = wlock

        finally:
            self.global_lock.release()

        # Lock the key itself
        lock.acquire()


and similarly

    def unlock_s3key(s3key):

        # Lock dictionary of lock objects
        self.global_lock.acquire()
        try:

            # Get lock object
            wlock = self.key_lock[s3key]

            # Unlock key
            wlock.lock.release()

            # We don't use the lock object any longer
            wlock.user_count -= 1

            # If no other thread uses the lock, dispose it
            if wlock.user_count == 0:
                del self.key_lock[s3key]
            assert wlock.user_count >= 0

        finally:
            self.global_lock.release()


WrappedLock is just an empty class that allows me to add the
additional user_count attribute.


My questions:

 - Does that look like a proper solution, or does anyone have a better
   one?

 - Did I overlook any deadlock possibilities?
 

Best,
Nikolaus



[1] Actually, it's not really files (because in that case I could use
    fcntl) but blobs stored on Amazon S3.
    

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
                                                         -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to