Is there a best practice on how to override __new__?

I have a base class, RDFObject, which is instantiated using a unique
identifier (a URI in this case). If an object with a given identifier
already exists, I want to return the existing object, otherwise, I
want to create a new object and add this new object to a cache. I'm
not sure if there is a name for such a creature, but I've seen the
name MultiSingleton in the archive.

This is not so hard; this can be done by overriding __new__(), as long
as I use a lock in case I want my code to be multi-threading
compatible.

import threading
threadlock = threading.Lock()

class RDFObject(object):
    _cache = {}   # class variable is shared among all RDFObject
instances
    def __new__(cls, *args, **kargs):
        assert len(args) >= 1
        uri = args[0]
        if uri not in cls._cache:
            threadlock.acquire() # thread lock
            obj = object.__new__(cls)
            cls._cache[uri] = obj
            threadlock.release() # thread unlock.
        return cls._cache[uri]
    def __init__(self, uri):
        pass
    # ...

However, I have the following problem:
The __init__-method is called every time you call RDFObject().

The benefit of this multi-singleton is that I can put this class in a
module, call RDFObject(someuri), and simply keep adding states to it
(which is what we want). If it had some state, good, that is retained.
If it did not have so: fine, we get a new object.
For example:

x = RDFObject(someuri)
x.myvar = 123
...later in the code...
y = RDFObject(someuri)
assert(y.myvar == 123)

I and fellow programmers tend to forget about the __init__() catch.
For example, when we subclass RDFObject:

class MySubclass(RDFObject):
    def __init__(self, uri):
        RDFObject.__init__(self, uri)
        self.somevar = []

Now, this does not work. The array is unwantedly initialized twice:

x = RDFObject(someotheruri)
x.somevar.append(123)
...later in the code...
y = RDFObject(someotheruri)
assert(y.somevar[0] == 123)

So I'm wondering: is there a best practice that allows the behaviour
we're looking for? (I can think of a few things, but I consider them
all rather ugly). Is there a good way to suppress the second call
__init__() from the base class? Perhaps even without overriding
__new__?
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to