Suppose that you want to implement a subclass of built-in class, to meet some specific design requirements.
Where in the Python documentation can one find the information required to determine the minimal[1] set of methods that one would need to override to achieve this goal? In my experience, educated guesswork doesn't get one very far with this question. Here's a *toy example*, just to illustrate this last point. Suppose that one wants to implement a subclass of dict, call it TSDict, to meet these two design requirements: 1. for each one of its keys, an instance of TSDict should keep a timestamp (as given by time.time, and accessible via the new method get_timestamp(key)) of the last time that the key had a value assigned to it; 2. other than the added capability described in (1), an instance of TSDict should behave *exactly* like a built-in dictionary. In particular, we should be able to observe behavior like this: >>> d = TSDict((('uno', 1), ('dos', 2)), tres=3, cuatro=4) >>> d['cinco'] = 5 >>> d {'cuatro': 4, 'dos': 2, 'tres': 3, 'cinco': 5, 'uno': 1} >>> d.get_timestamp('uno') 1293046586.644436 OK, here's one strategy, right out of OOP 101: #--------------------------------------------------------------------------- from time import time class TSDict(dict): def __setitem__(self, key, value): # save the value and timestamp for key as a tuple; # see footnote [2] dict.__setitem__(self, key, (value, time())) def __getitem__(self, key): # extract the value from the value-timestamp pair and return it return dict.__getitem__(self, key)[0] def get_timestamp(self, key): # extract the timestamp from the value-timestamp pair and return it return dict.__getitem__(self, key)[1] #--------------------------------------------------------------------------- This implementation *should* work (again, at least according to OOP 101), but, in fact, it doesn't come *even close*: >>> d = TSDict((('uno', 1), ('dos', 2)), tres=3, cuatro=4) >>> d['cinco'] = 5 >>> d {'cuatro': 4, 'dos': 2, 'tres': 3, 'cinco': (5, 1293059516.942985), 'uno': 1} >>> d.get_timestamp('uno') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/tmp/tsdict.py", line 23, in get_timestamp return dict.__getitem__(self, key)[1] TypeError: 'int' object is not subscriptable >From the above you can see that TSDict fails at *both* of the design requirements listed above: it fails to add a timestamp to all keys in the dictionary (e.g. 'uno', ..., 'cuatro' didn't get a timestamp), and get_timestamp bombs; and it also fails to behave in every other respect exactly like a built-in dict (e.g., repr(d) reveals the timestamps and how they are kept). So back to the general problem: to implement a subclass of a built-in class to meet a given set of design specifications. Where is the documentation needed to do this without guesswork? Given results like the one illustrated above, I can think of only two approaches (other than scrapping the whole idea of subclassing a built-in class in the first place): 1) Study the Python C source code for the built-in class in the hopes of somehow figuring out what API methods need to be overridden; 2) Through blind trial-and-error, keep trying different implementation strategies and/or keep overriding additional built-in class methods until the behavior of the resulting subclass approximates sufficiently the design specs. IMO, both of these approaches suck. Approach (1) would take *me* forever, since I don't know the first thing about Python's internals, and even if I did, going behind the documented API like that would make whatever I implement very likely to break with future releases of Python. Approach (2) could also take a very long time (probably much longer than the implementation would take if no guesswork was involved), but worse than that, one would have little assurance that one's experimentation has truly uncovered all the necessary details; IME, programming-by-guesswork leads to numerous and often nasty bugs. Is there any other way? TIA! ~kj [1] The "minimal" bit in the question statement is just another way of specifying a "maximal" reuse of the built-in's class code. [2] For this example, I've accessed the parent's methods directly through dict rather than through super(TSDict, self), just to keep the code as uncluttered as possible, but the results are the same if one uses super. -- http://mail.python.org/mailman/listinfo/python-list