On Thu, 30 Dec 2004 17:36:57 +1000, Nick Coghlan <[EMAIL PROTECTED]> wrote:
>Bengt Richter wrote: >> Essentially syntactic sugar to avoid writing id(obj) ? (and to get a little >> performance >> improvement if they're written in C). I can't believe this thread came from >> the >> lack of such sugar ;-) > >The downside of doing it that way is you have no means of getting from the >id() >stored as a key back to the associated object. Meaningful iteration (including >listing of contents) becomes impossible. Doing the id() call at the Python >level >instead of internally to the interpreter is also relatively expensive. ISTM d[id(obj)] = obj, classifier_func(obj) gets around the iteration problem (IIRC a very similar suggestion was somewhere in thread). But if the id call is a significant portion of the cycle budget, yeah, might want to "pursue" a collections solution ;-) > >> Or, for that matter, (if you are the designer) giving the objects an >> obj.my_classification attribute (or indeed, property, if dynamic) as part >> of their initialization/design? > >The main mutable objects we're talking about here are Python lists. Selecting >an and really non-mutated Python lists? >alternate classification schemes using a subclass is the current recommended >approach - this thread is about alternatives to that. I'm getting the impression your meaning of "classification" is less about classifying objects according their interesting features than how to associate the resulting kind-of-thing info with the objects for more efficient access that recalculating. In which case ISTM to be an optimization problem that depends intimately on the particular features of interest in the data, etc. > >I generally work with small enough data sets that I just use lists for >classification (sorting test input data into inputs which worked properly, and >those which failed for various reasons). However, I can understand wanting to >use a better data structure when doing frequent membership testing, *without* >having to make fundamental changes to an application's object model. > The DYFR thing ever lurks ;-) >> Or subclass your graph node so you can do something readable like >> if node.is_leaf: ... >> instead of >> if my_obj_classification[id(node)] == 'leaf': ... >I'd prefer: > if node in leaf_nodes: > ... Which is trivial to code, except for optimization issues, right? ;-) > >Separation of concerns suggests that a class shouldn't need to know about all >the different ways it may be classified. And mutability shouldn't be a barrier >to classification of an object according to its current state. Agreed. I didn't mean to imply otherwise. I did mention possibly memoizing classification functions as an optimization approach ;-) > >>>Hence why I suggested Antoon should consider pursuing >>>collections.identity_dict >>>and collections.identity_set if identity-based lookup would actually address >>>his >>>requirements. Providing these two data types seemed like a nice way to do an >>>end >>>run around the bulk of the 'potentially variable hash' key problem. >> >> I googled for those ;-) I guess pursuing meant implementing ;-) > >Yup. After all, the collections module is about high-performance datatypes for >more specific purposes than the standard builtins. identity_dict and >identity_set seem like natural fits for dealing with annotation and >classification problems where you don't want to modify the class definitions >for >the objects being annotated or classified. Well, at least they ought to be comparatively easy to do. > >I don't want the capability enough to pursue it, but Antoon seems reasonably >motivated :) Let's see what happens ;-) Regards, Bengt Richter -- http://mail.python.org/mailman/listinfo/python-list