Re: Objects, classes, metaclasses, and other things that go bump in the night
Dan Sugalski [EMAIL PROTECTED] writes: At 11:13 AM +0100 12/14/04, Leopold Toetsch wrote: Dan Sugalski [EMAIL PROTECTED] wrote: subclass - To create a subclass of a class object Is existing and used. Right. I was listing the things we need in the protocol. Some of them we've got, some we don't, and some of the stuff we have we probably need to toss out or redo. add_parent - To add a parent to the class this is invoked on become_parent - Called on the class passed as a parameter to add_parent What is the latter used for? To give the newly added parent class a chance to do some setup in the child class, if there's a need for it. There probably won't be in most cases, but when mixing in classes of different families I think we're going to need this. The chap who's writing Ruby on Rails, a very capable framework reckons that some of these 'meta' method calls that Ruby has in abundance have really made his life a lot easier; they're not the sort of things you need to use very often, but they're fabulously useful when you do.
Re: Objects, classes, metaclasses, and other things that go bump in the night
Dan Sugalski [EMAIL PROTECTED] wrote: subclass - To create a subclass of a class object Is existing and used. add_parent - To add a parent to the class this is invoked on become_parent - Called on the class passed as a parameter to add_parent What is the latter used for? class_type - returns some unique ID or other so all classes in one class family have the same ID What is a class family? instantiate - Called to create an object of the class Exists. add_method - called to add a method to the class remove_method - called to remove a method from a class These are the other two parts of the Cfetchmethod vtable I presume. When during packfile loading a Sub PMC constant is encountered and it's a method, Cadd_method should be called instead of CParrot_store_global? namespace_name - returns the name of this class' namespace Should we really separate the namespace name from the class name? get_anonymous_subclass - to put the object into a singleton anonymous subclass How is the singleton object created in the first place? leo
Re: Objects, classes, metaclasses, and other things that go bump in the night
At 11:13 AM +0100 12/14/04, Leopold Toetsch wrote: Dan Sugalski [EMAIL PROTECTED] wrote: subclass - To create a subclass of a class object Is existing and used. Right. I was listing the things we need in the protocol. Some of them we've got, some we don't, and some of the stuff we have we probably need to toss out or redo. add_parent - To add a parent to the class this is invoked on become_parent - Called on the class passed as a parameter to add_parent What is the latter used for? To give the newly added parent class a chance to do some setup in the child class, if there's a need for it. There probably won't be in most cases, but when mixing in classes of different families I think we're going to need this. class_type - returns some unique ID or other so all classes in one class family have the same ID What is a class family? The metaclass's class. I think. This is meant to be an identifier that says what kind of class a class is, so code can make some assumptions about internal structure and such. Everything that's based on a ParrotClass PMC, for example, would have the same class_type ID. add_method - called to add a method to the class remove_method - called to remove a method from a class These are the other two parts of the Cfetchmethod vtable I presume. When during packfile loading a Sub PMC constant is encountered and it's a method, Cadd_method should be called instead of CParrot_store_global? Yeah, I think so. I'm not too happy about it, but I think that's the way things will end up. Most classes will then go and stuff the methods into the namespace, but I think it'll have to be up to the classes whether they use the namespaces for methods or not. namespace_name - returns the name of this class' namespace Should we really separate the namespace name from the class name? Yes. We're going to have cases where we've got multiple classes with the same name but different namespaces. (Generally classes masquerading as other classes, but I can see some other cases) get_anonymous_subclass - to put the object into a singleton anonymous subclass How is the singleton object created in the first place? For now, I think singletons will all be objects of a normal class that get pulled into a singleton class, most likely because code's added or changed methods on the object rather than to the class. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
MMD dispatch (was: Objects, classes, metaclasses, and other things that go bump in the night)
Dan Sugalski [EMAIL PROTECTED] wrote: Now, there is one big gotcha here -- multimethod dispatch. I'm not entirely sure that we can do this properly and still give classes full control over how methods are looked for and where they go. Thinking about MMD a bit more, I can imagine the following scheme: 0) Our current MMD_table is 2-dimensional only and totally static. Furthermore it doesn't really fit into method lookups as it's an assymmetric second case. 1) Single method lookup. Current state: works for objects and NCI class methods. Does not work dynamically for vtable methods. The switch-over from dynamic lookup with objects to static lookup in PMCs is done with two helper (meta-)classes: delegate.pmc for objects and deleg_pmc.pmc for objects derived from PMCs). This adds some extra Cif statements in find_method_direct and looks bulky. Proposal: We just do a method call for all method-like, user methods that are currently handled in vtables. This means that we have internally: array.__push() array.__get_integer_keyed() scalar.__get_string() and what not. But also: complex.imag() complex.real() io_pmc.eof() ... The existing opcode syntax remains of course, but the method call syntax is an equivalent. Please remember, when reading this that we are currently not discussing performance, we want a complete and correct implementation. We can deal with performance implications later. The vtable entry can remain for internal (inside Parrot) use, if we know which PMC type is used and for internal vtable methods like init, mark, destroy, freeze, thaw, hash (!), and so on. As there is no distinction between a method and what's currently a vtable this expands easily to pythonic add-ons like PyInt.__oct__ which isn't supported directly and of course to all object methods. There is no distinction between object and vtable class methods (except that some cases may be heavily optimized). 2) 2-dimensional MMD This is basically just an abstraction of 1). A MMD method gets 2 entries in the globals table, one for the left class namespace and one for the right class. We have e.g. the visible opcode Pdest = Pleft + Pright This gets an internal opcode: callmethod_MMD_2 add # 2-dim MMD dispatch opcode or a more specialized one with signature and arguments: callmethod_MMD_2_v_ppp add, Pdest, Pleft, Pright to save on register setup overhead. This opcode does: a) look in Pleft's class for methods named add, remember the tuple (function, Csearchoffset) for each found method b) look in PRight's class and create such tuples, where the 2nd function argument is PRight's class. To accomplish this the namepace that is searched is e.g. \0right_class\0__2. c) compare the two lists for the best match (based on some distance e.g. the combined depth (searchoffset) of the found functions) d) cache the result (function, left-type, right-type) e) call function Done. Further as the classes for left and right are responsible for the method lookup, the left class can e.g. just ignore the MMD-search by returning a function and a flag that no further search should be done. During class setup existing MMD functions in *.pmc are created by a sequence similar to: store_global class_name_left, add, func store_global class_name_right, 2, add, func or store_global class_name_right\0__2, add, func 3) n-dimensional MMD This is a straight-forward extension of 2) and handled by callmethod_MMD_3_sig func (maybe) ... callmethod_MMD_n func, n Comments welcome, leo
Re: Objects, classes, metaclasses, and other things that go bump in the night
Leopold Toetsch wrote: Dan Sugalski [EMAIL PROTECTED] wrote: Anyway, so much for the 'outside world' view of objects as black box things that have properties and methods. [ ... ] Almost everything we do here is going to be with method calls. There's very little that I can see that requires any faster access than that, so as long as we have a proper protocol that everyone can conform to, we should be OK. The distinction object vs class is still in the way a bit, when it comes to method calls. $ python ... i = 4 i.__sub__(1) 3 help(int) ... | __sub__(...) | x.__sub__(y) == x-y Given that a PyInt is just a PMC it currently needs a lot of ugly hacks to implement the full set of operator methods. dynclasses/py*.pmc is mostly just duplicating existing code from standard PMCs. Iff the assembler just emits a method call for the opcode: sub P0, P1, P2 we have that equivalence too, with proper inheritance and static and dynamic operator overloading. I've duplicated FixedPMCArray.get_pmc_keyed_int() as a method, starting with this line: METHOD PMC* __getitem__(INTVAL key) { Point of order: discussions would be a lot easier if they didnt *start out* with perjorative terms like ugly hacks. One of my goals is to eliminate the need for if (!Interp_flags_TEST(INTERP, PARROT_PYTHON_MODE)) sprinkled throughout the various standard (i.e. Perl) PMCs. Yes, this all needs to be radically refactored - and all perlisms and pythonisms removed from the standard classes. Meanwhile, feel free to ask questions or make suggestions. With that out of the way, I will assert that in all the Python code I have seen, use of infix operators and square brackets for indexing overwhelmingly dominate direct use of the methods. Furthermore, it is clear that operators like sub_p_p_p and set_p_p_kic will always perform better than callmethodcc_sc __sub__ and callmethodcc_sc __getitem__ respectively. Therefore, Pirate emits the opcodes whenever direct use is made of things like infix operators and square brackets. And implemented methods like __sub__ which, while rarely used, will operate correctly by invoking to the opcodes. I *don't* see a need to heavily optimize for rarely used mechanisms. I encourage you to check out Pirate. The IMCC output of pirate.py is now remarkably close to the output of pie-thon.pl. - Sam Ruby
Objects, classes, metaclasses, and other things that go bump in the night
Right, so with at least a basic rework of the string stuff in, it's time to turn our attention to objects and all the stuff that goes with them. I'd originally thought that the bits we'd put in place would be sufficient to do everyone's object system (well, all the languages that we explicitly care about at least) but, well... that turns out not to be the case. Here's a quick recap on the architecture bits that already exist. Each PMC can: *) Return a PMC which represents that PMC's class *) Look up and return a PMC representing a named method *) Look up and return a named property for the PMC I think that we can accommodate most everyone's needs with this, assuming we set up some protocols and give a really big kick to the code in objects.c. The rest of this should be prefaced with the caveat that I'm finding that the more I'm learning about how all the different languages do objects the less I actually know what's going on, so this all might be wrong, or at least really sub-optimal. Anyway, with the vtable slots as they stand, we can make a method call on an object. There's a single op for this -- it calls the object's find_method function and then invokes the resulting PMC. The object is in complete control over what invokable PMC gets returned, and the invoked PMC has full control over what it actually does. Python has this interesting concept where properties and methods are sort of mixed -- that is if you do: a = foo.some_thing a will be either the property some_thing or the method object you'd get if you invoked some_thing on foo, bound up such that you can later do: a() and it's the same as if you'd done foo.some_thing(). I think. Pretty sure, at least, though I'm not sure of the order. Anyway, for this the core functionality is fine, though it assumes that the python compiler will emit a fetch prop/test/fetch method/bind method sequence for the a=foo.bar statement. I think I'm OK with that, though I'm more than willing to take arguments to the contrary. (It's not easy isn't a good argument -- we're down at the machine level. Nothing's easy. It breaks because of X and therefore has to be a fundamental thing to guarantee it works is a good argument) This assumes we have binding functionality that takes some parameters and produces an invokable PMC that uses those, and any other provided, parameters later on. Given that Perl 6 is doing higher-order functions (or curried functions, or whatever) where there's going to be a fair amount of binding going on I'm OK with core support for this, where core may be we have a bind PMC type with some methods on it or something like that. Dunno, we can work that out later. PMCs also implement 'is', 'does', and 'can'. These are three vtable methods that allow code to query a PMC to see if it it 'is' an instance of a named class, if it 'does' a particular interface, and whether it 'can' invoke the named method. PMCs are supposed to answer truthfully, however no assumptions are made as to how the pmc actually is, does, or can do something. It's perfectly acceptable for a PMC that has an AUTOLOAD (or its non-perl equivalent) function to answer true to any can query, and how an interface is implemented in a PMC's parent classes (if indeed it even has any parent classes) is also generally irrelevant, so long as the answers are either correct or egregiously wrong. Anyway, so much for the 'outside world' view of objects as black box things that have properties and methods. That part's easy, and I think we're fine there. The tricky part is all the bits that back the objects. Or, rather, arranging so that everyone can reasonably have their own custom backing bits while still allowing mix-n-match inheritance between the bits, and providing enough information to do things properly, or to cleanly fail. We're not going to *require* that classes of different sorts be able to inherit from one another, just require that if they do allow that then they behave properly. Almost everything we do here is going to be with method calls. There's very little that I can see that requires any faster access than that, so as long as we have a proper protocol that everyone can conform to, we should be OK. The one big exception to the we use methods thing is with composite objects. The only sane way I can think of to handle cross-system inheritance is through delegation and composition -- that is, if you have class A which inherits from Foo and A_Class, each of which is from a unique object system (like, say, CLOS, perl 5's 'object system', and new-style python classes), there's no way a single class system can reasonably express the differing semantics, so the only sane thing to do is have a composite object. The 'main' object is class A's (however that class instantiates objects) and hanging off it are two sub-objects, one for Foo and one for A_Class. These sub-objects will have a flag set marking