Re: Premature pessimization
Leopold Toetsch wrote: Good. Now Evil Leo (who can't program in Python ;) writes some piece of code like this: $ cat m.py class M(type): def __new__(meta, name, base, vars): cls = type.__new__(meta, name, base, vars) cls.__add__ = myadd return cls def myadd(self, r): return 44 - r I = M('Int', (int,), {}) i = I(5) print i print i + 2 $ python m.py 5 42 What this means is that the __add__ method will not be directly used for either PyInt or PyString objects Well, and that's not true, IMHO. See above. It has to be part of Parrot's method dispatch. What if your translator just sees the last 3 lines of the code and M is in some lib? That implies that you either can't translate to "$P0 = $P1 + $P2", or that you just translate or alias "__add__" to Parrot's "__add" and let Parrot fiddle around to find the correct method. Here's the part that you snipped that addresses that question: > And there is a piece that I haven't written yet that will do the > reverse: if MMD_ADD is called on a PyObject that has not provided > such behavior, then an any __add__ method provided needs to be > called. * the method is returning a new PMC. This doesn't follow the signature of Parrot infix MMD operations. Here I do think you are misunderstanding. The __add__ method with precisely that signature and semantics is defined by the Python language specification. It is (somewhat rarely) used directly, and therefore must be supported exactly that way. | __add__(...) | x.__add__(y) <==> x+y Parrot semantics are that the destination exists. But having a look at above "myadd", we probably have to adjust the calling conventions for overloaded infix operators, i.e. return the destination value. Or provide both schemes ... dunno. Since you provided an Evil Leo sample, let me provide an Evil Sam sample: d = { "__init__": lambda self,x: setattr(self, "value", x), "__add__": lambda self,x: str(self.value) + str(x.value) } def dict2class(d): class c: pass c.__dict__.update(d) return c c = dict2class(d) a=c(2) b=c(3) print a+b Things to note: 1) classes which are created every time a function is called 2) classes are thin wrappers over a dictionary object Now, given the above sample, let's revisit the statement that "The Python translator needs just a translation table for the common core methods." How, exactly, would that be done? Given that the method name is simply a string... used as a key in dictionary... with a different parameter signature than the hypothetical Parrot __add method. That's why I say: In the general case, looking for "reserved" method names at "compile" time doesn't work. "__add__" is reserved in Python and corresponds directly to "__add" in Parrot. I don't think that doesn't work. __add__ is *not* reserved in Python. There just is some syntatic sugar that provide a shorthand for certain signatures. I am free to define __add__ methods that have zero or sixteen arguments. I won't be able to call such methods with the convenient shorthand, but other than that, they should work. I personally don't think that performance considerations should be out of bounds in these discussions I've already shown that it's possible to go with fully dynamic dispatch *and* 30% faster for MMD and 70% faster for overloaded operations. First correct and complete, then speed considerations. Neither of which match Python semantics. We are going to need a system where classes are anonymous, not global. Where methods are properties that can be added simply by calling the equivalent of set_pmc_keyed. - Sam Ruby
Re: Premature pessimization
Sam Ruby <[EMAIL PROTECTED]> wrote: > Ah! Now we are getting somewhere! Yeah. That's the goal. > So, why have I proceeded in this manner? Two reasons. Fair enough, both. >> So given that we have a set of language-neutral PMCs in core that do the >> right thing, Python or Perl PMCs can inherit a lot of functionality from >> core PMCs. Language-specific behavior is of course implemented in the >> specific PMC. > Agreed. One area that will require a bit more thought is error cases. Yep. But let's just figure that out later. First the basics. > I'll address your questions below, but for reference, here is the code > that Pirate generates for "a=b+c": > find_type $I0, 'PyObject' > new $P0, $I0 > find_lex $P1, 'b' > find_lex $P2, 'c' > $P0 = $P1 + $P2 > store_lex -1, 'a', $P0 Good. Now Evil Leo (who can't program in Python ;) writes some piece of code like this: $ cat m.py class M(type): def __new__(meta, name, base, vars): cls = type.__new__(meta, name, base, vars) cls.__add__ = myadd return cls def myadd(self, r): return 44 - r I = M('Int', (int,), {}) i = I(5) print i print i + 2 $ python m.py 5 42 > What this means is that the __add__ method will not be directly used for > either PyInt or PyString objects Well, and that's not true, IMHO. See above. It has to be part of Parrot's method dispatch. What if your translator just sees the last 3 lines of the code and M is in some lib? That implies that you either can't translate to "$P0 = $P1 + $P2", or that you just translate or alias "__add__" to Parrot's "__add" and let Parrot fiddle around to find the correct method. >> * the method is returning a new PMC. This doesn't follow the signature >> of Parrot infix MMD operations. > Here I do think you are misunderstanding. The __add__ method with > precisely that signature and semantics is defined by the Python language > specification. It is (somewhat rarely) used directly, and therefore > must be supported exactly that way. | __add__(...) | x.__add__(y) <==> x+y Parrot semantics are that the destination exists. But having a look at above "myadd", we probably have to adjust the calling conventions for overloaded infix operators, i.e. return the destination value. Or provide both schemes ... dunno. > In the general case, looking for "reserved" method names at "compile" > time doesn't work. "__add__" is reserved in Python and corresponds directly to "__add" in Parrot. I don't think that doesn't work. > ... As everything can be overridden, this dispatch must > be done at runtime. Exactly and that's what I want to achieve. > I personally don't think that performance considerations should be out > of bounds in these discussions I've already shown that it's possible to go with fully dynamic dispatch *and* 30% faster for MMD and 70% faster for overloaded operations. First correct and complete, then speed considerations. > - Sam Ruby leo
Re: Premature pessimization
Ah! Now we are getting somewhere! Leopold Toetsch wrote: Sam Ruby <[EMAIL PROTECTED]> wrote: Leopold Toetsch wrote: So *all* lookups (complete with the asterisks) does not mean *all* lookups. How about ? Let's first concentrate on simpler stuff like infix operators. OK, but the point is that there will always be multiple mechanisms for dispatch. Citing S06: "Operators are just subroutines with special names." That statement is true for Perl. Same statement is true for Python. But the names vary based on the language. Yes. So let's factor out the common part and have that in Parrot core, usable for Python and Perl and ... The PyInt PMC currently duplicates almost all functionality that *should* be in the Integer PMC. We have first to fix the Integer PMC to do the Right Thing. Then we need some syntax for multiple inheritance in PMCs. The same holds for other PMCs. It was already proposed that we should have a language-neutral Hash PMC. No question that that is the intended final goal. What you see in the current python dynclasses is not representative of that final goal. So, why have I proceeded in this manner? Two reasons. First, I am not about to make random, unproven changes to the Parrot core until I am confident that the change is correct. Cloning a class temporarily gives me a playground to validate my ideas. Second, I am not going to wait around for Warnocked questions and proposals to be addressed. Now, neither of the above are absolutes. You have seen me make changes to the core - but only when I was relatively confident. And I *have* put on hold trying to reconcile object oriented semantics as this is both more substantial and seemed to be something that was likely to be addressed. Also, while I am not intending to make speculative changes to the core of Parrot, I don't have any objections to anybody making changes on my behalf. If you see some way of refactoring "my" code, go for it. It isn't "mine" - it is the community's. The one thing I would like to ask is that test cases that currently pass continue to pass. The dynclass unit tests are part of the normal test. Additionally, the tests in languages/parrot have been the ones driving most of my implementation lately. I do realize that that means checking out Pirate. Even though I don't agree with it, I do understand Michal's licensing issues. The reason I am not investing much time in resolving this issue is that Pirate is exactly one source file and could quickly be rewritten using the Perl 6 Grammar engine once that functionallity becomes sufficiently complete. So given that we have a set of language-neutral PMCs in core that do the right thing, Python or Perl PMCs can inherit a lot of functionality from core PMCs. Language-specific behavior is of course implemented in the specific PMC. Agreed. One area that will require a bit more thought is error cases. The behavior of integer divide by zero is likely to be different in each language. This could be approached in a number of different ways. One is by cloning such methods, like I have done. Another is to wrap such methods, catch the exception that is thrown, and handle it in a language specific manner. A better approach would be for the core to call out to a method on such error cases. Subclasses could simply inherit the common core behavior and override this one method. It also means that the "normal" execution path length (i.e., when dividing by values other than zero) is optimal, it is only the error paths that involve extra dispatches. That's an easy case. Overflow is a bit more subtle. Some languages might want to wrap the results (modulo 2**32). Some languages might want an exception. Other languages might want promotion to BigInt. Even if promotion to BigInt were the default behavior, subclasses would still want to override it. In Python's case, promotion to PyLong (which ideally would inherit from and trivially specialize and extend BigIt) would be the desired effect. Even this is only one aspect of a more general case: all morphing behavior needs to be overridable by subclasses. I believe that this can be easily handled by the current Parrot architecture by virtue of the fact that destination objects must be created before methods are called, and such destination objects can override morph methods). But it would help the cause if code were written to promote things to "Integer" instead of "PerlInt". Yes, at the moment, I'm guilty of this too. Second: method dispatch. I've looked a bit into PyObject. It seems that you start rolling your own method dispatch. Please don't get me wrong, I'm not criticizing your implementation. It might also be needed for some reasons I'm just overlooking and it's currently needed because core functionality isn't totally finished. I'll address your questions below, but for reference, here is the code that Pirate generates for "a=b+c": find_type $I0, 'PyObject' new $
Re: Premature pessimization
Sam Ruby <[EMAIL PROTECTED]> wrote: > Leopold Toetsch wrote: > So *all* lookups (complete with the asterisks) does not mean *all* lookups. > How about ? Let's first concentrate on simpler stuff like infix operators. >> Citing S06: "Operators are just subroutines with special names." > That statement is true for Perl. Same statement is true for Python. > But the names vary based on the language. Yes. So let's factor out the common part and have that in Parrot core, usable for Python and Perl and ... The PyInt PMC currently duplicates almost all functionality that *should* be in the Integer PMC. We have first to fix the Integer PMC to do the Right Thing. Then we need some syntax for multiple inheritance in PMCs. The same holds for other PMCs. It was already proposed that we should have a language-neutral Hash PMC. So given that we have a set of language-neutral PMCs in core that do the right thing, Python or Perl PMCs can inherit a lot of functionality from core PMCs. Language-specific behavior is of course implemented in the specific PMC. Second: method dispatch. I've looked a bit into PyObject. It seems that you start rolling your own method dispatch. Please don't get me wrong, I'm not criticizing your implementation. It might also be needed for some reasons I'm just overlooking and it's currently needed because core functionality isn't totally finished. Anyway - and please correct me if my assumptions are not true - I'll try to factor out the common part again. You have in PyObject e.g.: METHOD PMC* __add__(PMC *value) { PMC * ret = pmc_new(INTERP, dynclass_PyObject); mmd_dispatch_v_ppp(INTERP, SELF, value, ret, MMD_ADD); return ret; } I see six issues with that kind of approach: * The "__add" method should be in Parrot core. That's what I've described in the MMD dispatch proposal. * the method is returning a new PMC. This doesn't follow the signature of Parrot infix MMD operations. * well, it's dispatching twice. First the "__add__" method for PyObjects has to be searched for then the mmd_dispatch is done. * it'll very likely not work together with other HLLs. It's a python-only solution. * rolling your own dispatch still doesn't help, if a metaclass overloads the C<+> operation * code duplication So how would I do it: * prelim: above mentioned core PMC cleanup is done. Inheritance works: a PyInt isa(PyObject, Integer) * the core PMCs define methods, like your "__add__" except that our naming conventions is "__add". The Python translator needs just a translation table for the common core methods. * Method dispatch is done at the opcode level. add Px, Py, Pz just does the right thing. It calls the thingy that implements the "__add" method, being in core or overloaded shouldn't and doesn't matter. If inheritance changes at runtime it just works. And the other way round: Py."__add"(Pz, Px) is the same. Again it doesn't matter, if it's a core PMC, a Python PMC or an overloaded PASM/PIR multi sub (or a Python metaclass). The only difference is the changed signature. But that's how Parrot core defines overloaded infix operations. We have to do that anyway. It's just the correct way to go. (And please no answers WRT efficiency ;-) leo
Re: Premature pessimization
Leopold Toetsch wrote: Sam Ruby <[EMAIL PROTECTED]> wrote: Leopold Toetsch wrote: Another good reason to use pmc->vtable->fine_method in *all* lookups. So the PMC has full control over the dispatch. How does one lookup the C method? That will remain a VTABLE, entry right? Yes of course. So *all* lookups (complete with the asterisks) does not mean *all* lookups. How about ? All *external* names should be looked up using find_method. Yes. But what's an external name? What happens if a Perl6 guru installs a global overloaded add function: multi sub *infix:<+>(Int $l, Int $r) { ... } for example to count all "add" operations or whatever. This can currently be done with the C opcode. But that installs only one MMD function for one type pair. What if some classes inherit the "__add" method from C. Or: multi sub *infix:<+>(Any $l, Any $r) { ... } which presumably would influence all Perl types. Citing S06: "Operators are just subroutines with special names." That statement is true for Perl. Same statement is true for Python. But the names vary based on the language. IMCC should be independent of Perl semantics. Doing the same overloading with prefix or postfix operators OTOH would change vtable methods. That's currently not possible because the vtable dispatch is static, except for objects where it's dynamic. We also need to be aware that in the general case, we are talking about two lookups. One to find the language specific wrapper. And one to find the common code which backs it. Yes, but when the HLL method is found (e.g. PyStr.index) you just call the implementation statically and adjust e.g. return values. The second dispatch may be spelled "string_str_index", ... the second dispatch is basically not dispatch, it's a function call. So far, I have utterly failed to get my point across. Either all languages are going to need to adopt a common set of special names (not going to happen), or Parrot is a single language runtime, or the mechanism provided to do a language level dispatch needs to be orthogonal to the way that Parrot internal dispatches are done. Currently, Parrot is the latter. It is the suggestion that it moves towards the previous alternative that I object to. Why is this issue so important to me? I see a lot of things I like about Parrot, but at the moment, the notions that are baked into the core as to how classes should work is not one of them. You can straightforwardly use Parrot's object system for attributes and methods with all stuff known at compile time, AFAIK. You basically just extend the find_method and get_attr_str vtables. "straighforwardly"... "known at compile time"? Python's classes are glorified hashes, populated at runtime. The are essentially anonymous and lexically scoped. WRT find_method: S13 mentions multi subs per package or even lexically scoped. So we'll probably have to extend the method lookup anyway. I see a change that went in today to invalidate a method cache if a store_global call is made. That presumes a lot about how find_method works internally. An assumption that will not be valid for a number of languages. Well, the builtin method lookup system with the default find_method implementation is using it. But the cache is fully transparent. VTABLE_xxx depends only on pointer arithmetic and dereferencing. Yes, 20% of the operations may need more, and that should be provided for, but if 80% of the calls can be done this way, the performance benefits are significant. Please don't start arguing with estimated performance benefits. I've already shown that we can do the current MMD operations 30% faster - but with dynamic dispatch too. We need at the opcode level a dynamic dispatch. Internally the vtable call remains. I've written that already a few times. I think that all we can agree upon here is the MMD support is both incomplete and unoptimized. Making decisions based on how this implementation performs is premature. As you tirelessly point out, this is all internal. The way the ops are coded in the IMCC source need not change. However, prematurely mapping *all* method names to strings so that a second method dispatch can be done in 100% of the cases is a premature pessimization. Well better schemes for method overloading (e.g. in Python metaclass) are always welcome. I've proposed a scheme. And yes, it's an implementation detail. I watched the CLR folks try to come to consensus on as simple a matter as multiple inheritance. Each language will aproach this differently. The best Parrot can hope to do is to standardize on a protocol, and to provide enough primitives to enable languages to build proper support. [ dynclasses ] If anybody has any issues with this plan, please let me know now. Python specific stuff should of course move out from core. For the rest: It depends. If you end with a duplicate of CPython's Objects directory interoperbility will be difficult. If interoperability is defined as implementing a language independe
Re: Premature pessimization
Sam Ruby <[EMAIL PROTECTED]> wrote: > Leopold Toetsch wrote: >> Another good reason to use pmc->vtable->fine_method in *all* lookups. So >> the PMC has full control over the dispatch. > How does one lookup the C method? That will remain a > VTABLE, entry right? Yes of course. > All *external* names should be looked up using find_method. Yes. But what's an external name? What happens if a Perl6 guru installs a global overloaded add function: multi sub *infix:<+>(Int $l, Int $r) { ... } for example to count all "add" operations or whatever. This can currently be done with the C opcode. But that installs only one MMD function for one type pair. What if some classes inherit the "__add" method from C. Or: multi sub *infix:<+>(Any $l, Any $r) { ... } which presumably would influence all Perl types. Citing S06: "Operators are just subroutines with special names." Doing the same overloading with prefix or postfix operators OTOH would change vtable methods. That's currently not possible because the vtable dispatch is static, except for objects where it's dynamic. > We also need to be aware that in the general case, we are talking about > two lookups. One to find the language specific wrapper. And one to > find the common code which backs it. Yes, but when the HLL method is found (e.g. PyStr.index) you just call the implementation statically and adjust e.g. return values. > The second dispatch may be spelled "string_str_index", ... the second dispatch is basically not dispatch, it's a function call. > Why is this issue so important to me? I see a lot of things I like > about Parrot, but at the moment, the notions that are baked into the > core as to how classes should work is not one of them. You can straightforwardly use Parrot's object system for attributes and methods with all stuff known at compile time, AFAIK. You basically just extend the find_method and get_attr_str vtables. WRT find_method: S13 mentions multi subs per package or even lexically scoped. So we'll probably have to extend the method lookup anyway. > I see a change that went in today to invalidate a method cache if a > store_global call is made. That presumes a lot about how find_method > works internally. An assumption that will not be valid for a number of > languages. Well, the builtin method lookup system with the default find_method implementation is using it. But the cache is fully transparent. > VTABLE_xxx depends only on pointer arithmetic and dereferencing. Yes, > 20% of the operations may need more, and that should be provided for, > but if 80% of the calls can be done this way, the performance benefits > are significant. Please don't start arguing with estimated performance benefits. I've already shown that we can do the current MMD operations 30% faster - but with dynamic dispatch too. We need at the opcode level a dynamic dispatch. Internally the vtable call remains. I've written that already a few times. > As you tirelessly point out, this is all internal. The way the ops are > coded in the IMCC source need not change. However, prematurely mapping > *all* method names to strings so that a second method dispatch can be > done in 100% of the cases is a premature pessimization. Well better schemes for method overloading (e.g. in Python metaclass) are always welcome. I've proposed a scheme. And yes, it's an implementation detail. [ dynclasses ] > If anybody has any issues with this plan, please let me know now. Python specific stuff should of course move out from core. For the rest: It depends. If you end with a duplicate of CPython's Objects directory interoperbility will be difficult. > - Sam Ruby leo
Re: Premature pessimization
Leopold Toetsch wrote: The current delegation internals are not likely a good match for languages like Python or Ruby. I see such languages as implementing their own semantics for classes, and Parrot limiting its knowledge to methods like findmeth and invoke. Another good reason to use pmc->vtable->fine_method in *all* lookups. So the PMC has full control over the dispatch. How does one lookup the C method? That will remain a VTABLE, entry right? All *external* names should be looked up using find_method. All names needed by the Parrot runtime need to be orthogonal to the set of names available for use by languages. Ideally this would be by virtue of using a separate mechanism (e.g., VTABLES), but acceptably it could be done by naming convention (leading double underscores with no trailing double underscores). We also need to be aware that in the general case, we are talking about two lookups. One to find the language specific wrapper. And one to find the common code which backs it. The second dispatch may be spelled "string_str_index", in which case the loader does all the necessary fixup. The second dispatch may be spelled "VTABLE_get_pmc_keyed" in which case we are looking at a couple of pointer hops and some integer arithmetic. The second dispatch may be spelled "SUPER()" and use some combination of these. Or it could be using find_method, or perhaps even another technique entirely. But in all cases, we are looking at two different things. = = = Why is this issue so important to me? I see a lot of things I like about Parrot, but at the moment, the notions that are baked into the core as to how classes should work is not one of them. I see a change that went in today to invalidate a method cache if a store_global call is made. That presumes a lot about how find_method works internally. An assumption that will not be valid for a number of languages. ... What do we benefit from the premature pessimisation of mapping this to a string? Well, method names happen to be strings ;) Using the table below, at the moment, there are exactly zero strings materialized and/or compared against during the execution of vtable/PMC methods. I don't get that sentence. VTABLE_xxx depends only on pointer arithmetic and dereferencing. Yes, 20% of the operations may need more, and that should be provided for, but if 80% of the calls can be done this way, the performance benefits are significant. As you tirelessly point out, this is all internal. The way the ops are coded in the IMCC source need not change. However, prematurely mapping *all* method names to strings so that a second method dispatch can be done in 100% of the cases is a premature pessimization. = = = I plan to spend the next day or so looking into catching named exceptions, after which point I plan to implement Python classes on Parrot. I do not anticipate using globals or ParrotObject or ParrotClass in that implementation. I expect that there will be tests that replace methods, and I hope that optimizations that attempt to cache the results of find_method don't prevent these updates from being seen. I don't believe that it is a good idea for Python specific functionallity to be present in the core Parrot runtime. For that reason, I will implement this functionality inside dynclasses. If the runtime provides reasonable hooks for the functionally, I will use them. If not, I will temporarily clone and modify what I need, with the intent of getting it working first, and then refactoring the appropriate parts back into the Parrot runtime, this time with the necessary intercept points. In rare cases - when I feel confident enough - I may modify the Parrot runtime to provide the hooks I need. And all of this will be backed by test cases so that such refactoring can be done confidently and correctly. If anybody has any issues with this plan, please let me know now. - Sam Ruby
Re: Premature pessimization
Sam Ruby <[EMAIL PROTECTED]> wrote: > Leopold Toetsch wrote: >> There is of course a test case. I have mentioned it at least 20 times ;) >> t/op/gc_13.imc - currently using lexicals. > $ parrot t/op/gc_13.imc > 3 * 5 == 15! > What's broken? By using lexicals it works now. > The current delegation internals are not likely a good match for > languages like Python or Ruby. I see such languages as implementing > their own semantics for classes, and Parrot limiting its knowledge to > methods like findmeth and invoke. Another good reason to use pmc->vtable->fine_method in *all* lookups. So the PMC has full control over the dispatch. >>>... What do we benefit from the premature >>>pessimisation of mapping this to a string? >> >> Well, method names happen to be strings ;) > Using the table below, at the moment, there are exactly zero strings > materialized and/or compared against during the execution of vtable/PMC > methods. I don't get that sentence. > ... To the extent that these are represent common operations, this > can be significant from a performance perspective. ^ Ha? *SCNR* >> I'm proposing that *internally* (implementation detail ...) one scheme >> should be enough. Again: the opcodes don't change. > I don't think that there can be a single dynamic mechanism that works > for all languages, let alone one that works for Parrot internals too. What's wrong with the find_method vtable? > For starters, "findmeth" is a method. One that merits its own VTABLE > entry (otherwise, how would you find it?). Exactly. The "find_method" vtable is responsible for locating the method. As such it can't be overloaded directly. The "find_method" vtable is part of the meta-class system inside PMCs. The meta-class for Parrot standard objects is ParrotObject which handles the "find-method" by calling code in objects.c. If that dispatch scheme doesn't fit at all for a language it can of course create it's own meta-class with an appropriate "find_method" vtable. The important thing is that from the opcode level all continues to work as long as the opcodes are using "find_method". > MMD might help out with mutiple language interop, but Python as > currently designed doesn't need any such facility. That's not quite true. CPython has a rather bulky manual dispatch to find the correct operation for e.g. "long + int". The result is totally the same with Parrot's MMD dispatch - find a possibly internal function that is able to add a long integer and a normal integer. You can now of course duplicate the whole standard PMCs and roll your own code. But that's error prone and increases code size for no good reason and it will verly likely not help for HLL interoperbility. > I asked my question poorly, and I think you answered it, but the > important point is that subroutines can change the lexical variables of > their callers - enough so that either a second level of indirection is > required The second level is the name hash. Nothing changes here. If the compiler knows the lexical it can currently emit code with an indexed access to the lexical data array. This corresponds to an indexed access inside the register store. If the lexical isn't known because it's somewhere in an outer pad (or it's even created dynamically) the hash lookup jumps in, as now. > - Sam Ruby leo
Re: Premature pessimization
Leopold Toetsch writes: > > On a semi-related note, can I get a classoffset without doing a hash > lookup? That is, can I store the class number I get assigned somewhere > for quick fetching? Hey now, you're citing the Luke Palmer that writes code. Don't confuse him with the Luke Palmer who does software design. They really don't get along so well... Luke
Re: Premature pessimization
Leopold Toetsch wrote: Sam Ruby <[EMAIL PROTECTED]> wrote: Leopold Toetsch wrote: My philosophy is simple: things without test cases tend not not get fixed, and when fixed, tend not to stay fixed. There is of course a test case. I have mentioned it at least 20 times ;) t/op/gc_13.imc - currently using lexicals. $ parrot t/op/gc_13.imc 3 * 5 == 15! What's broken? I am worried about Parrot trying to establish common public names for common functions. Many of the examples you gave did not include the prerequisite double underscores. For example: "sub", "imag", "eof". Ah ok. Sorry. "__subtract". OTOH some might be that common and the usage is the same in all languages that we might use that common name. But that's probably not worth the effort. So yes, we should establish the notion that all methods follow that convention. Cool. Re: VTABLES... I disagree with you on this one. Prematurely mapping an operation to a string is a premature pessimization. Except that we are doing that already. Please read through classes/delegate.c. The problem is that this scheme doesn't work for MMD of real objects. I don't know enough to have an informed opinion yet on MMD. But __set_integer_native is not a MMD operation. It already is efficiently dispatched to the correct method. For PMCs yes. For objects not efficiently and only with a separate delegate meta-class. The current delegation internals are not likely a good match for languages like Python or Ruby. I see such languages as implementing their own semantics for classes, and Parrot limiting its knowledge to methods like findmeth and invoke. ... What do we benefit from the premature pessimisation of mapping this to a string? Well, method names happen to be strings ;) Using the table below, at the moment, there are exactly zero strings materialized and/or compared against during the execution of vtable/PMC methods. To the extent that these are represent common operations, this can be significant from a performance perspective. Anyway, I'll try to summarize our current method dispatch scheme: vtableMMD NCI method - opcodeset P0, 4 add P0, P1, P2 io."__eof"() - PMC pmc->vtable->.. mmd_dispatch_* callmethod objectdelegate.c 1) callmethod object(PMC) deleg_pmc.c 1) callmethod NCI methods are working, the method lookup is dynamic, inheritance works. MMD inheritance is totally static (set up during PMC compilation). PMCs dispatch with the mmd_dispatch_* functions in ops/*.ops. 1) overloading a MMD infix operation needs the mmdvtregister opcode, but this does not effect any class inheritance and it's just for the given two types. Multi-dimensional MD is not implemented. For vtables objects and objects derived from PMCs use two helper classes that bridge the dynamic inheritance of objects to the static inheritance in PMCs. This doesn't support runtime overloading of methods that defaulted to the PMC method. Looking at 5 different schemes that work for ~50% of the cases can - well - make one pessimistic ;) I'm proposing that *internally* (implementation detail ...) one scheme should be enough. Again: the opcodes don't change. I don't think that there can be a single dynamic mechanism that works for all languages, let alone one that works for Parrot internals too. For starters, "findmeth" is a method. One that merits its own VTABLE entry (otherwise, how would you find it?). Others that map cleanly onto existing language syntax also can benefit from such optimizations. MMD might help out with mutiple language interop, but Python as currently designed doesn't need any such facility. The vtable method set_integer_native clearly maps to a real method that can be either inherited or be provided by the user. This doesn't work for MMD functions. Again, set_integer_native is not an MMD function? __set_integer_native is a vtable method. Cool. No. I'm not ditching lexical pads at all. They are of course needed. The top pad is in the registers. Outer pads are looked up as usual. I guess I don't understand. I guess it is time for a test case. ;-) How would the following work if the top pad is in registers? The top pad is always the currently executing one. var = 1 def g(): global var var = 2 print "before", var g() print "after", var It's obviously up to the "global" to do the right thing. My impression (I don't have more Python knowledge ;) is, that "global" refers to the outermost lexical pad in the main module. Such behavior is the default for perl5: my $var = 1; sub g { $var = 2; } As there is no "my $var" in the inner pad, that's just a C opcode. It follows the C pointer to the outer pad and finds in the lexical hash the "$var" which maps to a pos
Re: Premature pessimization
Sam Ruby <[EMAIL PROTECTED]> wrote: > Leopold Toetsch wrote: > My philosophy is simple: things without test cases tend not not get > fixed, and when fixed, tend not to stay fixed. There is of course a test case. I have mentioned it at least 20 times ;) t/op/gc_13.imc - currently using lexicals. > I am worried about Parrot trying to establish common public names for > common functions. Many of the examples you gave did not include the > prerequisite double underscores. For example: "sub", "imag", "eof". Ah ok. Sorry. "__subtract". OTOH some might be that common and the usage is the same in all languages that we might use that common name. But that's probably not worth the effort. So yes, we should establish the notion that all methods follow that convention. >>>Re: VTABLES... I disagree with you on this one. Prematurely mapping an >>>operation to a string is a premature pessimization. >> >> Except that we are doing that already. Please read through >> classes/delegate.c. The problem is that this scheme doesn't work for >> MMD of real objects. > I don't know enough to have an informed opinion yet on MMD. But > __set_integer_native is not a MMD operation. It already is efficiently > dispatched to the correct method. For PMCs yes. For objects not efficiently and only with a separate delegate meta-class. > ... What do we benefit from the premature > pessimisation of mapping this to a string? Well, method names happen to be strings ;) Anyway, I'll try to summarize our current method dispatch scheme: vtableMMD NCI method - opcodeset P0, 4 add P0, P1, P2 io."__eof"() - PMC pmc->vtable->.. mmd_dispatch_* callmethod objectdelegate.c 1) callmethod object(PMC) deleg_pmc.c 1) callmethod NCI methods are working, the method lookup is dynamic, inheritance works. MMD inheritance is totally static (set up during PMC compilation). PMCs dispatch with the mmd_dispatch_* functions in ops/*.ops. 1) overloading a MMD infix operation needs the mmdvtregister opcode, but this does not effect any class inheritance and it's just for the given two types. Multi-dimensional MD is not implemented. For vtables objects and objects derived from PMCs use two helper classes that bridge the dynamic inheritance of objects to the static inheritance in PMCs. This doesn't support runtime overloading of methods that defaulted to the PMC method. Looking at 5 different schemes that work for ~50% of the cases can - well - make one pessimistic ;) I'm proposing that *internally* (implementation detail ...) one scheme should be enough. Again: the opcodes don't change. >> The vtable method set_integer_native clearly maps to a real method that >> can be either inherited or be provided by the user. This doesn't work >> for MMD functions. > Again, set_integer_native is not an MMD function? __set_integer_native is a vtable method. >> No. I'm not ditching lexical pads at all. They are of course needed. The >> top pad is in the registers. Outer pads are looked up as usual. > I guess I don't understand. I guess it is time for a test case. ;-) > How would the following work if the top pad is in registers? The top pad is always the currently executing one. >var = 1 >def g(): > global var > var = 2 >print "before", var >g() >print "after", var It's obviously up to the "global" to do the right thing. My impression (I don't have more Python knowledge ;) is, that "global" refers to the outermost lexical pad in the main module. > Such behavior is the default for perl5: >my $var = 1; >sub g { > $var = 2; >} As there is no "my $var" in the inner pad, that's just a C opcode. It follows the C pointer to the outer pad and finds in the lexical hash the "$var" which maps to a position in the register store of the outer pad. "my ($a, $b)" in the innermost pad (in sub g) would map directly to registers as "my $var" maps to a register in "main" (in the outer pad). It's the same as currently, except that we have now a separate array that holds the lexicals. I was just mapping that array into the preserved area of registers, which would make this area of course variable-sized with the nice benefit that "my int $i" also works (we don't have natural int lexicals currently). > - Sam Ruby leo
Re: Premature pessimization
Leopold Toetsch wrote: Sam Ruby <[EMAIL PROTECTED]> wrote: Leopold Toetsch wrote: "correct". I've discovered and analysed the problem with continuations. I've made a proposal to fix that. No one has said that it's technically wrong or couldn't work. It seems you are liking the idea, but Dan doesn't. Now what? I would suggest focusing on one issue at a time and starting with writing a test case. But, overall, it is clear that you are frustrated. ... a bit by the lack of progress that can be achieved to e.g. fix broken continuation behavior. But not much ;) My philosophy is simple: things without test cases tend not not get fixed, and when fixed, tend not to stay fixed. ... It show when you give responses like "Doesn't really matter" when people like me ask questions about your proposals. That, in turn, makes people like me frustrated. Well, you worried about different classes having identical method names, but the methods do different things. I asked why this is a problem. I gave an example. Here is one more: obj."__set_integer_native"(10) does 2 totally different things for arrays (set array size) or for scalars (assign value 10). I am worried about Parrot trying to establish common public names for common functions. Many of the examples you gave did not include the prerequisite double underscores. For example: "sub", "imag", "eof". These names are not likely to have a commmon behavior across all languages. Re: VTABLES... I disagree with you on this one. Prematurely mapping an operation to a string is a premature pessimization. Except that we are doing that already. Please read through classes/delegate.c. The problem is that this scheme doesn't work for MMD of real objects. I don't know enough to have an informed opinion yet on MMD. But __set_integer_native is not a MMD operation. It already is efficiently dispatched to the correct method. What do we benefit from the premature pessimisation of mapping this to a string? ... Add to that the fact that such names can conflict with usages by languages of this runtime. Parrots method names (two leading underscores + vtable name) are reserved. And again, when two distinct clases use the same method name this is no conflict at all. Issues with classes in an inheritance chain are easily avoided by following that convention. Again, my point is that the examples need to follow that convention. Without exception. ... Overall, the design of the opcodes and PMC layout should focus on trying to retain as much information as possible. Nothing get lost. Visible opcodes don't change. An example .sub m $P0 = newclass "Foo" $I0 = find_type "Foo" .local pmc obj obj = new $I0 obj = 5 .end running that snippet gives: Can't find method '__set_integer_native' for object 'Foo'. The vtable method set_integer_native clearly maps to a real method that can be either inherited or be provided by the user. This doesn't work for MMD functions. Again, set_integer_native is not an MMD function? Re: continuations... frankly, I'm hoping that a solution emerges that doesn't involve significant reduction in functionallity. I might be misunderstanding you, but it sounds to me like you are proposing ditching lexical pads. No. I'm not ditching lexical pads at all. They are of course needed. The top pad is in the registers. Outer pads are looked up as usual. I guess I don't understand. I guess it is time for a test case. ;-) How would the following work if the top pad is in registers? var = 1 def g(): global var var = 2 print "before", var g() print "after", var Such behavior is the default for perl5: my $var = 1; sub g { $var = 2; } print "before $var\n"; g(); print "after $var\n"; - Sam Ruby
Re: Premature pessimization
Sam Ruby <[EMAIL PROTECTED]> wrote: > Leopold Toetsch wrote: >> >> "correct". I've discovered and analysed the problem with continuations. >> I've made a proposal to fix that. No one has said that it's technically >> wrong or couldn't work. It seems you are liking the idea, but Dan >> doesn't. Now what? > I would suggest focusing on one issue at a time and starting with > writing a test case. > But, overall, it is clear that you are frustrated. ... a bit by the lack of progress that can be achieved to e.g. fix broken continuation behavior. But not much ;) > ... It show when you > give responses like "Doesn't really matter" when people like me ask > questions about your proposals. That, in turn, makes people like me > frustrated. Well, you worried about different classes having identical method names, but the methods do different things. I asked why this is a problem. I gave an example. Here is one more: obj."__set_integer_native"(10) does 2 totally different things for arrays (set array size) or for scalars (assign value 10). > Re: VTABLES... I disagree with you on this one. Prematurely mapping an > operation to a string is a premature pessimization. Except that we are doing that already. Please read through classes/delegate.c. The problem is that this scheme doesn't work for MMD of real objects. > ... Add to that the > fact that such names can conflict with usages by languages of this > runtime. Parrots method names (two leading underscores + vtable name) are reserved. And again, when two distinct clases use the same method name this is no conflict at all. Issues with classes in an inheritance chain are easily avoided by following that convention. > ... Overall, the design of the opcodes and PMC layout should focus > on trying to retain as much information as possible. Nothing get lost. Visible opcodes don't change. An example .sub m $P0 = newclass "Foo" $I0 = find_type "Foo" .local pmc obj obj = new $I0 obj = 5 .end running that snippet gives: Can't find method '__set_integer_native' for object 'Foo'. The vtable method set_integer_native clearly maps to a real method that can be either inherited or be provided by the user. This doesn't work for MMD functions. > Re: continuations... frankly, I'm hoping that a solution emerges that > doesn't involve significant reduction in functionallity. I might be > misunderstanding you, but it sounds to me like you are proposing > ditching lexical pads. No. I'm not ditching lexical pads at all. They are of course needed. The top pad is in the registers. Outer pads are looked up as usual. > - Sam Ruby leo
Re: Premature pessimization
Ashley Winters wrote: On Sun, 5 Dec 2004 11:46:24 -0700, Luke Palmer <[EMAIL PROTECTED]> wrote: Leopold Toetsch writes: This term came up in a recent discussion[1]. But I'd like to give this term a second meaning. Except what you're talking about here is premature *optimzation*. You're expecting certain forms of the opcodes to be slow (that's the pessimization part), but then you're acutally using opcodes that you think can be made faster. This is a classic Knuth example. So the sequence of using classoffset/getattribute is a _feature_ necessary for completeness, rather than a "pessimization" to compensate for the belief that not explicitly caching would be hopelessly slow? Ahh. I'm learning much. :) On a semi-related note, can I get a classoffset without doing a hash lookup? That is, can I store the class number I get assigned somewhere for quick fetching? *SCNR* - leo
Re: Premature pessimization
Leopold Toetsch wrote: "correct". I've discovered and analysed the problem with continuations. I've made a proposal to fix that. No one has said that it's technically wrong or couldn't work. It seems you are liking the idea, but Dan doesn't. Now what? I would suggest focusing on one issue at a time and starting with writing a test case. But, overall, it is clear that you are frustrated. It show when you give responses like "Doesn't really matter" when people like me ask questions about your proposals. That, in turn, makes people like me frustrated. Re: classoffset... I don't currently use this opcode, nor do I plan to. This opcode has a builtin assumption that attributes are stored as attributes. They may in fact be methods or properties in the Parrot sense, or Property's in the Python sense. Re: VTABLES... I disagree with you on this one. Prematurely mapping an operation to a string is a premature pessimization. Add to that the fact that such names can conflict with usages by languages of this runtime. Overall, the design of the opcodes and PMC layout should focus on trying to retain as much information as possible. Re: continuations... frankly, I'm hoping that a solution emerges that doesn't involve significant reduction in functionallity. I might be misunderstanding you, but it sounds to me like you are proposing ditching lexical pads. - Sam Ruby
Re: Premature pessimization
Luke Palmer <[EMAIL PROTECTED]> wrote: > Leopold Toetsch writes: >> This term came up in a recent discussion[1]. But I'd like to give this >> term a second meaning. > Except what you're talking about here is premature *optimzation*. No. I don't think so. >> During design considerations we (including me of course too) tend to >> discard or pessimize ideas early, because they look inefficient And I've summarized a bunch of examples. Maybe it wasn't too clear, but the conclusion is of course: spilling, lexical refetching, using method calls, ... (expensive looking operations) should not be discarded early. They can be optimized *later*. > ... Inline caching is > just another optimization. We need to be feature complete. That's exactly what I've written. I've described this idea as a possible later optimization. > And if you really feel like you need to optimize something, look into > the lexicals-as-registers idea you had to fix continuation semantics. > Then you can work under the guise of fixing something broken. Yeah. Have a look at chromatic's 4 c-words: compiles -> correct -> complete -> competitive "compiles" isn't really given, when there are compilers that either can't compile an -O3 build of the switched core or take ages to complete (if memory usage even permits that) choking the CG* cores. "correct". I've discovered and analysed the problem with continuations. I've made a proposal to fix that. No one has said that it's technically wrong or couldn't work. It seems you are liking the idea, but Dan doesn't. Now what? "complete" - read e.g. my recent postings WRT MMD (and perl.cvs.parrot:) "competitive" well, I like a speedy Parrot, that's obvious. It's a nice to have. But not only. Folks on conferences get interested in Parrot because it can proof that programs will run faster. > Luke leo
RE: Premature pessimization
can you unsubscribe me asap rgds venkat -Original Message- From: Ashley Winters [mailto:[EMAIL PROTECTED] Sent: Monday, December 06, 2004 7:19 AM To: Luke Palmer Cc: Leopold Toetsch; Perl 6 Internals Subject: Re: Premature pessimization On Sun, 5 Dec 2004 11:46:24 -0700, Luke Palmer <[EMAIL PROTECTED]> wrote: > Leopold Toetsch writes: > > This term came up in a recent discussion[1]. But I'd like to give this > > term a second meaning. > > Except what you're talking about here is premature *optimzation*. > You're expecting certain forms of the opcodes to be slow (that's the > pessimization part), but then you're acutally using opcodes that you > think can be made faster. This is a classic Knuth example. So the sequence of using classoffset/getattribute is a _feature_ necessary for completeness, rather than a "pessimization" to compensate for the belief that not explicitly caching would be hopelessly slow? Ahh. I'm learning much. :) Ashley Winters
Re: Premature pessimization
On Sun, 5 Dec 2004 11:46:24 -0700, Luke Palmer <[EMAIL PROTECTED]> wrote: > Leopold Toetsch writes: > > This term came up in a recent discussion[1]. But I'd like to give this > > term a second meaning. > > Except what you're talking about here is premature *optimzation*. > You're expecting certain forms of the opcodes to be slow (that's the > pessimization part), but then you're acutally using opcodes that you > think can be made faster. This is a classic Knuth example. So the sequence of using classoffset/getattribute is a _feature_ necessary for completeness, rather than a "pessimization" to compensate for the belief that not explicitly caching would be hopelessly slow? Ahh. I'm learning much. :) Ashley Winters
Re: Premature pessimization
On Sun, 5 Dec 2004 11:46:24 -0700, Luke Palmer <[EMAIL PROTECTED]> wrote: > Leopold Toetsch writes: > > This term came up in a recent discussion[1]. But I'd like to give this > > term a second meaning. > > Except what you're talking about here is premature *optimzation*. Yes, indeed. Cheers, Michael
Re: Premature pessimization
Leopold Toetsch writes: > This term came up in a recent discussion[1]. But I'd like to give this > term a second meaning. Except what you're talking about here is premature *optimzation*. You're expecting certain forms of the opcodes to be slow (that's the pessimization part), but then you're acutally using opcodes that you think can be made faster. This is a classic Knuth example. If I were Dan, here's what I'd say (though perhaps I'd say it in somewhat a different tone :-): don't worry about it. Inline caching is just another optimization. We need to be feature complete. Leo, speed is never off your mind, so we can be fairly certain that we won't be making any decisions that are going to bite us speed-wise. Plus, at this point, we can change interfaces (when we go past feature complete into real-life benchmarked optimization). Yeah, the compilers will have to change, but really that's not a big issue. I've been writing compilers for parrot for a while, and stuff is always changing, and it usually means one or two lines of code for me if I designed well. And if you really feel like you need to optimize something, look into the lexicals-as-registers idea you had to fix continuation semantics. Then you can work under the guise of fixing something broken. Luke > > During design considerations we (including me of course too) tend to > discard or pessimize ideas early, because they look inefficient. Recent > examples are e.g. > - spilling > - lexical (re-)fetching > - method calls > > We have even some opcodes that work around the estimated bad effect of > the operation. E.g. for attribute access: > > classoffset Ix, Pobj, "attr" > getattribute Pattr, Pobj, Ix > > instead a plain and clear, single opcode: > > gettatribute Pattr, Pobj, "attr" > > We are thinking that the former is more efficient because we can cache > the expensive hash lookup in the index C and do just integer math to > access other attributes. (This is BTW problematic anyay, as the side > effect of a getattribute could be a change to the class structure). > > Anyway, a second example is the indexed access of lexicals additionally > to the named access. Again a premature pessimization with opcode support. > > Another one is the "newsub" opcode, invented to be able to create the > subroutine object (and possibly the return continuation) outside of a > loop, where the function is called. > > All of these pessimized operations have one common part: one or several > costy lookups with a constant key. > > One possible solution is inline caching. > > A cache is of course kind of an optimization. OTOH when we know that we > don't have to pay any runtime penalty for certain operations, possible > solutions are again thinkable that would otherwise just be discarded as > they seem to be expensive. > > I'll comment on inline caching in a second mail. > > leo > > [1] Michael Walter in Re: [CVS ci] opcode cleanup 1 . minus 177 opcodes > I'm fully behind Dan's answer - first fix broken stuff, then optimize. > OTOH when compilers already are choking on the source it seems that this > kind of pessimization isn't really premature ;) >