Re: [RFC] imcc calling conventions
On Thu, Feb 20, 2003 at 09:55:35AM -0800, Steve Fink wrote: I think this has been discussd before, but are we all okay with this callee-save-everything policy? At the very least, I'd be tempted to add a bitmasked saveall/restoreall pair to reduce the amount of cache thrashing. (saveall 0b00100110) It just seems odd that you have to either save all 32 of one of the types of registers, or to save selected ones onto a different stack. But it *is* simpler to copy over the entire register file to a stack frame, I guess. I agree that the bitmasked savesome/restoresome would be less simple. I suspect that a butmasked version would JIT very nicely. Since when did parrot trade simplicity for speed? Nicholas Clark
Re: Using imcc as JIT optimizer
If memory serves me right, Dan Sugalski wrote: This sounds pretty interesting, and I bet it could make things faster. The one thing to be careful of is that it's easy to get yourself into a position where you spend more time optimizing the code you're JITting than you win in the end. I think that's not the case for ahead of time optimisations . As long as the JIT is not the optimiser , you could take your time optimising. The topic is really misleading ... or am I the one who's wrong ? You also have to be very careful that you don't reorder things, since there's not enough info in the bytecode stream to know what can and can't be moved. (Which is something we need to deal with in IMCC as well) I'm assuming that the temporaries are the things being moved around here ?. Since imcc already moves them around anyway and the programmer makes no assumptions about their positions -- this shouldn't be a problem ?. The only question I have here , how does imcc identify loops ?. I've been using if goto to loop around , which is exactly the way assembly does it. But that sounds like a lot of work identifying the loops and optimising accordingly. To make it more clear -- identifying tight loops and the usage weights correctly. 10 uses of $I0 outside the loop vs 1 use of $I1 inside a 100 times loop. Which will be come first ?. Gopal -- The difference between insanity and genius is measured by success
Re: Objects, methods, attributes, properties, and other related frobnitzes
On Fri, Feb 21, 2003 at 04:34:42PM -0500, Dan Sugalski wrote: If A isa B, we certainly wouldn't want to call A's AUTOLOAD on a method before we looked to see if B had a concrete instance of that method. Right. The best you could probably do is note where you found the first AUTOLOAD so that when you do reach the end of the ISA search you don't need to do the whole search again. But is this programming for the common case ? or is it premature optimization. Graham.
non-inline text in parrot assembly?
Parrot assembly supports inline strings, but are there any plans to have it support a distinct .string (or similar) asm section? The main benefit would be easier compatibility/portability with existing assembly code generators. Is anybody aware of an existing assembly format that doesn't support a separate string section? -Tupshin
Re: non-inline text in parrot assembly?
Tupshin Harper wrote: Parrot assembly supports inline strings, but are there any plans to have it support a distinct .string (or similar) asm section? The main benefit would be easier compatibility/portability with existing assembly code generators. Is anybody aware of an existing assembly format that doesn't support a separate string section? You can use the .constant (PASM) or .const (IMCC) syntax, to keep strings visually together. -Tupshin leo
Re: [RFC] imcc calling conventions
Steve Fink wrote: On Feb-18, Leopold Toetsch wrote: [ ... ] .return mi# [1] restoreall ret .end My immediate reaction to this (okay, I really saw this before in perl6-generated code) is why don't the values from .return and restoreall get mixed up? Yep. It is at least confusing at the beginning. You may want to add a brief description of the kazillion different stacks that Parrot uses. There are six, I think: 7 if you count the intstack too. I think this has been discussd before, but are we all okay with this callee-save-everything policy? I'm not that convinced of it - I still think the subroutine knows best, how to preserve its registers. ... At the very least, I'd be tempted to add a bitmasked saveall/restoreall pair to reduce the amount of cache thrashing. (saveall 0b00100110) It just seems odd that you have to either save all 32 of one of the types of registers, or to save selected ones onto a different stack. But it *is* simpler to copy over the entire register file to a stack frame, I guess. Probably yes. Taking that farther, I've always liked systems that divide up the registers into callee-save, caller-save, and scratch (nobody-save?) Register windows, like (AFAIK) IA64 has? This would avoid a lot of saveall/restoreall/saveX/restoreX for smaller subroutines. Maybe that's just me. And I vaguelly recall that there was some discussion I didn't follow about how that interferes with tail-call optimization. (To me, tail call optimization == replace recursive call with a goto to the end of the function preamble) Dan did mention it, when arguing for callee-save calling coventions. But I don't know how it exactly works. Or, as another stab at the same problem, does Parrot really need 32*4 registers? I keep thinking we might be better off with 16 of each type. But maybe I'm just grumbling. I think 16 might be not enough. So the current setting is ok. leo
Re: Configure.pl --cgoto=0 doesn't work
Nicholas Clark wrote: On Fri, Feb 21, 2003 at 08:34:05AM +0100, Leopold Toetsch wrote: Case 2) should disable only core_ops_cg.c but not core_ops_cgp.c But surely we'd also like a flag to disable core_ops_cgp.c but leave core_ops_cg.c? IMHO no. Why disable the faster and much smaller core, and keep the big and slow core? How many cores are there now? Is there a way to make a modular flag system that lets people configure any arbitrary combination that they wish to build? In the long run, we should build the normal function based core (used e.g. for trace and the fastest core, that is available. Though I can imagine, that we additionally want to have the most memory efficient core too - which would not be a prederefed one. Here is a summary in terms of speed: JIT CGP (obsoletes CGoto Prederef) Switched Prederef (obsoletes Prederef) Switched Normal When PBC code size matters we could have: CGoto Switched Normal So, the plain prederefed core is always obsolete now. And an easy way for the tinderbox machines to build all applicable, and run tests for each built core in turn? $ make quickfulltest Nicholas Clark leo
Re: Using imcc as JIT optimizer
Gopal V wrote: I'm assuming that the temporaries are the things being moved around here ?. It is not so much a matter of moving things around, but a matter of allocating (and renumbering) parrot (or for JIT) processor registers. These are of course mainly temporaries, but even when you have some find_lexical/do_something/store_lexical, imcc selects the best register for all involved ops, temps or variables it doesn't really matter. The only question I have here , how does imcc identify loops ?. I've been using if goto to loop around , which is exactly the way assembly does it. But that sounds like a lot of work identifying the loops and optimising accordingly. Here are basic blocks, the CFG and loop info of 0 set I0, 10 1 x: 1 unless I0, y 2 dec I0 2 print I0 2 print \n 2 branch x 3 y: 3end Dumping the CFG: --- 0 (0)- 1- 1 (1)- 2 3 - 2 0 2 (1)- 1- 1 3 (0)- - 1 Loop info - loop 0, depth 1, size 2, entry 0, contains blocks: 1 2 To make it more clear -- identifying tight loops and the usage weights correctly. 10 uses of $I0 outside the loop vs 1 use of $I1 inside a 100 times loop. Which will be come first ?. This is basically the current score calculation used for register allocation: r-score = r-use_count + (r-lhs_use_count 2); r-score += 1 (loop_depth * 3); Gopal leo
Re: IMCC's bsr handling
Steve Fink wrote: [Apologies if you receive this twice, No problemm, the duplicate filter in procmail takes care of it. On Sat, Feb 08, 2003 at 12:19:35PM +0100, Leopold Toetsch wrote: When we want these kind of branches, then they must be more high level, defining all possible branch targets, e.g. like a switch statement. I think all that means that I *can* specify a set of labels that the instruction might jump to, and guarantee that if it jumps to anywhere else that it won't affect any registers. I think, that should be enough to allocate registers in a save way. Though it might not the most efficient way to do it. It really depends on the complexity of such code pieces. When regex code is intersparsed with normal code, it will be for sure not be possible to reuse registers, when the control flow is not known. ... For now, I'm prototyping using a heavyweight mechanism. If that gets to be too unwieldy, maybe I'll take a look at implementing something like bsr $I0 = _label1 | _label2 | REGISTER_PRESERVING_LOCATION (Ignore the syntax!). It's only needed for imcc, right? (I wouldn't need to propagate it through to the JIT or anything, would I?) Its for the register allocator. But it really depends on calling conventions. When the bsr's do a saveall/restoreall and arguments are passed on stack then it's no problem, the bsr is a noop then in terms of CFG. When the bsr's have e.g. pdd03 calling conventions, then each possible control flow has to be tracked and allocated registers must match. How is invoke handled? Is it assumed to always use the full PDD06 calling conventions? s/06/03/ - No. When code is only called internally, it can use any calling convention, that fits best. leo
L-valueness of Arrays vs. Lists
On Tue, 11 Feb 2003, Michael Lazzaro wrote: What is the utility of the perl5 behavior: \($a,$b,$c) meaning (\$a, \$b, \$c) Do people really do that? ... Can someone give an example of an actual, proper, use? Yes, I've used it like this: for (\($a,$b,$c)) { $$_++; } to be sure that it works on all versions, since for ($a,$b,$c) { $_++; } works differently on different versions. (Actually, I don't have an old-enough version on hand to check just when that was, so it must have been 5.004 or before.) This change didn't start to bite me until P5.6.0, when values %HASH became an Lvalue too, whereupon for ( values %HASH ) { s/^prefix//; ... } ... do something else with %HASH stopped working. So, I would urge making as many things as possible Lvalues (and magical references) right from the start of P6, just so as we don't break things by making them so later. -Martin -- Help Microsoft stamp out software piracy: give Linux to a friend today...
Re: Arrays, lists, referencing
I would like to chip in in favour of the list is value, array is container side of the argument. However I think that needs clarifying. A reference is a value; the thing it refers to is a container. An anonymous container is a container with no references from any symbol table. It can lose its anonymity by creating such a reference. A list is an ordered set of references to (normally anonymous) containers. An array is a container that contains a list. When created, an array contains the empty list. The operations push, pop, shift, unshift, extend, truncate and element auto-vivify replace the value in the array with another value similar to the old one. Assignment replaces the value in the array with an entirely new value. Operations on individual elements of an array do not affect the value of the array, just the values in the containers that the array's list members refer to. Possible definition: Except for obvious arrays and hashes (those involving or % in the expression), anything evaluated inside a list in R-value context is itself in reference context. Named arrays and hashes are in flatten-to-reference-to-member context. Anything evaluated inside a list in Lvalue context is itself in reference context. Assignment to a list deferences successive elements of each side. Passing a list as parameters dereferences each element unless the prototype says otherwise. Almost all of these containers are either elidable at compile time, or will be needed soon anyway -- eg, as elements in the formal parameter list; so there's no practical cost to this definition. On a related topic... I like to be able to program in a pure functional mode as much as possible, mainly because it's the easiest to prove correctness, but also because it also offers the greatest scope for compile-time optimisation. What I would like is for the language to enable as many compile-time optimisations as possible, by making the optimisation-friendly choices the shorter easy-to-type defaults. One of those, rather inobviously, is choosing pass-by-value rather than pass-by-reference. And rather deeper pass-by-value than the sort of list I've talked about above: a list would be a set of actual values, not a set of references to containers containing values. And we could extend this to things other than arrays/lists. It's important to understand that I'm talking about the semantics of the language, not the implementation. The point is that the implementation is still free to pass by reference when it can be sure that the receiving function won't fiddle with it. That can be guaranteed if you can see (or infer) all the way down to the leaf function calls at compile time. (This gets complicated at trust boundaries, but we can work on that.) One of the things I found most irksome moving from C++ to Java was that Java took away both pass object by value AND pass object by const reference. Couple that with some rather bad choice of value vs container in the standard class library, and the result was that one had no way to be sure that an object wouldn't get modified once you handed a it over as a parameter or referent to some random method. Since then languages such as ECMAscript have copied that behaviour, and it seems that P6 is looking more and more like a clone of that language ... and that worries me. I would like to argue in favour of pass by value to be the default in the absence of some explicit prototype, because it allows greater type-safety, and because the opposite default interacts badly with out-of-order execution such as parallelism, and bars some optimisations that can be applied to closures. (We do want Perl to run fast on parallel hardware, don't we?) The relationship to the array/list thing is this: that it's not just pass-by-value to functions and methods, it's about implicit R-valueness in any context that doesn't absolutely require L-valueness. All this is orthogonal to the concept of object: in C++ an object can be used to implement either a value (such as string) or a container (such as vector); it would be nice to be able to do this in P6 too. -Martin PS: sorry for the long post...
Re: non-inline text in parrot assembly?
Leopold Toetsch wrote: Tupshin Harper wrote: Thanks. Apparently I'm being daft. I don't see any mention of pasm sections(constant or otherwise) in the pod docs, nor do any of the examples appear to use a constants section. What am I missing? Sorry nothing. There are only IIRC 3 tests in parrot and 3 in imcc using these features. $ perldoc assemble.pl Actually you're wrong ;-) I was missing something, and that of course was perldoc assemble.pl. ;-) Thanks for the pointer, that contains a *lot* of information that doesn't appear to be anywhere else(.constant, for example, is never mentioned in docs/*.pod). But they are not very well covered in the main docs. I would vote to move virtually all of this information out of assemble.pl and into docs/parrot_assembly.pod (or something similar), and have the perldoc for assemble.pl just be an overview + usage information. Thanks. -Tupshin
Re: Objects, methods, attributes, properties, and other related frobnitzes
At 9:33 AM + 2/22/03, Graham Barr wrote: On Fri, Feb 21, 2003 at 04:34:42PM -0500, Dan Sugalski wrote: If A isa B, we certainly wouldn't want to call A's AUTOLOAD on a method before we looked to see if B had a concrete instance of that method. Right. The best you could probably do is note where you found the first AUTOLOAD so that when you do reach the end of the ISA search you don't need to do the whole search again. But is this programming for the common case ? or is it premature optimization. I'm thinking premature optimization, and if not that at least something that can be put off until later. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: invoke
At 10:20 AM -0800 2/20/03, Steve Fink wrote: The invoke op is bothering me -- namely, it disturbs me that it implicitly operates on P0. I know that P0 is the correct register to use according to pdd03, but I dislike having it be implicit. The user is required to set the rest of the pdd03 conventions up manually, so I don't see any need for invoke to be different. It isn't, though. You have to set up P0 just like all the other registers you're using. It's not like invoke makes you specify them either. (Though I could certainly see a case for a version that takes counts as parameters and sets up the I registers appropriately, for speed reasons) And it makes it much more clear what registers are being used if you have to pass in a PMC as an argument. So would anyone mind if I eliminated the zero-arg invoke op in favor of a one-arg invoke that takes a single PMC? (I may also have situations where I don't need to follow pdd03, and it would be more convenient to use a different register.) Leave the zero-arg version in there, since the common case will be invoking routines that are conforming to the calling conventions, and thus have all the registers set up per PDD03. I fully expect anything with even a minimal amount of self-introspection will be rummaging around in that sub object, so having it in a fixed location will be the right thing. I'm OK with a one-arg version, as long as it's made explicit in the docs that code that uses it makes no guarantees about the state of any of the registers. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: [RFC] imcc calling conventions
At 9:55 AM -0800 2/20/03, Steve Fink wrote: I think this has been discussd before, but are we all okay with this callee-save-everything policy? Nope. It's safe to say that a lot of folks aren't. :) Still, I think it's the right way to go in the general case. Most sub calls of any complexity will be saving off all the I and P registers (at least, probably the S registers too) that I don't see it saving anything except in the trivial leaf sub case. Which, I realize, is common for some styles of programming, but those aren't the styles that people use with perl/python/ruby generally. At the very least, I'd be tempted to add a bitmasked saveall/restoreall pair to reduce the amount of cache thrashing. (saveall 0b00100110) It just seems odd that you have to either save all 32 of one of the types of registers, or to save selected ones onto a different stack. But it *is* simpler to copy over the entire register file to a stack frame, I guess. Faster in a number of ways. I considered the bitmask, but then you have the issue of lots of bit tests which isn't cheap in software. (Hardware yes, but we don't have that on our side) Besides that, doing a simple bit-blast is hardware accelerated on many systems, and cache friendly on others, which makes it a better option overall. At one point I did a test and found that whole-register-frame saves were faster than saving three individual registers in a frame, though we do have a relatively heavy-weight general purpose stack. Taking that farther, I've always liked systems that divide up the registers into callee-save, caller-save, and scratch (nobody-save?) Maybe that's just me. And I vaguelly recall that there was some discussion I didn't follow about how that interferes with tail-call optimization. (To me, tail call optimization == replace recursive call with a goto to the end of the function preamble) I can see doing this. If there was some sort of metainformation that would allow us to know at compile time that registers were safe we could emit different code, though there's still the issue of nested calls where there's limited info. Or, as another stab at the same problem, does Parrot really need 32*4 registers? I keep thinking we might be better off with 16 of each type. But maybe I'm just grumbling. Yeah, 32 is a bunch. I've considered going with 16 on and off, and still might. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Objects, methods, attributes, properties, and other related frobnitzes
At 11:46 PM -0500 2/21/03, Benjamin Goldberg wrote: My bit of example code was merely to demonstrate that UNIVERSAL::can() from perl5 clearly has the problem that Andy Wardley worries about wrt freezing to a particular definition... Thus, it *may* be a good idea to *not* provide a user-code-level means of obtaining method handles, o No. Python allows fetching a handle to the current method definition, and it seems a reasonable thing in some circumstances, so it needs to be supported. They may be the wrong answer in many circumstances, but that doesn't mean they're the wrong answer in all circumstances. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Using imcc as JIT optimizer
At 4:28 PM +0100 2/22/03, Leopold Toetsch wrote: Gopal V wrote: Direct hardware maps (like using CX for loop count etc) will need to be platform dependent ?. Or you could have a fixed reg that can be used for loop count (and gets mapped on hardware appropriately). We currently don't have special registers, like %ecx for loops, they are not used in JIT either. My Pentium manual states, that these ops are not the fastest. But in the long run, we should have some hints, that e.g. i386 needs %ecx as shift count, or that div uses %edx. But probably i386 is the only weird architecure with such ugly restrictions - and with far too few registers. I'm OK with adding in documentation that encourages using particular registers for particular purposes, or having some sort of metadata for the JIT that notes loop registers or something. As long as it's out of band and optional, that's cool. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Using imcc as JIT optimizer
Nicholas Clark wrote in perl.perl6.internals : r-score = r-use_count + (r-lhs_use_count 2); r-score += 1 (loop_depth * 3); [...] I wonder how hard it would be to make a --fsummon-nasal-demons flag for gcc that added trap code for all classes of undefined behaviour, and caused code to abort (or something more colourfully undefined) if anything undefined gets executed. I realise that code would run very slowly, but it would be a very very useful debugging tool. What undefined behaviour are you referring to exactly ? the shift overrun ? AFAIK it's very predictable (given one int size). Cases of potential undefined behavior can usually be detected at compile-time. I imagine that shift overrun detection can be enabled via an ugly macro and a cpp symbol. (what's a nasal demon ? can't find the nasald(8) manpage)
Re: Using imcc as JIT optimizer
Please don't take the following as a criticism of imcc - I'm sure I manage to write code with things like this all the time. On Sat, Feb 22, 2003 at 08:13:59PM +0530, Gopal V wrote: If memory serves me right, Leopold Toetsch wrote: r-score = r-use_count + (r-lhs_use_count 2); r-score += 1 (loop_depth * 3); Ok ... deeper the loop the more important the var is .. cool. until variables in 11 deep loops go undefined? (it appears to be a signed int) I'm not sure how to patch this specific instance - just trap loop depths over 10? Should score be unsigned? More importantly, how do we trap these sort of things in the general case? I wonder how hard it would be to make a --fsummon-nasal-demons flag for gcc that added trap code for all classes of undefined behaviour, and caused code to abort (or something more colourfully undefined) if anything undefined gets executed. I realise that code would run very slowly, but it would be a very very useful debugging tool. Nicholas Clark
Re: stabs support
Steve Fink wrote: First -- wow, thanks! I tried out the stabs stuff for the JIT yesterday, and it's really helpful to be able to step through PASM code from within emacs's gud mode. Thank your for liking it :) I had one problem, though -- whenever stepping over a keyed op (eg set I0, P0[3]), gdb fails to recognize that it reaches any more lines and instead runs the whole thing to completion and exits. I could never figure out, when gdb just continues. ... But I remember noticing that gdb prints out the current PASM line number after that second 'si' on the keyed op. Looks then like a gdb error to me. In my local copy (currently locked away on my home hard drive, so I can't post it from here at work), I also added stabs entries for all the PMC registers (in addition to the current S, I, and N registers.) You can see the PMC's data fields and type. It looks something like: (gdb) p P0 (PMC*) 0xdeadbeef (gdb) p *P0 { vtable = 0xdeadbeef, pobj = { u = { int_val = 17, pmc_val = 0x17 }, flags = 381741 } } (gdb) p *P0-vtable { base_type = PerlArray } (I added an enumeration for the PMC types). Wow, fine, fine. And I'd just like to say that stabs is a mess. Is DWARF2 any better? Yep - Ought to be, but I didn't have a look at it. leo
Re: Using imcc as JIT optimizer
If memory serves me right, Leopold Toetsch wrote: I'm assuming that the temporaries are the things being moved around here ?. It is not so much a matter of moving things around, but a matter of allocating (and renumbering) parrot (or for JIT) processor registers. Ok .. well I sort of understood that the first N registers will be the ones MAPped ?. So I thought re-ordering/sorting was the operation performed. Direct hardware maps (like using CX for loop count etc) will need to be platform dependent ?. Or you could have a fixed reg that can be used for loop count (and gets mapped on hardware appropriately). does it. But that sounds like a lot of work identifying the loops and optimising accordingly. Loop info - loop 0, depth 1, size 2, entry 0, contains blocks: 1 2 Hmm.. this is what I said sounds like a lot of work ... which still remains true from my perspective :-) r-score = r-use_count + (r-lhs_use_count 2); r-score += 1 (loop_depth * 3); Ok ... deeper the loop the more important the var is .. cool. Gopal -- The difference between insanity and genius is measured by success
Re: invoke
Steve Fink wrote: The invoke op is bothering me -- namely, it disturbs me that it implicitly operates on P0. I know that P0 is the correct register to use according to pdd03, but I dislike having it be implicit. The user is required to set the rest of the pdd03 conventions up manually, so I don't see any need for invoke to be different. And it makes it much more clear what registers are being used if you have to pass in a PMC as an argument. Sean O'Rourke proposed a long time ago, that with should have Binvoke_p too. The current Binvoke is fine for pdd03 only (where (almost) all other P registers might be parameters, but for calling Subs, compiled code, coroutines and so on, there is really no need, to not be able to select the object, which should get invoke'd. So would anyone mind if I eliminated the zero-arg invoke op in favor of a one-arg invoke that takes a single PMC? (I may also have situations where I don't need to follow pdd03, and it would be more convenient to use a different register.) Yep. At least add Binvoke Px. leo
Re: Using imcc as JIT optimizer
Gopal V wrote: If memory serves me right, Leopold Toetsch wrote: Ok .. well I sort of understood that the first N registers will be the ones MAPped ?. So I thought re-ordering/sorting was the operation performed. Yep. Register renumbering, so that the top N used (in terms of score) registers are I0, I1, ..In-1 Direct hardware maps (like using CX for loop count etc) will need to be platform dependent ?. Or you could have a fixed reg that can be used for loop count (and gets mapped on hardware appropriately). We currently don't have special registers, like %ecx for loops, they are not used in JIT either. My Pentium manual states, that these ops are not the fastest. But in the long run, we should have some hints, that e.g. i386 needs %ecx as shift count, or that div uses %edx. But probably i386 is the only weird architecure with such ugly restrictions - and with far too few registers. Loop info Hmm.. this is what I said sounds like a lot of work ... which still remains true from my perspective :-) There is still a lot of work, yes, but some things already are done: set I10, 10 x: if I10, ok branch y ok: set I0, 1 sub I10, I10, I0 print I10 print \n branch x y: end Ends up (with imcc -O2p) as: set I0, 10 set I1, 1 x: unless I0, y sub I0, I1 print I0 print \n branch x y: end You can see: opt1 sub I10, I10, I0 = sub I10, I0 if_branch if ... ok label ok deleted found invariant set I0, 1 inserting it in blk 0 after set I10, 10 The latter one is working out from the most inner loop. Gopal leo
Re: Objects, methods, attributes, properties, and other related frobnitzes
Graham Barr wrote: On Fri, Feb 21, 2003 at 04:34:42PM -0500, Dan Sugalski wrote: If A isa B, we certainly wouldn't want to call A's AUTOLOAD on a method before we looked to see if B had a concrete instance of that method. Right. The best you could probably do is note where you found the first AUTOLOAD so that when you do reach the end of the ISA search you don't need to do the whole search again. Unless we changed the language in such a way that we could *tell* whether or not we should try calling A's AUTOLOAD. Currently, in perl5, if you have package A; sub foo;, then the method search will stop in A and call A's autoload, since it *knows* that A has an appropriate method. Obviously, we don't really want to force our users to stub every method (though this would be *one* way of avoiding the need for a second pass for AUTOLOADs)... If the language had an AUTOPROTO/AUTOSTUB if some sort, we could call it and find out where the heirarchy search should stop in that class. But is this programming for the common case ? or is it premature optimization. Well, first ask what are the common cases for autoloading in perl5... I think that the *most* common case is AutoLoader/SelfLoader. If Devel::SelfStubber is used with either of those, then that stops the heirarchy search for methods in the right place, not needing a second pass. There are also definitions of XS constants ... though these are not, in general, used as methods, so I suppose we can ignore them for now. And finally, there are object property accessors, so that one can write $c = $obj-color, instead of $c = $obj-{color}. These are sometimes done in AUTOLOAD... But stubs are rarely, if ever, provided for them, so calling this type of method almost always requires two passes through the inheritance heirarchy. . There's also another case that's not-so-common, but mainly due to the difficulties of doing it right in perl5. You've suggested keeping track of where we found the *first* AUTOLOAD ... but what happens if we want to inherit from *two* classes with AUTOLOAD methods? In perl5, you'd have to use NEXT.pm, which, imho, is fairly ugly internally, and not especially efficient (plus it's not in the core). Perl6 should have a built-in mechanism to allow an AUTOLOAD method to either make a call to the next AUTOLOAD, a la NEXT.pm, (this might be fairly expensive), or throw an exception saying that that particular method isn't supplied by this AUTOLOAD, and have the search continue (possibly much less expensive). -- $;=qq qJ,krleahciPhueerarsintoitq;sub __{0 my$__;s ee substr$;,$,++$__%$,--,1,qq;;;ee; $__2__}$,=22+$;=~y yiy y;__ while$;;print
Re: non-inline text in parrot assembly?
Leopold Toetsch wrote: You can use the .constant (PASM) or .const (IMCC) syntax, to keep strings visually together. leo Thanks. Apparently I'm being daft. I don't see any mention of pasm sections(constant or otherwise) in the pod docs, nor do any of the examples appear to use a constants section. What am I missing? -Tupshin
Re: Using imcc as JIT optimizer
Nicholas Clark wrote: r-score += 1 (loop_depth * 3); until variables in 11 deep loops go undefined? Not undefined, but spilled. First *oops*, but second of course this all not final. I did change scoring several times from the code base AFAIK Angel Faus did implement. And we don't currently have any code that goes near that omplexity of such a deep nested loop. There are probably a *lot* of such gotchas in the whole CFG code in imcc. I'm currently on some failing perl6 tests, when using optimization, all in regexen tests, which do a lot of branching. I'm not sure how to patch this specific instance - just trap loop depths over 10? Should score be unsigned? A linear counting of loop_depth will do it, e.g. r-score += 100 * loop_depth ; Or score deeper nested loops vars always higher then outside, or ... More importantly, how do we trap these sort of things in the general case? With a lot of tests I wonder how hard it would be to make a --fsummon-nasal-demons flag for gcc that added trap code for all classes of undefined behaviour, and caused code to abort (or something more colourfully undefined) if anything undefined gets executed. I realise that code would run very slowly, but it would be a very very useful debugging tool. I'm currently adding asserts to e.g. loop detection code. Last one (to be checked in) is: /* we could also take the depth of the first contained * block, but below is a check, that an inner loop is fully * contained in an outer loop */ This is a check, that all blocks of a deeper nested loop are contained totally in the outer loop, so that there can't be basic blocks outside. But in regex code, this seems not to be true - or a prior stage of optimization messes things up. This issues are as hard to debug as deeply buried in ~400 basic blocks with ~1000 edges connecting those. perl6 $ ../imcc/imcc -O1 -d70 t/rx/basic_2.imc 21 | less Nicholas Clark leo
Re: non-inline text in parrot assembly?
Tupshin Harper wrote: Leopold Toetsch wrote: You can use the .constant (PASM) or .const (IMCC) syntax, to keep strings visually together. Thanks. Apparently I'm being daft. I don't see any mention of pasm sections(constant or otherwise) in the pod docs, nor do any of the examples appear to use a constants section. What am I missing? Sorry nothing. There are only IIRC 3 tests in parrot and 3 in imcc using these features. $ perldoc assemble.pl $ perldoc languages/imcc/docs/syntax.pod $ perldoc languages/imcc/docs/macros.pod But they are not very well covered in the main docs. Additionally, string (and key and float constants) are a distinct section in PBC, only the assembler doesn't care - or OTOH there is now syntax to reference a string constant directly. This is all done via the constant tabke. -Tupshin leo
Re: Using imcc as JIT optimizer
On Sat, Feb 22, 2003 at 08:44:12PM -, Rafael Garcia-Suarez wrote: Nicholas Clark wrote in perl.perl6.internals : r-score = r-use_count + (r-lhs_use_count 2); r-score += 1 (loop_depth * 3); [...] I wonder how hard it would be to make a --fsummon-nasal-demons flag for gcc that added trap code for all classes of undefined behaviour, and caused code to abort (or something more colourfully undefined) if anything undefined gets executed. I realise that code would run very slowly, but it would be a very very useful debugging tool. What undefined behaviour are you referring to exactly ? the shift overrun ? AFAIK it's very predictable (given one int size). Cases of Will you accept a shortcut written in perl? The shift op uses C signed integers: $ perl -MConfig -le 'print foreach ($^O, $Config{byteorder}, 1 32)' linux 1234 0 vs $ perl -MConfig -le 'print foreach ($^O, $Config{byteorder}, 1 32)' linux 1234 1 $ perl -MConfig -le 'print foreach ($^O, $Config{byteorder}, 1 32)' linux 4321 1 vs $ perl -MConfig -le 'print foreach ($^O, $Config{byteorder}, 1 32)' linux 4321 0 (all 4 are Debian GNU/Linux And both architectures that give 0 for a shift of 32, happen to give 1 for a shift of 256. But I wouldn't count on it for all architectures) potential undefined behavior can usually be detected at compile-time. I In this specific case, maybe. In the general case no. signed integer arithmetic overflowing is undefined behavior imagine that shift overrun detection can be enabled via an ugly macro and a cpp symbol. (what's a nasal demon ? can't find the nasald(8) manpage) Demons flying out of your nose. One alleged consequence of undefined behaviour. Another is your computer turning into a butterfly. I guess a third is Microsoft releasing a bug free program Nicholas Clark
Re: Using imcc as JIT optimizer
On Sat, Feb 22, 2003 at 09:27:04PM +, nick wrote: On Sat, Feb 22, 2003 at 08:44:12PM -, Rafael Garcia-Suarez wrote: What undefined behaviour are you referring to exactly ? the shift overrun ? AFAIK it's very predictable (given one int size). Cases of Will you accept a shortcut written in perl? The shift op uses C signed integers: Oops. The logical shift uses *un*signed integers, except under use integer $ perl -MConfig -le 'use integer; print foreach ($^O, $Config{byteorder}, 1 32)' linux 1234 0 $ perl -MConfig -le 'use integer; print foreach ($^O, $Config{byteorder}, 1 32)' linux 1234 1 $ perl -MConfig -le 'use integer; print foreach ($^O, $Config{byteorder}, 1 32)' linux 4321 0 $ perl -MConfig -le 'use integer; print foreach ($^O, $Config{byteorder}, 1 32)' linux 4321 1 So there's actually no difference in the numbers. But as I'm being a pedant I ought to get the facts right. [I guess it's my fault for drinking Australian wine :-)] Nicholas Clark
Re: non-inline text in parrot assembly?
Tupshin -- Parrot Byte Code (.pbc) files (aka packfiles) have multiple sections, but Parrot Assembly (.pasm) files do not reference them explicitly. Literal constants are *implicitly* placed in the constant section of the .pbc file upon assembly. The .constant or .const directives allow you to name your constants, but the net result is equivalent. Regards, -- Gregor Tupshin Harper [EMAIL PROTECTED] 02/22/2003 02:31 PM To: Leopold Toetsch [EMAIL PROTECTED] cc: [EMAIL PROTECTED] Subject:Re: non-inline text in parrot assembly? Leopold Toetsch wrote: You can use the .constant (PASM) or .const (IMCC) syntax, to keep strings visually together. leo Thanks. Apparently I'm being daft. I don't see any mention of pasm sections(constant or otherwise) in the pod docs, nor do any of the examples appear to use a constants section. What am I missing? -Tupshin
Re: invoke
On Feb-22, Leopold Toetsch wrote: Steve Fink wrote: So would anyone mind if I eliminated the zero-arg invoke op in favor of a one-arg invoke that takes a single PMC? (I may also have situations where I don't need to follow pdd03, and it would be more convenient to use a different register.) Yep. At least add Binvoke Px. Ok, done.
Re: Objects, methods, attributes, properties, and other related frobnitzes
Dan Sugalski wrote: Benjamin Goldberg wrote: Graham Barr wrote: Dan Sugalski wrote: If A isa B, we certainly wouldn't want to call A's AUTOLOAD on a method before we looked to see if B had a concrete instance of that method. Right. The best you could probably do is note where you found the first AUTOLOAD so that when you do reach the end of the ISA search you don't need to do the whole search again. Unless we changed the language in such a way that we could *tell* whether or not we should try calling A's AUTOLOAD. Given that we have to run perl 5 code with the same expressed semantics as perl 5, and also are going to run python and ruby code properly, this isn't a tenable option. We're the implementors. While we can complain about the semantics we have to express, we don't get to not express them. Nothing says that we can't have a different semantic for each language we're running. When running perl5 code, we could fetch methods and perform method caching one way, and when running perl6 code, we could fetch methods and perform method caching a different way... and possibly a different technique for each of python and tcl. I was going to say that I probably ought to write my idea up in an RFC, and see how people react, and get Larry's approval... but, I discovered that someone else thought of this idea before me, and wrote it up! http://dev.perl.org/rfc/232.pod -- $;=qq qJ,krleahciPhueerarsintoitq;sub __{0 my$__;s ee substr$;,$,++$__%$,--,1,qq;;;ee; $__2__}$,=22+$;=~y yiy y;__ while$;;print
Re: Objects, methods, attributes, properties, and other related frobnitzes
At 7:56 PM -0500 2/22/03, Benjamin Goldberg wrote: Dan Sugalski wrote: Benjamin Goldberg wrote: Graham Barr wrote: Dan Sugalski wrote: If A isa B, we certainly wouldn't want to call A's AUTOLOAD on a method before we looked to see if B had a concrete instance of that method. Right. The best you could probably do is note where you found the first AUTOLOAD so that when you do reach the end of the ISA search you don't need to do the whole search again. Unless we changed the language in such a way that we could *tell* whether or not we should try calling A's AUTOLOAD. Given that we have to run perl 5 code with the same expressed semantics as perl 5, and also are going to run python and ruby code properly, this isn't a tenable option. We're the implementors. While we can complain about the semantics we have to express, we don't get to not express them. Nothing says that we can't have a different semantic for each language we're running. Well, almost nothing. Nothing much besides me, at least. This isn't the place to ponder alternate semantics for existing or proposed languages. That's what the language lists are for. If you want perl 6 to behave in some particular way, go to perl6-language or petition Larry. (I'd not suggest bringing it up on Python-dev, but if you want to brave it, well, good luck) -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: [RFC] imcc calling conventions
Dan Sugalski wrote: At 9:55 AM -0800 2/20/03, Steve Fink wrote: [snip] Or, as another stab at the same problem, does Parrot really need 32*4 registers? I keep thinking we might be better off with 16 of each type. But maybe I'm just grumbling. Yeah, 32 is a bunch. I've considered going with 16 on and off, and still might. Given that registers are allocated with the lower numbers being the ones used more often, how about having 32 registers, as we now have, but two different ops for saving -- one of which saves registers 0 .. 15, the other saves all 0 .. 31. Or is this just a dumb idea? -- $;=qq qJ,krleahciPhueerarsintoitq;sub __{0 my$__;s ee substr$;,$,++$__%$,--,1,qq;;;ee; $__2__}$,=22+$;=~y yiy y;__ while$;;print
access to partial registers?
Sorry for all the questions...these are the trials and tribulations of dealing with a newbie trying to get up to speed with the current state of parrot. So here's another question: Is it possible and/or meaningful to read and write from a part of a register(e.g. a single word) in pasm? As with my previous questions, I'm not really interested in pbc issues/format(with exceptions of course), just learning the intricacies of pasm. -Tupshin