Re: [RFC] imcc calling conventions

2003-02-22 Thread Nicholas Clark
On Thu, Feb 20, 2003 at 09:55:35AM -0800, Steve Fink wrote:
 I think this has been discussd before, but are we all okay with this
 callee-save-everything policy? At the very least, I'd be tempted to
 add a bitmasked saveall/restoreall pair to reduce the amount of cache
 thrashing. (saveall 0b00100110) It just
 seems odd that you have to either save all 32 of one of the types of
 registers, or to save selected ones onto a different stack. But it
 *is* simpler to copy over the entire register file to a stack frame, I
 guess.

I agree that the bitmasked savesome/restoresome would be less simple.
I suspect that a butmasked version would JIT very nicely.

Since when did parrot trade simplicity for speed?

Nicholas Clark


Re: Using imcc as JIT optimizer

2003-02-22 Thread Gopal V
If memory serves me right, Dan Sugalski wrote:
 This sounds pretty interesting, and I bet it could make things 
 faster. The one thing to be careful of is that it's easy to get 
 yourself into a position where you spend more time optimizing the 
 code you're JITting than you win in the end.

I think that's not the case for ahead of time optimisations . As long
as the JIT is not the optimiser , you could take your time optimising.

The topic is really misleading ... or am I the one who's wrong ?

 You also have to be very careful that you don't reorder things, since 
 there's not enough info in the bytecode stream to know what can and 
 can't be moved. (Which is something we need to deal with in IMCC as 
 well)

I'm assuming that the temporaries are the things being moved around here ?.
Since imcc already moves them around anyway and the programmer makes
no assumptions about their positions -- this shouldn't be a problem ?.

The only question I have here , how does imcc identify loops ?. I've
been using if goto to loop around , which is exactly the way assembly
does it. But that sounds like a lot of work identifying the loops and
optimising accordingly.

To make it more clear -- identifying tight loops and the usage weights
correctly. 10 uses of $I0 outside the loop vs 1 use of $I1 inside a 100
times loop. Which will be come first ?. 

Gopal
-- 
The difference between insanity and genius is measured by success


Re: Objects, methods, attributes, properties, and other related frobnitzes

2003-02-22 Thread Graham Barr
On Fri, Feb 21, 2003 at 04:34:42PM -0500, Dan Sugalski wrote:
 If A isa B, we certainly wouldn't want to call A's AUTOLOAD on a 
 method before we looked to see if B had a concrete instance of that 
 method.

Right. The best you could probably do is note where you found the first AUTOLOAD
so that when you do reach the end of the ISA search you don't need to do
the whole search again.

But is this programming for the common case ? or is it premature
optimization.

Graham.


non-inline text in parrot assembly?

2003-02-22 Thread Tupshin Harper
Parrot assembly supports inline strings, but are there any plans to have 
it support a distinct .string (or similar) asm section? The main benefit 
would be easier compatibility/portability with existing assembly code 
generators. Is anybody aware of an existing assembly format that doesn't 
support a separate string section?

-Tupshin



Re: non-inline text in parrot assembly?

2003-02-22 Thread Leopold Toetsch
Tupshin Harper wrote:

Parrot assembly supports inline strings, but are there any plans to have 
it support a distinct .string (or similar) asm section? The main benefit 
would be easier compatibility/portability with existing assembly code 
generators. Is anybody aware of an existing assembly format that doesn't 
support a separate string section?


You can use the .constant (PASM) or .const (IMCC) syntax, to keep 
strings visually together.


-Tupshin


leo



Re: [RFC] imcc calling conventions

2003-02-22 Thread Leopold Toetsch
Steve Fink wrote:

On Feb-18, Leopold Toetsch wrote:
[ ... ]


  .return mi# [1]
  restoreall
  ret
.end
My immediate reaction to this (okay, I really saw this before in
perl6-generated code) is why don't the values from .return and
restoreall get mixed up?


Yep. It is at least confusing at the beginning.


You may want to add a brief description of the kazillion different
stacks that Parrot uses. There are six, I think:


7 if you count the intstack too.


I think this has been discussd before, but are we all okay with this
callee-save-everything policy? 


I'm not that convinced of it - I still think the subroutine knows best, 
how to preserve its registers.

... At the very least, I'd be tempted to
add a bitmasked saveall/restoreall pair to reduce the amount of cache
thrashing. (saveall 0b00100110) It just
seems odd that you have to either save all 32 of one of the types of
registers, or to save selected ones onto a different stack. But it
*is* simpler to copy over the entire register file to a stack frame, I
guess.


Probably yes.


Taking that farther, I've always liked systems that divide up the
registers into callee-save, caller-save, and scratch (nobody-save?)


Register windows, like (AFAIK) IA64 has? This would avoid a lot of 
saveall/restoreall/saveX/restoreX for smaller subroutines.


Maybe that's just me. And I vaguelly recall that there was some
discussion I didn't follow about how that interferes with tail-call
optimization. (To me, tail call optimization == replace recursive
call with a goto to the end of the function preamble)


Dan did mention it, when arguing for callee-save calling coventions. But 
I don't know how it exactly works.


Or, as another stab at the same problem, does Parrot really need 32*4
registers? I keep thinking we might be better off with 16 of each
type. But maybe I'm just grumbling.


I think 16 might be not enough. So the current setting is ok.

leo




Re: Configure.pl --cgoto=0 doesn't work

2003-02-22 Thread Leopold Toetsch
Nicholas Clark wrote:

On Fri, Feb 21, 2003 at 08:34:05AM +0100, Leopold Toetsch wrote:

Case 2) should disable only core_ops_cg.c but not core_ops_cgp.c

But surely we'd also like a flag to disable core_ops_cgp.c but leave
core_ops_cg.c?


IMHO no. Why disable the faster and much smaller core, and keep the big 
and slow core?


How many cores are there now? Is there a way to make a modular flag system
that lets people configure any arbitrary combination that they wish to build?


In the long run, we should build the normal function based core (used 
e.g. for trace and the fastest core, that is available. Though I can 
imagine, that we additionally want to have the most memory efficient 
core too - which would not be a prederefed one.

Here is a summary in terms of speed:
JIT
CGP (obsoletes CGoto  Prederef)
Switched Prederef (obsoletes Prederef)
Switched
Normal
When PBC code size matters we could have:
CGoto
Switched
Normal
So, the plain prederefed core is always obsolete now.


And an easy way for the tinderbox machines to build all applicable, and
run tests for each built core in turn?


$ make quickfulltest


Nicholas Clark


leo





Re: Using imcc as JIT optimizer

2003-02-22 Thread Leopold Toetsch
Gopal V wrote:

I'm assuming that the temporaries are the things being moved around here ?.


It is not so much a matter of moving things around, but a matter of 
allocating (and renumbering) parrot (or for JIT) processor registers. 
These are of course mainly temporaries, but even when you have some 
find_lexical/do_something/store_lexical, imcc selects the best register 
for all involved ops, temps or variables it doesn't really matter.


The only question I have here , how does imcc identify loops ?. I've
been using if goto to loop around , which is exactly the way assembly
does it. But that sounds like a lot of work identifying the loops and
optimising accordingly.


Here are basic blocks, the CFG and loop info of
0 set I0, 10
1 x:
1 unless I0, y
2 dec I0
2 print I0
2 print \n
2 branch x
3 y:
3end
Dumping the CFG:
---
0 (0)- 1-
1 (1)- 2 3  - 2 0
2 (1)- 1- 1
3 (0)-  - 1
Loop info
-
loop 0,  depth 1, size 2, entry 0, contains blocks:
1 2

To make it more clear -- identifying tight loops and the usage weights
correctly. 10 uses of $I0 outside the loop vs 1 use of $I1 inside a 100
times loop. Which will be come first ?. 


This is basically the current score calculation used for register 
allocation:

 r-score = r-use_count + (r-lhs_use_count  2);

  r-score += 1  (loop_depth * 3);


Gopal
leo




Re: IMCC's bsr handling

2003-02-22 Thread Leopold Toetsch
Steve Fink wrote:

[Apologies if you receive this twice, 


No problemm, the duplicate filter in procmail takes care of it.


On Sat, Feb 08, 2003 at 12:19:35PM +0100, Leopold Toetsch wrote:

When we want these kind of branches, then they must be more high level, 
defining all possible branch targets, e.g. like a switch statement.


I think all that means that I *can* specify a set of labels that the
instruction might jump to, and guarantee that if it jumps to anywhere
else that it won't affect any registers. 


I think, that should be enough to allocate registers in a save way. 
Though it might not the most efficient way to do it. It really depends 
on the complexity of such code pieces. When regex code is intersparsed 
with normal code, it will be for sure not be possible to reuse 
registers, when the control flow is not known.

... For now, I'm prototyping
using a heavyweight mechanism. If that gets to be too unwieldy, maybe
I'll take a look at implementing something like
  bsr $I0 = _label1 | _label2 | REGISTER_PRESERVING_LOCATION

(Ignore the syntax!). It's only needed for imcc, right? (I wouldn't
need to propagate it through to the JIT or anything, would I?)


Its for the register allocator. But it really depends on calling 
conventions. When the bsr's do a saveall/restoreall and arguments are 
passed on stack then it's no problem, the bsr is a noop then in 
terms of CFG. When the bsr's have e.g. pdd03 calling conventions, then 
each possible control flow has to be tracked and allocated registers 
must match.


How is invoke handled? Is it assumed to always use the full PDD06
calling conventions?


s/06/03/ - No. When code is only called internally, it can use any 
calling convention, that fits best.

leo



L-valueness of Arrays vs. Lists

2003-02-22 Thread Martin D Kealey
On Tue, 11 Feb 2003, Michael Lazzaro wrote:
 What is the utility of the perl5 behavior:

  \($a,$b,$c)

 meaning

  (\$a, \$b, \$c)

 Do people really do that? ...  Can someone give an example of an actual,
 proper, use?

Yes, I've used it like this:

   for (\($a,$b,$c)) {
  $$_++;
   }

to be sure that it works on all versions, since

   for ($a,$b,$c) {
  $_++;
   }

works differently on different versions.  (Actually, I don't have an
old-enough version on hand to check just when that was, so it must have been
5.004 or before.)

This change didn't start to bite me until P5.6.0, when values %HASH became
an Lvalue too, whereupon

   for ( values %HASH ) {
  s/^prefix//;
  ...
   }
   ... do something else with %HASH

stopped working.

So, I would urge making as many things as possible Lvalues (and magical
references) right from the start of P6, just so as we don't break things by
making them so later.

-Martin

-- 
Help Microsoft stamp out software piracy: give Linux to a friend today...




Re: Arrays, lists, referencing

2003-02-22 Thread Martin D Kealey

I would like to chip in in favour of the list is value, array is container
side of the argument. However I think that needs clarifying.

A reference is a value; the thing it refers to is a container.

An anonymous container is a container with no references from any symbol
table.  It can lose its anonymity by creating such a reference.

A list is an ordered set of references to (normally anonymous) containers.

An array is a container that contains a list.  When created, an array
contains the empty list. The operations push, pop, shift, unshift, extend,
truncate and element auto-vivify replace the value in the array with another
value similar to the old one. Assignment replaces the value in the array
with an entirely new value.

Operations on individual elements of an array do not affect the value of the
array, just the values in the containers that the array's list members refer
to.

Possible definition:

Except for obvious arrays and hashes (those involving  or % in the
expression), anything evaluated inside a list in R-value context is itself
in reference context.  Named arrays and hashes are in
flatten-to-reference-to-member context.  Anything evaluated inside a list in
Lvalue context is itself in reference context.  Assignment to a list
deferences successive elements of each side.  Passing a list as parameters
dereferences each element unless the prototype says otherwise.

Almost all of these containers are either elidable at compile time, or will
be needed soon anyway -- eg, as elements in the formal parameter list; so
there's no practical cost to this definition.


On a related topic...

I like to be able to program in a pure functional mode as much as
possible, mainly because it's the easiest to prove correctness, but also
because it also offers the greatest scope for compile-time optimisation.

What I would like is for the language to enable as many compile-time
optimisations as possible, by making the optimisation-friendly choices the
shorter easy-to-type defaults.

One of those, rather inobviously, is choosing pass-by-value rather than
pass-by-reference. And rather deeper pass-by-value than the sort of list
I've talked about above: a list would be a set of actual values, not a set
of references to containers containing values. And we could extend this
to things other than arrays/lists.

It's important to understand that I'm talking about the semantics of the
language, not the implementation. The point is that the implementation is
still free to pass by reference when it can be sure that the receiving
function won't fiddle with it.  That can be guaranteed if you can see (or
infer) all the way down to the leaf function calls at compile time.  (This
gets complicated at trust boundaries, but we can work on that.)

One of the things I found most irksome moving from C++ to Java was that Java
took away both pass object by value AND pass object by const reference.
Couple that with some rather bad choice of value vs container in the
standard class library, and the result was that one had no way to be sure
that an object wouldn't get modified once you handed a it over as a
parameter or referent to some random method.

Since then languages such as ECMAscript have copied that behaviour, and it
seems that P6 is looking more and more like a clone of that language ... and
that worries me.

I would like to argue in favour of pass by value to be the default in the
absence of some explicit prototype, because it allows greater type-safety,
and because the opposite default interacts badly with out-of-order execution
such as parallelism, and bars some optimisations that can be applied to
closures. (We do want Perl to run fast on parallel hardware, don't we?)

The relationship to the array/list thing is this: that it's not just
pass-by-value to functions and methods, it's about implicit R-valueness in
any context that doesn't absolutely require L-valueness.

All this is orthogonal to the concept of object: in C++ an object can be
used to implement either a value (such as string) or a container (such as
vector); it would be nice to be able to do this in P6 too.

-Martin

PS: sorry for the long post...




Re: non-inline text in parrot assembly?

2003-02-22 Thread Tupshin Harper
Leopold Toetsch wrote:

Tupshin Harper wrote:

Thanks. Apparently I'm being daft. I don't see any mention of pasm 
sections(constant or otherwise) in the pod docs, nor do any of the 
examples appear to use a constants section.  What am I missing?
Sorry nothing.
There are only IIRC 3 tests in parrot and 3 in imcc using these features.
$ perldoc assemble.pl
Actually you're wrong ;-)
I was missing something, and that of course was perldoc assemble.pl. ;-)
Thanks for the pointer, that contains a *lot* of information that 
doesn't appear to be anywhere else(.constant, for example, is never 
mentioned in docs/*.pod).

But they are not very well covered in the main docs.
I would vote to move virtually all of this information out of 
assemble.pl and into docs/parrot_assembly.pod (or something similar), 
and have the perldoc for assemble.pl just be an overview + usage 
information.

Thanks.

-Tupshin



Re: Objects, methods, attributes, properties, and other related frobnitzes

2003-02-22 Thread Dan Sugalski
At 9:33 AM + 2/22/03, Graham Barr wrote:
On Fri, Feb 21, 2003 at 04:34:42PM -0500, Dan Sugalski wrote:
 If A isa B, we certainly wouldn't want to call A's AUTOLOAD on a
 method before we looked to see if B had a concrete instance of that
 method.
Right. The best you could probably do is note where you found the 
first AUTOLOAD
so that when you do reach the end of the ISA search you don't need to do
the whole search again.

But is this programming for the common case ? or is it premature
optimization.
I'm thinking premature optimization, and if not that at least 
something that can be put off until later.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: invoke

2003-02-22 Thread Dan Sugalski
At 10:20 AM -0800 2/20/03, Steve Fink wrote:
The invoke op is bothering me -- namely, it disturbs me that it
implicitly operates on P0. I know that P0 is the correct register to
use according to pdd03, but I dislike having it be implicit. The user
is required to set the rest of the pdd03 conventions up manually, so I
don't see any need for invoke to be different.
It isn't, though. You have to set up P0 just like all the other 
registers you're using. It's not like invoke makes you specify them 
either. (Though I could certainly see a case for a version that takes 
counts as parameters and sets up the I registers appropriately, for 
speed reasons)

And it makes it much
more clear what registers are being used if you have to pass in a PMC
as an argument.
So would anyone mind if I eliminated the zero-arg invoke op in favor
of a one-arg invoke that takes a single PMC? (I may also have
situations where I don't need to follow pdd03, and it would be more
convenient to use a different register.)
Leave the zero-arg version in there, since the common case will be 
invoking routines that are conforming to the calling conventions, and 
thus have all the registers set up per PDD03. I fully expect anything 
with even a minimal amount of self-introspection will be rummaging 
around in that sub object, so having it in a fixed location will be 
the right thing.

I'm OK with a one-arg version, as long as it's made explicit in the 
docs that code that uses it makes no guarantees about the state of 
any of the registers.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [RFC] imcc calling conventions

2003-02-22 Thread Dan Sugalski
At 9:55 AM -0800 2/20/03, Steve Fink wrote:
I think this has been discussd before, but are we all okay with this
callee-save-everything policy?
Nope. It's safe to say that a lot of folks aren't. :)

Still, I think it's the right way to go in the general case. Most sub 
calls of any complexity will be saving off all the I and P registers 
(at least, probably the S registers too) that I don't see it saving 
anything except in the trivial leaf sub case. Which, I realize, is 
common for some styles of programming, but those aren't the styles 
that people use with perl/python/ruby generally.

At the very least, I'd be tempted to
add a bitmasked saveall/restoreall pair to reduce the amount of cache
thrashing. (saveall 0b00100110) It just
seems odd that you have to either save all 32 of one of the types of
registers, or to save selected ones onto a different stack. But it
*is* simpler to copy over the entire register file to a stack frame, I
guess.
Faster in a number of ways. I considered the bitmask, but then you 
have the issue of lots of bit tests which isn't cheap in software. 
(Hardware yes, but we don't have that on our side) Besides that, 
doing a simple bit-blast is hardware accelerated on many systems, and 
cache friendly on others, which makes it a better option overall.

At one point I did a test and found that whole-register-frame saves 
were faster than saving three individual registers in a frame, though 
we do have a relatively heavy-weight general purpose stack.

Taking that farther, I've always liked systems that divide up the
registers into callee-save, caller-save, and scratch (nobody-save?)
Maybe that's just me. And I vaguelly recall that there was some
discussion I didn't follow about how that interferes with tail-call
optimization. (To me, tail call optimization == replace recursive
call with a goto to the end of the function preamble)
I can see doing this. If there was some sort of metainformation that 
would allow us to know at compile time that registers were safe we 
could emit different code, though there's still the issue of nested 
calls where there's limited info.

Or, as another stab at the same problem, does Parrot really need 32*4
registers? I keep thinking we might be better off with 16 of each
type. But maybe I'm just grumbling.
Yeah, 32 is a bunch. I've considered going with 16 on and off, and still might.
--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Objects, methods, attributes, properties, and other related frobnitzes

2003-02-22 Thread Dan Sugalski
At 11:46 PM -0500 2/21/03, Benjamin Goldberg wrote:
My bit of example code was merely to demonstrate that UNIVERSAL::can()
from perl5 clearly has the problem that Andy Wardley worries about wrt
freezing to a particular definition...  Thus, it *may* be a good idea
to *not* provide a user-code-level means of obtaining method handles, o
No. Python allows fetching a handle to the current method definition, 
and it seems a reasonable thing in some circumstances, so it needs to 
be supported. They may be the wrong answer in many circumstances, but 
that doesn't mean they're the wrong answer in all circumstances.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Using imcc as JIT optimizer

2003-02-22 Thread Dan Sugalski
At 4:28 PM +0100 2/22/03, Leopold Toetsch wrote:
Gopal V wrote:
Direct hardware maps (like using CX for loop count etc) will need to be
platform dependent ?. Or you could have a fixed reg that can be used for
loop count (and gets mapped on hardware appropriately).


We currently don't have special registers, like %ecx for loops, they 
are not used in JIT either. My Pentium manual states, that these ops 
are not the fastest.
But in the long run, we should have some hints, that e.g. i386 needs 
%ecx as shift count, or that div uses %edx. But probably i386 is the 
only weird architecure with such ugly restrictions - and with far 
too few registers.
I'm OK with adding in documentation that encourages using particular 
registers for particular purposes, or having some sort of metadata 
for the JIT that notes loop registers or something. As long as it's 
out of band and optional, that's cool.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Using imcc as JIT optimizer

2003-02-22 Thread Rafael Garcia-Suarez
Nicholas Clark wrote in perl.perl6.internals :
 
r-score = r-use_count + (r-lhs_use_count  2);
  
 r-score += 1  (loop_depth * 3);
[...]
 I wonder how hard it would be to make a --fsummon-nasal-demons flag for gcc
 that added trap code for all classes of undefined behaviour, and caused
 code to abort (or something more colourfully undefined) if anything
 undefined gets executed. I realise that code would run very slowly, but it
 would be a very very useful debugging tool.

What undefined behaviour are you referring to exactly ? the shift
overrun ? AFAIK it's very predictable (given one int size). Cases of
potential undefined behavior can usually be detected at compile-time. I
imagine that shift overrun detection can be enabled via an ugly macro
and a cpp symbol.

(what's a nasal demon ? can't find the nasald(8) manpage)


Re: Using imcc as JIT optimizer

2003-02-22 Thread Nicholas Clark
Please don't take the following as a criticism of imcc - I'm sure I manage
to write code with things like this all the time.


On Sat, Feb 22, 2003 at 08:13:59PM +0530, Gopal V wrote:
 If memory serves me right, Leopold Toetsch wrote:

r-score = r-use_count + (r-lhs_use_count  2);
  
 r-score += 1  (loop_depth * 3);
 
 Ok ... deeper the loop the more important the var is .. cool.

until variables in 11 deep loops go undefined?
(it appears to be a signed int)
I'm not sure how to patch this specific instance - just trap loop depths over
10? Should score be unsigned? 

More importantly, how do we trap these sort of things in the general case?

I wonder how hard it would be to make a --fsummon-nasal-demons flag for gcc
that added trap code for all classes of undefined behaviour, and caused
code to abort (or something more colourfully undefined) if anything
undefined gets executed. I realise that code would run very slowly, but it
would be a very very useful debugging tool.

Nicholas Clark


Re: stabs support

2003-02-22 Thread Leopold Toetsch
Steve Fink wrote:

First -- wow, thanks! I tried out the stabs stuff for the JIT
yesterday, and it's really helpful to be able to step through PASM
code from within emacs's gud mode.


Thank your for liking it :)


I had one problem, though -- whenever stepping over a keyed op (eg
set I0, P0[3]), gdb fails to recognize that it reaches any more
lines and instead runs the whole thing to completion and exits. 


I could never figure out, when gdb just continues.

... But I remember
noticing that gdb prints out the current PASM line number after that
second 'si' on the keyed op.


Looks then like a gdb error to me.


In my local copy (currently locked away on my home hard drive, so I
can't post it from here at work), I also added stabs entries for all
the PMC registers (in addition to the current S, I, and N registers.)
You can see the PMC's data fields and type. It looks something like:
  (gdb) p P0
(PMC*) 0xdeadbeef
  (gdb) p *P0
{ vtable = 0xdeadbeef, pobj = { u = { int_val = 17, pmc_val = 0x17 }, flags = 
381741 } }
  (gdb) p *P0-vtable
{ base_type = PerlArray }
(I added an enumeration for the PMC types). 


Wow, fine, fine.


And I'd just like to say that stabs is a mess. Is DWARF2 any better?


Yep - Ought to be, but I didn't have a look at it.

leo




Re: Using imcc as JIT optimizer

2003-02-22 Thread Gopal V
If memory serves me right, Leopold Toetsch wrote:
  I'm assuming that the temporaries are the things being moved around here ?.
 
 
 It is not so much a matter of moving things around, but a matter of 
 allocating (and renumbering) parrot (or for JIT) processor registers. 

Ok .. well I sort of understood that the first N registers will be the
ones MAPped ?. So I thought re-ordering/sorting was the operation performed.

Direct hardware maps (like using CX for loop count etc) will need to be
platform dependent ?. Or you could have a fixed reg that can be used for
loop count (and gets mapped on hardware appropriately). 

  does it. But that sounds like a lot of work identifying the loops and
  optimising accordingly.

 Loop info
 -
 loop 0,  depth 1, size 2, entry 0, contains blocks:
 1 2

Hmm.. this is what I said sounds like a lot of work ... which still 
remains true from my perspective :-)

   r-score = r-use_count + (r-lhs_use_count  2);
 
r-score += 1  (loop_depth * 3);

Ok ... deeper the loop the more important the var is .. cool.

Gopal
-- 
The difference between insanity and genius is measured by success


Re: invoke

2003-02-22 Thread Leopold Toetsch
Steve Fink wrote:

The invoke op is bothering me -- namely, it disturbs me that it
implicitly operates on P0. I know that P0 is the correct register to
use according to pdd03, but I dislike having it be implicit. The user
is required to set the rest of the pdd03 conventions up manually, so I
don't see any need for invoke to be different. And it makes it much
more clear what registers are being used if you have to pass in a PMC
as an argument.


Sean O'Rourke proposed a long time ago, that with should have 
Binvoke_p too. The current Binvoke is fine for pdd03 only (where 
(almost) all other P registers might be parameters, but for calling 
Subs, compiled code, coroutines and so on, there is really no need, to 
not be able to select the object, which should get invoke'd.


So would anyone mind if I eliminated the zero-arg invoke op in favor
of a one-arg invoke that takes a single PMC? (I may also have
situations where I don't need to follow pdd03, and it would be more
convenient to use a different register.)
Yep. At least add Binvoke Px.

leo



Re: Using imcc as JIT optimizer

2003-02-22 Thread Leopold Toetsch
Gopal V wrote:

If memory serves me right, Leopold Toetsch wrote:


Ok .. well I sort of understood that the first N registers will be the
ones MAPped ?. So I thought re-ordering/sorting was the operation performed.


Yep. Register renumbering, so that the top N used (in terms of score) 
registers are I0, I1, ..In-1


Direct hardware maps (like using CX for loop count etc) will need to be
platform dependent ?. Or you could have a fixed reg that can be used for
loop count (and gets mapped on hardware appropriately). 


We currently don't have special registers, like %ecx for loops, they are 
not used in JIT either. My Pentium manual states, that these ops are not 
the fastest.
But in the long run, we should have some hints, that e.g. i386 needs 
%ecx as shift count, or that div uses %edx. But probably i386 is the 
only weird architecure with such ugly restrictions - and with far too 
few registers.


Loop info

Hmm.. this is what I said sounds like a lot of work ... which still 
remains true from my perspective :-)


There is still a lot of work, yes, but some things already are done:

	set I10, 10
x: 
if I10, ok
	branch y
ok: 
set I0, 1
	sub I10, I10, I0
	print I10
	print \n
	branch x
y:
	end

Ends up (with imcc -O2p) as:

set I0, 10
set I1, 1
x:
unless I0, y
sub I0, I1
print I0
print \n
branch x
y:
end
You can see:

opt1 sub I10, I10, I0 = sub I10, I0
if_branch if ... ok
label ok deleted
found invariant set I0, 1
inserting it in blk 0 after set I10, 10 

The latter one is working out from the most inner loop.


Gopal
leo



Re: Objects, methods, attributes, properties, and other related frobnitzes

2003-02-22 Thread Benjamin Goldberg
Graham Barr wrote:
 
 On Fri, Feb 21, 2003 at 04:34:42PM -0500, Dan Sugalski wrote:
  If A isa B, we certainly wouldn't want to call A's AUTOLOAD on a
  method before we looked to see if B had a concrete instance of that
  method.
 
 Right. The best you could probably do is note where you found the
 first AUTOLOAD so that when you do reach the end of the ISA search you
 don't need to do the whole search again.

Unless we changed the language in such a way that we could *tell*
whether or not we should try calling A's AUTOLOAD.

Currently, in perl5, if you have package A; sub foo;, then the method
search will stop in A and call A's autoload, since it *knows* that A has
an appropriate method.

Obviously, we don't really want to force our users to stub every method
(though this would be *one* way of avoiding the need for a second pass
for AUTOLOADs)...  If the language had an AUTOPROTO/AUTOSTUB if some
sort, we could call it and find out where the heirarchy search should
stop in that class.

 But is this programming for the common case ? or is it premature
 optimization.

Well, first ask what are the common cases for autoloading in perl5...

I think that the *most* common case is AutoLoader/SelfLoader.

If Devel::SelfStubber is used with either of those, then that stops the
heirarchy search for methods in the right place, not needing a second
pass.

There are also definitions of XS constants ... though these are not, in
general, used as methods, so I suppose we can ignore them for now.

And finally, there are object property accessors, so that one can write
$c = $obj-color, instead of $c = $obj-{color}.  These are sometimes
done in AUTOLOAD... But stubs are rarely, if ever, provided for them, so
calling this type of method almost always requires two passes through
the inheritance heirarchy.

.

There's also another case that's not-so-common, but mainly due to the
difficulties of doing it right in perl5.  You've suggested keeping track
of where we found the *first* AUTOLOAD ... but what happens if we want
to inherit from *two* classes with AUTOLOAD methods?  In perl5, you'd
have to use NEXT.pm, which, imho, is fairly ugly internally, and not
especially efficient (plus it's not in the core).  Perl6 should have a
built-in mechanism to allow an AUTOLOAD method to either make a call to
the next AUTOLOAD, a la NEXT.pm, (this might be fairly expensive), or
throw an exception saying that that particular method isn't supplied by
this AUTOLOAD, and have the search continue (possibly much less
expensive).

-- 
$;=qq qJ,krleahciPhueerarsintoitq;sub __{0 
my$__;s ee substr$;,$,++$__%$,--,1,qq;;;ee;
$__2__}$,=22+$;=~y yiy y;__ while$;;print


Re: non-inline text in parrot assembly?

2003-02-22 Thread Tupshin Harper
Leopold Toetsch wrote:

You can use the .constant (PASM) or .const (IMCC) syntax, to keep 
strings visually together.

leo

Thanks. Apparently I'm being daft. I don't see any mention of pasm 
sections(constant or otherwise) in the pod docs, nor do any of the 
examples appear to use a constants section.  What am I missing?

-Tupshin



Re: Using imcc as JIT optimizer

2003-02-22 Thread Leopold Toetsch
Nicholas Clark wrote:

  r-score += 1  (loop_depth * 3);

until variables in 11 deep loops go undefined?


Not undefined, but spilled. First *oops*, but second of course this all 
not final. I did change scoring several times from the code base AFAIK 
Angel Faus did implement. And we don't currently have any code that goes 
near that omplexity of such a deep nested loop.

There are probably a *lot* of such gotchas in the whole CFG code in 
imcc. I'm currently on some failing perl6 tests, when using 
optimization, all in regexen tests, which do a lot of branching.


I'm not sure how to patch this specific instance - just trap loop depths over
10? Should score be unsigned? 


A linear counting of loop_depth will do it, e.g.

  r-score += 100 * loop_depth ;

Or score deeper nested loops vars always higher then outside, or ...


More importantly, how do we trap these sort of things in the general case?


With  a lot of tests


I wonder how hard it would be to make a --fsummon-nasal-demons flag for gcc
that added trap code for all classes of undefined behaviour, and caused
code to abort (or something more colourfully undefined) if anything
undefined gets executed. I realise that code would run very slowly, but it
would be a very very useful debugging tool.


I'm currently adding asserts to e.g. loop detection code. Last one (to 
be checked in) is:

/* we could also take the depth of the first contained
 * block, but below is a check, that an inner loop is fully
 * contained in an outer loop
 */
This is a check, that all blocks of a deeper nested loop are contained 
totally in the outer loop, so that there can't be basic blocks outside. 
But in regex code, this seems not to be true - or a prior stage of 
optimization messes things up.
This issues are as hard to debug as deeply buried in ~400 basic blocks 
with ~1000 edges connecting those.

perl6 $ ../imcc/imcc -O1 -d70 t/rx/basic_2.imc 21 | less


Nicholas Clark


leo





Re: non-inline text in parrot assembly?

2003-02-22 Thread Leopold Toetsch
Tupshin Harper wrote:

Leopold Toetsch wrote:

You can use the .constant (PASM) or .const (IMCC) syntax, to keep 
strings visually together.


Thanks. Apparently I'm being daft. I don't see any mention of pasm 
sections(constant or otherwise) in the pod docs, nor do any of the 
examples appear to use a constants section.  What am I missing?


Sorry nothing.
There are only IIRC 3 tests in parrot and 3 in imcc using these features.
$ perldoc assemble.pl
$ perldoc languages/imcc/docs/syntax.pod
$ perldoc languages/imcc/docs/macros.pod
But they are not very well covered in the main docs.

Additionally, string (and key and float constants) are a distinct 
section in PBC, only the assembler doesn't care - or OTOH there is now 
syntax to reference a string constant directly. This is all done via the 
constant tabke.

-Tupshin


leo



Re: Using imcc as JIT optimizer

2003-02-22 Thread Nicholas Clark
On Sat, Feb 22, 2003 at 08:44:12PM -, Rafael Garcia-Suarez wrote:
 Nicholas Clark wrote in perl.perl6.internals :
  
 r-score = r-use_count + (r-lhs_use_count  2);
   
  r-score += 1  (loop_depth * 3);
 [...]
  I wonder how hard it would be to make a --fsummon-nasal-demons flag for gcc
  that added trap code for all classes of undefined behaviour, and caused
  code to abort (or something more colourfully undefined) if anything
  undefined gets executed. I realise that code would run very slowly, but it
  would be a very very useful debugging tool.
 
 What undefined behaviour are you referring to exactly ? the shift
 overrun ? AFAIK it's very predictable (given one int size). Cases of

Will you accept a shortcut written in perl? The shift op uses C signed
integers:

$ perl -MConfig -le 'print foreach ($^O, $Config{byteorder},  1  32)'
linux
1234
0

vs

$ perl -MConfig -le 'print foreach ($^O, $Config{byteorder},  1  32)'
linux
1234
1

$ perl -MConfig -le 'print foreach ($^O, $Config{byteorder},  1  32)'
linux
4321
1

vs

$ perl -MConfig -le 'print foreach ($^O, $Config{byteorder},  1  32)'
linux
4321
0

(all 4 are Debian GNU/Linux
 And both architectures that give 0 for a shift of 32, happen to give 1 for
 a shift of 256.
 But I wouldn't count on it for all architectures)

 potential undefined behavior can usually be detected at compile-time. I

In this specific case, maybe. In the general case no.
signed integer arithmetic overflowing is undefined behavior

 imagine that shift overrun detection can be enabled via an ugly macro
 and a cpp symbol.
 
 (what's a nasal demon ? can't find the nasald(8) manpage)

Demons flying out of your nose. One alleged consequence of undefined
behaviour. Another is your computer turning into a butterfly. I guess a
third is Microsoft releasing a bug free program

Nicholas Clark


Re: Using imcc as JIT optimizer

2003-02-22 Thread Nicholas Clark
On Sat, Feb 22, 2003 at 09:27:04PM +, nick wrote:
 On Sat, Feb 22, 2003 at 08:44:12PM -, Rafael Garcia-Suarez wrote:

  What undefined behaviour are you referring to exactly ? the shift
  overrun ? AFAIK it's very predictable (given one int size). Cases of
 
 Will you accept a shortcut written in perl? The shift op uses C signed
 integers:

Oops. The logical shift uses *un*signed integers, except under use integer

$ perl -MConfig -le 'use integer; print foreach ($^O, $Config{byteorder},  1  32)'
linux
1234
0

$ perl -MConfig -le 'use integer; print foreach ($^O, $Config{byteorder},  1  32)' 
linux
1234
1

$ perl -MConfig -le 'use integer; print foreach ($^O, $Config{byteorder},  1  32)'
linux
4321
0

$ perl -MConfig -le 'use integer; print foreach ($^O, $Config{byteorder},  1  32)'
linux
4321
1

So there's actually no difference in the numbers. But as I'm being a pedant I
ought to get the facts right. [I guess it's my fault for drinking Australian
wine :-)]

Nicholas Clark


Re: non-inline text in parrot assembly?

2003-02-22 Thread gregor
Tupshin --

Parrot Byte Code (.pbc) files (aka packfiles) have multiple sections, but 
Parrot
Assembly (.pasm) files do not reference them explicitly. Literal constants 
are
*implicitly* placed in the constant section of the .pbc file upon 
assembly. The
.constant or .const directives allow you to name your constants, but the 
net
result is equivalent.


Regards,

-- Gregor





Tupshin Harper [EMAIL PROTECTED]
02/22/2003 02:31 PM

 
To: Leopold Toetsch [EMAIL PROTECTED]
cc: [EMAIL PROTECTED]
Subject:Re: non-inline text in parrot assembly?


Leopold Toetsch wrote:

 You can use the .constant (PASM) or .const (IMCC) syntax, to keep 
 strings visually together.


 leo

Thanks. Apparently I'm being daft. I don't see any mention of pasm 
sections(constant or otherwise) in the pod docs, nor do any of the 
examples appear to use a constants section.  What am I missing?

-Tupshin






Re: invoke

2003-02-22 Thread Steve Fink
On Feb-22, Leopold Toetsch wrote:
 Steve Fink wrote:

 So would anyone mind if I eliminated the zero-arg invoke op in favor
 of a one-arg invoke that takes a single PMC? (I may also have
 situations where I don't need to follow pdd03, and it would be more
 convenient to use a different register.)
 
 Yep. At least add Binvoke Px.

Ok, done.


Re: Objects, methods, attributes, properties, and other related frobnitzes

2003-02-22 Thread Benjamin Goldberg
Dan Sugalski wrote:
 Benjamin Goldberg wrote:
 Graham Barr wrote:
 Dan Sugalski wrote:
 If A isa B, we certainly wouldn't want to call A's AUTOLOAD on a
 method before we looked to see if B had a concrete instance of that
 method.

 Right. The best you could probably do is note where you found the
 first AUTOLOAD so that when you do reach the end of the ISA search
 you don't need to do the whole search again.

 Unless we changed the language in such a way that we could *tell*
 whether or not we should try calling A's AUTOLOAD.
 
 Given that we have to run perl 5 code with the same expressed
 semantics as perl 5, and also are going to run python and ruby code
 properly, this isn't a tenable option.
 
 We're the implementors. While we can complain about the semantics we
 have to express, we don't get to not express them.

Nothing says that we can't have a different semantic for each language
we're running.

When running perl5 code, we could fetch methods and perform method
caching one way, and when running perl6 code, we could fetch methods and
perform method caching a different way... and possibly a different
technique for each of python and tcl.

I was going to say that I probably ought to write my idea up in an RFC,
and see how people react, and get Larry's approval... but, I discovered
that someone else thought of this idea before me, and wrote it up!

   http://dev.perl.org/rfc/232.pod

-- 
$;=qq qJ,krleahciPhueerarsintoitq;sub __{0 
my$__;s ee substr$;,$,++$__%$,--,1,qq;;;ee;
$__2__}$,=22+$;=~y yiy y;__ while$;;print


Re: Objects, methods, attributes, properties, and other related frobnitzes

2003-02-22 Thread Dan Sugalski
At 7:56 PM -0500 2/22/03, Benjamin Goldberg wrote:
Dan Sugalski wrote:
 Benjamin Goldberg wrote:
 Graham Barr wrote:
 Dan Sugalski wrote:
 If A isa B, we certainly wouldn't want to call A's AUTOLOAD on a
 method before we looked to see if B had a concrete instance of that
 method.
 Right. The best you could probably do is note where you found the
 first AUTOLOAD so that when you do reach the end of the ISA search
 you don't need to do the whole search again.
 Unless we changed the language in such a way that we could *tell*
 whether or not we should try calling A's AUTOLOAD.
 Given that we have to run perl 5 code with the same expressed
 semantics as perl 5, and also are going to run python and ruby code
 properly, this isn't a tenable option.
 We're the implementors. While we can complain about the semantics we
 have to express, we don't get to not express them.
Nothing says that we can't have a different semantic for each language
we're running.
Well, almost nothing. Nothing much besides me, at least.

This isn't the place to ponder alternate semantics for existing or 
proposed languages. That's what the language lists are for. If you 
want perl 6 to behave in some particular way, go to perl6-language or 
petition Larry. (I'd not suggest bringing it up on Python-dev, but if 
you want to brave it, well, good luck)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [RFC] imcc calling conventions

2003-02-22 Thread Benjamin Goldberg
Dan Sugalski wrote:
 
 At 9:55 AM -0800 2/20/03, Steve Fink wrote:
[snip]
 Or, as another stab at the same problem, does Parrot really need 32*4
 registers? I keep thinking we might be better off with 16 of each
 type. But maybe I'm just grumbling.
 
 Yeah, 32 is a bunch. I've considered going with 16 on and off, and
 still might.

Given that registers are allocated with the lower numbers being the ones
used more often, how about having 32 registers, as we now have, but two
different ops for saving -- one of which saves registers 0 .. 15, the
other saves all 0 .. 31.  Or is this just a dumb idea?

-- 
$;=qq qJ,krleahciPhueerarsintoitq;sub __{0 
my$__;s ee substr$;,$,++$__%$,--,1,qq;;;ee;
$__2__}$,=22+$;=~y yiy y;__ while$;;print


access to partial registers?

2003-02-22 Thread Tupshin Harper
Sorry for all the questions...these are the trials and tribulations of 
dealing with a newbie trying to get up to speed with the current state 
of parrot. So here's another question:

Is it possible and/or meaningful to read and write from a part of a 
register(e.g. a single word) in pasm?

As with my previous questions, I'm not really interested in pbc 
issues/format(with exceptions of course), just learning the intricacies 
of pasm.

-Tupshin