Zcode interpreter release

2005-08-24 Thread Amir Karger
Hi!

I adopted the Zcode interpreter that leo posted in February. Luckily,
he did the hard parts.

I added some opcodes, README/CHANGES/TODO files, and testing, and,
finally, did my first Parrot checkin. How exciting!

There's much much more to do - 41 more opcodes just for version 3 of
the Z-machine. And then of course there's Dan's Holy Grail of making
the Zops into native parrot ops...

The only bad news is there's something wrong with my make test.
The following work:
- perl t/harness in languages/Zcode
- perl t/whatever.t in languages/Zcode
- make test in languages/Zcode
- make Zcode.test in languages/

However, I couldn't put Zcode into languages/testall.  It breaks when
it tries to run my tests. I managed to narrow this down to a very
weird Perl behavior I don't understand at all:

testportal:~mkdir z
testportal:~cd z
testportal:~/zmkdir y
testportal:~/ztouch y/foo
testportal:~/zperl -we 'system(cd y)'
Can't exec cd: No such file or directory at -e line 1.
testportal:~/zperl -we 'system(cd y  ls)'
foo

Because of the above, my tests break when they do:
 run_command($parrot z3.pir $test_file, CD=Zcode
 (I need to cd into Zcode so that I can run z3.pir and find the z*.pir
files it depends on.)

Any thoughts on why this is happening?

-Amir Karger


Re: Demagicalizing pairs

2005-08-24 Thread Damian Conway

Larry wrote:


Plus I still think it's a really bad idea to allow intermixing of
positionals and named.  We could allow named at the beginning or end
but still keep a constraint that all positionals must occur together
in one zone.


If losing the magic from ='d pairs isn't buying us named args wherever we 
like, why are we contemplating it?




I suspect a lot of people would still prefer to write named args with =,


I'd say so.



so we should put some thought into making it syntactically trivial, if
not automatic like it is now.   Even making named() a listop would help.


I'd say that's the best alternative. I'd certainly prefer that to repurposing 
:(...)




I hate to say it, but the named args should probably be marked
with : instead of + in the signature.

One other idle thought is that, if we don't mind blowing a different
kind of consistency, and if we s/+/:/ in sigs, a sig containing
:$foo could instead be written $:foo (presuming we take : away from
privates as we've postulated),


Yes please. Underscore is much better in that role.

Damian



Re: Demagicalizing pairs

2005-08-24 Thread Damian Conway

Larry mused:

On the other hand, I'm not all that attached to colon itself. 


I *am*!!!



If, as proposed elsewhere, we get rid of the %Foo:: notation in favor of
some Foo variant, then trailing :: becomes available (ignoring ??/::
for the moment), and

new Dog:: tail = 'long'

almost makes sense, insofar as it kinda looks like it's marking Dog
as a type name, even though it isn't.  But

new Dog:: :taillong

doesn't look so good.


Nor do object methods:

  wag $dog:: 'tail';

  say $fh:: $whatever;




On the other hand, looking at it from the other end, the MMD notation
tiebreaking notation is a little hard to spot, since colon is easy
to miss.


Is it??? I've been writing quite a bit of MMD notation, and I think the colon 
is very obvious...and exactly the right visual weight.




Maybe there's something that shows up better in a signature
that also works as the invocant marker and, by extension, the indirect
object marker.  Since it's an ordering kind of thing, you'd kind of
like to work  into it somehow, since the left side is of greater
importance than the left.  Unfortunately, though, the good ones are
all taken.  Maybe some digraph like

method new ($what* $:tail) {...}
method new ($what+ $:tail) {...}
method new ($what. $:tail) {...}
method new ($what| $:tail) {...}
method new ($what $:tail) {...}

giving

new Dog* :taillong
new Dog+ :taillong
new Dog. :taillong
new Dog| :taillong
new Dog :taillong

I guess that last one is eqivalent to:

method new ($what» $:tail) {...}
new Dog» :taillong

which I could maybe get used to.  It kind of looks like a prompt to me.


Not one of these is anything close to as readable as:

new Dog: :taillong

name $dog: 'Rover';

say fh:   @whatever

*Please* don't give up on the colon there. It's much more readable. I 
especially like it for setting up objects:


$person = Contact.new;

 first_name $person: George;
family_name $person: Bush;
  title $person: President;
  email $person: [EMAIL PROTECTED];
 spouse $person: search $contacts: Laura;



The ordinary MMD might look like

multi foo ($a, $b, $c» $d)

And Lisp-like MMD fallback on every argument would look like

multi foo ($a» $b» $c» $d»)

I suppose that particular use of » could be construed as encouraging
people not to do that.  :-)


I truly believe that using the French quotes or (shudder!) their Texan 
equivalents here would be a dire step backwards. They're already overloaded 
for word lists and hyperoperators. I think using them for an invocant marker 
as well would simply be too much.


The colon really was (and still is) the right choice here.

Damian



Re: ~ and + vs. generic eq

2005-08-24 Thread Damian Conway

Larry wrote:


Or we could have a different operator that coerces like == and eq, only
via .snap:

if [1,2,3] equals [1,2,3] { say true } else { say false }

(Actual name negotiable, of course).  The advantage of the latter approach
is that you can say

@foo equals @bar

and the .snaps are automatically distributed.  Otherwise you'd have to say

@foo.snap eqv @bar.snap

which is a pain.  On top of which, equals doesn't actually have to
implemented in terms of .snap--it could just compare the current
values of the mutable objects directly.  (Just as =:= doesn't have
to be implemented in terms of .id.)


Just a meta-point...one thing we really do need to be careful of is not ending 
up with 17 different equality operators (like certain languages I shall 
refrain from naming). So far we're contemplating:


=:=
~~
==
eq
eqv
equals

Do we really need even that many???

Damian



Re: ~ and + vs. generic eq

2005-08-24 Thread Yuval Kogman
On Tue, Aug 23, 2005 at 16:32:37 -0700, Larry Wall wrote:
 Hmm, well, I don't think op is valid syntax, but you did say
 semantics, so I can't criticize that part.  :-)

What is , btw?

Is it

circumfix:{'',''} (Code op -- Code); # takes some code, returns 
a listop

or
precircumfix:{'',''} (Code op, [EMAIL PROTECTED] -- List);

 I don't know how close ~~ and eqv will end up.  There are some
 differences in emphasis, and when two operators get too much like each
 other, I tend to add more differences to make them inhabit different
 parts of the solution space.  One current difference is that, despite
 the symmetry of ~~, it's not actually a symmetrical operator much of
 the time, such as when matching values to rules.  ~~ is intended to
 be heavily dwimmical, so it's allowed to do various kinds of abstract
 coercions to figure out some mystical good enough quotient.  But eqv
 on the other hand should probably be false in asymmetrical situations.
 The implementation of ~~ may delegate to eqv in certain symmetrical
 situations, of course.

Right... Magic is defined in the base definitions of ~~:

infix:~~ ($x, Rule $r) { ... }
infix:~~ ($x, Code test) { code($x) }

And so on and so forth, and then it is extended by the extender to
make cool aggregate operations, but even this doesn't have to be the
same for ~~ and eqv, it's just that eqv should have good builtins
for collections, is all.

 should say true, since those are the same values, and they can't
 change.  However, in Perl-5-Think, [1,2,3] produces mutable arrays,
 so unless we come up with some kind of fancy COW for [1,2,3] to be
 considered immutable until someone, er, mutes it, I think eqv would
 have to return false, and consider two such objects to be different
 values (potentially different if not actually different).

Well, when I use it as

if (@array eqv [1, 2, 3]) {

}

I think it's obvious that I'm checking what is the value right
now, and ditto when I say

my $str = foo;
$hash{$str} = 1;
$str ~= bar;
$hash{$str}; # not the same

Arguably use of an array as a hash key is using a reference to a
container, and use of a scalar as a hash key is using the value
inside a container, so in a sense the hash key didn't change when I
appended the string, but this distinction is subtle and mostly an
implementation detail.


 It would be possible to declare %hash some way that forces a snapshot
 via some kind of serialization or other, but then it gets hard to keep
 the identity around.  Then the question arises how we doctor
 
 if [1,2,3] eqv [1,2,3] { say true } else { say false }

eqvs! it pronounces very well: 'eekyoovies'. Try it:

if 1 2 3 eekyooviesses 1 2 3 say true, otherwise say false

It's fun, but I hate it even more than eqv ;-)

 to do the same snapshot comparison.  Arguably ~~ could do it, since it's
 explicitly *not* about identity.
 
 if [1,2,3] ~~ [1,2,3] { say true } else { say false }

But ~~ is not is the same it's matches. This is also true, and
not what I want eqv for:

if [1, 2, 3] ~~ [code { 1 }, rx/\d+/, Num] { say true } else { say 
false }

 Or we could have a more explicit way of doing whatever it is that the
 snapshot hash does to each argument.
 
 if [1,2,3].snap eqv [1,2,3].snap { say true } else { say false }

I think the opposite is better, make snapshotting by default, and
mutable value equality false by saying

[1, 2, 3] eqv [1, 2, 3] :always # or :forever

-- 
 ()  Yuval Kogman [EMAIL PROTECTED] 0xEBD27418  perl hacker 
 /\  kung foo master: *shu*rik*en*sh*u*rik*en*s*hur*i*ke*n*: neeyah



pgpqB6BHxX5TY.pgp
Description: PGP signature


Re: Zcode interpreter release

2005-08-24 Thread Anatoly Vorobey
Perl's systen() tries to execute an external command directly when 
doesn't find any any shell metacharacters in the command line.

Otherwise, it defaults to the shell.

-- 
avva
There's nothing simply good, nor ill alone -- John Donne



Re: Python PMC's

2005-08-24 Thread Leopold Toetsch

Sam Ruby wrote:


Let me try again to move the discussion from subjective adjectives to
objective code.  Consider:


[ example code ]


If you run this, you will get 1,2,3.

When called as a function, f will return the value of the second
parameter.  When called as a method, the same code will need to return
the value of the first parameter.


The calls to f() work more or less w/o problems in branches/leo-ctx5.


I'm open to suggestions as to what PIR code should be emitted by Pirate
to correspond to the above code.


A stripped down PIR-only, pythonless translation is below.


The way this currently works with Pirate and the existing Py*.pmc
closely follows the Python definition, which you can find here:

http://users.rcn.com/python/download/Descriptor.htm#functions-and-methods


Good explanation, thanks.


To illustrate how this works today, lets consider the call to foo.f(2)
above.   This involves both a find_method and an invoke.  Lets consider
each in turn:

 * pyclass.find_method first calls getprop on f.
 * pyclass.find_method then calls __get__ on the pyfunc returned,
   passing the object on which this method was called on.
 * pyfunc.__get__ creates a new PyBoundMeth object, saving both the
   original function and the object.
 * This PyBoundMeth object is then returned

then:

 * pyboundmeth.invoke shifts all registers right, inserts the original
   object as the first parameter, and then calls invoke on the original
   function.

Needless to say, this doesn't necessarily show off Parrot to its best
advantage from a performance perspective.  I tried a number of times to
argue for a combined find_and_invoke VTABLE entry as much of the above
can be optimized if you know that result of the find_method will only be
used a single time for an invoke.  If you like, you can scan the
archives to see what reception this suggestion received.


Well I think that the find_and_invoke is just the callmethodcc opcode as 
used in my translation below. The call to __get__ and the BoundMethod 
shouldn't be necessary, *if* their is no userdefined descriptor. That 
is, when getattribute checks, if the attribute provides __get__ then 
call it.


The yet unsopported thing is

  g = foo.g
  ...
  g(3)

which really needs a BoundSub object. But this is just a special case of 
currying, or a special case of Perl6's .assuming(). Therefore I think 
that Parrot should support this natively.



- Sam Ruby


leo
#
#  def f(x,y):
#return y

.sub f
.param pmc x
.param pmc y
.return (y)
.end

#
#  class Foo:
.sub create_Foo
.local pmc self, Foo
self = subclass Py, Foo
addattribute self, f
addattribute self, g
Foo = find_global Foo, Foo
store_global Foo, Foo
.end

.namespace [Foo]

.sub Foo
.local pmc o
o = new Foo
.return (o)
.end

#f = f
#def g(self,y):
#  return y

.sub __init method
.local pmc f, g
f = find_name f
setattribute self, f, f
g = find_name g
setattribute self, g, g
.end

.sub g
.param pmc self
.param pmc y
.return (y)
.end

.namespace []
.sub main @MAIN
.local pmc foo, g
init_python()

create_Foo()

#
#  foo = Foo()
#
foo = Foo()
#  g=foo.g
#
g = getattribute foo, g
# TODO create bound Sub inside gettattribute like so
## $I0 = isa g, Sub
## unless $I0 goto no_sub
## $P0 = new .BoundSub, g
## assign $P0, foo
## g = $P0
## no_sub:

#  print f(0,1)
$P0 = f(0, 1)
print_item $P0
print_newline

#  print foo.f(2)

# emulate python find_name, which checks attributes too
push_eh m_nf
$P0 = foo.f(2)
clear_eh
goto m_f
m_nf:
# getattribute would also check if __get__ is there
$P1 = getattribute foo, f
$P0 = foo.$P1(2)
m_f:
print_item $P0
print_newline

#  print g(3)
#$P0 = g(3)
#print_item $P0
#print_newline
.end

#

.sub init_python
.local pmc py, bs
py = newclass Py
.end



Re: Demagicalizing pairs

2005-08-24 Thread John Macdonald
On Wed, Aug 24, 2005 at 04:27:03PM +1000, Damian Conway wrote:
 Larry wrote:
 
 Plus I still think it's a really bad idea to allow intermixing of
 positionals and named.  We could allow named at the beginning or end
 but still keep a constraint that all positionals must occur together
 in one zone.
 
 If losing the magic from ='d pairs isn't buying us named args wherever we 
 like, why are we contemplating it?

When calling a function, I would like to be able to have a
mixture of named and positional arguments. The named argument
acts as a tab into the argument list and subsequent unnamed
arguments continue on.  That allows you to use a name for a
group of arguments:

move( from= $x, $y, delta= $up, $right );

In this case, there could even be an optional z-coordinate
argument for each of the from and delta groups.

The named group concept works well for interfaces that use the
same groups in many different functions.  It is especially
powerful in languages which do not have structured types,
which means it is not so necessary in Perl, but even here,
you often are computing the components (like $up and $right
above) separately, rather than always computing a single
structured value (which would mean writing delta=(x=$up,
y=$right) instead).

-- 


Perl 6 code - a possible compile, link, run cycle

2005-08-24 Thread Yuval Kogman
WRT to PIL and compilation and all that, I think it's time to think
about how the linker might look.

As I see it the compilation chain with the user typing this in the
prompt:

perl6 foo.pl

perl6 is a compiled perl 6 script that takes an input file, and
compiles it, and then passes the compiled unit to the default
runtime (e.g. parrot).

perl6 creates a new instance of the perl compiler (presumably an
object). The compiler will only compile the actual file 'foo.pl',
and disregard any 'require', 'use', or 'eval' statements.

The compiler produces an object representing a linkable unit, which
will be discussed later.

At this point the runtime kicks in. The runtime really runs compiled
byte code for the runtime linker, which takes the compiled unit that
the compiler emitted and prepares it for execution.

The runtime linker checks if any inclusions of outside code have
been made, and if so, invokes a search routine with the foreign
module plugin responsible. For example 'use python:Numerical' will
use the pyhon module plugin to produce a linkable unit.

A given module will normally traverse a search path, find some
source code, check to see if there is a valid cached version of the
source code, and if needed, recompile the source code into another
linkable unit.

As the linker gets new linkable units it checks if they have any
dependencies of their own, and eventually resolves all the data and
code that modules take from one another.

The resulting mess has only one requirement: that it can be run by
the runtime - that is, byte code can be extracted out of it.

If the modules expose more than just byte code with resolved
dependencies in the modules, for example type annotations,
serialized PIL, serialized perl 6 code, and so forth, it may, at
this point, do any amount of static analysis as it pleases,
recompiling, relinking, optimizing, inlining, performing early
resolution (especially of MMD)  and otherwise modifying code
(provided it was asked to do this by the user).

The optimization premise is: by the time it's linked it probably
won't change too much. Link time isa magictime for resolving calls,
inlining values, folding newly discovered constants, and so forth.

Furthermore, a linker may cache the link between two modules,
regardless of the calling script, so that optimization does not have
to be repeated.

The result is still the same: code that can be executed by the
runtime. It just might be more efficient.

The linker also must always be able to get the original version of
the linked byte code back, either by reversing some changes, or
keeping the original.

At this point the runtime's runloop kicks in, starting at the start
point in the byte code, and doing it's thing.

Runtime loading of code (e.g. eval 'sub foo { }') simply reiterates
the above procedure:

'sub foo { }' is compiled by the compiler, creating a linkable unit
(that can give 'sub foo { }' to the world). The runtime linker gets
a fault saying byte code state is changing,
$compiled_code_from_eval  is being ammended to
$linked_byte_code_in_runtime_loop.  The linker must then link the
running code to the result of eval. To do this it may need to undo
it's optimizations that assumed there was no sub foo.

For example, if there is a call to 'foo()' somewhere in foo.pl, the
linker may have just inlined a 'die no such sub foo()' error
instead of the call. Another linker may have put in code to do a
runtime search for the 'foo' symbol and apply it. The linker that
did a runtime search that may fail doesn't need to change anything,
but the linker which inlined a fatal error must undo that
optimization now that things have changed.

The behavior of the linker WRT to such things is dependant on the
deployment setting. In a long running mod_perl application there may
even be a linker that optimizes code as time goes by, slowly
changing things to be more and more static. As the process
progresses through time, the probability of new code being
introduced is lower, so the CPU time is invested better.
Furthermore, latency is not hindered, and startup is fast because
the linker doesn't do any optimizations in the begining. This is
part of the proposed optimizer chain, as brought up on p6l a month
or so ago.

Anyway, back to runtime linking. Once the code is consistent again,
e.g. calls to foo() will now work as expected, eval gets the
compiled code, and runs it. It just happens that 'sub foo { }' has
no runtime effects, so evak returns, and normal execution is
resumed.

To get the semantics of 'perl -c' you force the linker to resolve
everything, but don't actually go to the runloop.

Linkable units are first class objects, and may be of a different
class. This has merits when, for example, a linkable unit is
implemented by an FFI wrapper. The FFI wrapper determines at link
time what the foreign interface looks like, and then behaves like
the linkable unit one might expect if it were a native interface. It
can generate bytecode to call 

Re: Python PMC's

2005-08-24 Thread Sam Ruby
Leopold Toetsch wrote:
 Sam Ruby wrote:
 
 Let me try again to move the discussion from subjective adjectives to
 objective code.  Consider:
 
 [ example code ]
 
 If you run this, you will get 1,2,3.

 When called as a function, f will return the value of the second
 parameter.  When called as a method, the same code will need to return
 the value of the first parameter.
 
 The calls to f() work more or less w/o problems in branches/leo-ctx5.
 
 I'm open to suggestions as to what PIR code should be emitted by Pirate
 to correspond to the above code.
 
 A stripped down PIR-only, pythonless translation is below.

Thanks for doing this.  Keeping the conversation anchored with code is
very helpful.  I have a number of unimportant to the topic at hand
comments on that code (example: classes aren't global in Python), but
for now, I'll focus on comments that are relevant to the topic at hand.

 The way this currently works with Pirate and the existing Py*.pmc
 closely follows the Python definition, which you can find here:

 http://users.rcn.com/python/download/Descriptor.htm#functions-and-methods
 
 Good explanation, thanks.
 
 To illustrate how this works today, lets consider the call to foo.f(2)
 above.   This involves both a find_method and an invoke.  Lets consider
 each in turn:

  * pyclass.find_method first calls getprop on f.
  * pyclass.find_method then calls __get__ on the pyfunc returned,
passing the object on which this method was called on.
  * pyfunc.__get__ creates a new PyBoundMeth object, saving both the
original function and the object.
  * This PyBoundMeth object is then returned

 then:

  * pyboundmeth.invoke shifts all registers right, inserts the original
object as the first parameter, and then calls invoke on the original
function.

 Needless to say, this doesn't necessarily show off Parrot to its best
 advantage from a performance perspective.  I tried a number of times to
 argue for a combined find_and_invoke VTABLE entry as much of the above
 can be optimized if you know that result of the find_method will only be
 used a single time for an invoke.  If you like, you can scan the
 archives to see what reception this suggestion received.
 
 Well I think that the find_and_invoke is just the callmethodcc opcode as
 used in my translation below. The call to __get__ and the BoundMethod
 shouldn't be necessary, *if* their is no userdefined descriptor. That
 is, when getattribute checks, if the attribute provides __get__ then
 call it.

Take a look at the definition of callmethodcc in ops/object.ops.  It is
a find_method followed by an invoke.  It is the VTABLE operations that I
would like to see combined.

Note that Python provides __get__ methods for you on every function and
method, i.e., it is *not* optional, getattribute will always succeed.
Observe:

[EMAIL PROTECTED]:~$ python
Python 2.4.1 (#2, Mar 30 2005, 21:51:10)
[GCC 3.3.5 (Debian 1:3.3.5-8ubuntu2)] on linux2
Type help, copyright, credits or license for more information.
 def f(): pass
...
 f.__get__
method-wrapper object at 0xb7e0404c
 class c:
...   def m(self): pass
...
 c.m.__get__
method-wrapper object at 0xb7e0440c


 The yet unsopported thing is
 
   g = foo.g
   ...
   g(3)
 
 which really needs a BoundSub object. But this is just a special case of
 currying, or a special case of Perl6's .assuming(). Therefore I think
 that Parrot should support this natively.

I agree that BoundSub is a special case of currying, or what Perl6 calls
assuming.  I also agree that Parrot should support this natively.

 - Sam Ruby

[snip]

Below is from the sample that Leo provided.

 #  print foo.f(2)
 
 # emulate python find_name, which checks attributes too
 push_eh m_nf
 $P0 = foo.f(2)
 clear_eh
 goto m_f
 m_nf:
 # getattribute would also check if __get__ is there
 $P1 = getattribute foo, f
 $P0 = foo.$P1(2)
 m_f:
 print_item $P0
 print_newline

Note that the code above would need to be emitted for *every* method
call, as there is no way of knowing at compile time what property you
will get and how it is defined.  In addition to increasing code size, it
is error prone: for example, the above only appears to check for __get__
if the call fails, so the pie-thon b0 test will fail as hooking __get__
is integral to the way it provides a trace.  Also, there are other
reasons that exceptions may be thrown, and we are incurring the overhead
of setting up exception blocks, etc again, this would apply to
*every* method call.

An alternate solution without these problems would be

 #  print foo.f(2)
 $P0 = foo.f(2)
 print_item $P0
 print_newline

Ideally, callmethodcc would result in one vtable call to enable
interesting optimizations for those that care to provide it.  The
default implementation for this vtable entry could be a call to
find_method and invoke.

- Sam Ruby


Re: Demagicalizing pairs

2005-08-24 Thread Paul Seamons
I don't think this example reads very clearly.  Visually you have to parse 
until you see the next = and then back track one word to figure out the key.

 move( from= $x, $y, delta= $up, $right );

Personally I'd write that as either

  move(from = [$x, $y], delta = [$up, $right]);

OR assuming I has a Position object and a vector object

  move(from = $pos1, delta = $vec1);

The original example just seems difficult to parse.

Paul


Re: [pirate] Re: Python PMC's

2005-08-24 Thread Kevin Tew
I agree this following would be cool.
However in the general case this type of code inference is HARD to do.
I believe that the optimizations you are looking for would require a
combination of type inference and graph reduction.
PyPy may be the eventual answer.
Don't get me wrong, I think it is great and the long term goal.
As soon as we compile and run correctly, I'm all about optimizations.

Good Insight Michal,
Kevin,

Michal Wallace wrote:
 Hey Sam,
I agree with what you're saying in this
thread. This is slightly off topic, but
I wanted to point something out.

In general, python has so many places
where things have to be dynamic that you 
really can't know this kind of thing at 
compile time, especially if you allow for 
eval/exec, or if you allow the code to be 
used as a module. 

However if you treat the code as a closed 
system, *and* you have access to python at 
compile time, then we can optimize away a 
lot of these questions.

For example, in your original code:

  def f(x,y):
return y

  class Foo:
f = f
def g(self,y):
  return y

  foo = Foo()

  g=foo.g

  print f(0,1)
  print foo.f(2)
  print g(3)


Once you know how python works, it's *obvious* 
that this prints 1,2,3. I see no reason why the 
compiler couldn't figure this out up front just 
by walking the tree.

In fact, a good optimizing compiler would see the
return y lines and just get rid of those methods
completely. 

I'd like to allow for the ability to do certain 
optimizations like this up front, sacrificing 
flexibility for speed. I know there are many
programs that won't allow for this, but for the
ones that do, I'd like to be able to do a sort
of static compile like this. 

In other words, sometimes a python-like language
is a desirable thing. (But of course this should
all be optional so that we can also be 100%
python compatible)

Sincerely,
 
Michal J Wallace
Sabren Enterprises, Inc.
-
contact: [EMAIL PROTECTED]
hosting: http://www.cornerhost.com/
my site: http://www.withoutane.com/
-

___
pirate mailing list
[EMAIL PROTECTED]
http://cornerhost.com/mailman/listinfo/pirate
  



Re: Python PMC's

2005-08-24 Thread Leopold Toetsch

Sam Ruby wrote:

Leopold Toetsch wrote:



A stripped down PIR-only, pythonless translation is below.



  (example: classes aren't global in Python),


Yes, of course. The stripped down means essential the absence of any 
lexical handlings. But as you say, this doesn't matter for these sub and 
method calls.



Note that Python provides __get__ methods for you on every function and
method, i.e., it is *not* optional, getattribute will always succeed.


That's fine. But if it's not overriden by user code, you exactly know 
what it is doing. Therefore you can just emulate it, I think.



#  print foo.f(2)

   # emulate python find_name, which checks attributes too
   push_eh m_nf
   $P0 = foo.f(2)
   clear_eh
   goto m_f
m_nf:
   # getattribute would also check if __get__ is there
   $P1 = getattribute foo, f
   $P0 = foo.$P1(2)
m_f:



Note that the code above would need to be emitted for *every* method
call, 


Above snippet should not be emitted for a method call, it just emulates 
it in PIR. It should demonstrate how Python's find_method() could be 
implemented:


- try to find a method, else
- check attributes
- call __get__, if it is user provided

(or whatever order CPython actually uses).

The return value is a callable sub.


Ideally, callmethodcc would result in one vtable call to enable
interesting optimizations for those that care to provide it.  The
default implementation for this vtable entry could be a call to
find_method and invoke.


The interesting optimizations are:
- cache the find_method in the global method cache.
  This happens already, if the method string is constant.
- use a PIC [1] (which is tied to the opcode) and cache
  the final resut of find_method (or any similar lookup)
- run the invoked Sub inside the same run loop - specifically
  not within Parrot_run_meth_fromc_args

[1] polymorphic inline cache

The important thing is that the method lookup and the subroutine 
invocation happens from inside the runloop. Run cores that allow 
rewriting of opcodes (prederefed and JIT) can insert faster equivalent 
opcodes in these cases.


Have a look at src/pic.c (which doesn't implement a lot - it's more a 
proof of concept now). Anyway here is an example that is implemented:


  new P0, PyInt# new_p_sc

When this opcodes gets execute the first time, the type number of the 
class is looked up and then the opcode is replaced with the faster variant:


  new P0, 89 # new_p_ic  - arbitrary number for PyInt

The same scheme can be applied for all these opcodes that consist of any 
kind of a lookup (MMD, methods, attributes, ...) and some further 
action. The lookup is done once at runtime (and again only after cache 
invalidation).



- Sam Ruby


leo



Re: Zcode interpreter release

2005-08-24 Thread wolverian
On Tue, Aug 23, 2005 at 05:09:26PM -0400, Amir Karger wrote:
 testportal:~/zperl -we 'system(cd y)'

The right thing to do (tm) here is chdir(y), but if 'cd' is just an
example and not the actual command, the right thing is system LIST form:

system command = @args;

-- 
wolverian


signature.asc
Description: Digital signature


Re: Demagicalizing pairs

2005-08-24 Thread John Williams
On Wed, 24 Aug 2005, Damian Conway wrote:
 Larry wrote:

  Plus I still think it's a really bad idea to allow intermixing of
  positionals and named.  We could allow named at the beginning or end
  but still keep a constraint that all positionals must occur together
  in one zone.

 If losing the magic from ='d pairs isn't buying us named args wherever we
 like, why are we contemplating it?

I've lost track of the score in this thread, but I thought I would throw
a couple pennies into the fountain.

I really dread the thought of losing C name = value  for named
parameters.  Off the top of my head, ADA, PL/SQL, and php all use that
syntax, so it would be a shame to lose something that newbies might find
familiar.

On the other hand I would like it if the adverbial named parameter
style C :name(value)  were allowed at the begining as well as the end of
the parameter list.  I think adverbs read better when they are next to the
verbs they modify, and I would be nice if I didn't have to resort to macro
magic to get them there.

Mixing named and positionals is bad though.

~ John Williams




Re: Demagicalizing pairs

2005-08-24 Thread Chip Salzenberg
On Wed, Aug 24, 2005 at 08:38:39AM -0400, John Macdonald wrote:
 When calling a function, I would like to be able to have a
 mixture of named and positional arguments. The named argument
 acts as a tab into the argument list and subsequent unnamed
 arguments continue on.

I see a main point of named parameters to free the caller from the
tyranny of argument order (and vice versa).  It seems to me you're
asking for the worst of both worlds.
-- 
Chip Salzenberg [EMAIL PROTECTED]


Re: Python PMC's

2005-08-24 Thread Sam Ruby
Leopold Toetsch wrote:
 Sam Ruby wrote:
 
 Leopold Toetsch wrote:
 
 A stripped down PIR-only, pythonless translation is below.
 
   (example: classes aren't global in Python),
 
 Yes, of course. The stripped down means essential the absence of any
 lexical handlings. But as you say, this doesn't matter for these sub and
 method calls.

Agreed.

 Note that Python provides __get__ methods for you on every function and
 method, i.e., it is *not* optional, getattribute will always succeed.
 
 That's fine. But if it's not overriden by user code, you exactly know
 what it is doing. Therefore you can just emulate it, I think.

I'm not as sure as you appear to be, but we can worry about this later.

 #  print foo.f(2)

# emulate python find_name, which checks attributes too
push_eh m_nf
$P0 = foo.f(2)
clear_eh
goto m_f
 m_nf:
# getattribute would also check if __get__ is there
$P1 = getattribute foo, f
$P0 = foo.$P1(2)
 m_f:
 
 Note that the code above would need to be emitted for *every* method
 call, 
 
 Above snippet should not be emitted for a method call, it just emulates
 it in PIR. It should demonstrate how Python's find_method() could be
 implemented:

With that clarification, I agree in principle.

 - try to find a method, else
 - check attributes
 - call __get__, if it is user provided
 
 (or whatever order CPython actually uses).

What's in dynclass/pyclass.pmc matches Python's semantics closer.

 The return value is a callable sub.

More precisely: a curried function call.  This is an important
distinction; to see why, see below.

 Ideally, callmethodcc would result in one vtable call to enable
 interesting optimizations for those that care to provide it.  The
 default implementation for this vtable entry could be a call to
 find_method and invoke.
 
 The interesting optimizations are:
 - cache the find_method in the global method cache.
   This happens already, if the method string is constant.

Note that you would then be caching the results of a curried function
call.  This result depends not only on the method string, but also on
the particular object upon which it was invoked.

 - use a PIC [1] (which is tied to the opcode) and cache
   the final resut of find_method (or any similar lookup)

Again, the results will depends on the object.

 - run the invoked Sub inside the same run loop - specifically
   not within Parrot_run_meth_fromc_args

I don't understand this.  (Note: it may not be important to this
discussion that I do understand this - all that is important to me is
that it works, and somehow I doubt that Parrot_run_meth_fromc_args cares
whether a given function is curried or not).

 [1] polymorphic inline cache
 
 The important thing is that the method lookup and the subroutine
 invocation happens from inside the runloop. Run cores that allow
 rewriting of opcodes (prederefed and JIT) can insert faster equivalent
 opcodes in these cases.
 
 Have a look at src/pic.c (which doesn't implement a lot - it's more a
 proof of concept now). Anyway here is an example that is implemented:
 
   new P0, PyInt# new_p_sc
 
 When this opcodes gets execute the first time, the type number of the
 class is looked up and then the opcode is replaced with the faster variant:
 
   new P0, 89 # new_p_ic  - arbitrary number for PyInt
 
 The same scheme can be applied for all these opcodes that consist of any
 kind of a lookup (MMD, methods, attributes, ...) and some further
 action. The lookup is done once at runtime (and again only after cache
 invalidation).

The above works because PyInt is a constant.  It probably can be
extended to handle things that seem unlikely to change very rapidly.

But the combination of decisions on how to handle the passing of the
self parameter to a method, keeping find_method and invoke separated
at the VTABLE level, and the semantics of Python make the notion of
caching the results of find_method problematic.

 - Sam Ruby
 
 leo

- Sam Ruby


Re: Demagicalizing pairs

2005-08-24 Thread Dave Whipp
I've been trying to thing about how to make this read right without too 
much line noise. I think Lukes keyword approach (named) is on the 
right track.


If we want named params at both start and end, then its bound to be a 
bit confusing. But perhaps we can say that they're always at the end -- 
but either at the end of the invocant section or the end of the args.


Also, named is a bit of a clumsy name. Where and given are taken, 
so I'll use with:


I think something like these read nicely, without too much line noise:

  draw_polygon $canvas: @verticies with color = red;

  draw_polygon $canvas with color = red: @vertices;


Dave.


Re: Demagicalizing pairs

2005-08-24 Thread John Macdonald
On Wed, Aug 24, 2005 at 10:12:39AM -0700, Chip Salzenberg wrote:
 On Wed, Aug 24, 2005 at 08:38:39AM -0400, John Macdonald wrote:
  When calling a function, I would like to be able to have a
  mixture of named and positional arguments. The named argument
  acts as a tab into the argument list and subsequent unnamed
  arguments continue on.
 
 I see a main point of named parameters to free the caller from the
 tyranny of argument order (and vice versa).  It seems to me you're
 asking for the worst of both worlds.

Perhaps I didn't make it clear in my original message -
I agree that arbitrary mixing of named and positional is
usually a bad thing.

The only place where I find it useful is with a group of
arguments that are always provided in the same order, used one
or more times each by a number of functions, with additional
arguments for some/all of those functions.

So, a function that takes position and/or vector values would
provide a name for each vector/position, but expect each to have
an x, a y, and (possibly) a z argument following the name.

I saw this in the DO system - a shell written at CDC back in
the late 70's.  The provided scripts were designed so that
all programming scripts used the same sequence of arguments
after the OPT keyword, the LINK keywork, etc.

As I said originally, the value is diluted in a language
with structured data types - you can use a single argument
for a position that is a hash or array which contains the
x/y/z components within it.

The named group helps especially if you generally want to
provide separate-but-related arguments.  This tends to be
things like an optional sub-action that requires multiple
parameters if it is used at all.

So, I'm mostly saying that a mixture of named and positional
arguments is not ALWAYS bad, and that there may be some
value in permitting such a mixture in certain circumstances.

-- 


Re: Calling positionals by name in presence of a slurpy hash

2005-08-24 Thread Nicholas Clark
On Tue, Aug 23, 2005 at 10:11:37AM -0700, Larry Wall wrote:

 setting up the proxy hash.  It's possible that COW hashes can be made
 to work efficiently.  We'll need to copy hashes if we want to modify
 them to pass to subfunctions, just as when you change your environment
 it doesn't affect your parent process's environment variables.

I would assume that for parrot, with vtables, a simple COW hash would be
very efficient. By simple, I mean one that does a complete copy and separation
the first time someone tries to write to one of the copies, rather than the
more complex concept of maintaining state on partial copies.

My hunch would be that a simple system of COW plus an overlay hash would
work very well for the case of adding to (or deleting from) a hash passed in
as default arguments, because 95% of the time that hash is only directly
manipulated by functions back up the call frame, so for the duration of the
call would be unchanged.

But this is all arm-wavy, and needs real code to analyse before committing
to it as a strategy.

Nicholas Clark


Re: Python PMC's

2005-08-24 Thread Leopold Toetsch


On Aug 24, 2005, at 19:45, Sam Ruby wrote:



Leopold Toetsch wrote:

Sam Ruby wrote:





The return value is a callable sub.


More precisely: a curried function call.  This is an important
distinction; to see why, see below.


A callable sub may be of course a curried one - yes.


The interesting optimizations are:
- cache the find_method in the global method cache.
  This happens already, if the method string is constant.


Note that you would then be caching the results of a curried function
call.  This result depends not only on the method string, but also on
the particular object upon which it was invoked.


No the inner Parrot_find_method_with_cache just caches the method for 
a specific class (obeying C3 method resolution order as in Python). 
There is no curried function at that stage.





- use a PIC [1] (which is tied to the opcode) and cache
  the final resut of find_method (or any similar lookup)


Again, the results will depends on the object.


Yes. The nice thing with a PIC is that it is per bytecode location. You 
have a different PIC and a different cache for every call site. The 
prologue of PIC opcode is basically:


  if cache.version == interpreter.version:
 (cache.function)(args)
  else:
 # do dynamic lookup
 # update cache then repeat

The 'version' compare depends on the cached thingy, and is more 
explicit in individual implementations. But the principle remains 
always the same: you create a unique id that depends on the variables 
of the lookup and remember it. Before invoking the cached result you 
compare actual with cached ids. If there is a cache miss, there are 2 
or 3 more cache slots to consult before doing just the dynamic original 
scheme again (and maybe rewrite the opcode again to just the dynamic 
one in case of too many cache misses).


src/pic.c and ops/pic.ops have an implementation for sub Px, Py - just 
for fun and profit to show the performance in the MOPS benchmark ;-)


The important part in ops/pic.ops is:

lr_types = (left-vtable-base_type  16) | 
right-vtable-base_type;

if (lru-lr_type == lr_types) {
runit_v_pp:
((mmd_f_v_pp)lru-f.real_function)(interpreter, left, right);

(the lru is part of the cache structure, above code will be only run, 
when both types are = 0x)


MMD depends on the two involved types. This is compared before calling 
the cached function directly. As the cache is per bytecode location, 
there is a fair chance (95 %) that the involved types for this very 
code location are matching.


The same is true for plain method calls. The callee depends on the 
invocant and the method name, which is usually a constant. Therefore 
you can compare the cached version with the actual invocant type and 
normally, with a match, just run the cached function immediately.


Currying is only important for placing the 'self' into the arguments - 
the actual lookup was already done earlier and doesn't influence the 
called function. Actually a curried subroutine isa 'Sub' object already 
and invoked directly withouth any further method lookup. There is no 
caching involved in the call of a curried sub.





- run the invoked Sub inside the same run loop - specifically
  not within Parrot_run_meth_fromc_args


I don't understand this.  (Note: it may not be important to this
discussion that I do understand this - all that is important to me is
that it works, and somehow I doubt that Parrot_run_meth_fromc_args 
cares

whether a given function is curried or not).


There are two points to be considered:
- currying: the effect is that some call arguments (in this special 
case the object) are already fixed. The argument passing code has 
therefore the duty to insert these known (and remembered) arguments 
into the params for the callee. For the BoundMethod this is of course, 
shift all arguments up by one, and make the object 'self' the first 
param of he sub.
- the second point is only related to call speed aka optimization. It's 
just faster to run a PIR sub in the same run loop, then to create a new 
run loop.



The above works because PyInt is a constant.  It probably can be
extended to handle things that seem unlikely to change very rapidly.


Yes. That's the 'trick' behind PIC. It works best the more constant the 
items are. But as said above, method names and invocants usually don't 
vary *per byecode location*. Literature states ~95 % of method calls 
are monomorphic (one type of invocant), and 99,5 % are cached within 4 
cache slots. Look at some typical code


   a.'foo'(x)
   ...
   b.'bar'(y)

Both method calls have a distinct cache. The method names are constant. 
Therefore the callee depends only on the invocant. The types of 'a' or 
'b' are typically the same (except maybe inside compilers AST visit 
methods or some such). The same schme applies to plain method or 
attribute lookups.



But the combination of decisions on how to handle the passing of the
self parameter to a method, keeping find_method and 

Re: Python PMC's

2005-08-24 Thread Sam Ruby
Leopold Toetsch wrote:
 
 Note that you would then be caching the results of a curried function
 call.  This result depends not only on the method string, but also on
 the particular object upon which it was invoked.
 
 No the inner Parrot_find_method_with_cache just caches the method for
 a specific class (obeying C3 method resolution order as in Python).
 There is no curried function at that stage.

Where does Parrot_find_method_with_cache get this data?

Remember, in Python, there are no methods, there are only properties.

Of course, there is a find_method VTABLE entry, and the implementation
of this function calls __get__ which performs the curry function, per
the Python specifications.

Does Parrot_find_method_with_cache cache the results of the previous
call to find_method?

- Sam Ruby


Re: Python PMC's

2005-08-24 Thread Leopold Toetsch


On Aug 24, 2005, at 23:34, Sam Ruby wrote:



Leopold Toetsch wrote:



Note that you would then be caching the results of a curried function
call.  This result depends not only on the method string, but also on
the particular object upon which it was invoked.


No the inner Parrot_find_method_with_cache just caches the method 
for

a specific class (obeying C3 method resolution order as in Python).
There is no curried function at that stage.


Where does Parrot_find_method_with_cache get this data?

Remember, in Python, there are no methods, there are only properties.


Above Parrot interface function tries to locate Sub objects in the 
namespace of the invocant's class and in the MRO of it. When you go 
back to the PIR translation of your code there is e.g.


  class Foo:
def g(self,y):

I have translated this to:

.namespace [Foo]
.sub g
.param pmc self

Therefore all the static / default / common methods inside Python 
code would be callable with the plain Parrot method call syntax.


The classname contributes the namespace. The sub declaration creates an 
entry in that namespace, which is retrievable as:


  func = findglobal Foo, g

And as the namespace is exactly the classname this is exactly what 
find_method for a Foo object does, when trying to locate a method 
named g (I'm omitting here further lookups according to MRO due to 
other parents). The actual namespace that the classes find_method looks 
into can further be finetuned with the vtable function 
VTABLE_namespace_name(interp, class).


Other methods, like the one defined by f = f, would of course need a 
fallback to attribute lookup (please can we keep the term attribute, 
which is also used in CPython) - and not property). Therefore the 
'find_method' inside Py code should also consult the attributes of the 
object (and yes properties for per object overrides).


Well, this is at least my understanding how it could work.



Of course, there is a find_method VTABLE entry, and the implementation
of this function calls __get__ which performs the curry function, per
the Python specifications.


Yes. But currying isn't needed for a plain method call, when the 
__get__ isn't user defined. Why first curry the function in the first 
place, create a new curried sub PMC, then during invocation shift 
arguments, and eventually run it. This isn't the common case. Even if 
CPython does it like this now (and Python folks are discussing an 
optimized opcode on pthon-dev), it's not better or more correct - just 
the result is important. And that works as is now for Parrot method 
calls.




Does Parrot_find_method_with_cache cache the results of the previous
call to find_method?


It does one dynamic lookup and caches per class/method the result. The 
cache is invalidated by a store_global (which could influence the 
dynamic search result).



- Sam Ruby


leo



Re: [pirate] Re: Python PMC's

2005-08-24 Thread Michal Wallace
On Wed, 24 Aug 2005, Sam Ruby wrote:

[huge cut]
 
 Below is from the sample that Leo provided.
 
  #  print foo.f(2)
  
  # emulate python find_name, which checks attributes too
  push_eh m_nf
  $P0 = foo.f(2)
  clear_eh
  goto m_f
  m_nf:
  # getattribute would also check if __get__ is there
  $P1 = getattribute foo, f
  $P0 = foo.$P1(2)
  m_f:
  print_item $P0
  print_newline
 
 Note that the code above would need to be emitted for *every* method
 call, as there is no way of knowing at compile time what property you
 will get and how it is defined.


Hey Sam,

I agree with what you're saying in this
thread. This is slightly off topic, but
I wanted to point something out.

In general, python has so many places
where things have to be dynamic that you 
really can't know this kind of thing at 
compile time, especially if you allow for 
eval/exec, or if you allow the code to be 
used as a module. 

However if you treat the code as a closed 
system, *and* you have access to python at 
compile time, then we can optimize away a 
lot of these questions.

For example, in your original code:

  def f(x,y):
return y

  class Foo:
f = f
def g(self,y):
  return y

  foo = Foo()

  g=foo.g

  print f(0,1)
  print foo.f(2)
  print g(3)


Once you know how python works, it's *obvious* 
that this prints 1,2,3. I see no reason why the 
compiler couldn't figure this out up front just 
by walking the tree.

In fact, a good optimizing compiler would see the
return y lines and just get rid of those methods
completely. 

I'd like to allow for the ability to do certain 
optimizations like this up front, sacrificing 
flexibility for speed. I know there are many
programs that won't allow for this, but for the
ones that do, I'd like to be able to do a sort
of static compile like this. 

In other words, sometimes a python-like language
is a desirable thing. (But of course this should
all be optional so that we can also be 100%
python compatible)

Sincerely,
 
Michal J Wallace
Sabren Enterprises, Inc.
-
contact: [EMAIL PROTECTED]
hosting: http://www.cornerhost.com/
my site: http://www.withoutane.com/
-



Re: Zcode interpreter release

2005-08-24 Thread Amir Karger
On 8/23/05, I wrote:
 I adopted the Zcode interpreter that leo posted in February. 
 
 The only bad news is there's something wrong with my make test.
 I managed to narrow this down to a very
 weird Perl behavior I don't understand at all:
 
 testportal:~mkdir z
 testportal:~cd z
 testportal:~/zmkdir y
 testportal:~/ztouch y/foo
 testportal:~/zperl -we 'system(cd y)'
 Can't exec cd: No such file or directory at -e line 1.
 testportal:~/zperl -we 'system(cd y  ls)'
 foo

Several people pointed out that I didn't perldoc -f system. Sorry! 
Btw, even after reading the docs, I still don't understand why Perl
would pass a cd command to a piece of the shell that can't understand
it. Granted, I shouldn't do it anyway, because then Perl will exit the
shell it created for the system() and my cd will be useless.

I changed my tests to chdir to the right directory before doing
anything, and now it seems to work no matter where you run it.

So now, v0.2 can officially be released. Hooray!

-Amir


Re: Zcode interpreter release

2005-08-24 Thread Andrew Rodland
On Wednesday 24 August 2005 04:26 pm, Amir Karger wrote:

 Several people pointed out that I didn't perldoc -f system. Sorry!
 Btw, even after reading the docs, I still don't understand why Perl
 would pass a cd command to a piece of the shell that can't understand
 it. Granted, I shouldn't do it anyway, because then Perl will exit the
 shell it created for the system() and my cd will be useless.

It's not a piece of the shell; it's no shell at all. Perl is trying to 
execute /bin/cd (or rather, 'cd' somewhere in your search path), as provided 
for by execvp. As to why perl does it; perl doesn't know or care that the 
thing you're naming is a shell builtin; it simply tries to run what you told 
it to. If you wanted to force it to work as a builtin, you could use 
system('sh', '-c', 'cd /foo'); -- but as you've already noted, it wouldn't be 
good for anything anyway.


Andrew


Re: Python PMC's

2005-08-24 Thread Sam Ruby
Leopold Toetsch wrote:
 
 Above Parrot interface function tries to locate Sub objects in the
 namespace of the invocant's class and in the MRO of it. When you go back
 to the PIR translation of your code there is e.g.
 
   class Foo:
 def g(self,y):
 
 I have translated this to:
 
 .namespace [Foo]
 .sub g
 .param pmc self
 
 Therefore all the static / default / common methods inside Python code
 would be callable with the plain Parrot method call syntax.
 
 The classname contributes the namespace. The sub declaration creates an
 entry in that namespace, which is retrievable as:
 
   func = findglobal Foo, g

Honestly, I don't believe that that is workable for Python.  Modules are
global in Python.  Classes are lexically scoped.  All subs emitted by
Pirate are @anon.

Modules are namespaces and can contain classes, functions, and
variables.  The way to retrieve a method from a Python class defined in
module __main__ would be:

  $P1 = findglobal __main__, Foo
  getattribute $P2, $P1, 'g'

- Sam Ruby


Re: Zcode interpreter release

2005-08-24 Thread Joshua Juran

On Aug 24, 2005, at 7:42 PM, Andrew Rodland wrote:


On Wednesday 24 August 2005 04:26 pm, Amir Karger wrote:


Several people pointed out that I didn't perldoc -f system. Sorry!
Btw, even after reading the docs, I still don't understand why Perl
would pass a cd command to a piece of the shell that can't understand
it. Granted, I shouldn't do it anyway, because then Perl will exit the
shell it created for the system() and my cd will be useless.


It's not a piece of the shell; it's no shell at all. Perl is trying 
to
execute /bin/cd (or rather, 'cd' somewhere in your search path), as 
provided
for by execvp. As to why perl does it; perl doesn't know or care that 
the
thing you're naming is a shell builtin; it simply tries to run what 
you told

it to. If you wanted to force it to work as a builtin, you could use
system('sh', '-c', 'cd /foo'); -- but as you've already noted, it 
wouldn't be

good for anything anyway.


Another way to execute a shell builtin is to append a semicolon:  
system( cd /foo; ).


Josh