Re: grsecurity interfering with the parrot JIT/build

2004-10-19 Thread Leopold Toetsch
Christian Jaeger wrote:
Ok, I've looked at the test_exec_linux source, and tried it out 
separately; it's clear what happens to me now: I've enabled the 
following GrSecurity option, which makes the mprotect system call fail 
with a permission error - the test even outputs this as the line 
failure: Permission denied.

CONFIG_GRKERNSEC_PAX_MPROTECT
  Enabling this option will prevent programs from
   - changing the executable status of memory pages that were
 not originally created as executable,
With that option enabled we'd need a different 
mem_alloc_executable/mem_free_executable pair. Can you try to replace 
the allocation to use mmap with the PROT_EXEC bit set?

I did perl Configure.pl --optimize. I still get a parrot that is about 4 
times slower than perl5, 
... with the fib test. Yes. That's the reason for the ongoing discussion 
for a different calling scheme.

... and for which none of the -f,-g,-P,-S,-C,-j 
options does seem to make a significant difference in speed. (see data 
below).
That was a bug that is already fixed. Please update from CVS.
leo


Re: [perl #32036] [BUG] t/pmc/signal.t fails

2004-10-19 Thread Leopold Toetsch
Will Coleda [EMAIL PROTECTED] wrote:

 My machine did happen to be under a bit of a load at the time the test
 ran, but that doesn't seem like much of an excuse. =)

The test awaits the signal do be delivered in a second or so. Under load
it just may fail.

 t/pmc/signal...Hangup

I saw that once too: looks like the test script got the signal.

leo


Re: grsecurity interfering with the parrot JIT/build

2004-10-19 Thread Christian Jaeger
At 14:47 Uhr +0200 16.10.2004, Leopold Toetsch wrote:
Anyway, JIT memOk. There is a test in config/auto/jit/test_exec_openbsd.in,
Ok, I've looked at the test_exec_linux source, and tried it out 
separately; it's clear what happens to me now: I've enabled the 
following GrSecurity option, which makes the mprotect system call 
fail with a permission error - the test even outputs this as the line 
failure: Permission denied.

CONFIG_GRKERNSEC_PAX_MPROTECT
  Enabling this option will prevent programs from
   - changing the executable status of memory pages that were
 not originally created as executable,
   - making read-only executable pages writable again,
   - creating executable pages from anonymous memory.
  You should say Y here to complete the protection provided by
  the enforcement of non-executable pages.
  NOTE: you can use the 'chpax' utility to control this
  feature on a per file basis. chpax is available at
  http://pax.grsecurity.net
I've not yet hat the time to reboot that machine with a kernel w/o 
that option, so I'm not sure it'd work with the other grsecurity/PAX 
features. (In any case, the problem remains that one cannot build the 
JIT if GRKERNSEC_PAX_MPROTECT is enabled because the user has no 
chance running chpax; it should be possible to switch PAX off for 
some path using GrSecurity's ACL rules but I don't know if that reads 
the inodes into the kernel as LIDS does, if so then it's a real 
conflict.)

But I've built parrot now on another machine which is not running 
GrSecurity.  This machine is running a not up-to-date Debian Sarge 
(gcc 3.3.2) and has a Intel PentiumIII 1Ghz CPU.

I did perl Configure.pl --optimize. I still get a parrot that is 
about 4 times slower than perl5, and for which none of the 
-f,-g,-P,-S,-C,-j options does seem to make a significant difference 
in speed. (see data below).

I checked the Makefile, it has:
CFLAGS = -D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN 
-I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -O3 
-DDISABLE_GC_DEBUG=1 -DNDEBUG -g  -Dan_Sugalski -Larry -Wall 
-Wstrict-prototypes -Wmissing-prototypes -Winline -Wshadow 
-Wpointer-arith -Wcast-qual -Wcast-align -Wwrite-strings 
-Waggregate-return -Winline -W -Wno-unused -Wsign-compare 
-Wformat-nonliteral -Wformat-security -Wpacked 
-Wdisabled-optimization -mno-accumulate-outgoing-args -Wno-shadow 
-falign-functions=16   -I./include -I./blib/include  -DHAS_JIT -DI386 
-DHAVE_COMPUTED_GOTO

and include/parrot/config.h has:
#define PARROT_JIT_CAPABLE  1
Below are some timings, and the output of the configuration.
Does this indicate that something is wrong or not?
Thanks
Christian.
time perl -w fib.pl
fib(28) = 317811
real0m2.193suser0m2.180ssys 0m0.010s
time python fib.py
fib(28) = 317811
real0m1.723suser0m1.700s
time ../../parrot fib.imc
fib(28) = 317811 11.031775s
real0m11.113s   user0m8.130ssys 0m0.030s
with -f:
real0m17.399s   user0m8.060s
real0m8.540suser0m7.900s
with -g:
real0m8.904suser0m7.590s
with -P:
real0m11.088s   user0m8.860s
with -S:
real0m8.607suser0m8.220s
with -j:
real0m7.884suser0m7.860s
real0m11.019s   user0m9.040s
with -j -O:
real0m8.292suser0m8.280s
with -j -O8:
real0m7.873suser0m7.870s
with -j --optimize=9:
real0m7.401suser0m7.340s
time perl -w oo6.pl
50
real0m2.567suser0m2.550s
time python oo6.py
50
real0m2.679suser0m2.560s
time ../../parrot oo6.imc
50
real0m7.539suser0m7.470s
with -j:
real0m8.080suser0m8.040s
with -C:
real0m7.345suser0m7.190s
real0m7.619suser0m7.360s
with -j:
real0m9.734suser0m7.630s

Output from  perl Configure.pl --optimize --verbose-step=JIT :
...
Determining architecture, OS and JIT capability...
Setting Configuration Data:
(
archname = 'i386-linux-thread-multi',
cpuarch = 'i386',
osname = 'linux',
);
-e jit/i386/core.jit = yes
Setting Configuration Data:
(
asmfun_o = '',
);
Setting Configuration Data:
(
jitarchname = 'i386-linux',
jitcpuarch = 'i386',
jitcpu = 'I386',
jitosname = 'LINUX',
jitcapable = '1',
cc_hasjit = ' -DHAS_JIT -DI386',
TEMP_jit_h = '$(INC)/jit.h',
TEMP_jit_o = '$(SRC)/jit$(O) $(SRC)/jit_cpu$(O) 
$(SRC)/jit_debug$(O) $(SRC)/jit_debug_xcoff$(O)',
);
Setting Configuration Data:
(
TEMP_exec_h = '$(INC)/jit.h $(INC)/exec.h $(INC)/exec_dep.h 
$(INC)/exec_save.h',
TEMP_exec_o = '$(SRC)/exec$(O) $(SRC)/exec_cpu$(O) 
$(SRC)/exec_save$(O)',
execcapable = '1',
);
 (has_exec_protect cc -D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS 
-DDEBIAN  -I/usr/local/include -D_LARGEFILE_SOURCE 
-D_FILE_OFFSET_BITS=64 -O3 -DDISABLE_GC_DEBUG=1 

Re: [perl #32036] [BUG] t/pmc/signal.t fails

2004-10-19 Thread Jeff Clites
On Oct 19, 2004, at 12:42 AM, Leopold Toetsch wrote:
Will Coleda [EMAIL PROTECTED] wrote:
t/pmc/signal...Hangup
I saw that once too: looks like the test script got the signal.
That's what my patch from last week was supposed to fix--I'm surprised 
it's still happening. We should currently be making sure not to kill 
the harness.

(However, if there's another copy of parrot running--for instance, 
stopped in a debugger somewhere--then that may get the signal 
erroneously.)

JEff


Re: [perl #32035] [PATCH] tests and fixes for Integer and Undef PMCs

2004-10-19 Thread Nicholas Clark
On Mon, Oct 18, 2004 at 09:26:09PM -0400, Dan Sugalski wrote:
 A
 I have started a test script for the Integer PMC. In that process I found
 strangeness in get_string(). set_integer_native() can be inherited from the
 Scalar PMC.
 
 For the Undef PMC I fixed an error in set_number_native().
 
 A patch is attached. The file t/pmc/integer.t is new.
 
 Applied, though the patch didn't have t/pmc/integer.t in it.

Presumably due to the long standing issue with the perl.org mail system
stripping any attachments ending .t

Nicholas Clark


Re: Problems with 0.1.1 release on x86-64

2004-10-19 Thread Leopold Toetsch
Brian Wheeler [EMAIL PROTECTED] wrote:

 Here's the diff against the current CVS.

Please append the patch.

,--[ messed up ]---
|  +
|   cc_gen('config/auto/mema
|
|  On Thu, 2004-10-14 at 06:37, Leopold Toetsch wrote:
`--

leo


Re: Python, Parrot, and lexical scopes

2004-10-19 Thread Leopold Toetsch
Sam Ruby [EMAIL PROTECTED] wrote:

 ...  Suggestions welcome, in
 particular, a PIR equivalent to the Perl would be most helpful.

It could be something like below. Some remarks:

* we don't have a notion to create a Closure PMC, so these closures are
  handcrafted. (NB: a subroutine with a .yield inside gets already
  created as a Coroutine PMC constant)
* the import statement is simulated too by storing the lexicals into the
  caller's frame. This would very likely be another Python opcode.
* the builtins would be at the top-level pad. The lexical opcodes do not
  fully comply with Python's name lookup, which is something like this:

for d in (locals, globals, builtins):
  try:
 f = d[wrapped_name]:
return f
  except KeyError:
 pass
  except:
 raise
# raise NameError

# scope1.pir

.namespace []
.sub __main__ @MAIN
new_pad 0
# from scope2 import *
load_bytecode scope2.pir
$P0 = find_global scope2, _scope2__import
$P0()
# print f(), foo
.local pmc f
f = find_lex f
$P2 = f()
.local pmc foo
foo = find_lex foo
print_item $P2
print_item foo
print_newline
# foo = 1
$P4 = new Undef
$P4 = 1
store_lex -1, foo, $P4
foo = find_lex foo
#print f(), foo
$P3 = f()
print_item $P3
print_item foo
print_newline
.end

# scope2.pir

.namespace [scope2]
.sub _scope2__init @LOAD
new_pad -1
# foo = 2
$P0 = new Undef
$P0 = 2
store_lex -1, foo, $P0
# def f(): return foo
.local pmc f
f = new Closure
set_addr f, _f
store_lex -1, f, f
$P1 = new Closure
set_addr $P1, _scope2__import
store_global scope2, _scope2__import, $P1
.end
.sub _scope2__import# @CLOSURE
$P0 = find_lex foo
store_lex -2, foo, $P0
$P0 = find_lex f
store_lex -2, f, $P0
.end
.sub _f # @CLOSURE
   $P0 = find_lex foo
   .pcc_begin_return
   .return $P0
   .pcc_end_return
.end

leo


Re: NOTICE: New interpreter naming (people with pending patches, read this now)

2004-10-19 Thread Leopold Toetsch
Brent 'Dax' Royal-Gordon [EMAIL PROTECTED] wrote:
 The naming of the interpreter structure has changed.  The struct is
 now called parrot_interp_t;

Thanls,

leo


Re: [Proposal] JIT, exec core, threads, and architectures

2004-10-19 Thread Leopold Toetsch
Jeff Clites [EMAIL PROTECTED] wrote:
 On Oct 17, 2004, at 3:18 AM, Leopold Toetsch wrote:

 Nethertheless we have to create managed objects (a Packfile PMC) so
 that we can recycle unused eval-segments.

 True, and some eval-segments are done as soon as they run (eval 3 +
 4), whereas others may result in code which needs to stay around (eval
 sub {}), and even in the latter case not _all_ of the code
 generated in the eval would need to stay around. It seems that it may
 be hard to determine what can be recycled, and when.

Well, not really. As long as you have a reference to the code piece,
it's alive.

 And we have to protect the packfile dictionary with mutexes, when this
 managing structure changes i.e. when new segments gets chained into
 this list or when they get destroyed.

 Yes, though it's not clear to me if all eval-segments will need to go
 into a globally-accessible dictionary. (e.g., it seems the 3 + 4 case
 above would not.)

It probably depends on the generated code. If this code creates globals
(e.g. Sub PMCs) it ought to stay around.

[ toss constant op variations ]

 For PIR yes, but the PASM assembler can't know for sure what register
 would be safe to use--the code could be using its own obscure calling
 conventions.

PASM would need rewriting to only use the available ops, basically.

 JEff

leo


Re: Parrot Forth 0.1

2004-10-19 Thread Leopold Toetsch
Michel Pelletier [EMAIL PROTECTED] wrote:
 The Python interpreter could use this method too
 to really spank CPython, which has implicit
 stack traffic that cannot be easily optimized
 out.

That's not need. The translater can easily create register code, even
from Python bytecode, which is stack oriented.

leo


Re: [Summary] Register stacks again

2004-10-19 Thread Leopold Toetsch
Matt Fowles [EMAIL PROTECTED] wrote:
 All~

 This feels similar in spirit to the old framestacks that we used to
 have.  I throught that we moved away from those to single frame things
 so that we did not have to perform special logic around continuations.
  I would feel more comfortable if someone explained both the initial
 motivation of leaving the chunked system and why this does not violate
 that motivation or that motivation was wrong.

The problem currently is that we do too much copying. The caller has to
preserve it's registers. That is currently done by copying onto the
frame stacks. After function return there's another copy going on to
restore registers.

Until around Parrot 0.0.3 there were chunked stacks *with* an
indirection for the register frame pointers. During development of the
JIT system these indirections got dropped to be able to use absolute
addresses for registers in JIT code and for about 3% of more
performance.

To support continuations the chunks were first copied then COWed, and
later replaced by the single frame stack, we now have.

What I want to achieve is to find the best combination of all these
variations. That is:

- again one indirection for register access. The cost is near zero
  because almost all JIT subsystems are already using register indirect
  addressing.
- but only one frame stack, not 4 to be able to have the frame pointer
  in a CPU register
- no COW copying of frames because that is expensive too. Instead the
  register chunks are compacted occassionally during GC.

 Thanks,
 Matt

leo


Re: Parrot Forth 0.1

2004-10-19 Thread Darryl
michel wrote:
Whether or not an old definition is retained if
a word is redefined is a different question, in
the case of Parakeet, it will increment by two
because all high level words are looked up by
name at run-time via indirect threading.
This is an incorrect __Forth__ behaviour.  gForth's is correct.
If you want print+ to increment by two, it should be redefined to do that.
A compiled word should not change its behaviour.
matt wrote:
I'd be interesting in knowing which was the
correct behavior.

I suspect it is implementation defined, but
unfortunately taygeta.com is not working for me
right now.




Re: [Summary] Register stacks again

2004-10-19 Thread Matt Fowles
Leo~

Thanks for the detailed explanation.


On Tue, 19 Oct 2004 10:50:22 +0200, Leopold Toetsch [EMAIL PROTECTED] wrote:
 Until around Parrot 0.0.3 there were chunked stacks *with* an
 indirection for the register frame pointers. During development of the
 JIT system these indirections got dropped to be able to use absolute
 addresses for registers in JIT code and for about 3% of more
 performance.
 
 To support continuations the chunks were first copied then COWed, and
 later replaced by the single frame stack, we now have.
 
 What I want to achieve is to find the best combination of all these
 variations. That is:
 
 - again one indirection for register access. The cost is near zero
   because almost all JIT subsystems are already using register indirect
   addressing.
 - but only one frame stack, not 4 to be able to have the frame pointer
   in a CPU register
 - no COW copying of frames because that is expensive too. Instead the
   register chunks are compacted occassionally during GC.

Could we have the chunks only hold one frame and avoid much of the
compaction work?  If we return to the inderict access mechanism, we
can switch register frames by changing one pointer.  But if we keep
the one frame per chunk, we do not need to compact frames, standard
DOD/GC will be able to reclaim frames.  I recall there being
efficiency issues with frames being frequently allocated/deallocated
too frequently, so we could have a special free list for frames.

This proposal feels to me like a slightly simpler version of yours. 
Thus I would argue for it on the grounds of do the simple thing first
and compare its efficiency.

Matt
-- 
Computer Science is merely the post-Turing Decline of Formal Systems Theory.
-???


Re: Pathological Register Allocation Test Generator

2004-10-19 Thread Bill Coffman
Hello All,

This is my first post to the parrot list, but I hope that many will
follow.  Thanks to all of you for working so dilligently on building
this wonderful new toy for all us geeks to play with!

I am currently working on a fix to the large subroutine register
allocation bug, aka, massive spilling not yet implemented.  The
problem, is that the register allocation code is complex, and I'm not
all that familiar with it, or even with working with compilers at the
coding level.

I am comming at the problem as a former graph theorist, who was a
consultant to a compiler project (at UC Davis, in the late 90's where
some of the work was applied toward what is the current Intel
compiler).  I talked a lot, and learned a lot, but didn't actually
contribute any coding to the project.  I also spent a lot of time
coding graph coloring heuristics with C++.

I expect to produce a patch soon, that will make register allocation
better.  To that end, I felt it was important to come up with some
metrics.  The first attached program (gen3.pl), which is based on Greg
Purdy's gen-pra.pl, gathers data on compilation performance for
varying numbers of variables and compares this to GCC.  Mine also uses
gnuplot to make pretty pictures.  If everything is installed okay on
your system, a graph will pop up displaying runtime results.  I'm
using Debian Linux, so it was pretty easy to set up.

The second program, gen4.pl, is a randomized, automated, blackbox
tester.  It works similar to the above, but the corresponding parrot
and GCC outputs are compared.

And now, for the philosophical note ...

One of the biggest wins of RISC over CISC was not that RISC was
smaller, but that architects looked at actual code that would run on
their machines.  They removed an instruction from the instruction set
if it wasn't used that frequently.  They considered wheather to
implement features in the hardware, or the compiler, and they tested
each, to find the optimum balance.  This approach may be helpful in
building parrot as well, and it's what I've tried to do, in a very
modest way, in my scripts.  Thanks to Gregg Purdy for getting the ball
rolling in this general direction.

Thanks,
Bill Coffman


On Mon, 11 Oct 2004 09:54:31 -0400, Dan Sugalski [EMAIL PROTECTED] wrote:
 At 4:58 PM -0700 10/2/04, Gregor N. Purdy wrote:
 Dan et al. --
 
 I made a new version of the script that creates gen.cpp and gen.imc
 (attached). You can run it like this:
 
perl gen-pra.pl 1000 1
 
 (for 1000 labels and 1 variables) and it will create equivalent
 gen.imc and gen.cpp files. You can test-compile them with these
 commands:
 
 g++ -c gen.cpp
 
 and
 
imcc -o x.x gen.imc
 
 on my system, the g++ compiler does eventually finish, but the imcc
 compiler is eventually killed.
 
 Maybe this could be used to drive out the underlying problems that
 are keeping parrot from compiling Dan's really large subs?
 
 Mmmm, degenerate behaviour! Cool, and thanks. Should help judge how
 we're doing with the nastier code.
 --
 Dan
 
 --it's like this---
 Dan Sugalski  even samurai
 [EMAIL PROTECTED] have teddy bears and even
teddy bears get drunk

#!/usr/bin/perl -w

use strict;

die Usage: $0 starting_num_variables ending_num_variables\n unless @ARGV == 2;
my ($start,$stop) = @ARGV;

open COMP, compile.dat or die $!;
print COMP #vars   gcc   imc\n;

for (my $vars=$start;$vars=$stop;$vars++) {
  my ($total_locals) = $vars;
  my $total_labels = int ($total_locals / 2);

  my $labels_so_far = 1;
  my $locals_so_far = 0;

  open IMC, gen.imc;
  open CPP, gen.cpp;

  print IMC .sub __MAIN\n\n;
  printf IMC \n_L_%d:\n, 1;
  print CPP #include stdio.h\n\n;
  print CPP int main(int argc, char * arg[])\n{\n;
  printf CPP \n_L_%d:\n, 1;

  my %action_table = qw(l label v variable a arithmetic c control p print);
  my @actions = (('l')x2, ('v')x4, ('a')x9, ('c')x3, ('p')x0);

  while ($labels_so_far  $total_labels or $locals_so_far  $total_locals) {
   my $action = $action_table{$actions[rand @actions]};

   if ($action eq 'label' and $labels_so_far  $total_labels) {
 my $this_label = ++$labels_so_far;

 printf IMC \n_L_%d:\n, $this_label;
 printf CPP \n_L_%d:\n, $this_label;
   }
   elsif ($action eq 'variable' and $locals_so_far  $total_locals) {
 my $this_local = ++$locals_so_far;
 my $this_value = int(rand(99)) + 1;

 printf IMC   .local int V_%d\n, $this_local;
 printf IMC   V_%d = %d\n, $this_local, $this_value;

 printf CPP   int V_%d;\n, $this_local;
 printf CPP   V_%d = %d;\n, $this_local, $this_value;
   }
   elsif ($action eq 'arithmetic' and $locals_so_far  0) {
 my $result = 1 + int(rand($locals_so_far));
 my $arg1   = 1 + int(rand($locals_so_far));
 my $arg2   = 1 + int(rand($locals_so_far));
 next if $arg1 == $arg2;

 my $op = 

Re: Perl 6 Summary for 2004-10-01 through 2004-10-17

2004-10-19 Thread Michele Dondi
On Sun, 17 Oct 2004, Matt Fowles wrote:
Google groups has nothing for Perl6.language between October 2 and 14.
Is this really the case?  (I had not signed up until shortly before
Yes: no traffic at all for quite a while...
Michele
--
Except people don't actually read the documentation, and when they
do read it, they don't understand it, and when they do understand it,
they'll write it wrong anyway out of habit.  You might as well write
your warning in Russian for all the good it'll do.  :-)
- Larry Wall in perl6-language ML, 9 Jul 2004


Re: Perl 6 Summary for 2004-10-01 through 2004-10-17

2004-10-19 Thread Joshua Gatcomb
--- Matt Fowles [EMAIL PROTECTED] wrote:

 Joshua Gatcomb  accidentally introduced a dependency
 on
 Config::IniFiles.  Since it is implemented in pure
 perl he offered to
 add it to the repository.  Warnock applies.
 
 http://xrl.us/div3

In the note offering to fix it, I also listed numerous
other scripts with non-core dependencies.  Dan, in
IRC, indicated that they all should have tickets on
them.  Before fixing parrotbench.pl with one of the
following solutions:
1.  inline Config::IniFiles with the author's
permission
2.  Use some other core module if possible
3.  Roll my own
4.  Revert back to previous non-module version

I want to find out what the general guidance is and
try to be inline with that - warnock still applies

 = Threads on Cygwin
 
 Joshua Gatcomb discovered some trouble with threads
 on Cygwin.  It
 seems that there are problems with both the thread
 implementation, and
 the tests not be generous enough if accepting out of
 order input. 
 Still unresolved, I think.
 
 http://xrl.us/div5

The threading issue is resolved by upgrading
cygwin1.dll (see PLATFORMS).  The test output being
controlled using sleep statements instead of using
regexes is still unresolved.

 = Cygwin bugs
 
 Joshua Gatacomb has been fighting with Cygwin
 getting Parrot to work. 
 Apparently we trip a few of its bugs.  Read more if
 you like.
 
 http://xrl.us/diwz
 http://xrl.us/diw2
 http://xrl.us/diw3

Gatacomb ne Gatcomb.  That is what my drill
sergeant in the Army used to call me though ;-)  There
are plenty of Cygwin issues but I have swatted
everyone I have discovered so far.  That being said -
I am going to be giving mingw a go and leaving Cygwin
alone for a while.


 == Perl 6 Summaries
 
 Piers raised the white flag after several years as a
 wonderful
 summarizer.  Having now just finished my first
 summary

Thank you for stepping up to the plate - good job

Joshua Gatcomb
a.k.a. Limbic~Region

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


Re: Perl 6 Summary for 2004-10-01 through 2004-10-17

2004-10-19 Thread Austin Hastings
Michele Dondi wrote:
On Sun, 17 Oct 2004, Matt Fowles wrote:
Google groups has nothing for Perl6.language between October 2 and 14.
Is this really the case?  (I had not signed up until shortly before

Yes: no traffic at all for quite a while...
Does this mean that we're done?   :)


Register stacks, return continuations, and speeding up calling

2004-10-19 Thread Dan Sugalski
Okay, since my calendar's off and it's apparently time to rehash this 
*again* (I had this down for next month, but I guess it's a chaotic 
cycle). Normally I'd just let it spin out, but we *do* have an issue 
with sub call speeds, and I don't see how fiddling with this will do 
any harm otherwise.

We've two big issues with sub calling.
The first is the CPS style chews through continuation objects at a 
massive rate, and using them means that everything needs to be COW.

The second is that with the register file, we're doing a lot of 
copying of data on each sub call and, while code can (though at the 
moment really doesn't) restrict which sets of registers gets saved, 
that doesn't work for indirect calls to bytecode, via vtable 
functions and whatnot.

I've solutions of a sort for both problems.
For the first, we've got the infrastructure in place to do what we 
need. A return continuation can be recycled *if* we know it hasn't 
been used. With the return continuation register, that's now easy. 
First, we remove the return continuation from the registers the 
called sub/method sees. The continuation's still there, in the return 
register, but it's not in the general register set. Next, we 
distinguish between actual and potential continuations.

A return continuation is a *potential* continuation. That is, it 
*can* be a full continuation, but until something actually does 
something to it, most of its continuation-ness can be deferred. That 
is, unless something takes a real continuation or fetches the return 
continuation out of the return continuation register (thus turning it 
from potential to actual) the continuation can be safely recycled 
once invoked and doesn't have to go COW-marking stacks or anything.

This does mean that we'd want to encourage people to use the 
invokecc ops to call a function or method (as they're the only ones 
that can safely create a potential continuation) and use the return 
op to invoke the return continuation in the return continuation 
register (so it can then go recycle the continuation after extracting 
out all the good bits)

Taking a continuation can *also* mark the current interpreter 
structure as dirty. That way when return is invoked, if the 
interpreter *isn't* dirty we can immediately recycle the register 
file if we choose to hang the current register file off a pointer.

For the register copying problem, we've a couple of options. At the 
moment I'm leaning towards re-abstracting out the register file and 
hanging it off a pointer in the interpreter structure. We can 
allocate a new register file when making a sub call and only copy the 
relevant bits into it as a sort of extra bonus for speed. (And copy 
back only the relevant bits on return) The two downsides to this are 
that it does slow access to the actual registers, which'll impact the 
interpreter by a bit, and it will completely invalidate all the 
existing JIT code.

I'll note that I'm not sure that the register copying issue will 
truly be an issue in most code, if proper save/restore sets are done, 
which could be helped by hints passed to the pir compiler by whatever 
compiler modules are being used. On the other hand, looking at the -t 
output from The Work Project, I see a near-insane amount of bytecoded 
vtable method calling, so I'm willing to accept that it'll be an 
issue for many people. (I'd not have the problem if my data types had 
their vtable functions written in C. I chose not to for 
implementation speed reasons and because the dynloading stuff was 
badly broken when I needed them. I'm reasonably sure that the big 
languages (perl/python/ruby/tcl (Hi Wil) will have their basic PMC 
classes all done up in C)

I think this makes some sense, but I'm kinda sick at the moment, so 
it may not. (OTOH, being ill is likely why I'm looking at this again, 
so we take the good with the bad...)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Perl 6 Summary for 2004-10-01 through 2004-10-17

2004-10-19 Thread Matthew Walton
Austin Hastings wrote:
Michele Dondi wrote:
On Sun, 17 Oct 2004, Matt Fowles wrote:
Google groups has nothing for Perl6.language between October 2 and 14.
Is this really the case?  (I had not signed up until shortly before

Yes: no traffic at all for quite a while...

Does this mean that we're done?   :)
No, it means Larry's about to stun us with something seemingly bizarre 
and inexplicable which turns out to be a stroke of genius.

At least, I hope that's the case. Life's been so dull lately I've even 
applied to do a PhD.


Re: [perl #31919] [PATCH] Win32 perlnum test failure - test 36 (+- zero)

2004-10-19 Thread Ron Blaschke
 t\pmc\perlnumNOK 36#  got: '0
 # 0
 # '
 # expected: '0
 # -0.00
 # '

Visual C++ compiles -0.0 to 0.0, which leads to the error.  Attached
patch will fix this.
src/string.c







win32-perlnum-negzero.patch
Description: Binary data


Win32 - Visual C++ 7.1 Test Results Summary

2004-10-19 Thread Ron Blaschke
Just to let people know how things are going on win32 (at least from
my perspective).

Failed Test Stat Wstat Total Fail  Failed  List of Failed
---
t\library\streams.t2   512212   9.52%  14 18
t\pmc\nci.t2   512472   4.26%  38 43
6 tests and 55 subtests skipped.
Failed 2/123 test scripts, 98.37% okay. 4/1941 subtests failed, 99.79% okay.

I'd like to look into [perl #31921] (streams.t) next, unless
someone else has already taken that.

Ron





Re: Problems with 0.1.1 release on x86-64

2004-10-19 Thread Brian Wheeler
Sigh.  I'll get this right sometime!

Brian

Index: config/auto/jit.pl
===
RCS file: /cvs/public/parrot/config/auto/jit.pl,v
retrieving revision 1.33
diff -u -r1.33 jit.pl
--- config/auto/jit.pl  8 Mar 2004 08:49:05 -   1.33
+++ config/auto/jit.pl  19 Oct 2004 18:38:41 -
@@ -171,9 +171,9 @@
   else {
 Configure::Data-set(
   jitarchname = 'nojit',
-  jitcpuarch  = 'i386',
-  jitcpu  = 'I386',
-  jitosname   = 'nojit',
+  jitcpuarch  = $cpuarch,
+  jitcpu  = $cpuarch,
+  jitosname   = $osname,
   jitcapable  = 0,
   execcapable = 0,
   cc_hasjit   = '',
Index: config/auto/memalign.pl
===
RCS file: /cvs/public/parrot/config/auto/memalign.pl,v
retrieving revision 1.10
diff -u -r1.10 memalign.pl
--- config/auto/memalign.pl 13 Oct 2004 14:37:59 -  1.10
+++ config/auto/memalign.pl 19 Oct 2004 18:38:41 -
@@ -42,6 +42,13 @@
Configure::Data-set('malloc_header', 'stdlib.h');
 }
 
+if (Configure::Data-get('ptrsize') == Configure::Data-get('intsize')) {
+   Configure::Data-set('ptrcast','int');
+  }
+else {
+   Configure::Data-set('ptrcast','long');
+  }
+
 cc_gen('config/auto/memalign/test_c.in');
 eval { cc_build(); };
 unless ($@ || cc_run_capture() !~ /ok/) {
Index: config/auto/memalign/test_c.in
===
RCS file: /cvs/public/parrot/config/auto/memalign/test_c.in,v
retrieving revision 1.4
diff -u -r1.4 test_c.in
--- config/auto/memalign/test_c.in  13 Jul 2003 18:52:37 -  1.4
+++ config/auto/memalign/test_c.in  19 Oct 2004 18:38:41 -
@@ -9,6 +9,6 @@
 
 int main(int argc, char **argv) {
void *ptr = memalign(256, 17);
-   puts(ptr  ((int)ptr  0xff) == 0 ? ok : nix);
+   puts(ptr  ((${ptrcast})ptr  0xff) == 0 ? ok : nix);
return 0;
 }
Index: config/auto/memalign/test_c2.in
===
RCS file: /cvs/public/parrot/config/auto/memalign/test_c2.in,v
retrieving revision 1.3
diff -u -r1.3 test_c2.in
--- config/auto/memalign/test_c2.in 13 Jul 2003 18:52:37 -  1.3
+++ config/auto/memalign/test_c2.in 19 Oct 2004 18:38:41 -
@@ -20,6 +20,6 @@
 *  arbitrary allocation size)
 */
int i = posix_memalign(p, s, 177);
-   puts(((int)p  0xff) == 0  i == 0 ? ok : nix);
+   puts(((${ptrcast})p  0xff) == 0  i == 0 ? ok : nix);
return i;
 }




Re: [Summary] Register stacks again

2004-10-19 Thread Miroslav Silovic
[EMAIL PROTECTED] wrote:
Could we have the chunks only hold one frame and avoid much of the
compaction work?  If we return to the inderict access mechanism, we
can switch register frames by changing one pointer.  But if we keep
the one frame per chunk, we do not need to compact frames, standard
DOD/GC will be able to reclaim frames.  I recall there being
efficiency issues with frames being frequently allocated/deallocated
too frequently, so we could have a special free list for frames.
This proposal feels to me like a slightly simpler version of yours. 
Thus I would argue for it on the grounds of do the simple thing first
and compare its efficiency.
 

Well, for the code that doesn't do call/cc, bigger chunks mean that that 
you can use them as a classical stack. So you won't ever have to 
allocate them, and never have to run the compaction. For call/cc, you 
still don't have to compact them as often, since the non-captured 
continuations will get popped normally, and the watermark lowering will 
take care of temporarily captured continuations (between two GC's).

Basically bigger chunks mean that frames are allocated using the special 
scheme just for them. Considering that you're going to allocate one on 
each function call, I would agree with LT that the complexity is 
justified (and is not too bad - the way I understand the Parrot 
internals, which is to say, not too well ;), arrays of PMC pointers 
already get copy-collected; stack frame chunks are not too different 
from these).

  Miro



Re: [Proposal] JIT, exec core, threads, and architectures

2004-10-19 Thread Jeff Clites
On Oct 19, 2004, at 1:56 AM, Leopold Toetsch wrote:
Jeff Clites [EMAIL PROTECTED] wrote:
On Oct 17, 2004, at 3:18 AM, Leopold Toetsch wrote:

Nethertheless we have to create managed objects (a Packfile PMC) so
that we can recycle unused eval-segments.

True, and some eval-segments are done as soon as they run (eval 3 +
4), whereas others may result in code which needs to stay around 
(eval
sub {}), and even in the latter case not _all_ of the code
generated in the eval would need to stay around. It seems that it may
be hard to determine what can be recycled, and when.
Well, not really. As long as you have a reference to the code piece,
it's alive.
Yes, that's what I meant. In the case of:
$sum = eval 3 + 4;
you don't have any such reference. In the case of:
$sub = eval sub { return 7 };
you do. In the case of:
$sub = eval 3 + 4; sub { return 7 };
you've got a reference to the sub still, but the 3 + 4 code is no 
longer reachable, so wouldn't need to stay around.

But it's possible that Parrot won't be able to tell the difference, and 
will have to keep around more than is necessary.

And we have to protect the packfile dictionary with mutexes, when 
this
managing structure changes i.e. when new segments gets chained into
this list or when they get destroyed.

Yes, though it's not clear to me if all eval-segments will need to go
into a globally-accessible dictionary. (e.g., it seems the 3 + 4 
case
above would not.)
It probably depends on the generated code. If this code creates globals
(e.g. Sub PMCs) it ought to stay around.
Yes, that's what I meant by not all--some yes, some no.
JEff