date:20040116

Re: Some namespace notes

2004-01-16 Thread Jeff Clites

On Jan 15, 2004, at 9:52 AM, Dan Sugalski wrote:

At 10:13 AM -0800 1/13/04, Jeff Clites wrote:
Here are some notes on namespaces, picking up a thread from about a 
month ago:

On Dec 11, 2003, at 8:57 AM, Dan Sugalski wrote:

That does, though, argue that we need to revisit the global access 
opcodes. If we're going hierarchic, and we want to separate out the 
name from the namespace, that would seem to argue that we'd want it 
to look  like:

  find_global P1, ['global', 'namespace', 'hierarchy'], thingname

That is, split the namespace path from the name of the thing, and 
make the namespace path a multidimensional key.
I definitely agree that we should have separate slots for namespace 
and name, as you have above. So I think the discussion boils down to 
whether a namespace specifier is logically a string or an array of 
strings.

Short version: I was originally going to argue for fully hierarchical 
namespaces, identified as above, but after turning this over in my 
head for a while, I came to the conclusion that namespaces are not 
conceptually hierarchical (especially as used in languages such as 
Perl5 and Java, at least), so I'm going to argue for a single string 
(rather than an array) as a namespace identifier.
Here's my big, and in fact *only*, reason to go hierarchical:

We don't need to mess around with separator character substitution.

Other than that I don't much care and, as you've pointed out, most of 
the languages don't really do a hierarchical structure as such. Going 
hierarchical, though, means we don't have to do ::/:///whatever 
substitutions to present a unified view of the global namespaces.
A key part of my argument (and it's find if you understood this, and 
disagree--just wanted to make sure that it was clear) is that I think 
we shouldn't try to do any sort of cross-language unification. That is, 
if we some day have a Parrot version of Java, and in Perl6 code I want 
to reference a global created inside of some Java class I've loaded in, 
it would be clearer to just reference this as 
java.lang.String.CASE_INSENSITIVE_ORDER, even inside of Perl6 
code--rather than having to do something like 
java::lang::String::CASE_INSENSITIVE_ORDER. Parrot itself would be 
completely ignorant of any concept of a separator character--these 
would just be uninterpreted strings, and foo::bar and foo.bar would 
be separate namespaces, whatever the language. I think it's confusing 
to try to unify namespaces across languages, and doesn't buy us 
anything. I think it's much cleaner to say namespaces have names which 
are arbitrary strings, and if you want to put colons or periods in the 
name, so what--parrot doesn't care.

(That said, if the Perl6 creators and the Java-on-Parrot creators 
decided that it _is_ good to try to unify their namespaces, they could 
still do this at the compiler level--so maybe Perl6 would substitute 
. for :: in namespace names at compile time. But parrot itself 
wouldn't know or care. And, if the Python people decide it's better not 
to try to unify with this mega-namespace, that's up to them.)

So I'm arguing here against a unified view of the global namespaces. 
But, if we decide that's needed, then I definitely agree that it's best 
to avoid having some magic separator character--much cleaner to treat 
it as an array.

JEff

Re: JVM as a threading example (threads proposal)

2004-01-16 Thread Leopold Toetsch

Damien Neil [EMAIL PROTECTED] wrote:
 On Thu, Jan 15, 2004 at 09:31:39AM +0100, Leopold Toetsch wrote:
 I don't see any advantage of such a model. The more as it doesn't
 gurantee any atomic access to e.g. long or doubles. The atomic access to
 ints and pointers seems to rely on the architecture but is of course
 reasonable.

 You *can't* guarantee atomic access to longs and doubles on some
 architectures, unless you wrap every read or write to one with a
 lock.  The CPU support isn't there.

Yes, that's what I'm saying. I don't see an advantage of JVMs multi-step
variable access, because it even doesn't provide such atomic access.

Parrot deals with PMCs, which can contain (lets consider scalars only)
e.g. a PerlInt or a PerlNumer. Now we would have atomic access
(normally) to the former and very likely non-atomic access to the latter
just depending on the value which happened to be stored in the PMC.

This implies, that we have to wrap almost[1] all shared write *and* read
PMC access with LOCK/UNLOCK.

[1] except plain ints and pointers on current platforms

- Damien

leo

Re: Problem during make test

2004-01-16 Thread Leopold Toetsch

Chromatic [EMAIL PROTECTED] wrote:
 On Sun, 2004-01-04 at 12:09, Harry Jackson wrote:

 I tried that as well, it spits out identical PASM each time but on the
 odd occasion I need to use CTRL-C to get back to the shell.

 I'm seeing the same thing on Linux PPC -- odd hangs from time to time
 when running PIR, while running the PASM emitted with -o works well.
 t/op/arithmetics 3 and 9 seem to be the big culprits in the test suite.

Could you attach gdb to the hanging parrot?

$ cat sl.pasm
  sleep 1
  end
$ parrot sl.pasm

[ in second term ]

$ ps ax | grep  [p]arrot
28952 pts/0S  0:00 parrot sl.pasm
28953 pts/0S  0:00 parrot sl.pasm
28954 pts/0S  0:00 parrot sl.pasm

$ gdb parrot 28952
GNU gdb 5.3
...
0x4011a391 in __libc_nanosleep () at __libc_nanosleep:-1
-1  __libc_nanosleep: No such file or directory.
in __libc_nanosleep
(gdb) bac
#0  0x4011a391 in __libc_nanosleep () at __libc_nanosleep:-1
#1  0x4011a31b in __sleep (seconds=1)
at ../sysdeps/unix/sysv/linux/sleep.c:82
#2  0x08086792 in Parrot_sleep (seconds=1) at src/platform.c:47
#3  0x080f89c4 in Parrot_sleep_ic (cur_opcode=0x826e488, interpreter=0x824b0a8)
at ops/sys.ops:151
#4  0x08082921 in runops_slow_core (interpreter=0x824b0a8, pc=0x826e488)
at src/runops_cores.c:115
...

This is on linux, the lowest PID is the main thread.
There should be some hints, where it hangs.

 -- c

leo

[FYI] Win32 SFU

2004-01-16 Thread Leopold Toetsch

I don't know, if we should depend on that, but it would definitely help. 
Could some Windows guys have a look at:
http://www.microsoft.com/windows/sfu/

cite
[Interoperability. Integration. Extensibility.]
Windows Services for UNIX (SFU) 3.5 provides the tools and environment 
that IT professionals and developers need to integrate Windows with UNIX 
and Linux environments.
/cite

leo

Re: Some namespace notes

2004-01-16 Thread Jeff Clites

On Jan 15, 2004, at 8:26 PM, Benjamin K. Stuhl wrote:

Thus wrate Dan Sugalski:
At 10:13 AM -0800 1/13/04, Jeff Clites wrote:
Short version: I was originally going to argue for fully 
hierarchical namespaces, identified as above, but after turning this 
over in my head for a while, I came to the conclusion that 
namespaces are not conceptually hierarchical (especially as used in 
languages such as Perl5 and Java, at least), so I'm going to argue 
for a single string (rather than an array) as a namespace 
identifier.
...
Performance-wise, I would guesstimate that it's more-or-less a
wash between parsing strings and parsing multidimensional keys,
so as long as we precreate the keys (keep thm in a constant
table or something), I see no performance issues.
It turns out that it makes a big difference in lookup times--doing one 
hash lookup v. several. I did this experiment using Perl5 (5.8.0): 
Create a structure holding 1296 entries, each logically 12 characters 
long--either one level of 12 character strings, or 2 levels of 6 
character strings, or 3 levels of 4 character strings, or 4 levels of 3 
character strings, and look up the same item 10 million times. Here is 
the time it takes for the lookups:

1-level: 14 sec.
2-level: 20 sec.
3-level: 25 sec.
4-level: 32 sec.
Conclusion: It's faster to do one lookup of a single, longer string 
than several lookups of shorter strings.

Of course, as Uri pointed out, even if we go with hierarchical 
namespaces, we could implement these internally as a single-level hash, 
behind the scenes, as an implementation detail and optimization.

JEff

Re: Problem during make test

2004-01-16 Thread chromatic

On Thu, 2004-01-15 at 23:26, Leopold Toetsch wrote:

 Could you attach gdb to the hanging parrot?

This time, it's hanging at t/op/00ff-dos.t:

(gdb) bac
#0  0x0fd0e600 in sigsuspend () from /lib/libc.so.6
#1  0x0ff970ac in __pthread_wait_for_restart_signal ()
   from /lib/libpthread.so.0
#2  0x0ff96cf8 in pthread_onexit_process () from /lib/libpthread.so.0
#3  0x0fd10bc8 in exit () from /lib/libc.so.6
#4  0x1008c750 in Parrot_exit (status=0) at src/exit.c:54
#5  0x100320b4 in main (argc=1, argv=0x75c0) at imcc/main.c:555

Here's another run, this time hanging at test #3 in t/op/arithmetics.t:

#0  0x0fd0e600 in sigsuspend () from /lib/libc.so.6
#1  0x0ff970ac in __pthread_wait_for_restart_signal ()
   from /lib/libpthread.so.0
#2  0x0ff96cf8 in pthread_onexit_process () from /lib/libpthread.so.0
#3  0x0fd10bc8 in exit () from /lib/libc.so.6
#4  0x1008c750 in Parrot_exit (status=0) at src/exit.c:54
#5  0x100320b4 in main (argc=1, argv=0x75b0) at imcc/main.c:555

I can upgrade glibc to see if that helps.

-- c

Re: Some namespace notes

2004-01-16 Thread Luke Palmer

Jeff Clites writes:
 On Jan 15, 2004, at 8:26 PM, Benjamin K. Stuhl wrote:
 
 Thus wrate Dan Sugalski:
 At 10:13 AM -0800 1/13/04, Jeff Clites wrote:
 Short version: I was originally going to argue for fully 
 hierarchical namespaces, identified as above, but after turning this 
 over in my head for a while, I came to the conclusion that 
 namespaces are not conceptually hierarchical (especially as used in 
 languages such as Perl5 and Java, at least), so I'm going to argue 
 for a single string (rather than an array) as a namespace 
 identifier.
 ...
 Performance-wise, I would guesstimate that it's more-or-less a
 wash between parsing strings and parsing multidimensional keys,
 so as long as we precreate the keys (keep thm in a constant
 table or something), I see no performance issues.
 
 It turns out that it makes a big difference in lookup times--doing one 
 hash lookup v. several. I did this experiment using Perl5 (5.8.0): 
 Create a structure holding 1296 entries, each logically 12 characters 
 long--either one level of 12 character strings, or 2 levels of 6 
 character strings, or 3 levels of 4 character strings, or 4 levels of 3 
 character strings, and look up the same item 10 million times. Here is 
 the time it takes for the lookups:
 
 1-level: 14 sec.
 2-level: 20 sec.
 3-level: 25 sec.
 4-level: 32 sec.
 
 Conclusion: It's faster to do one lookup of a single, longer string 
 than several lookups of shorter strings.
 
 Of course, as Uri pointed out, even if we go with hierarchical 
 namespaces, we could implement these internally as a single-level hash, 
 behind the scenes, as an implementation detail and optimization.

My two cents:  I don't care as long as we can toss symbol tables around
as PMCs, and replace symbol tables with different (alternately
implemented) PMCs.   I think it's possible to do this using a clever
ordered hash scheme even if we go one level, but it's something to keep
in mind.

Luke

 JEff

Re: JVM as a threading example (threads proposal)

2004-01-16 Thread Jeff Clites

On Jan 15, 2004, at 10:55 PM, Leopold Toetsch wrote:
Damien Neil [EMAIL PROTECTED] wrote:
On Thu, Jan 15, 2004 at 09:31:39AM +0100, Leopold Toetsch wrote:
I don't see any advantage of such a model. The more as it doesn't
gurantee any atomic access to e.g. long or doubles. The atomic 
access to
ints and pointers seems to rely on the architecture but is of course
reasonable.

You *can't* guarantee atomic access to longs and doubles on some
architectures, unless you wrap every read or write to one with a
lock.  The CPU support isn't there.
Yes, that's what I'm saying. I don't see an advantage of JVMs 
multi-step
variable access, because it even doesn't provide such atomic access.
What I was expecting that the Java model was trying to do (though I 
didn't find this) was something along these lines: Accessing the main 
store involves locking, so by copying things to a thread-local store we 
can perform several operations on an item before we have to move it 
back to the main store (again, with locking). If we worked directly 
from the main store, we'd have to lock for each and every use of the 
variable.

The reason I'm not finding it is that the semantic rules spelled out in 
the spec _seem_ to imply that every local access implies a 
corresponding access to the main store, one-to-one. On the other hand, 
maybe the point is that it can save up these accesses--that is, lock 
the main store once, and push back several values from the thread-local 
store. If it can do this, then it is saving some locking.

Parrot deals with PMCs, which can contain (lets consider scalars only)
e.g. a PerlInt or a PerlNumer. Now we would have atomic access
(normally) to the former and very likely non-atomic access to the 
latter
just depending on the value which happened to be stored in the PMC.

This implies, that we have to wrap almost[1] all shared write *and* 
read
PMC access with LOCK/UNLOCK.

[1] except plain ints and pointers on current platforms
Ah, but this misses a key point: We know that user data is allowed to 
get corrupted if the user isn't locking properly--we only have to 
protect VM-internal state. The key point is that it's very unlikely 
that there will be any floats involved in VM-internal state--it's going 
to be all pointers and ints (for offsets and lengths). That is, a 
corrupted float won't crash the VM.

JEff

Re: Problem during make test

2004-01-16 Thread Jeff Clites

On Jan 15, 2004, at 10:42 PM, chromatic wrote:

On Sun, 2004-01-04 at 12:09, Harry Jackson wrote:

I tried that as well, it spits out identical PASM each time but on the
odd occasion I need to use CTRL-C to get back to the shell.
I'm seeing the same thing on Linux PPC -- odd hangs from time to time
when running PIR, while running the PASM emitted with -o works well.
t/op/arithmetics 3 and 9 seem to be the big culprits in the test suite.
Perl 5.8.2, gcc version 3.2.3 20030422.

I've checked out a fresh source tree and still see this behavior.
Removing -DHAVE_JIT from the Makefile (since I didn't find the 
configure
argument) had no effect.
Yeah, I think JIT is a red herring--I don't see how JIT problems can be 
involved when not running with the JIT core

JEff

Q: thread function

2004-01-16 Thread Leopold Toetsch

Should a thread function be always non-prototyped or do we allow 
prototyped ones too?

On preparing a thread, the parameters of the thread function are 
copied/cloned into the new thread interpreter. So I'd like to know, 
which registers get the state from the calling interpreter (all or 
according to prototyped or to non-prototyped calling conventions).

leo

Re: Optimization brainstorm: variable clusters

2004-01-16 Thread Elizabeth Mattijsen

At 19:52 -0500 1/15/04, Melvin Smith wrote:
At 04:26 PM 1/15/2004 -0700, Luke Palmer wrote:
I can see some potential problems to solve with regards to some
languages where variables are dynamic and can be undefined,
such as Perl6, but the optimization would certainly work for
constants in all languages. The only problem with Perl6 would be
if a global or package variable's address changed after it was stored
in the register group at bytecode load time, (which could probably happen).
Which is very hard not to happen as soon as you get into Exporter 
land.  ;-(  For example:

  use Scalar::Util qw(blessed weaken reftype);
  use POSIX;

Anytime we cache something dynamic, we have to make sure the caches
know about changes. I think that is where notifications might help.
For constants it is easy. IMCC might say, this routine requires us 
to intialize
at least 3 registers with a constant value, lets make it into a 
register block

This may be a premature optimization, but for certain cases I think its pretty
nifty.
This smells like premature optimization to me for languages such as Perl[\d].

The number of times a variable occurs in a program, may have _no_ 
relation to how many times it will be accessed.  So what's the 
optimization then?

If you're thinking about this, then maybe a better heuristic would be 
to group globals into groups that are _only_ referenced within a 
specific scope and fetch them on scope entry and store them on scope 
exit.  But then, anything like eval or the equivalent of a glob 
assignment (or even worse: an event) within that scope, will cause 
problems.

But please, people around me always tell me that I'm way too 
negative.  That I'm always saying why things _can't_ happen.  I'd 
like to be proven wrong...  ;-)

Liz

Re: [DOCS] POD Errors

2004-01-16 Thread Michael Scott

They're already commited.

On 16 Jan 2004, at 00:21, chromatic wrote:

On Thu, 2004-01-15 at 15:02, Michael Scott wrote:

So, after migrating from Pod::Checker to Pod-Simple, I've cleared up
all the pod errors and done a rudimentary html tree.
Do you have patches to fix the errors in CVS or are they even 
necessary?

-- c

Re: Problem during make test

2004-01-16 Thread Leopold Toetsch

Chromatic [EMAIL PROTECTED] wrote:
 On Thu, 2004-01-15 at 23:26, Leopold Toetsch wrote:

 Could you attach gdb to the hanging parrot?

 This time, it's hanging at t/op/00ff-dos.t:

 (gdb) bac
 #0  0x0fd0e600 in sigsuspend () from /lib/libc.so.6
 #1  0x0ff970ac in __pthread_wait_for_restart_signal ()
from /lib/libpthread.so.0
 #2  0x0ff96cf8 in pthread_onexit_process () from /lib/libpthread.so.0
 #3  0x0fd10bc8 in exit () from /lib/libc.so.6
 #4  0x1008c750 in Parrot_exit (status=0) at src/exit.c:54
 #5  0x100320b4 in main (argc=1, argv=0x75c0) at imcc/main.c:555

Ugly. Parrot starts the event thread as detached, so that *should* not
cause problems. But maybe I'm doing something stupid somewhere.
The event thread is waiting on a queue condition, but when main exits,
it should just terminate AFAIK. I can send a special
event_loop_terminate event though.

 I can upgrade glibc to see if that helps.

That's another possibility.

 -- c

leo

Re: Problem during make test

2004-01-16 Thread Leopold Toetsch

Chromatic [EMAIL PROTECTED] wrote:
 This time, it's hanging at t/op/00ff-dos.t:

I've checked in now:

* terminate the even loop thread on destroying of the last interp
* this could help against the spurious hangs reported on p6i

Could you please check if that helps.

Thanks,
leo

Re: Optimization brainstorm: variable clusters

2004-01-16 Thread Leopold Toetsch

Elizabeth Mattijsen [EMAIL PROTECTED] wrote:

 If you're thinking about this, then maybe a better heuristic would be
 to group globals into groups that are _only_ referenced within a
 specific scope and fetch them on scope entry and store them on scope
 exit.  But then, anything like eval or the equivalent of a glob
 assignment (or even worse: an event) within that scope, will cause
 problems.

Storing lexicals or globals isn't needed:

$ cat g.pasm
  new P0, .PerlInt
  set P0, 4
  store_global $a, P0
  # ...
  find_global P1, $a
  inc P1
  find_global P2, $a
  print P2
  print \n
  end
$ parrot g.pasm
5

So the optimization is to just keep lexicals/globals in registers as long
as we have some. Where currently spilling is done, we just forget about
that register (but not *reuse* it, Cnew or such is ok) - and refetch
the variable later.

So the *only* current optimization is: we need HLL directives for
lexicals and globals so that the spilling code and register allocator
can use this information. That is: we can always cut the life range of
lexicals/globals, *if* we refetch, where we now fetch from the spill
array.

 Liz

leo

Re: Unicode, internationalization, C++, and ICU

2004-01-16 Thread Michael Scott

Maybe we can use someone else's solution...

http://lists.ximian.com/archives/public/mono-list/2003-November/ 
016731.html

On 16 Jan 2004, at 00:33, Jonathan Worthington wrote:

- Original Message -
From: Dan Sugalski [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Thursday, January 15, 2004 8:09 PM
Subject: Unicode, internationalization, C++, and ICU

Now, assuming there's still anyone left reading this message...

We've been threatening to build ICU into parrot, and it's time for
that to start happening. Unfortunately there's a problem--it doesn't
work right now. So, what we need is some brave soul to track ICU
development and keep us reasonably up to date. What I'd really like
is:
1) ICU building and working
2) ICU not needing any C++
I've done some testing, and I hate to be the bearer of bad news but I
believe we have something of a problem.  :-(  The configure script  
turns out
to be a shell script which, unless I'm mistaken, means we're currently
unable to build ICU anywhere we don't have bash or similar.  Win32 for
starters, which is where I'm testing.  A possible solution might be to
re-write the configure script in Perl - though we'd have to keep it
maintained as we do ICU updates.  Another one, for Win32 at least, is  
that
we *might* be able to use UNIX Services For Win32 and run configure  
under
that, generate a Win32 makefile and just copy it in place with the  
configure
script.  Less portable to other places with the same problem, though,  
and
again we have to maintain it as we update ICU.

There is also a problem with the configure stage on Win32, but that's  
an
aside until the above issue is sorted out.

I also gave it a spin in cygwin, where the configure script for ICU  
runs
OK, but there's no C++ compiler so it doesn't get built.

Thoughts?

Jonathan

[PATCHish] Food for thought about allocation

2004-01-16 Thread Luke Palmer

I did a little benchmark test today that had some enlightening results.
I'll explain what my goal was, then show you said results, and finally a
patch (not meant to be committed) and the benchmark programs themselves.

I dearly wanted real Continuations to be faster, because I loathe
RetContinuations.  They suck.  Honest.  They're of no use as or with
continuations -- they do little more than put a return value on the
proverbial stack.  And they can't be promoted without serious danger of
the state being corrupted before this happens.

So, I implemented a new register stack scheme without chunks.  The
register stack is now a linked list of single frames.  This has the
advantage that you *never* have to copy frames, nor do you have to mess
with marking things COW.  I figured it would be an improvement.

I wrote it using Parrot's default small object allocator for each frame.
I ran benchmark_1 (see below), and got these results (or around there, I
idiotically didn't keep them around):

% time parrot-sync benchmark_1.imc  # Parrot from CVS
parrot-sync benchmark_1.imc  0.87s user ...
% time parrot  benchmark_1.imc  # My modified version
parrot benchmark_1.imc 2.04s user ...

Benchmark one is about raw save/restore speed, without respect for
anything fancy.

This was clearly unacceptable.  I doesn't matter how much faster it
makes continuations, this large a speed hit can't be compromised.  

I thought the problem might be in the small object allocator, as I
couldn't imagine how the simple algorithms in my modification could
possibly take so long.  So I wrote my own small object allocator for the
register stacks, and got:

% time parrot benchmark_1.imc
parrot benchmark_1.imc 1.11s user ...

Much better.  Didn't make me entirely happy, but a great improvement
indeed.  This could probably be improved with some careful and attentive
coding.

Here are the times for the other benchmarks:

% time parrot-sync benchmark_2.imc
parrot-sync benchmark_2.imc 0.78s user ...
% time parrot benchmark_2.imc
parrot benchmark_2.imc 0.89s user ...

Benchmark 2 calls a sub many times, creating a RetContinuation each
time.

And finally the one with real continuations:

% time parrot-sync benchmark_3.imc
parrot-sync benchmark_3.imc 1.58s user ...
% time parrot benchmark_3.imc
parrot benchmark_3.imc 0.45s user ...

Benchmark 3 does the same as benchmark 2, except that it creates a real
Continuation each time.  Note the number of iterations is half that of
benchmark 2.

Well, I achieved my goal for sure.  Side results weren't so great.

But the most enlightening thing is how much things improved by writing
my own small object allocator -- even a quick and dirty one.  Last time
I was fishing through that subsystem, it seemed to have a lot of
overhead.  Are there things about it that require this overhead, or
could it take an optimization run brining it near the overhead level of
this little one?  If so, we might have an opportunity to boost parrot's
usual speed by a fine degree.

Here are the benchmarks:

benchmark_1.imc
---

.sub _main
$I0 = 50
again:
unless $I0 goto quit
savetop
restoretop
dec $I0
goto again
quit:
end
.end

benchmark_2.imc
---

.sub _main
$I0 = 10
newsub P0, .Sub, _other
again:
unless $I0 goto quit
saveall
newsub P1, .RetContinuation, back
invokecc
restoreall
dec $I0
goto again
quit:
end
.end

.sub _other
invoke P1
.end

benchmark_3.imc
---

.sub _main
$I0 = 5
newsub P0, .Sub, _other
again:
unless $I0 goto quit
saveall
newsub P1, .Continuation, back
invokecc
restoreall
dec $I0
goto again
quit:
end
.end

.sub _other
invoke P1
.end

And the exemplary patch:

Index: config/gen/config_h/config_h.in
===
RCS file: /cvs/public/parrot/config/gen/config_h/config_h.in,v
retrieving revision 1.20
diff -u -r1.20 config_h.in
--- config/gen/config_h/config_h.in 24 Dec 2003 14:54:16 -  1.20
+++ config/gen/config_h/config_h.in 16 Jan 2004 10:08:56 -
@@ -100,15 +100,8 @@
 
 /* typedef INTVAL *(*opcode_funcs)(void *, void *) OPFUNC; */
 
-#define FRAMES_PER_CHUNK 16
-
 /* Default amount of memory to allocate in one whack */
 #define DEFAULT_SIZE 32768
-
-#define FRAMES_PER_PMC_REG_CHUNK FRAMES_PER_CHUNK
-#define FRAMES_PER_NUM_REG_CHUNK FRAMES_PER_CHUNK
-#define FRAMES_PER_INT_REG_CHUNK FRAMES_PER_CHUNK
-#define FRAMES_PER_STR_REG_CHUNK FRAMES_PER_CHUNK
 
 #define JIT_CPUARCH  ${jitcpuarch}
 #define JIT_OSNAME   ${jitosname}
Index: imcc/pcc.c
===
RCS file: /cvs/public/parrot/imcc/pcc.c,v

Vtables organization

2004-01-16 Thread Leopold Toetsch

PMCs use Vtables for almost all their functionality *and* for stuff that 
in Perl5 term is magic (or they should AFAIK).

E.g. setting the _ro property of a PMC (that supports it[1]) swaps in 
the Const$PMC vtable, where all vtable methods that would change the PMC 
thrown an exception.
Or: setting a PMC shared, whould swap in a vtable, that locks e.g. 
internal aggregate state on access. That is a non-shared PMC doesn't 
suffer of any locking slowdown.
Tieing will very likely swap in just another vtable and so on.

The questions are:
- Where and how should we store these vtables?
- Are these PMC variants distinct types (with class_enum and name)
- Or are these sub_types (BTW what is vtable-subtype)? E.g. hanging off 
from the main vtable?

Comments welcome,
leo
[1] This still needs more work: Real constant PMCs are allocated in a 
separate arena which isn't scanned during DOD, *but* all items, that the 
PMC may refer too have to be constant too, including Buffers it may use. 
But swapping in the vtable is working.

Events and JIT

2004-01-16 Thread Leopold Toetsch

Event handling currently works for all run cores[1] except JIT.

The JIT core can't use the schemes described below, but we could:
1) explicitely insert checks, if events are to be handled
1a) everywhere or
1b) in places like described below under [1] c)
2) Patch the native opcodes at these places with e.g. int3 (SIG_TRAP, 
debugger hook) cpu instruction and catch the trap. Running the event 
handler (sub) from there should be safe, as we are in a consistent state 
in the run loop.
3) more ideas?

1) of course slows down execution of all JIT code, 2) is 
platform/architecture dependent, but JIT code is that anyway.

Comments welcome,
leo
[1]
a) Run cores with an opcode dispatch table get a new dispatch table, 
where all entries point to the event handling code
b) The switch core checks at the beginning of the switch statement.
c) Prederefed run cores get the opcode stream patched, where back-ward 
branches and invoke or such[2] are replaced with event-check opcodes.

While a) and c) is run async from the event thread, it shouldn't cause 
problems, because (assuming atomic word-access) either the old function 
table/opcode pointer is visible or already the new, there is no 
inconsistent state.

Events using a) or b) are handled instantly, while c) events get handled 
some (probably very short time) after they were scheduled.

[2] Explicit hints from the ops-files, where to check events would 
simplify that. E.g.

op event invoke()
op event_after sleep(..)

Re: Some namespace notes

2004-01-16 Thread Tim Bunce

Here's my proposal:

* Basics:

Parrot uses nested hashes for namespaces (like perl does).

The high-level language splits namespace strings using whatever
its separator is ('::', '.' etc) to generate an array of strings
for the namespace lookup.


* Relative roots:

Namespace lookup starts from a 'root' namespace (think root directory).
Here the P2 argument holds the root namespace to start the lookup from:

  find_global P1, P2, ['global', 'namespace', 'hierarchy'], thingname

If it's null then the interpreters default root namespace is used.

This scheme allows chroot() style shifting of the effective root.
(It's a key part of how the perl Safe module works, for example.)


* Per-language root:

Each HLL could use a 'root' that's one level down from the true root.
Using a directory tree for illustration:

 /perl/Carp/carp  perl sees Carp at top level
 /java/java/lang/...  java sees java at top level


* Backlinks:

 /perl/main -.main points back to perls own root
  ^--'(so $main::main::main::foo works as it should)

 /perl/parrot -.  parrot points back to true root
 ^-'


* Accessing namespace of other languages:

Given the above, accessing the namespace of other languages is as simple as:

 /perl/parrot/java/java/lang/String/...

eg $parrot::java::java::lang::String::CASE_INSENSITIVE_ORDER for perl
and parrot.perl.Carp.carp for Java (perhaps, I don't claim to know any Java)


* Summary:

  - Nested hashes allow chroot() style shifting of the root.
  - That requires the 'effective root' to be passed to find_global.
  - Each HLL could have it's own 'root' to avoid name clashes.
  - Backlinks can be used to provide access to other HLL namespaces.
  - This combination of unification (all in one tree) and isolation
(each HLL has a separate root) offers the best of all worlds.

Tim.

Ops file hints

2004-01-16 Thread Leopold Toetsch

IMHO we need some more information in the ops files:
1) If an INT argument refers to a label/branch destination
2) For the event-checking code
3) For the safe run core
ad 1) e.g.
inline op eq(in INT, in INT, inconst INT) {
inline op eq(in INT, in INT, in_branch_const INT) {
The Cin_branch_const translates during ops file mangling to Cinconst 
*and* sets an appropriate flag in OpLib/core.pm, which is carried on 
into the op_info structure.
Currently imcc just estimates, if an opcode would branch and where the 
label is, which is bad (and error prone) - or more precisley, branching 
ops are known, but not all have an associated label.

ad 2) e.g.
op event invoke()
op event_after sleep()
would mean to check at or after that opcode, if events are to be 
handled. This would be a flag (or 2) in the op_info.

ad 3)
The safe run-core will very likely need to disallow opcodes per category 
similar as in perldoc Opcode.

e.g.
op :base_io print(...)
Comments welcome (and takers, if we agree)
leo

Re: [PATCHish] Food for thought about allocation

2004-01-16 Thread Leopold Toetsch

Luke Palmer wrote:

I did a little benchmark test today that had some enlightening results.
First of all: did you compile Parrot optimized?
Because:

I thought the problem might be in the small object allocator, 
You actually have a similar thing, but mainly inlined. Object allocation 
and DOD need optimized compiles.


saveall
newsub P1, .RetContinuation, back
invokecc
You are creating *two* RetContinuations with that.

leo

Re: Some namespace notes

2004-01-16 Thread Leopold Toetsch

Tim Bunce [EMAIL PROTECTED] wrote:
 Here's my proposal:

 * Basics:

 Parrot uses nested hashes for namespaces (like perl does).


 * Relative roots:

 Namespace lookup starts from a 'root' namespace (think root directory).
 Here the P2 argument holds the root namespace to start the lookup from:

   find_global P1, P2, ['global', 'namespace', 'hierarchy'], thingname

I like that except: *again* above syntax sucks.

   find_global P1, P2 ['global'; 'namespace'; 'hierarchy'; thingname ]

P2 can be a namespace PMC or the interpreter itself.

   find_global P3, P2 ['global'; 'namespace'; 'hierarchy' ]

returns another namespace, and ...

   find_global P1, P3 [ thingname ]

is the same, as the first.

The original syntax would need heavy modifications in the assembler, the
latter fits nicely.

 Tim.

leo

Re: Some namespace notes

2004-01-16 Thread Tim Bunce

On Fri, Jan 16, 2004 at 12:49:09PM +0100, Leopold Toetsch wrote:
 Tim Bunce [EMAIL PROTECTED] wrote:
  Here's my proposal:
 
  * Basics:
 
  Parrot uses nested hashes for namespaces (like perl does).
 
 
  * Relative roots:
 
  Namespace lookup starts from a 'root' namespace (think root directory).
  Here the P2 argument holds the root namespace to start the lookup from:
 
find_global P1, P2, ['global', 'namespace', 'hierarchy'], thingname
 
 I like that except: *again* above syntax sucks.
 
find_global P1, P2 ['global'; 'namespace'; 'hierarchy'; thingname ]
 
 P2 can be a namespace PMC or the interpreter itself.
 
find_global P3, P2 ['global'; 'namespace'; 'hierarchy' ]
 
 returns another namespace, and ...
 
find_global P1, P3 [ thingname ]
 
 is the same, as the first.
 
 The original syntax would need heavy modifications in the assembler, the
 latter fits nicely.

Sure. Sounds good.

(I'm not well placed to talk about syntax as I've not yet written any
parrot code, though than may be about to change, it's the principles of
a unified hierarchy, chroot, and backlinks that's important.)

Tim.

Re: Optimization brainstorm: variable clusters

2004-01-16 Thread Tim Bunce

On Thu, Jan 15, 2004 at 06:27:52PM -0500, Melvin Smith wrote:
 At 06:13 PM 1/15/2004 -0500, Melvin Smith wrote:
 At 10:02 PM 1/15/2004 +0100, Elizabeth Mattijsen wrote:
 At 15:51 -0500 1/15/04, Melvin Smith wrote:
 Comments   questions welcome.
 
 Why am I thinking of the register keyword in C?
 
 I have no idea and I can't see the relationship. :)
 
 I just realized my response sounded crass, and wasn't meant to be.
 I welcome comments, I just didn't understand what relation
 you were getting at.  Feel free to point it out to me.
 
 The context: Jonathan was asking about importing
 constants at runtime and/or constant namespaces.
 
 Dan and I were discussing the issues and how routines
 with lots of package globals or constants would spend
 a significant part of their time retrieving symbols. Jonathan
 did not want compile time constants, Dan did not want
 importable constants that mutate the bytecode at runtime,
 so I was trying to come up with a compromise, ugly as
 it may be.

Aren't constant strings going to be saved in a way that lets the
address of the saved string be used to avoid string comparisons?
(As is done for hash keys in perl5.) Perhaps that's already done.

Then bytecode could 'declare' all the string constants it contains.
The byteloader could merge them into the global saved strings pool
and 'fixup' references to them in the bytecode.

If the namespace lookup code knew when it was being given saved
string pointers it could avoid the string compares as it walks the
namespace tree.

Maybe all that's been done. Here's an idea that builds on that:

Perhaps a variant of a hash that worked with large integers (pointers)
as keys could be of some use here. The namespace could be a tree of
these 'integer hashes'. Most namespace lookups use constants which
can be 'registered' in the unique string pool at byteloader time.

To lookup a non-constant string you just need to check if it's in
the unique string pool and get that pointer if it is. If it's not
then you know it doesn't exist anywhere. If it is you do the lookup
using the address of the string in the pool.

The JudyL functions (http://judy.sourceforge.net/) provide a very
efficient 'integer hash'.

Tim.

postgres lib

2004-01-16 Thread LF

hello

i noticed that harry jackson is writing a postgres interface to wrap 
the nci functions. i'd like him to show what he's written so far and 
give his comments

guess i'm not the only one hoping to test it

LF

Re: Optimization brainstorm: variable clusters

2004-01-16 Thread Dan Sugalski

At 9:30 AM +0100 1/16/04, Elizabeth Mattijsen wrote:
At 19:52 -0500 1/15/04, Melvin Smith wrote:
At 04:26 PM 1/15/2004 -0700, Luke Palmer wrote:
I can see some potential problems to solve with regards to some
languages where variables are dynamic and can be undefined,
such as Perl6, but the optimization would certainly work for
constants in all languages. The only problem with Perl6 would be
if a global or package variable's address changed after it was stored
in the register group at bytecode load time, (which could probably happen).
Which is very hard not to happen as soon as you get into Exporter land.  ;-(
Well... sorta. A lot of that stuff's known at compile time.

Anytime we cache something dynamic, we have to make sure the caches
know about changes. I think that is where notifications might help.
For constants it is easy. IMCC might say, this routine requires us 
to intialize
at least 3 registers with a constant value, lets make it into a 
register block

This may be a premature optimization, but for certain cases I think 
its pretty
nifty.
This smells like premature optimization to me for languages such as Perl[\d].
To some extent, yes. I just had a really nasty thought, and I think 
the compiler writers need to get Official Rulings on behavior.

With perl, for example, it's distinctly possible that this:

  our $foo; # It's a global
  $foo = 12;
  if ($foo  10) {
print $foo;
  }
will require fetching $foo's PMC out of the global namespace three 
times, once for each usage. I don't know offhand if this is how perl 
5 works (I think it might be) and we should check for perl 6, python, 
and ruby. This is mainly because of the possibility of tied or 
overridden namespaces, which would argue for a refetch on each use.

*Not* refetching is a perfectly valid thing, and not specifying is 
also perfectly valid, but we need to check.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Optimization brainstorm: variable clusters

2004-01-16 Thread Leopold Toetsch

Tim Bunce [EMAIL PROTECTED] wrote:
 Aren't constant strings going to be saved in a way that lets the
 address of the saved string be used to avoid string comparisons?

Constant strings get an entry in the constant_table. So for comparing 2
constant strings of the *same* code segment, strings differ, if their
address differ.

 (As is done for hash keys in perl5.) Perhaps that's already done.

Not yet, but its worth to look at.

 The byteloader could merge them into the global saved strings pool
 and 'fixup' references to them in the bytecode.

That's not possible generally. E.g. eval()ing a piece of code with
varying string constants would grow the global string constants
forever.

 Perhaps a variant of a hash that worked with large integers (pointers)
 as keys could be of some use here.

That doesn't play well with dynamic code AFAIK. Namespace keys cane be
string vars too.

 The JudyL functions (http://judy.sourceforge.net/) provide a very
 efficient 'integer hash'.

I had a look at that some time ago, but the internals are horribly
complex and it was leaking memory too.

 Tim.

leo

Re: cvs commit: parrot/src dynext.c extend.c

2004-01-16 Thread Leopold Toetsch

Dan Sugalski [EMAIL PROTECTED] wrote:
   +++ nci.pmc 16 Jan 2004 13:29:52 -  1.22

   +STRING* name () {
   +return SELF-vtable-whoami;

All classes inherit the Cname method from default.pmc.
Did it now work without this addition?

leo

Re: The todo list

2004-01-16 Thread Dan Sugalski

At 12:45 PM -0500 1/15/04, Dan Sugalski wrote:
What I'd like is for a volunteer or two to manage the todo queue. 
Nothing fancy, just be there to assign todo list items to the folks 
that volunteer, make sure they're closed out when done, and reassign 
them if whoever's handling a task needs to bail for whatever reason.
Okay, I've two volunteers, Dave Pippenger and Stephane Peiry. (Make 
sure you get me your perl.org logins if you've not done so yet (I've 
not checked (Wheee, parenthesis!))) When Robert's back from vacation 
we'll get you installed with access to the RT queues for parrot and 
see what we can do.

In the mean time, if anyone else has todo list items, send them 
(*ONE* per e-mail!) to bugs-parrot at bugs6.perl.org to get 'em in 
the queue and we'll start sorting them out from there. If we're lucky 
and have sufficient web luck we might even get 'em into a 
web-accessible TODO list (so make sure the subjects are descriptive 
too!)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Optimization brainstorm: variable clusters

2004-01-16 Thread Dan Sugalski

At 3:51 PM -0500 1/15/04, Melvin Smith wrote:
While sitting on IRC with Dan and Jonathan discussing how to
optimizer a certain construct with how we handle globals/package
variables, etc. I came to the conclusion that it would be valuable
to not have to fetch each and every global, lexical or package
variable by name, individually, but instead fetch them in
clusters (4-16 at a time).
We already have register frames which are saved and restored
very efficiently.
Two things:

1) Lexicals should be reasonably fast, as they're integer indexable 
in most cases. (The only time we *need* hashlike access is when we're 
doing symbolic lookup, which is pretty uncommon. Downright impossible 
for many languages)

2) I can easily see having something equivalent to the lookback op 
for register stacks. Won't help for ints and floats, but it'll work 
fine for PMCs and strings.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Some namespace notes

2004-01-16 Thread Dan Sugalski

At 12:49 PM +0100 1/16/04, Leopold Toetsch wrote:
Tim Bunce [EMAIL PROTECTED] wrote:
 Here's my proposal:

 * Basics:

 Parrot uses nested hashes for namespaces (like perl does).


 * Relative roots:

 Namespace lookup starts from a 'root' namespace (think root directory).
 Here the P2 argument holds the root namespace to start the lookup from:

   find_global P1, P2, ['global', 'namespace', 'hierarchy'], thingname
I like that except: *again* above syntax sucks.

   find_global P1, P2 ['global'; 'namespace'; 'hierarchy'; thingname ]
No. The thing will be a separate parameter.

The original syntax would need heavy modifications in the assembler, the
latter fits nicely.
We can cope. The assembler needs a good kick with regards to keyed 
stuff anyway, I expect, and we're going to need this for constructing 
keys at runtime, something we've not, as yet, addressed.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Unicode, internationalization, C++, and ICU

2004-01-16 Thread Dan Sugalski

At 10:40 AM +0100 1/16/04, Michael Scott wrote:
Maybe we can use someone else's solution...

http://lists.ximian.com/archives/public/mono-list/2003-November/016731.html
Could be handy. We really ought to detect a system-installed ICU and 
use that rather than our local copy at configure time, if it's of an 
appropriate version. That'd at least avoid having two copies, and 
potentially get us some system-wide runtime memory savings.

- Original Message -
From: Dan Sugalski [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Thursday, January 15, 2004 8:09 PM
Subject: Unicode, internationalization, C++, and ICU
Now, assuming there's still anyone left reading this message...

We've been threatening to build ICU into parrot, and it's time for
that to start happening. Unfortunately there's a problem--it doesn't
work right now. So, what we need is some brave soul to track ICU
development and keep us reasonably up to date. What I'd really like
is:
1) ICU building and working
2) ICU not needing any C++
I've done some testing, and I hate to be the bearer of bad news but I
believe we have something of a problem.  :-(  The configure script turns out
to be a shell script which, unless I'm mistaken, means we're currently
unable to build ICU anywhere we don't have bash or similar.  Win32 for
starters, which is where I'm testing.  A possible solution might be to
re-write the configure script in Perl - though we'd have to keep it
maintained as we do ICU updates.  Another one, for Win32 at least, is that
we *might* be able to use UNIX Services For Win32 and run configure under
that, generate a Win32 makefile and just copy it in place with the configure
script.  Less portable to other places with the same problem, though, and
again we have to maintain it as we update ICU.
There is also a problem with the configure stage on Win32, but that's an
aside until the above issue is sorted out.
I also gave it a spin in cygwin, where the configure script for ICU runs
OK, but there's no C++ compiler so it doesn't get built.
Thoughts?

Jonathan


--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Some namespace notes

2004-01-16 Thread Dan Sugalski

At 11:00 PM -0800 1/15/04, Jeff Clites wrote:
A key part of my argument (and it's find if you understood this, and 
disagree--just wanted to make sure that it was clear) is that I 
think we shouldn't try to do any sort of cross-language unification.
I saw that and wasn't really looking to deal with it, but I 
should've. I think we should have the potential for cross-language 
unification. It shouldn't be obligatory, but it should be easy, and I 
think we're going to see perl 5, perl 6, ruby, and python at least 
sharing a global namespace once we get things going sufficiently.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Optimization brainstorm: variable clusters

2004-01-16 Thread Leopold Toetsch

Simon Cozens [EMAIL PROTECTED] wrote:
 [EMAIL PROTECTED] (Dan Sugalski) writes:
if ($foo  10) {
  print $foo;
}
 This is mainly because of the possibility of tied or
 overridden namespaces, which would argue for a refetch on each use.

 No, come on, Dan. It's far worse than that.

 It'll be possible, from Perl-space to override either  or print, and it
 should well be possible for them to, for instance, tie their operands. Wee!

That's not the problem. But if the overriden C   op (or the print)
changes $foo's namespace on the fly, then a refetch would be necessary.

I'd prefer to have a hint:

  our $foo is volatile;

The normal case would be to fetch $foo exactly once - before any loop.

Honestly an overridden op could insert new rules in the code and
recompile everything. If we have to always check for such nasty things,
then we can forget all performance and optimizations.

leo

Re: Some namespace notes

2004-01-16 Thread Dan Sugalski

At 11:07 AM + 1/16/04, Tim Bunce wrote:
Here's my proposal:
I like it all except for the backlink part, and that only because I'm 
not sure the names are right. I'm tempted to use reasonably 
unavailable characters under the hood (yeah, I'm looking at NUL 
(ASCII 0) and maybe SOH (ASCII 1) for language root and global root). 
Otherwise it looks good, and I think it's the way to be going.

Languages can have the option of sharing a common root if they so 
choose, and set their search paths, since we're going to allow that 
sort of thing with nested namespaces. The default global space can be 
a two-level nest with the language level coming before the generic 
one in the search space.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Optimization brainstorm: variable clusters

2004-01-16 Thread Dan Sugalski

At 2:37 PM + 1/16/04, Simon Cozens wrote:
[EMAIL PROTECTED] (Dan Sugalski) writes:
if ($foo  10) {
  print $foo;
}
 This is mainly because of the possibility of tied or
 overridden namespaces, which would argue for a refetch on each use.
No, come on, Dan. It's far worse than that.

It'll be possible, from Perl-space to override either  or print, and it
should well be possible for them to, for instance, tie their operands. Wee!
Yeah, but we still control what they get. We've not gone so 
completely mad that we hand in strings and code snippets for runtime 
evaluation...
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Numeric formatting

2004-01-16 Thread Dan Sugalski

At 12:21 AM + 1/16/04, Tim Bunce wrote:
On Thu, Jan 15, 2004 at 05:15:05PM -0500, Dan Sugalski wrote:
 At 2:39 PM -0500 1/15/04, Dan Sugalski wrote:
 At 8:31 PM +0100 1/15/04, Michael Scott wrote:
 Is this relevant?
 http://oss.software.ibm.com/icu/userguide/formatNumbers.html
 
 I'm still not clear in my mind what the plan is with regard to ICU.
 Is it intended eventually to be:
 
a) an always-there part of parrot, or
b) just a sometimes-there thing that gets linked in if you
 mess with unicode?
 
 
 A) is the case. I didn't realize that the ICU library did numeric
 formatting.
 And then I realized, somewhat belatedly, that this won't necessarily work.
Won't work, or work but potentially have too much overhead?
Too much overhead. It means yanking in ICU and initializing parrot's 
Unicode subsystem just to do numeric formatting, which seems like a 
bit of overkill just to make:

   format foo, 9, .00

stick 0009.00 in foo. (Except for the case where it ought to stick 
0009,00, I suppose)

  I think I'd rather not have to yank things into Unicode just to
 format numbers.
I'm not quite sure what you mean here.
Well, what I'm trying hard to do is make Unicode as optional for 
Parrot as any other encoding. While I know that in some likely common 
cases (perl 6) it'll be required, I'd prefer it not be mandatory 
everywhere. For code that requires unicode it certainly makes sense 
to yank it in, but I'm not sure numeric formatting really does. Using 
ICU in this case seems more a matter of convenience than need.

OTOH, it brings in locale issues, and Jarkko's warned me about those 
often enough, though I remain in blissful ignorance about them for 
the moment. :)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: cvs commit: parrot/src dynext.c extend.c

2004-01-16 Thread Dan Sugalski

At 3:07 PM +0100 1/16/04, Leopold Toetsch wrote:
Dan Sugalski [EMAIL PROTECTED] wrote:
   +++ nci.pmc	16 Jan 2004 13:29:52 -	1.22

   +STRING* name () {
   +return SELF-vtable-whoami;
All classes inherit the Cname method from default.pmc.
Did it now work without this addition?
Ah, this was a junk leftover thing. I was wedging in debugging 
information, so I needed a local copy, and when I was cleaning up my 
tree I just ripped out the debug code and not the mainline code. This 
chunk can get tossed.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Optimization brainstorm: variable clusters

2004-01-16 Thread Dan Sugalski

At 4:02 PM +0100 1/16/04, Leopold Toetsch wrote:
Honestly an overridden op could insert new rules in the code and
recompile everything. If we have to always check for such nasty things,
then we can forget all performance and optimizations.
That, at least, we don't have to worry about. None of the existing 
languages (well, Tcl might, but I don't care there) require going 
quite so mad, and if Larry tries, well... we have ways to deal with 
that. :)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: [PATCHish] Food for thought about allocation

2004-01-16 Thread Dan Sugalski

At 3:34 AM -0700 1/16/04, Luke Palmer wrote:
I dearly wanted real Continuations to be faster, because I loathe
RetContinuations.  They suck.  Honest.
There really ought be no difference between a regular continuation 
and a return continuation. (I certainly can't think of a reason they 
should be different)

But the most enlightening thing is how much things improved by writing
my own small object allocator -- even a quick and dirty one.  Last time
I was fishing through that subsystem, it seemed to have a lot of
overhead.  Are there things about it that require this overhead, or
could it take an optimization run brining it near the overhead level of
this little one?  If so, we might have an opportunity to boost parrot's
usual speed by a fine degree.
True enough. It's certainly worth poking around with, and since this 
is all (pleasantly) internal stuff, it's a good place for 
experimentation and research without affecting user visible 
behaviour. I fully expect to tweak the object allocator and frame 
sizes as we get closer to release time.

I looked at the patch and I don't think this is an issue, but we 
should probably have separate allocation functions for different 
sized objects, even if they're just macro'd to be the generic 
allocator function with a hard-coded parameter. That'll let us tweak 
the allocation system as we go along.
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Q: thread function

2004-01-16 Thread Dan Sugalski

At 9:13 AM +0100 1/16/04, Leopold Toetsch wrote:
Should a thread function be always non-prototyped or do we allow 
prototyped ones too?

On preparing a thread, the parameters of the thread function are 
copied/cloned into the new thread interpreter.
I'm not sure I want to do it that way. (Actually, I think I don't, at 
least in some cases)

Give me a bit -- I'm working on the proposed thread spec, and it gets 
this low-level.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Ops file hints

2004-01-16 Thread Dan Sugalski

At 12:15 PM +0100 1/16/04, Leopold Toetsch wrote:
IMHO we need some more information in the ops files:
1) If an INT argument refers to a label/branch destination
2) For the event-checking code
3) For the safe run core
ad 1) e.g.
inline op eq(in INT, in INT, inconst INT) {
inline op eq(in INT, in INT, in_branch_const INT) {
Works, go for it.

ad 2) e.g.
op event invoke()
op event_after sleep()
would mean to check at or after that opcode, if events are to be 
handled. This would be a flag (or 2) in the op_info.
Works as well. (Though we need to change sleep to wait on an event to 
wake up, but that's separate)

ad 3)
The safe run-core will very likely need to disallow opcodes per 
category similar as in perldoc Opcode.
Yep.

I'd also like to have the ability to add in some other parameters to 
the ops file, so if when we're digging in we could wedge in callouts 
that by default are ignored, that'd be great.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: postgres lib

2004-01-16 Thread Dan Sugalski

At 2:59 PM +0100 1/16/04, LF wrote:
hello

i noticed that harry jackson is writing a postgres interface to wrap 
the nci functions. i'd like him to show what he's written so far and 
give his comments

guess i'm not the only one hoping to test it
:)

Harry, if you're OK with this, go ahead and check it in. If you don't 
have checkin privs, grab a perl.org account and we'll get you some.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Unicode, internationalization, C++, and ICU

2004-01-16 Thread Jeff Clites

On Jan 15, 2004, at 3:33 PM, Jonathan Worthington wrote:

- Original Message -
From: Dan Sugalski [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Thursday, January 15, 2004 8:09 PM
Subject: Unicode, internationalization, C++, and ICU

Now, assuming there's still anyone left reading this message...

We've been threatening to build ICU into parrot, and it's time for
that to start happening. Unfortunately there's a problem--it doesn't
work right now. So, what we need is some brave soul to track ICU
development and keep us reasonably up to date. What I'd really like
is:
1) ICU building and working
2) ICU not needing any C++
I've done some testing, and I hate to be the bearer of bad news but I
believe we have something of a problem.  :-(  The configure script  
turns out
to be a shell script which, unless I'm mistaken, means we're currently
unable to build ICU anywhere we don't have bash or similar.  Win32 for
starters, which is where I'm testing.
This page give instructions for building on Windows--it doesn't seem to  
require installing bash or anything:

http://oss.software.ibm.com/cvs/icu/~checkout~/icu/ 
readme.html#HowToBuildWindows

I assume that on Windows you don't need to run the configure script.

JEff

Re: Unicode, internationalization, C++, and ICU

2004-01-16 Thread Jonathan Worthington

 snip

 This page give instructions for building on Windows--it doesn't seem to
 require installing bash or anything:

 http://oss.software.ibm.com/cvs/icu/~checkout~/icu/
 readme.html#HowToBuildWindows

 I assume that on Windows you don't need to run the configure script.

Thanks for that, I'll work on and test a patch for the Configure script to
do this on Win32 later.  It won't help with any compiler other than MSVC++,
but it certainly helps.

Thanks,

Jonathan

Re: Ops file hints

2004-01-16 Thread Leopold Toetsch

Dan Sugalski [EMAIL PROTECTED] wrote:
 At 12:15 PM +0100 1/16/04, Leopold Toetsch wrote:
op event_after sleep()

 Works as well. (Though we need to change sleep to wait on an event to
 wake up, but that's separate)

I thought so too. Its mainly a workaround until we have all the events
done, albeit a plain sleep should be easy.

 I'd also like to have the ability to add in some other parameters to
 the ops file, so if when we're digging in we could wedge in callouts
 that by default are ignored, that'd be great.

Takers wanted. Its mainly a Perl task mangling the ops files.
If more info is needed, please just ask.

leo

Re: Numeric formatting

2004-01-16 Thread Michael Scott

Well, that's sort of how I imagined this would play out. And from a 
code writers ascii-centric outlook I agree it makes sense to have 
unicode as a special case brought in only when the pesky data requires 
it. What about using the icu api to make things easier when the 
formatter changes?

Mike

On 16 Jan 2004, at 16:18, Dan Sugalski wrote:

At 12:21 AM + 1/16/04, Tim Bunce wrote:
On Thu, Jan 15, 2004 at 05:15:05PM -0500, Dan Sugalski wrote:
 At 2:39 PM -0500 1/15/04, Dan Sugalski wrote:
 At 8:31 PM +0100 1/15/04, Michael Scott wrote:
 Is this relevant?
 http://oss.software.ibm.com/icu/userguide/formatNumbers.html
 
 I'm still not clear in my mind what the plan is with regard to 
ICU.
 Is it intended eventually to be:
 
 	a) an always-there part of parrot, or
 	b) just a sometimes-there thing that gets linked in if you
 mess with unicode?
 
 
 A) is the case. I didn't realize that the ICU library did numeric
 formatting.

 And then I realized, somewhat belatedly, that this won't 
necessarily work.
Won't work, or work but potentially have too much overhead?
Too much overhead. It means yanking in ICU and initializing parrot's 
Unicode subsystem just to do numeric formatting, which seems like a 
bit of overkill just to make:

   format foo, 9, .00

stick 0009.00 in foo. (Except for the case where it ought to stick 
0009,00, I suppose)

  I think I'd rather not have to yank things into Unicode just to
 format numbers.
I'm not quite sure what you mean here.
Well, what I'm trying hard to do is make Unicode as optional for 
Parrot as any other encoding. While I know that in some likely common 
cases (perl 6) it'll be required, I'd prefer it not be mandatory 
everywhere. For code that requires unicode it certainly makes sense to 
yank it in, but I'm not sure numeric formatting really does. Using ICU 
in this case seems more a matter of convenience than need.

OTOH, it brings in locale issues, and Jarkko's warned me about those 
often enough, though I remain in blissful ignorance about them for the 
moment. :)
--
Dan

--it's like 
this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Some namespace notes

2004-01-16 Thread Leopold Toetsch

Dan Sugalski [EMAIL PROTECTED] wrote:
 At 12:49 PM +0100 1/16/04, Leopold Toetsch wrote:

find_global P1, P2 ['global'; 'namespace'; 'hierarchy'; thingname ]

 No. The thing will be a separate parameter.

Why? Nested keys get you down the key chain until there is no more key.
This can be a variable (above case) or another namespace PMC. Above
lookup can be totally cached. When thingname is separate at least 2
hash lookups are necessary. Or if a separate thingname is there just
append it - should be equivalent.

The original syntax would need heavy modifications in the assembler, the
latter fits nicely.

 We can cope. The assembler needs a good kick with regards to keyed
 stuff anyway, I expect, and we're going to need this for constructing
 keys at runtime, something we've not, as yet, addressed.

We have:

$ cat k.pasm
   new P1, .PerlHash
   new P2, .PerlString
   set P2, hello\n
   set P1[b], P2
   new P3, .PerlHash
   set P3[a], P1

   set P5, P3[a; b] # HoH access by key cons
   print P5

   new P6, .Key
   set P6, a
   new P7, .Key
   set P7, b
   push P6, P7
   set P5, P3[P6]   # fully dynamic HoH access
   print P5

   end

$ parrot k.pasm
hello
hello

leo

[perl #24922] Need Ops file metadata/hints system

2004-01-16 Thread via RT

# New Ticket Created by  Dan Sugalski 
# Please include the string:  [perl #24922]
# in the subject line of all future correspondence about this issue. 
# URL: http://rt.perl.org/rt3/Ticket/Display.html?id=24922 


We need to revamp the ops file parsers to allow easier addition of 
ops metadata and parameter hints.

-- 
 Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk

[PATCH] Fix imcpasm tests on Win32

2004-01-16 Thread Jonathan Worthington

Hi,

The attached patch fixes a problem in imcc/TestCompiler.pm which was causing
all imcpasm tests to fail on Win32.

Jonathan


imcpasmtests.patch
Description: Binary data

Re: postgres lib

2004-01-16 Thread Harry Jackson

Dan Sugalski wrote:
At 2:59 PM +0100 1/16/04, LF wrote:

hello

i noticed that harry jackson is writing a postgres interface to wrap 
the nci functions. i'd like him to show what he's written so far and 
give his comments

guess i'm not the only one hoping to test it


Just something to bear in mind:

Please do not assume anything about this code with regards to its
suitability as a driver etc. I only did it because Dan wanted something
to ease the pain of using libpq via Parrot. It does this in _areas_ but 
has not really been tested yet, its a bit of a mess as it stands at
the moment but works for selects.

It is probably OK to use it to play with or learn IMCC and to see how to
use the NCI interface but none of it will become production code!
Some of the IMCC guru's may have some questions as to its suitability to 
learn anything from ;-)

:)

Harry, if you're OK with this, go ahead and check it in. If you don't 
have checkin privs, grab a perl.org account and we'll get you some.
I have an account username hjackson. Where is it going to go in the tree?

I will try and get it tidied up a bit and check it in, do you want 
insert/update to be fully working as well. For those of you thinking its 
a silly question ;-) trust me it's not as straight forward as you might 
think operation ;-).

I believe you want this for a demo (your post 12/02/03 18:35 ) What sort 
of thing is it for?

Harry

Re: Events and JIT

2004-01-16 Thread Michal Wallace

On Fri, 16 Jan 2004, Dan Sugalski wrote:

   2) Those that explicitly check for events
...
 Ops like spin_in_event_loop (or whatever we call it) or checkevent is
 in category two. They check events because, well, that's what they're
 supposed to do. Compilers should emit these with some frequency,
 though it's arguable how frequent they ought to be.

I don't understand that part. Why the compiler?
If the high-level code doesn't do anything with
the events, then there's no point in checking.
If it does use the events, then shouldn't
developers call the even loop explicitly?

Sincerely,

Michal J Wallace
Sabren Enterprises, Inc.
-
contact: [EMAIL PROTECTED]
hosting: http://www.cornerhost.com/
my site: http://www.withoutane.com/
--

cygwin link failure

2004-01-16 Thread Jonathan Worthington

Hi,

On cygwin, the final link fails with the following error:-

gcc -o parrot.exe -s -L/usr/local/lib  -g  imcc/main.o
blib/lib/libparrot.a -lcrypt
blib/lib/libparrot.a(io_unix.o)(.text+0x87e): In function `PIO_sockaddr_in':
/home/Jonathan/parrot_test/io/io_unix.c:468: undefined reference to
`_inet_pton'

inet_pton has not yet been implemented in cygwin, but it is being worked
on...
http://win6.jp/Cygwin/

Jonathan

Re: Vtables organization

2004-01-16 Thread Dan Sugalski

At 11:53 AM +0100 1/16/04, Leopold Toetsch wrote:
PMCs use Vtables for almost all their functionality *and* for stuff 
that in Perl5 term is magic (or they should AFAIK).

E.g. setting the _ro property of a PMC (that supports it[1]) swaps 
in the Const$PMC vtable, where all vtable methods that would change 
the PMC thrown an exception.
Or: setting a PMC shared, whould swap in a vtable, that locks e.g. 
internal aggregate state on access. That is a non-shared PMC doesn't 
suffer of any locking slowdown.
Tieing will very likely swap in just another vtable and so on.

The questions are:
- Where and how should we store these vtables?
- Are these PMC variants distinct types (with class_enum and name)
- Or are these sub_types (BTW what is vtable-subtype)? E.g. hanging 
off from the main vtable?
I was going to go on about a few ways to do this, but after I did I 
realized that only one option is viable. So, let's try this on for 
size:

Vtables are chained. That means each vtable has a link to the next in 
the chain. It *also* means that each call into a vtable function has 
to pass in a pointer to the vtable the call came from so calls can be 
delegated properly. If we don't want this to suck down huge amounts 
of memory it also means that the vtable needs to be split into a 
vtable header and vtable function table body.

Downside there is that we have an extra parameter (somewhat pricey) 
to all the vtable functions.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Ops file hints

2004-01-16 Thread Dan Sugalski

At 5:58 PM +0100 1/16/04, Leopold Toetsch wrote:
Dan Sugalski [EMAIL PROTECTED] wrote:
  I'd also like to have the ability to add in some other parameters to
 the ops file, so if when we're digging in we could wedge in callouts
 that by default are ignored, that'd be great.
Takers wanted. Its mainly a Perl task mangling the ops files.
If more info is needed, please just ask.
As you probably noticed, I threw a bug/todo item in for this. :)
--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

RE: Events and JIT

2004-01-16 Thread Gordon Henriksen

Leopold Toetsch wrote:

 Event handling currently works for all run cores[1] except JIT.
 
 The JIT core can't use the schemes described below, but we could:
 2) Patch the native opcodes at these places with e.g. int3 (SIG_TRAP, 
 debugger hook) cpu instruction and catch the trap. Running the event 
 handler (sub) from there should be safe, as we are in a 
 consistent state 
 in the run loop.

I don't think that bytecode-modifying versions should fly; they're not
threadsafe, and it would be nice to write-protect the instruction stream
to avert that attack vector.


 1) explicitely insert checks, if events are to be handled
 1a) everywhere or
 1b) in places like described below under [1] c)

I like this (1b). With the JIT, an event check could be inlined to 1
load and 1 conditional branch to the event dispatcher, yes? (So long as
interp is already in a register.) If that's done before blocking and at
upward branches, the hit probably won't be killer for most of code. For
REALLY tight loops (i.e., w/o branches or jumps, and w/ op count less
than a particular threshold), maybe unroll the loop a few times and then
still check on the upward branch.

Those branches will almost always fall straight through, so while there
will be load in the platform's branch prediction cache and a bit of
bloat, there shouldn't be much overhead in terms of pipeline bubbles.
The event ready word (in the interpreter, presumably) will stay in the
L1 or L2 cache, avoiding stalls.

No, it's not zero-overhead, but it's simple and easy enough to do
portably. Crazy platform-specific zero-overhead schemes can come later
as optimizations.

-- 

Gordon Henriksen
IT Manager
ICLUBcentral Inc.
[EMAIL PROTECTED]

Re: Some namespace notes

2004-01-16 Thread Larry Wall

I've used non-hierarchical file systems in the distant past, and
it wasn't pleasant.  I think aliases (symlinks) work much better in
a hierarchy.  So do inner packages, modules, and classes, which we
plan to have in Perl 6.  And package aliasing will be the basis for
allowing different versions of the same module to coexist.  And if
Parrot makes people put /perl/parrot/java on the front of Java names,
the first thing people will do is to alias them all to /java.

Larry

Re: Events and JIT

2004-01-16 Thread Michal Wallace

On Fri, 16 Jan 2004, Dan Sugalski wrote:

 I don't understand that part. Why the compiler?

 Because we don't have  the sort of control of the async environment
 that hardware does to deal with interrupts.

 And, realistically, all code has to deal with the possibility of
 interrupts. Even if they aren't doing any IO at all they're still
 potentially watching for keyboard (or other OS-initiated)
 breaks/interrupts.

I see your point. In python if you press ^C, it should
raise a KeyboardInterrupt exception.

But rather than always monitoring for that, I'd want
to register a listener and then have parrot handle the
polling for me. Maybe by default, parrot does the check
every N instructions... And then you could turn that off
if you wanted more speed.

 Well... There's the issue of signals, which is the big one. If we
 could skip signals, that'd be great, but we can't, even on systems
 that don't do them in the true Unix-y sense. Windows programs should
 respond to breaks from the keyboard (or close-window requests in a
 terminal-esque environment if we build one) and have a chance to shut
 down cleanly, so... that's an event.

This is probably a dumb question but: what if signals
threw exceptions instead?

I mean, they're pretty rare aren't they? They seem like
a completely different kind of thing than watching a
mouse or socket... Different because signals have nothing
to do with the program itself but come entire from the
outside. (Whereas with the regular events, the data
comes from outside but the choice to listen for the
data was made inside the program.)

Sincerely,

Michal J Wallace
Sabren Enterprises, Inc.
-
contact: [EMAIL PROTECTED]
hosting: http://www.cornerhost.com/
my site: http://www.withoutane.com/
--

RE: Events and JIT

2004-01-16 Thread Gordon Henriksen

Michal,

 But rather than always monitoring for that, I'd want
 to register a listener and then have parrot handle the
 polling for me.

This is precisely what's being discussed.


 This is probably a dumb question but: what if signals
 threw exceptions instead?

I'd hope that the event handler for a signal event could 
elect to throw an exception; it could even be the default.
But the exception has to get into the thread somehow--
exceptions don't autonomously happen, and they require
considerable cooperation from the thread on which the
exception occurs. High-priority events are that mechanism
through which the code which will throw the exception can
interrupt normal program execution.

-- 

Gordon Henriksen
IT Manager
ICLUBcentral Inc.
[EMAIL PROTECTED]

Re: Events and JIT

2004-01-16 Thread Leopold Toetsch

Dan Sugalski [EMAIL PROTECTED] wrote:
 At 11:38 AM +0100 1/16/04, Leopold Toetsch wrote:
Event handling currently works for all run cores[1] except JIT.

 What I'd planned for with events is a bit less responsive than the
 system you've put together for the non-JIT case, and I think it'll be
 OK generally speaking.

 Ops fall into three categories:

   1) Those that don't check for events
   2) Those that explicitly check for events
   3) Those that implicitly check for events

Yep, that are the cases. I think I have boiled down that scheme to no
cost for non JIT run cores[1], that is, in the absence of events there
is no overhead for event checking. Event delivery (which I consider rare
in terms of CPU cycles) takes a bit more instead - but not much.

But the JIT core has to deal with event delivery too. So we have to
decide which JITted ops are 3) - (case 2) the explicit check op is already
available, that's no problem we need hints for 3)

 Ops in the third category are a bit trickier. Anything that sleeps or
 waits should spin on the event queue

Ok, the latter is the simple part - all IO or event related ops. But the
problem remains:

What about the loop[2] of mops.pasm? Only integers in registers running
at one Parrot op per CPU cycle.

 The big thing to ponder is which ops ought go in category three. I
 can see the various invoke ops doing it, but beyond that I'm up in
 the air.

Yes. First: do we guarantee timely event handling in highly optimized
loops like in mops.pasm? Can we uses schemes like my proposal of using
the int3 x86 instruction...

leo

[1] the switched core currently checks after the switch statement, but
its not simple to optimize that

[2]
  jit_func+116:   sub%edi,%ebx
  jit_func+118:   jne0x81c73a4 jit_func+116

Re: Problem during make test

2004-01-16 Thread Leopold Toetsch

Chromatic [EMAIL PROTECTED] wrote:

 Yes, that's better.  (Upgrading glibc didn't help -- I was worried that
 this was an NPTL issue that Parrot couldn't fix.)

Cool.

 Now it hangs on t/pmc/timer:

 0x10090b30 in Parrot_del_timer_event (interpreter=0x10273e88,

Ah yep. When committing the first (trial) fix, I thought about such
a problem, which is related:
- if it seems to hang on a condition variable (still AFAIK: it shouldn't)
  - but anyway - it could depend on objects, that need destruction, like
a timer event, so ...

I moved killing the event loop a bit down in the interpreter destroy
sequence. Reaching that point, timers should be removed from the queue.

HTH and thanks for your valuable feedback to track things down.

 -- c

leo

Re: Events and JIT

2004-01-16 Thread Leopold Toetsch

Michal Wallace [EMAIL PROTECTED] wrote:
 On Fri, 16 Jan 2004, Dan Sugalski wrote:
 interrupts. Even if they aren't doing any IO at all they're still
 potentially watching for keyboard (or other OS-initiated)
 breaks/interrupts.

 I see your point. In python if you press ^C, it should
 raise a KeyboardInterrupt exception.

 But rather than always monitoring for that, I'd want
 to register a listener

Ehem: that's what we are talking about. There is a listener already
running, that's the event thread. It currently doesn't much, but it
listens eg for a timer event, that is, it waits on a condition. This
doesn't take any CPU time during waiting except for a bit overhead, when
the kernel awakens that thread and it executes (for a very short time).

So this event thread sees: Oh the kernel just noticed me, that the
cookie for interpreter #5 is ready, so I'll tell that interpreter.
That's what the event thread is doing.

Now interpreter #5 - currently busy running in a tight loop - doesn't
see that his cookie is ready. *If* this interpreter would check
every pace, if there is a cookie, it would be a big slowdown.

To minimze the overhead, the event thread throws a big piece of
wood in front of the fast running interpreter #5, which finally, after a
bit stumbling, realizes: Oh, my cookie has arrived.

 This is probably a dumb question but: what if signals
 threw exceptions instead?

We will (AFAIK) convert signals to events, which dispatch further.

 I mean, they're pretty rare aren't they?

Async is the problem.

 Michal J Wallace

leo

Re: Events and JIT

2004-01-16 Thread Leopold Toetsch

Gordon Henriksen [EMAIL PROTECTED] wrote:
 Leopold Toetsch wrote:

 2) Patch the native opcodes at these places with e.g. int3

 I don't think that bytecode-modifying versions should fly; they're not
 threadsafe,

Why? The bytecode is patched by a different thread *if* an event is due
(which in CPU cycles is rare). And I don't see a thread safety problem.
The (possibly different) CPU reads an opcode and runs it. Somewhere in
the meantime, the opcode at that memory position changes to the byte
sequence 0xCC (on intel: int3 ) one byte changes, the CPU executes the
trap or not (or course changing that memory position is assumed to be
atomic, which AFAIK works on i386) - but next time in the loop the trap
is honored.

 ... and it would be nice to write-protect the instruction stream
 to avert that attack vector.

We did protect it, so we can un- and reprotect it, that's not the
problem.

 1b) in places like described below under [1] c)

 I like this (1b). With the JIT, an event check could be inlined to 1
 load and 1 conditional branch to the event dispatcher, yes?

Yep. That's the plain average slower case :) Its a fallback, if there
are no better and faster solutions.

 (So long as
 interp is already in a register.)

Arghh, damned i386 with *zero* registers, where zero is around 4
(usable, general... ) ;)
So no interpreter in registers here - no.

Its at least 3? cycles + branch prediction overhead, so a lot compared to
nul overhead...

 ... If that's done before blocking and at
 upward branches, the hit probably won't be killer for most of code. For
 REALLY tight loops (i.e., w/o branches or jumps, and w/ op count less
 than a particular threshold), maybe unroll the loop a few times and then
 still check on the upward branch.

Yep, loop unrolling would definitely help, that was the currently very
likely working solution in my head.

 Those branches will almost always fall straight through, so while there
 will be load in the platform's branch prediction cache and a bit of
 bloat, there shouldn't be much overhead in terms of pipeline bubbles.
 The event ready word (in the interpreter, presumably) will stay in the
 L1 or L2 cache, avoiding stalls.

Yep. Still I like these numbers:
$ parrot -j examples/assembly/mops.pasm
M op/s:790.105001  # on AMD 800

 No, it's not zero-overhead, but it's simple and easy enough to do
 portably. Crazy platform-specific zero-overhead schemes can come later
 as optimizations.

s(Crazy)(Reasonable) but later is ok:)

leo

Re: cygwin link failure

2004-01-16 Thread Seiler Thomas

Hi

First of all,
yet_another_shy_lurker++;

 On cygwin, the final link fails with the following error:-

 gcc -o parrot.exe -s -L/usr/local/lib  -g  imcc/main.o
 blib/lib/libparrot.a -lcrypt
 blib/lib/libparrot.a(io_unix.o)(.text+0x87e): In function
 `PIO_sockaddr_in': /home/Jonathan/parrot_test/io/io_unix.c:468:
 undefined reference to `_inet_pton'

I had that problem when i tried to compile parrot on
one of our school machines(cygwin). inet_pton is an
addressfamily independent version of inet_aton that
works with normal ip adresses aswell as ipv6 adresses,
but is mostly only defined on machines that support ipv6.

 inet_pton has not yet been implemented in cygwin, but it is being
 worked on...
 http://win6.jp/Cygwin/

Indeed, but I think there might be other unix-like environments
that (do not yet|will never) provide the inet_pton function.
So I tried to add a inet_pton implementation for the cases where
the platform does not provide it. Apache 2.0 goes that way,
http://lxr.webperf.org/source.cgi/srclib/apr/network_io/unix/inet_pton.c

I alread managed to adapt that piece of source slightly so that
it compiles during the parrot build process. Now I'm trying to
understand parrots configuration system in order to compile
this only if there is no inet_pton defined.

But then, im only a shy_lurker so this might take some time...

Thomas

Re: JVM as a threading example (threads proposal)

2004-01-16 Thread Damien Neil

On Thu, Jan 15, 2004 at 11:58:22PM -0800, Jeff Clites wrote:
 On Jan 15, 2004, at 10:55 PM, Leopold Toetsch wrote:
 Yes, that's what I'm saying. I don't see an advantage of JVMs 
 multi-step
 variable access, because it even doesn't provide such atomic access.

You're missing the point of the multi-step access.  It has nothing to
do with threading or atomic access to variables.

The JVM is a stack machine.  JVM opcodes operate on the stack, not on
main memory.  The stack is thread-local.  In order for a thread to operate
on a variable, therefore, it must first copy it from main store to thread-
local store (the stack).

Parrot, so far as I know, operates in exactly the same way, except that
the thread-local store is a set of registers rather than a stack.

Both VMs separate working-set data (the stack and/or registers) from
main store to reduce symbol table lookups.


 What I was expecting that the Java model was trying to do (though I 
 didn't find this) was something along these lines: Accessing the main 
 store involves locking, so by copying things to a thread-local store we 
 can perform several operations on an item before we have to move it 
 back to the main store (again, with locking). If we worked directly 
 from the main store, we'd have to lock for each and every use of the 
 variable.

I don't believe accesses to main store require locking in the JVM.

This will all make a lot more sense if you keep in mind that Parrot--
unthreaded as it is right now--*also* copies variables to working store
before operating on them.  This isn't some odd JVM strangeness.  The
JVM threading document is simply describing how the stack interacts
with main memory.

  - Damien

RE: Events and JIT

2004-01-16 Thread Gordon Henriksen

Leopold Toetsch wrote:

 Why? The bytecode is patched by a different thread *if* an 
 event is due (which in CPU cycles is rare). And I don't see a 
 thread safety problem. The (possibly different) CPU reads an 
 opcode and runs it. Somewhere in the meantime, the opcode at 
 that memory position changes to the byte sequence 0xCC (on 
 intel: int3 ) one byte changes, the CPU executes the trap or 
 not (or course changing that memory position is assumed to be 
 atomic, which AFAIK works on i386) - but next time in the 
 loop the trap is honored.

Other threads than the target could be executing the same chunk
of JITted code at the same time.

-- 

Gordon Henriksen
IT Manager
ICLUBcentral Inc.
[EMAIL PROTECTED]

[PATCH] Configure ICU for building on Win32

2004-01-16 Thread Jonathan Worthington

Hi,

The attached patch adds support to configure for building ICU with MSVC++,
as recommended in:-
http://oss.software.ibm.com/cvs/icu/~checkout~/icu/readme.html#HowToBuildWindows

Unfortunately, the VC++ project/workspace files are missing from our ICU
tree in CVS, see:-

http://oss.software.ibm.com/cvs/icu/~checkout~/icu/source/allinone/
http://oss.software.ibm.com/cvs/icu/~checkout~/icu/source/allinone/all/

Once they are there, and with this patch, it should (unless I've messed up)
work.

Jonathan

Re: cygwin link failure

2004-01-16 Thread Jonathan Worthington

From: Seiler Thomas [EMAIL PROTECTED]

 First of all,
 yet_another_shy_lurker++;
Welcome.  :-)

  On cygwin, the final link fails with the following error:-
 
  gcc -o parrot.exe -s -L/usr/local/lib  -g  imcc/main.o
  blib/lib/libparrot.a -lcrypt
  blib/lib/libparrot.a(io_unix.o)(.text+0x87e): In function
  `PIO_sockaddr_in': /home/Jonathan/parrot_test/io/io_unix.c:468:
  undefined reference to `_inet_pton'

 I had that problem when i tried to compile parrot on
 one of our school machines(cygwin). inet_pton is an
 addressfamily independent version of inet_aton that
 works with normal ip adresses aswell as ipv6 adresses,
 but is mostly only defined on machines that support ipv6.

  inet_pton has not yet been implemented in cygwin, but it is being
  worked on...
  http://win6.jp/Cygwin/

 Indeed, but I think there might be other unix-like environments
 that (do not yet|will never) provide the inet_pton function.
 So I tried to add a inet_pton implementation for the cases where
 the platform does not provide it. Apache 2.0 goes that way,
 http://lxr.webperf.org/source.cgi/srclib/apr/network_io/unix/inet_pton.c

This was the kinda solution I had in mind, but my network programming
knowledge is way under par.

 I alread managed to adapt that piece of source slightly so that
 it compiles during the parrot build process. Now I'm trying to
 understand parrots configuration system in order to compile
 this only if there is no inet_pton defined.

You may want to take a look at config/auto/memalign.pl, which I believe is
one of a number of scripts that generates a c file and attempts to compile
it, then does something based upon the success of that attempt.

 But then, im only a shy_lurker so this might take some time...

Thanks for having a crack at it.

Jonathan

Re: [PATCH] Configure ICU for building on Win32

2004-01-16 Thread Jonathan Worthington

From: Jonathan Worthington [EMAIL PROTECTED]
 
 The attached patch...
Which I forgot to attach.  Sorry.

Jonathan

icuwin32.patch
Description: Binary data

Re: JVM as a threading example (threads proposal)

2004-01-16 Thread Jeff Clites

On Jan 16, 2004, at 1:01 PM, Damien Neil wrote:

On Thu, Jan 15, 2004 at 11:58:22PM -0800, Jeff Clites wrote:
On Jan 15, 2004, at 10:55 PM, Leopold Toetsch wrote:
Yes, that's what I'm saying. I don't see an advantage of JVMs
multi-step
variable access, because it even doesn't provide such atomic access.
You're missing the point of the multi-step access.  It has nothing to
do with threading or atomic access to variables.
The JVM is a stack machine.  JVM opcodes operate on the stack, not on
main memory.  The stack is thread-local.  In order for a thread to 
operate
on a variable, therefore, it must first copy it from main store to 
thread-
local store (the stack).

Parrot, so far as I know, operates in exactly the same way, except that
the thread-local store is a set of registers rather than a stack.
Both VMs separate working-set data (the stack and/or registers) from
main store to reduce symbol table lookups.
...
This will all make a lot more sense if you keep in mind that Parrot--
unthreaded as it is right now--*also* copies variables to working store
before operating on them.  This isn't some odd JVM strangeness.  The
JVM threading document is simply describing how the stack interacts
with main memory.
I think the JVM spec actually implying something beyond this. For 
instance section 8.3 states, A store operation by T on V must 
intervene between an assign by T of V and a subsequent load by T of V. 
Translating this to parrot terms, this would mean that the following is 
illegal, which it clearly isn't:

	find_global P0, V
	set P0, P1	# assign by T of V
	find_global P0, V	# a subsequent load by T of V w/o an intervening 
store operation by T on V

I think it is talking about something below the Java-bytecode 
level--remember, this is the JVM spec, and constrains how an 
implementation of the JVM must behave when executing a sequence of 
opcodes, not the rules a Java compiler must follow when generating a 
sequence of opcodes from Java source code.

What I think it's really saying, again translated into Parrot terms, is 
this:

store_global foo, P0 # internally, may cache value and not push to 
main memory
find_global P0, foo # internally, can't pull value from main memory 
if above value was not yet pushed there

and I think the point is this:

find_global P0, foo # internally, also caches value in thread-local 
storage
find_global P0, foo # internally, can use cached thread-local value

And, as mentioned in section 8.6, any time a lock is taken, cached 
values need to be pushed back into main memory, and the local cache 
emptied. This doesn't make any sense if the thread's working memory 
is interpreted as the stack.

JEff

Re: Events and JIT

2004-01-16 Thread Jeff Clites

On Jan 16, 2004, at 1:20 PM, Leopold Toetsch wrote:

Gordon Henriksen [EMAIL PROTECTED] wrote:
Leopold Toetsch wrote:

2) Patch the native opcodes at these places with e.g. int3

I don't think that bytecode-modifying versions should fly; they're not
threadsafe,
Why? The bytecode is patched by a different thread *if* an event is due
(which in CPU cycles is rare). And I don't see a thread safety problem.
The (possibly different) CPU reads an opcode and runs it. Somewhere in
the meantime, the opcode at that memory position changes to the byte
sequence 0xCC (on intel: int3 ) one byte changes, the CPU executes the
trap or not (or course changing that memory position is assumed to be
atomic, which AFAIK works on i386) - but next time in the loop the trap
is honored.
Where in the stream would we patch? If not in a loop, you may never hit 
the single patched location again, but still might not end for a very 
long time

If we are patching all locations with branches and such, in large 
bytecode this could take a long time, and the executing thread might 
outrun the patching thread. Also, once the handler is entered we'd 
have to fix all of patched locations, which again could be 
time-consuming for large bytecode.

It could work, but seems problematic.

JEff

Re: JVM as a threading example (threads proposal)

2004-01-16 Thread Gordon Henriksen

On Friday, January 16, 2004, at 02:58 , Jeff Clites wrote:

On Jan 15, 2004, at 10:55 PM, Leopold Toetsch wrote:
Damien Neil [EMAIL PROTECTED] wrote:
On Thu, Jan 15, 2004 at 09:31:39AM +0100, Leopold Toetsch wrote:
I don't see any advantage of such a model. The more as it doesn't
gurantee any atomic access to e.g. long or doubles. The atomic 
access to
ints and pointers seems to rely on the architecture but is of course
reasonable.

You *can't* guarantee atomic access to longs and doubles on some
architectures, unless you wrap every read or write to one with a
lock.  The CPU support isn't there.
Yes, that's what I'm saying. I don't see an advantage of JVMs 
multi-step
variable access, because it even doesn't provide such atomic access.
What I was expecting that the Java model was trying to do (though I 
didn't find this) was something along these lines: Accessing the main 
store involves locking, so by copying things to a thread-local store we 
can perform several operations on an item before we have to move it 
back to the main store (again, with locking). If we worked directly 
from the main store, we'd have to lock for each and every use of the 
variable.
I think the real purpose of the model was to say thread-local values 
may be committed to main memory (perhaps significantly) after the local 
copy is logically assigned. Thus: In the absence of explicit 
synchronization, threads may manipulate potentially inconsistent local 
copies of variables. This model addresses: Copies of variables in 
registers, copies on the JVM stack, copies in the stack frame, thread 
preemption prior to store (which occurs on uniprocessors and 
multiprocessors alike), and delayed write-back caches in SMP systems.[*]

In short, this portion of the spec provides bounds for the undefinedness 
of the behavior that occurs when programs do not use Java's 
synchronization primitives. It does so realistically, in a manner that 
contemporary computer systems can implement efficiently. (In fact, the 
spec is far more descriptive than it is proscriptive.)

Or, as an example, it allows the natural thing to happen here:

; PPC[**] implementation of:
; var = var + var;
lwz r30, 0(r29)  ; load var  (1 JVM load)
addi r30, r30, r30   ; double var(2 JVM uses, 1 JVM assign)
; if your thread is preempted here
stw r30, 0(r29)  ; store var (1 JVM store)
And allows obvious optimizations like this:

; PPC implementation of:
; var = var + var;
; var = var + var;
lwz r30, 0(r29)
addi r30, r30, r30
; imagine your thread is preempted here
addi r30, r30, r30
stw 0(r29), r30
But it explicitly disallows that same optimization for a case like this:

var = var + var;
synchronized (other} { other++; }
var = var + var;
That--that, and the whole cache coherency/delayed write thing:

; CPU 1 ; CPU 2
loadi r29, 0xFFF8   loadi r29, 0xFFF8
loadi r30, 0xDEAD   loadi r30, 0xBEEF
stw 0(r29), r30 stw 0(r29), r30
lwz r28, 0(r29) lwz r28, 0(r29)
; r28 is probably 0xDEAD; r28 is probably 0xBEEF
; (but could be 0xBEEF) ; (but could be 0xDEAD)
syncnoop
lwz r28, 0(r29) lwz r28, 0(r29)
; r28 matches on both CPUs now, either 0xDEAD or
; 0xBEEF (but not 0xDEEF or 0xBEAD or 0x).
[* - On many SMP systems, processors do not have coherent views of main 
memory (due to their private data caches) unless the program executes 
explicit memory synchronization operations, which are at least expensive 
enough that you don't want to execute them on every opcode.]

[** - Forgive my rusty assembler.]

The reason I'm not finding it is that the semantic rules spelled out in 
the spec _seem_ to imply that every local access implies a 
corresponding access to the main store, one-to-one. On the other hand, 
maybe the point is that it can save up these accesses--that is, lock 
the main store once, and push back several values from the thread-local 
store. If it can do this, then it is saving some locking.
The spec doesn't lock except when, or if the architecture were unable to 
provide atomic loads and stores for 32-bit quantities.

Parrot deals with PMCs, which can contain (lets consider scalars only)
e.g. a PerlInt or a PerlNumer. Now we would have atomic access
(normally) to the former and very likely non-atomic access to the 
latter
just depending on the value which happened to be stored in the PMC.

This implies, that we have to wrap almost[1] all shared write *and* 
read
PMC access with LOCK/UNLOCK.

[1] except plain ints and pointers on current platforms
Ah, but this misses a key point: We know that user data is allowed to 
get corrupted if the user isn't locking properly--we only have to 
protect VM-internal state. The key point is that it's very unlikely 
that there will be any floats involved in VM-internal state--it's going 
to be all pointers and ints (for offsets and lengths). That is, a 
corrupted float won't crash the VM.
On one

Re: JVM as a threading example (threads proposal)

2004-01-16 Thread Gordon Henriksen

On Friday, January 16, 2004, at 08:38 , Jeff Clites wrote:

On Jan 16, 2004, at 1:01 PM, Damien Neil wrote:

On Thu, Jan 15, 2004 at 11:58:22PM -0800, Jeff Clites wrote:

On Jan 15, 2004, at 10:55 PM, Leopold Toetsch wrote:

Yes, that's what I'm saying. I don't see an advantage of JVMs 
multi-step variable access, because it even doesn't provide such 
atomic access.
You're missing the point of the multi-step access.  It has nothing to 
do with threading or atomic access to variables.
... it has everything to do with allowing multiprocessors to operate 
without extraneous synchronization.

The JVM is a stack machine.  JVM opcodes operate on the stack, not on 
main memory.  The stack is thread-local.  In order for a thread to 
operate on a variable, therefore, it must first copy it from main 
store to thread-local store (the stack).

Parrot, so far as I know, operates in exactly the same way, except 
that the thread-local store is a set of registers rather than a stack.

Both VMs separate working-set data (the stack and/or registers) from 
main store to reduce symbol table lookups.
...
This will all make a lot more sense if you keep in mind that 
Parrot--unthreaded as it is right now--*also* copies variables to 
working store before operating on them.  This isn't some odd JVM 
strangeness.  The JVM threading document is simply describing how the 
stack interacts with main memory.
I think the JVM spec actually implying something beyond this. For 
instance section 8.3 states, A store operation by T on V must 
intervene between an assign by T of V and a subsequent load by T of V. 
Translating this to parrot terms, this would mean that the following is 
illegal, which it clearly isn't:

	find_global P0, V
	set P0, P1	# assign by T of V
	find_global P0, V	# a subsequent load by T of V w/o an 
intervening store operation by T on V
This rule addresses aliasing. It says that this (in PPC assembly):

; presume (obj-i) == obj+12
lwz r29, 12(r30) ; read, load
addi r29, r29, 1 ; use, assign
lwz r28, 12(r30) ; read, load
addi r28, r28, 1 ; use, assign
stw 0(r30), r29 ; store, eventual write
stw 0(r30), r28 ; store, eventual write
... is an invalid implementation of this:

j.i = j.i + 1;
k.i = k.i + 1;
... where the JVM cannot prove j == k to be false. The rule states that 
the stw of r29 must precede the stw of r28. Why this is under 
threading... beyond me.

Let me briefly hilight the operations as discussed and digress a little 
bit as to why all the layers:

main memory
--read+load--
working copy (register file, stack frame, etc.)
--use--
execution engine (CPU core)
--assign--
working copy (register file, stack frame, etc.)
--write+store--
main memory
(I paired the read+load and write+store due to the second set of rules 
in 8.2.)

The spec never says where a read puts something that a load can use it, 
or where a store puts something that a write can use it. A store with 
its paired write pending is simply an in-flight memory transaction (and 
the same for a read+load pair). Possible places the value could be: 
In-flight on the system bus; queued by the memory controller; on a dirty 
line in a write-back cache;  somewhere in transit on a NUMA 
architecture. Store-write and read-load are just different ends of the 
underlying ISA's load and store memory transactions. The read and write 
operations specify the operations from the memory controller's 
perspective; load and store specify them from the program's perspective.

Note that reads and writes are performed by main memory, not by a 
thread. That distinction is crucial to reading the following section 
from the spec:

8.2 EXECUTION ORDER AND CONSISTENCY
The rules of execution order constrain the order in which certain 
events may occur. There are four general constraints on the 
relationships among actions:

*	The actions performed by any one thread are totally ordered; that 
is, for any two actions performed by a thread, one action precedes the 
other.
*	The actions performed by the main memory for any one variable are 
totally ordered; that is, for any two actions performed by the main 
memory on the same variable, one action precedes the other.
	...

The extra read/write step essentially allows main memory (the memory 
controller) to order its operations with bounded independence of any 
particular thread. Careful reading of the other rules will show that 
this is only a useful abstraction in the case of true concurrency (e.g., 
SMP), as the other rules ensure that a single processor will always load 
variables in a state consistent with what it last stored.

I think it is talking about something below the Java-bytecode 
level--remember, this is the JVM spec, and constrains how an 
implementation of the JVM must behave when executing a sequence of 
opcodes, not the rules a Java compiler must

74 matches

Mail list logo