Re: [perl #17731] [PATCH] Integration of Lea like allocators

2002-10-05 Thread Mike Lambert

> And additionally, for + 10 % more generations in life.pasm
> - tossed one instruction in the fast path of Buffer_headers

I don't believe this is valid. bufstart needs to be set to 0 when you free
an object. When the stackwalk runs, it could "liven" a dead buffer. When
the copying collector runs, it could see a dead buffer and want to copy
it. So we'd better make sure that buffer's not pointing to garbage memory
or it'll be coping random stuff from who-knows where and likely GPF. :)

Mike Lambert




Re: RFC: library entry name collision

2002-09-29 Thread Mike Lambert

> I was beating my head on the wall yesterday trying to figure out why
> an intlist test was failing on a freshly updated source tree. (I
> rarely use 'make clean', because that's almost always just covering up
> dependency problems.) I'll leave out the gory details, but the problem
> boiled down to parrot/intlist.o and parrot/classes/intlist.o being
> treated as identical by ar. Upon further reading, it appears that for
> portability, we can only depend on the basename of ar entries.

Two things...

First:
One dependancy problem that comes up all the time is that classes/Makefile
doesn't have any dependancies upon GENERAL_H_FILES. These .o files aren't
updated if I change parrot headers, etc. The best way to solve this is put
the logic into the base parrot Makefile, although that could make makefile
generation a bit more difficult.

Second:
intlist is not the only culprit. ./classes/key.c and ./key.c have a
similar problem.

Mike Lambert




Re: Of PMCs Buffers and memory management

2002-09-29 Thread Mike Lambert
le, however, to have a pool of unsized-header
*pointers*, and that's exactly what extra_buffer_headers is.

Currently, we group all headers of the same size in the same header pool,
although only constants string headers currently have their own pool.
Namely, because we don't really have constants of anything else
implemented yet. :)

> > ... But there is
> > no compelling reason to do so, at this point in time. (I have some ideas
> > that would require it, tho)
>
> Could you elaborate on these ideas?

I guess I will need to write up those ideas. :)

> > ... I don't think we want interpreters appearing and
> > disapppearing with references...they should be explicitly created and
> > destroyed.
>
> Actually, it's not a big difference, how they are destroyed, but we have
>   already a "newinterp" opcode, so a interpreter PMC class just needs a
> custom destroy method - that get called too ;-)
> Though, if nested structures inside the interpreter are all buffers,
> destroying them would neatlessly fit into the framework.

Yes, it would. But a lot of the interpreters structures have data fields,
and those don't work too well as buffer data. They could work as part of a
sized buffer header, I suppose. I think it would be much easier to make
the interpreter PMC-ish, or at least have a PMC wrapper. Then this PMC
can have an active-destroy method, which would properly clean up
everything that needed to be cleaned up. Since the interpreter memeory
would be malloc-allocated, it wouldn't be copied or cleaned on it's own.
The PMC would become an interface for the GC system to control the
lifetime of the allocated interpreter memory, since the GC system would
control the PMC.


Mike Lambert





Re: Of PMCs Buffers and memory management

2002-09-27 Thread Mike Lambert
method, so that fields of the
sized buffer interpreter header could be marked() and buffer_lives()
themselves. (Currently, this is done in dod.c).

If they were unified, the PMC would be an interpreter referencing a sized
buffer header. Or if we had sized PMCs, the fields could be part of
it, avoiding the need for a buffer.

However, as far as leaking memory, there is no reason that interpreters
have to be PMC/buffers. Just as we have an make_interpreter to create an
interpreter, we can have an unmake_interpreter that destroys the
interpreter. I don't think we want interpreters appearing and
disapppearing with references...they should be explicitly created and
destroyed. But that's a discussion for another thread. My point is that
all things don't need to be traced, and some stuff can be handled
manually, as long as the perl programmer doesn't see it directly.

Hope this helps answer your questions,
Mike Lambert




Re: [INFO] parrot with Lea malloc - 50% faster life

2002-09-24 Thread Mike Lambert

> The whole resource.c is replaced by calls to calloc/realloc (res.c),
> add_free_buffer does free().

I think that's the problem right there. What exactly are you changing
to use this new calloc/malloc implementation? One approach is to modify
memory.c to use the new versions, although since parrot does a lot of its
own memory management of large chunks, it's not likely to give any speed
improvement over the OS.

If you're replacing resources.c and headers.c with this calloc/malloc, you
have to make sure to get every occurance, which can be difficult. What
level are you trying to hook in this new implementation?

You can bind every buffer's contents to calloc/realloc calls, but then
there will be no copying or collection going on because we're not
allocating out of pools. You'll need to disable compact_pool to do
nothing, and update the checks in mem_allocate such that they aren't
dependant upon it being called.

If you want the actual headers to be allocated from calloc/realloc, you'll
need to change add_free_object and get_free_object in smallobject.c, and
(add|get)_free_(buffer|pmc) in headers.c. Then you'll need to disable DOD
because all of the headers will no longer be consecutive in large pools
like they were before. At this point, we're screwed because we will never
free any memory. You could reimplement DOD to use a pool of pointers to
headers, but that's just going to be diminishing returns, especially with
the random memory dereferences (we had relatively good cache coherency
before).

> make test shows currently gc_2-4 broken (no statistics) and 2 tests from
> src/intlist.t, which is proably my fault.

I'm not surprised, to be honest. :) gc tests some GC behaviors that aren't
tested through the normal code (ie, it will actually trigger a DOD and/or
collection run), so if they're broken, you've likely done something wrong.
(changing just add_free_buffer without get_free_buffer or dod is one thing
that classifies as "something wrong" ;)

> I didn't look into these further, but this is probably due to more
> broken string/COW code + continuations in 8.t.
>
> string_substr / unmake_COW currently does highly illegal (WRT malloc)
> things, which might or might not cause problems for the current GC
> implementation. s. patch. There are probably more things like this.

Those illegal things are likely illegal for malloc, but they work
perfectly fine for our pool-based system. Feel free to remove some of that
code (it's mostly optimizations) if you like. However, I sincerely doubt
that unmake_COW or string_substr is the cause of the GC bugs you
mentioned, since they shouldn't be used by those tests at all. (Those
tests merely verify that collections and dods are being run. They don't
even really check that the GC runs don't destroy important data. (So
gc_2 failing is basically saying that the pool compaction is failing.)

> I didn't look further into memory usage or such, though top seems to
> show ~double the footprint of CVS.

I think Dan would disallow such a patch for this reason alone. We're
already taking a 2x hit (peak) by using a copying collector. No need to
make it worse. :)

Mike Lambert




Re: [perl #17495] [PATCH] for a faster life

2002-09-22 Thread Mike Lambert

>   Now, trace_system_stack walks a ~1300 entries deeper stack in CGoto
> run mode, because of the jump table in cg_core. Don't ask me about this
> difference to 900 ops, gdb says so.

Ahh, good observation. (I'm more of a non-cgoto person myself ;).

> Attached patch now sets interpreter->lo_var_ptr beyond this jump table,
> reducing the time of trace_system_stack to 0.04s for above case.

Unfortunately, this doesn't work too well as a solution. There are a few
pmcs and buffers that appear before the call to runops. These must be
traced by trace_system_stack or else get prematurely freed (think ARGV,
pbc filename, etc). Try running with GC_DEBUG to see this happen.

What I don't understand, however.the ops_addr jump table appears to be
a static variable. Shouldn't the contents of this static variable not be
stored on the stack? Alternately, is it possible to allocate this jump
table in the system heap (malloc et al), and store only a pointer to it on
the stack?

Mike Lambert




Re: Tinderbox "TD-ParkAvenue" not working

2002-09-19 Thread Mike Lambert

> First, a thank you to whoever it is who is running these test-drive
> machines (there's no name in the build log).  Also, a thanks to Compaq
> for setting them up.

You're welcome. It's basically just a script on my linux box that uploads
a tar file (the servers don't have gzip dammit! ;) to the remote machine,
then telnets and and manually builds them. This is because the machines
don't have any outgoing connections (so they can't grab ftp or cvs or rsh
stuff).

> There's a problem with the NetBSD machine.  There's no 'perl' in the
> $PATH being used, so the log file looks like this:
>
> ...
>
> Obviously it's not going to work.  If there is a perl installed on
> that machine, then perhaps the full pathname to perl should be used.
> If there's no perl, then perhaps the machine should be removed from the
> tinderbox.

Yes, I had noticed that. And that struct me as strange, particularly
because that machine had worked before, but isn't working now. I'll remove
it from the list of machines it connects to.

Mike Lambert




Re: Current Perl6 on MS Win32 status

2002-09-06 Thread Mike Lambert

> Perl6 on Win32 MS VC++ gives:
>
> Failed TestStatus Wstat Total Fail  Failed  List of Failed
> --
>
> t/compiler/8.t 1   256 61  16.67%  6
> t/compiler/a.t 1   256 31  33.33%  2
> t/rx/call.t1   256 21  50.00%  1

After the recent BUFFER_external_FLAG fixes, I now get:

Failed TestStatus Wstat Total Fail  Failed  List of Failed
--
t/compiler/1.t 1   256121   8.33%  11
t/compiler/3.t 1   256 71  14.29%  7
t/rx/call.t1   256 21  50.00%  1

Specifically,
t/compiler/1NOK 11#  got: ''
# expected: '1003.10
# 1031.00
# 1310.00
# 4100.00
# '

and:
t/compiler/3NOK 7#  got: 'Wrong type on top of stack!
# '
# expected: '678910
# 1112131415
# '

The first was a GPF, the second was just incorrect output.

I'm not sure if this is progress or not, but I believe it might adversely
affect other platforms. I don't have time to look into the issue now, but
I'll try to do so tomorrow.

Mike Lambert




Re: [perl #16855] [PATCH] uselessly optimize print()

2002-09-06 Thread Mike Lambert

> > In tracking down a gc bug, I realized that the current throwaway
> > implementation of the print op could be replaced with a faster
> > throwaway implementation that avoids doing a string_to_cstring.
> >
> > Note that both the original and new implementations are still buggy
> > with respect to supporting different encodings. I don't know if
> > printf("%s") is any better than fwrite in terms of at least vaguely
> > paying attention to your locale or whatever. If so, don't apply it.
> >
> > (all tests pass)

Applied, thanks,

Mike Lambert




Re: [perl #16852] [PATCH] Eliminate empty extension

2002-09-05 Thread Mike Lambert

> > This patch trims off the period at the end of executable filenames for
> > C-based tests on unix. (It compiled "t/src/basic_1.c" ->
> > "t/src/basic_1."; this patch makes that "t/src/basic_1")
>
> This patch should also update languages/perl6/P6C/TestCompiler.pm, since
> it hijacks lib/Parrot/Test.pm to get its functionality. I'll probably
> apply this after the code opens up again, but if someone beats me to it,
> please be sure to update the affected file above.

Applied, thanks,

Mike Lambert




Current Perl6 on MS Win32 status

2002-09-05 Thread Mike Lambert

Perl6 on Win32 MS VC++ gives:

Failed TestStatus Wstat Total Fail  Failed  List of Failed
--

t/compiler/8.t 1   256 61  16.67%  6
t/compiler/a.t 1   256 31  33.33%  2
t/rx/call.t1   256 21  50.00%  1



t/compiler/8NOK 6#  got: 'Wrong type on top of stack!
# ed 1
# 1
# 2
# a.1: 3
# b.1
# foo
# '
# expected: '1
# 2
# a.1: 3
# b
# 4
# 5
# Survived 1
# 1
# 2
# a.1: 3
# b.1
# foo
# '
# Looks like you failed 1 tests of 6.

This one is known, and is waiting on a BUFFER_external patch.
Now that parrot works on win32 again, I'll try to clear out my
patch queue.


t/compiler/aok 1/3Couldn't find global label '__setup' at line 1.
Couldn't find global label '_main' at line 3.
Couldn't find operator 'bsr' on line 1.
Couldn't find operator 'bsr' on line 3.
# Failed test (t/compiler/a.t at line 51)
t/compiler/aNOK 2#  got: ''
# expected: '1
# 1.1
# 2
# --
# 1.1
# 2.1
# --
# 1
# 1.1
# 2
# 2.1
# 3.1
# 4
# 4.1
# 5.1
# 6.1
# --
# 1
# 1.1
# 2.1
# 3.1
# 4
# 4.1
# 5.1
# '

This error was in imc->pasm, specifically:

last token = []
(error) line 63: parse error
Didn't create output asm.


t/rx/call...NOK 1#  got: 'ok 1
# ok 2
# ok 3
# ok 4
# ok 5
# ok 6
# ok 7
# ok 8
# ok 9
# '
# expected: 'ok 1
# ok 2
# ok 3
# ok 4
# ok 5
# ok 6
# ok 7
# ok 8
# ok 9
# ok 10
# '

No idea on where the missing "ok 10" went.



If people would like the p6/imcc/pasm/pbc files,
I can provide them. Just let me know.

Mike Lambert




Re: Conditional makefile generation (Was Re: [perl #16856] [PATCH]various changes to imcc)

2002-09-01 Thread Mike Lambert

> > Is there any fundamental reason why we *cannot* just enter a generated
> > imcparser.c and imcparser.h into CVS and save users the step of building
> > them on these platforms?
>
>
> Ack, so we should just delete the lines:
> imclexer.c
> imcparser.c
> imcparser.h
>
> from .cvsignore

Yep, although one also needs to adjust the makefile to avoid
performing such rules. Attached patch gets IMCC building on MSVC without
cygwin (lex/bison/yacc/etc). It assumes you add the generated imclexer.c,
imcparser.c, and imcparser.h to cvs as well.

Current perl6 test gives:

Failed TestStatus Wstat Total Fail  Failed  List of Failed
--
t/compiler/8.t 1   256 61  16.67%  6
t/rx/basic.t   2   512 52  40.00%  3-4
t/rx/call.t1   256 21  50.00%  2

Any idea on how to go about fixing the rx ones? They're failing on
imc->pasm, with msgs like "NO Op rx_pushmark_ic (rx_pushmark<1>)"

Mike Lambert

Index: config/gen/makefiles/imcc.in
===
RCS file: /cvs/public/parrot/config/gen/makefiles/imcc.in,v
retrieving revision 1.1
diff -u -r1.1 imcc.in
--- config/gen/makefiles/imcc.in27 Aug 2002 05:02:28 -
1.1
+++ config/gen/makefiles/imcc.in2 Sep 2002 02:57:49 -
@@ -33,10 +33,12 @@
 all : imcc
cd ../.. && $(MAKE) shared && $(RM_F) parrot${exe} && $(MAKE)

-imcparser.c imcparser.h : imcc.y
+grammar : yacc_file lex_file
+
+yacc_file : imcc.y
$(YACC) -d -o imcparser.c imcc.y

-imclexer.c : imcc.l $(HEADERS)
+lex_file : imcc.l $(HEADERS)
$(LEX) imcc.l

 .c$(O):
Index: languages/imcc/.cvsignore
===
RCS file: /cvs/public/parrot/languages/imcc/.cvsignore,v
retrieving revision 1.2
diff -u -r1.2 .cvsignore
--- languages/imcc/.cvsignore   27 Aug 2002 08:07:45 -  1.2
+++ languages/imcc/.cvsignore   2 Sep 2002 02:58:00 -
@@ -1,6 +1,3 @@
 imcc
-imclexer.c
-imcparser.c
-imcparser.h
 imcparser.output
 Makefile




Re: [perl #16895] [PATCH] core.ops, ops2c.pl

2002-08-31 Thread Mike Lambert

> 4) set P[k], i
>
> Here probably P should be a IN argument, because P itself is neither
> created nor changed, just the array/hash contents is changed.
> Currently only the JITters and imcc are using this flags, so, it should
> be discussed, what OUT really means.

I disagree. If P contains no keys, then set P[k] will create a new key or
pair element in the P hashtable. This means that P is being modified. Of
course, our meaning of IN and OUT is not completely nailed down just yet,
especially since the best meaning of them will probably relate to how the
JIT and IMCC want them to act. As such, the above argument could be
correct or incorrect depending upon exactly how they are defined. :)

Mike Lambert





Re: [PATCH] in makefile, move libparrot.a from "test" to "all"

2002-08-31 Thread Mike Lambert

Mr. Nobody wrote:

> Date: Fri, 30 Aug 2002 18:13:27 -0700 (PDT)
> From: Mr. Nobody <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Subject: [PATCH] in makefile, move libparrot.a from "test" to "all"
>
> libparrot.a is not really related to testing, it should belong in "all". This
> patch does so, and as a side effect, t/src/basic will now work with "make testj".

I thought so as well, at first. And currently, that might be an
okay thing to do.

However, it might help if I explain the purpose of the t/src/* tests. The
originate from ticket 468:
http://bugs6.perl.org/rt2/Ticket/Display.html?id=468

I believe the eventual intent is to set up the t/src/* tests to test:
a) functions in parrot which aren't testable via opcodes, and thus can't
be tested with our pasm files.
b) the embedding system, to ensure that a static interface doesn't change
behavior on us, etc.

Currently however, neither a nor b are implemented, and so the t/src/*
test have no direct dependancy upon libparrot.a/lib and libparrot.so/dll,
and so can probably be removed. If it helps make 0.0.8 build on more
platforms, it might be a "good thing" to do.

At least, that's my understanding of the situation.

Mike Lambert





Re: [BUG] GC collects argv aka P0

2002-08-31 Thread Mike Lambert

> > $ perl6 -k examples/life-ar.p6 5
> > Running  generations
>
> This problem is due to the fact that the argument strings are
> created with the external flag set, which is not properly
> supported by the string and GC modules. Steve posted
> some patches recently that might well fix the problem,
> but I have been leaving those for Mike to look at.

Yes yes, I've been a lazy bum. :) It's currently the summer->school
transition for me, so I might be a bit spotty for the few days, too.
You're right in that this bug is probably just a result of unimplemented
external-ness, in which case, applying Steve's fixes should make this
problem go away.

Mike Lambert






Re: Conditional makefile generation (Was Re: [perl #16856] [PATCH]various changes to imcc)

2002-08-31 Thread Mike Lambert

> > However, the intermediate filename 'y.tab.c' isn't necessarily portable,
>
> > if I remember my Windows and VMS lore correctly.  However, those platforms
> > probably have bison, which understands the simpler -o imcparser.c output
> > option.
>
>
> So the first question actually is, is there a platform, parrot will
> support, where there is/will be.. no bison.

The better question to ask is: is there any platform where we will need to
run bison/yacc on that platform in *order* to compile Parrot? I believe
the answer is no.

Is there any fundamental reason why we *cannot* just enter a generated
imcparser.c and imcparser.h into CVS and save users the step of building
them on these platforms? It's just an additional parrot depenendancy which
doesn't need to be there, and may come in handy when trying to build on a
lot of the more arcane platforms.

Mike Lambert




Re: [perl #16874] [BUG] Concatenation failing

2002-08-31 Thread Mike Lambert

> > I have a weird bug where concatenation is sometimes failing, and I
> > have no idea why. See the attached pasm. I fully expect both works and
> > weird to output "foo", "bar", "quux" with various levels of spacing,
> > but weird doesn't output quux.
>
> Patch below should fix the problem. This is not an optimal solution,
> as the unmake_COW is probably not required if the string_grow is
> going to happen anyway, but it seems to follow the general spirit of
> the current code.

Ah, this would be my bug. Thanks for finding it Peter. Unfortunately, I
fail to see why this actually fixes any bug. string_grow should unmake_COW
itself. So the old code essentially looked like this:

/* make sure A's big enough for both */
if (a->buflen < a->bufused + b->bufused) {
unmake_COW(interpreter,a);
/* Don't check buflen, if we are here, we already checked. */
Parrot_reallocate_string(interpreter, a, a->bufused + b->bufused + EXTRA_SIZE);
}
unmake_COW(interpreter, a);

While the new code with your patch, should look like:

unmake_COW(interpreter, a);
/* make sure A's big enough for both */
if (a->buflen < a->bufused + b->bufused) {
unmake_COW(interpreter,a);
/* Don't check buflen, if we are here, we already checked. */
Parrot_reallocate_string(interpreter, a, a->bufused + b->bufused + EXTRA_SIZE);
}

Since unmake_COW is a no-op if the string is not COW, I fail to see what
the functional difference is between these two snippets of code, although
I don't doubt that there is one if your fix solves Leon's code.

Also, I agree that unmake_COW is not an optimal function call if you're
going to grow the string afterwards. I wanted to get a simple interface
implemented for using COW, such that it would be easy to understand, and
then later optimize it for actual usage scenarios. I imagine that
unmake_COW could be extended to take 'pre' and 'post' byte arguments, and
would pad the resulting string by that much on either side when
uncowifying it. This would help string_grow optimally uncowify things. I
just haven't gotten around to that yet.

Thanks,
Mike Lambert





Re: [perl #16859] [PATCH] Fix BUFFER_external_FLAG for strings

2002-08-31 Thread Mike Lambert

> This patch is the real fix for strings with the BUFFER_external_FLAG.
> It requires the previous dod.c and resources.c patches to be applied.
>
> Strings marked with the BUFFER_external_FLAG point to the external
> memory rather than making a copy for themselves. Side effects of this
> are (1) less memory usage, (2) external updates to the string are
> reflected in the Parrot STRING (except for length changes!), and (3)
> these strings are skipped over when memory is getting moved around
> during a compaction.

Fixing BUFFER_external_FLAG is probably a good thing, and I'm up for
applying them after 008 goes out the door. However, BUFFER_external_FLAG
and BUFFER_selfpoolptr_FLAG almost seem to have complementary purposes
now. (The latter was something that I introduced along with the
current COW version.)

selfpoolptr indicates that the header references memory in the local pool.
This allows a non-constant-header string to point cowingly towards a
constant buffer in the constant pool. The selfpoolptr flag would be false,
and it would avoid collecting/copying/destroying the data. This is rather
similar in nature to external's behavior, and I imagine we could either:
a) make them act identically, like this patch does
b) just have people who want external data to unset selfpoolptr.

Thoughts?
Mike Lambert





Re: [perl #16857] [PATCH] minor refactoring in dod.c

2002-08-31 Thread Mike Lambert

> Small cleanups, but also a piece of my attempted fixes to the
> BUFFER_external_FLAG. (The CVS version currently doesn't work anyway,
> so don't worry about only getting parts of my fixes in; nothing
> depends on it.)

I'm curious about your dod/external "fix". If I understood the purpose of
BUFFER_external_FLAG correctly, it indicates that the memory pointed to by
this header is external to our local memory pool, and thus should not be
collected, etc.

However, if I understand your patch correctly, it makes all external
buffers immune to being collected. I agree with the resources.c patch to
fix external, but I'm not sure about this one.

Mike Lambert






Re: [perl #16855] [PATCH] uselessly optimize print()

2002-08-31 Thread Mike Lambert

> In tracking down a gc bug, I realized that the current throwaway
> implementation of the print op could be replaced with a faster
> throwaway implementation that avoids doing a string_to_cstring.
>
> Note that both the original and new implementations are still buggy
> with respect to supporting different encodings. I don't know if
> printf("%s") is any better than fwrite in terms of at least vaguely
> paying attention to your locale or whatever. If so, don't apply it.
>
> (all tests pass)

Yay, I like this one. I was looking for a way to get rid of
string_to_cstring at one point due to its ugly habit of uncowifying
strings when we go to print them. I tried a temporary solution that used
parrot io, which worked, except that the rest of the print ops in parrot
still used stdio, and so I ended up with out-of-order printing.

I'll apply this once Parrot 009 starts up again. Although at some point,
we should go 100% in converting the IO in core.ops to use parrot io, or
convert the tests/programs over to using io.ops. (Not sure which way we
want to go.)

Mike Lambert





Re: [perl #16852] [PATCH] Eliminate empty extension

2002-08-31 Thread Mike Lambert

> This patch trims off the period at the end of executable filenames for
> C-based tests on unix. (It compiled "t/src/basic_1.c" ->
> "t/src/basic_1."; this patch makes that "t/src/basic_1")

This patch should also update languages/perl6/P6C/TestCompiler.pm, since
it hijacks lib/Parrot/Test.pm to get its functionality. I'll probably
apply this after the code opens up again, but if someone beats me to it,
please be sure to update the affected file above.

Thanks,
Mike Lambert





Re: [perl #16820] [PATCH] Build libparrot.a with ar and ranlib

2002-08-28 Thread Mike Lambert

> The use of ar rcs to build a library and reconstruct the symbol table
> is non-portable.  (Mac OS X, for example, doesn't support the
> 's' option to ar.)

Yes, I had noticed that, and was hoping someone more knowledgeable would
help out with our build problems. :)

> The following patch changes the main makefile to use ar and ranlib, just
> as perl5 has successfully done for years.

Applied, thanks.

Mike Lambert




Re: [perl #16818] [PATCH] Build cleanup

2002-08-28 Thread Mike Lambert

> I discovered 'make languages' yesterday.  The enclosed patch cleans up a
> lot of small nits I found in the build process.  In a number of cases, the
> Makefiles were running perl scripts as
>   ./script
> rather than as
>   $(PERL) script
>
> A few other places called a plain 'perl' instead of $(PERL).

Thanks for cleaning these up. Applied.

> Second, Configure.pl was putting the wrong flags in to build a shared
> library.  (Or, more precisely, it was apparently unconditionally using
> flags that work for GNU binutils.)  I have replaced ld_shared by what I
> suspect is the appropriate perl5 Config variable.  I left ld_shared_flags
> empty because I don't know what is supposed to go there, but the value
> Configure.pl used to use is definitely not right for Solaris's linker.

I applied the ld_shared_flags portion of this. When I went to get things
working on win32/cygwin, I didn't know what "-Wl,-soname,libparrot$(SO)"
was for, so I left it in. Taking it out as your patch had, seems to work
fine.

However, I did not apply the following:

> -ld_shared => '-shared',
> +ld_shared => $Config{lddlflags},

With that bit applied, I get the following error during "make shared"

make[1]: Leaving directory `/cygdrive/d/p/parrot-manfree/parrot/classes'

gcc -s -L/usr/local/lib  -s -L/usr/local/lib  -o blib/lib/libparrot.so
exceptions.o ...(lots of .o files)... chartypes/usascii.o -lcrypt

/usr/lib/libcygwin.a(libcmain.o)(.text+0x6a): undefined reference to `WinMain@16'

collect2: ld returned 1 exit status

Where
LD_SHARED = -shared -L/usr/local/lib

It seems that cygwin GCC does not like the -s, and would much rather
prefer -shared to work properly. The Makefile was built using cygwin perl
(that's why its using cygwin GCC), so perhaps cygwin perl's
$Config{lddlflags} is incorrect? Any ideas on how to resolve this?

Thanks,
Mike Lambert




Re: DOD etc

2002-08-27 Thread Mike Lambert

Let me ask a somewhat obvious question here.

Why is deterministic destruction needed?

The most often-used example is that of objects with external resources
like filehandles or network sockets. Let me take that argument for the
duration of this email, but please feel free to bring up other reasons
that deterministic destruction is needed.

For the most part, the programmer should be perectly aware of when a
determinstic destruction object should be desetructed. 90% of the cases
involve the object being on the stack, and going out of scope. The
remaining 10%, in my mind, are the ones where the programmer passes on a
filehandle to some code which will do stuff with the filehandle later in
the program, and it needs to hold a reference to it.

This tells me that if we make an attribute stack_collected, the user could
use that when they are sure they are done with the filehandle.
{
  my $fh is stack_collected = new IO::FileHandle(..)
  print $fh whatever;
} # $fh is collected here


The other reason for ref-counted (I think) objects is to avoid pushing
certain system limits, like 64 filehandles, etc. This mirrors the
situation of headers, where we have a limited number of headers, and try
to avoid allocating new ones.

If we are able to define a new type of precious resource, we can make the
GC handle them efficiently. On allocation of a new PMC with type
PRECIOUS_filehandle, we can check how many PRECIOUS_filehandle's exist,
and if there's no room to allocate anymore, we can trigger a DOD run to
attempt to free some up.

This particular system would allow us to avoid over-allocating certain
system resources like filehandles and network sockets, while not placing a
burden upon the code that doesn't care for such precious resources.

Is there still a need for determinstic destruction, even in light of the
alternative approaches mentioned above?

Thanks,
Mike Lambert





Re: [perl #16755] imcc requires Parrot_dlopen but HAS_DLOPEN is never defined

2002-08-26 Thread Mike Lambert

> It currently works on my version of MSVC with nmake and friends. A few
> minutes ago, it worked on cygwin/GCC as well. Unfortunately, I broke
> something, I'm not sure what, and it doesn't work on cygwin anymore. I'm
> going to sleep now, and will probably pick up again on this tomorrow
> night.

Okay, with a bit more rejiggering tonight of some defines that were being
misused, I got dlopen to work on cygwin. I've also committed the revised
patch, in an attempt to help force out any remaining issues before we
release 0.0.8.

So currently, if one does a CVS checkout on win32, and is using cygwin or
msvc, they can do:

Configure.pl && cd languages\perl6 && make && make test

And it should proceed to properly pass all of the compiler tests, aside
from 8_5 and 8_6, which are a bug with the perl6 compiler somewhere
(verified by sean and leo).

Mike Lambert




Re: [perl #16755] imcc requires Parrot_dlopen but HAS_DLOPEN is never defined

2002-08-26 Thread Mike Lambert

> > "make shared" dies with 'missing .h files'
>
> More competent and/or Windows-savvy hands than mine are working on this as
> we speak.

I believe the proper term is stubbornly persistent. Attached is a patch to
fix up parrot shared-libraries, imcc, and perl6 all to work on win32.

It currently works on my version of MSVC with nmake and friends. A few
minutes ago, it worked on cygwin/GCC as well. Unfortunately, I broke
something, I'm not sure what, and it doesn't work on cygwin anymore. I'm
going to sleep now, and will probably pick up again on this tomorrow
night.

Is getting perl6 working on win32 a priority for 0.0.8? I wouldn't want to
commit code to fix known problems the night before 0.0.8 ships, since
there are some people who are adamantly against that.

This code will probably only work on unix/win32 when it is done. From what
I can tell, it only worked on unix to begin with (due to
shared-library/dynamic-loading usage), so I believe this is an
improvement. :)

Some of the steps taken in this patch could be deemed hacks. Some people
are of the opinion that this is okay if it gets perl6 working on win32 for
0.0.8. I'm rather unfamiliar with how cross-platform makefile weirdness
should be resolved, so I'd appreciate any advice on how to fix up some of
the issues. (see imcc.y MSC_VER ifdef's, root.in's LD_SHARED_FLAGS,
root.in's ${blib_lib_libparrot_a, and libparrot.def})

Any of the win32 folk out there want to try this patch and see if it
resolves any issues for you?

Thanks,
Mike Lambert


Index: config/gen/makefiles.pl

===

RCS file: /cvs/public/parrot/config/gen/makefiles.pl,v

retrieving revision 1.2

diff -u -r1.2 makefiles.pl

--- config/gen/makefiles.pl 29 Jul 2002 04:41:24 -  1.2

+++ config/gen/makefiles.pl 26 Aug 2002 08:14:15 -

@@ -17,6 +17,7 @@

   genfile('config/gen/makefiles/miniperl.in',  'languages/miniperl/Makefile');

   genfile('config/gen/makefiles/scheme.in','languages/scheme/Makefile');

   genfile('config/gen/makefiles/perl6.in', 'languages/perl6/Makefile');

+  genfile('config/gen/makefiles/imcc.in',  'languages/imcc/Makefile');

 }

 

 1;

Index: config/gen/makefiles/root.in

===

RCS file: /cvs/public/parrot/config/gen/makefiles/root.in,v

retrieving revision 1.24

diff -u -r1.24 root.in

--- config/gen/makefiles/root.in25 Aug 2002 23:39:15 -  1.24

+++ config/gen/makefiles/root.in26 Aug 2002 08:14:16 -

@@ -1,12 +1,13 @@

 O = ${o}

-SO = .so

-A = .a

+SO = ${so}

+A = ${a}

 RM_F = ${rm_f}

 RM_RF = ${rm_rf}

 AR_CRS = ar crs

 LD = ${ld}

 LD_SHARED = ${ld_shared}

 LD_OUT = ${ld_out}

+LD_SHARED_FLAGS=${ld_shared_flags}

 

 INC=include/parrot

 

@@ -158,10 +159,6 @@

 

 mops : examples/assembly/mops${exe} examples/mops/mops${exe}

 

-# XXX Unix-only for now

-libparrot$(A) : $(O_DIRS) $(O_FILES)

-   $(AR_CRS) $@ $(O_FILES)

-

 $(TEST_PROG) : test_main$(O) $(GEN_HEADERS) $(O_DIRS) $(O_FILES) 
lib/Parrot/OpLib/core.pm lib/Parrot/PMC.pm

$(LD) ${ld_out}$(TEST_PROG) $(LDFLAGS) $(O_FILES) test_main$(O) $(C_LIBS)

 

@@ -180,50 +177,60 @@

 #

 # Shared Library Targets:

 #

-# XXX This target is not portable to Win32

-#

 ###

 

 blib :

-   mkdir -p blib

-

-blib_lib :

-   mkdir -p blib/lib

+   -mkdir blib

 

-shared : blib_lib blib/lib/libparrot$(SO) blib/lib/libcore_prederef$(SO) 
$(TEST_PROG_SO)

+blib_lib : blib

+   -mkdir blib${slash}lib

 

-blib/lib/libparrot$(SO).${VERSION} : blib_lib $(O_DIRS) $(O_FILES)

-   $(LD) $(LD_SHARED) -Wl,-soname,libparrot$(SO).${MAJOR} $(LDFLAGS) 
$(LD_OUT)blib/lib/libparrot$(SO).${VERSION} $(O_FILES)

+shared : blib_lib blib/lib/libparrot$(SO) ${blib_lib_libparrot_a} $(TEST_PROG_SO)

 

-blib/lib/libparrot$(SO).${MAJOR}.${MINOR} : blib/lib/libparrot$(SO).${VERSION}

-   $(RM_F) $@

-   cd blib/lib; ln -s libparrot$(SO).${VERSION} libparrot$(SO).${MAJOR}.${MINOR}

-

-blib/lib/libparrot$(SO).${MAJOR} : blib/lib/libparrot$(SO).${MAJOR}.${MINOR}

-   $(RM_F) $@

-   cd blib/lib; ln -s libparrot$(SO).${MAJOR}.${MINOR} libparrot$(SO).${MAJOR}

-

-blib/lib/libparrot$(SO) : blib/lib/libparrot$(SO).${MAJOR}

-   $(RM_F) $@

-   cd blib/lib; ln -s libparrot$(SO).${MAJOR} libparrot$(SO)

-

-blib/lib/libcore_prederef$(SO).${VERSION} : blib_lib core_ops_prederef$(O)

-   $(LD) $(LD_SHARED) -Wl,-soname,libparrot$(SO).${MAJOR} $(LDFLAGS) 
$(LD_OUT)blib/lib/libcore_prederef$(SO).${VERSION} core_ops_prederef$(O)

+# XXX Unix-only for now

+blib/lib/libparrot$

Re: [perl #16269] [PATCH] COW...Again and Again

2002-08-22 Thread Mike Lambert

> On Wed, Aug 21, 2002 at 04:17:30AM -0400, Mike Lambert wrote:
> > Just to complete this thread, I have committed the current version of my
> > COW code, as I promised earlier this week.
>
> Did you try running tests with GC_DEBUG on? I get numerous failures.
> Here's a patch with a couple of fixes (not all of them gc-related),
> though I should warn you that it is rather roughly carved out of my
> local copy, which has far too many modifications at the moment.

Look at the hypocrite! He writes GC_DEBUG code to make others fix GC
problems, then forgets to do the same for his own GC problems. I have
another GC_DEBUG patch in the wings which should make this easier to test
with if you compile with --debugging, when I get around to properly
cleaning it up.

> With this patch, things work for me, but I punted in one place: if you
> look at unmake_COW() in string.c, I just disabled garbage collection
> around the reallocation. The problem seems to be that you change where
> s->bufstart points to, then call Parrot_reallocate_string() on s. But
> that can trigger a collection, and it gets confused by the
> inconsistent state.

Hrm, yeah. When I did that in unmake_COW, it seemed like a neat hack.
Unfortunately, all hacks are bad hacks with GC. :) Your patch looks good
for now, although I'll have to think about a better way to solve the
problem than blocking GC.

> This patch also contains a debugging aid that I was speculating on
> earlier. Whenever a buffer is marked as being live, it checks to see
> if the buffer is on the free list. If so, it whines and complains.
> (This is for finding premature killing of newborn Buffers; they'll go
> through one sweep and get put on the free list, then get anchored to
> the root set somehow. The next sweep will find them.)

I see your clever use of a version tag to find the source of the problem.
This should certainly help with buffers, and I imagine something similar
can be done to PMCs as well.

I'll take care of applying this patch, as I've other changes to submit.
(See below.)

> This is not 100% accurate, because the stackwalk is conservative: it
> assumes that anything whose pointer is in the appropriate range and
> otherwise smells right must be a live PMC or Buffer. I thought that
> finding old Buffers on the stack would be rare, but I was wrong -- the
> problem is that if they were buried in some deeply nested call chain,
> then because stack frames are not zeroed out upon allocation (which
> would be horribly slow), a later call chain will dig them back up. And
> the stackwalking code itself is deep enough and has large enough stack
> frames to dig up quite a bit of junk.

This problem is theoretically identical to there being perfectly valid
data on the stack that appears to be a pointer to a freed header. Even if
we could memset(0), it would not eliminate this particular problem.

> Note that this is a real bug; we really shouldn't be poking into dead
> Buffers. There's no telling what the current state of decomposition
> is. A seg fault might jump out and bite us, or worse, because that
> pointer may have been used by something else for its own twisted
> purposes. And that something else could get very upset when it returns
> to find that we've jammed flags into the corpse and used its liver for
> a link in the ->next_for_GC chain.

I would mostly disagree about it being a bug. I originally thought it was
a problem as well, until I talked it out loud on IRC once. First, free
buffer/pmcs have one field you can't touch: bufstart/vtable. This first
field of the structure is used as a pointer to create the freed-header
linked list.

However, every other bit of data in the header is part of allocated, valid
memory. Parrot will dole out this memory in their current form to code
requesting headers later. We can't possibly segfault by messing with the
flag fields. Other than that, dead headers just sit around as part of this
linked list waiting for someone to request them.

So...if we find a buffer pointer on the stack, we modify it's flags in
buffer_lives. This is perfectly harmless. When we perform the DOD
free_unused_buffers, we only add it to the free list (and modify bufstart)
if it's not BUFFER_on_free_list_FLAG or BUFFER_live_FLAG. So stuff already
on the free list will stay that way.

> It is unlikely to cause problems accidentally, I suppose -- pointers
> are checked to make sure they're within one of the pools, so the only
> way to run into problems is to have a pointer on the stack to
> somewhere within a Buffer's memory, kill off the Buffer by forgetting
> about it, then have DOD add the bogus Buffer back onto the free list.
> Or have the COW code chase down a bogus tail pointer. Or use a bogus
> PMC instead -- then you have ->vtable->mar

Re: [perl #16269] [PATCH] COW...Again and Again

2002-08-21 Thread Mike Lambert

> Some final 5000 life results from my system, and a few improvements
> I believe are still possible:
>
> Before COW: 172 seconds
> After COW: 121 seconds
> A 30% improvement in performance is not too bad, I suppose.
> Well done Mike!

Thanks!

> CVS/COW with stack pointer alignment = four: 93 seconds
> Above plus pre-mask for PMC/Buffer alignment = four: 90 seconds
>
> The first of these improvements is achieved by determining
> the alignment with which pointers are actually placed on the
> stack, versus PARROT_PTR_ALIGNMENT, which is the
> minimum alignment permitted by the underlying system.
> On an Intel x86 platform running linux, I have been unable to
> persuade any pointer to live on the stack other than on a
> four-byte alignment, except by placing it in a struct, and
> telling the compiler to pack structs. A simple C program is
> included below which illustrates this point.

Jason Gloudon has also said that x86 has a four-byte pointer alignment. I
seem to recall a pointer aligned to an odd value that I found in a stack
walk once, but I'm unable to reproduce it in extensive fiddling with your
test program. As such, it's probably worthwhile to implement such a
change, although I'm not quite sure the best way to do it.

Should this be a configure.pl-determined constant? Should we hardcode it
to sizeof(void*)? Is this behavior guaranteed by the C spec? Can we
assume it across all platforms even if it is not guaranteed?

> > If you don't mind, please feel free to continue your work on parrot-grey.
> The problem arises with trying to do new experimental development,
> which still keeping sufficiently in sync with cvs parrot that I can do a
> 'cvs update' from time to time without getting dozens of conflicts.
> A case in point is the new 'strstart' field - grey doesn't need it, but to
> leave it out would create a large number of differences between the
> two versions, with code having to be changed every time somebody
> writes a new reference to it - therefore if I do continue with grey, I will
> just probably just leave strstart in, and ignore the memory overhead.
> The next item on the list for grey was paged memory allocation - this
> may be usable to some extent without the buffer linked lists; so I will
> probably give that a spin anyway.

I think a union in the string header might do quite nicely in your case. I
had the chance to look into your next/prev buffer linking code the other
night. Interesting approach, but I have a few questions. :)

In your collection phase, you give up header pool cache-coherency in favor
of the memory pool. Your headers are organized by bufstart, essentially.
Likewise, your use of the circular linked list of headers to add stuff to
the front and ends of the header list as necessary is also interesting,
and thrw me for a loop for a little while. :)

The current cvs approach has an approach which is mostly cache-coherent.
It iterates over ALL (not just live ones, as you do) buffers in header
pools. And since the last collection, we can assume that most of the data
hasn't changed (a harder assumption if we have a generational collector),
and so the pool locality should follow the header locality, due to the
nature of the copying. I'm not trying to argue which one is better, but
merely try and state the differences in implementations to see if I got it
straight.

Might I ask what your motivation was for the header linked list? I can see
that it solves the problem of:

set S0, some_large_data_file_contents
substr S1, S0, 0, 1 #get first character as COW
set S0, ""
sweep
collect

In current CVS, the large data file is kept around, whereas in your
implementation, it would only copy the single character. However, there is
an easy way to achieve nearly the same behavior as above in the current
CVS. When we copy a COW string, it's initially marked as non-COW. In the
subsequent collection, we have a really large buffer with a strstart and
bufused that are quite small in total usage. If we only copy necessary
data for non-COW strings, then the second sweep performed would eliminate
the wasted memory copy.

Not quite as fast in eliminating the memory usage as the above solution,
but since we are guaranteed of collections happening throughout the
lifetime of any program that does something with strings, I think it's an
okay tradeoff. Were there any other reasons for implementing the above
linked list technique that I missed?

Thanks,
Mike Lambert






Re: Possible bug in new string COW code

2002-08-21 Thread Mike Lambert

> Reading through the latest version of string.c, pondering the
> best way to integrate the grey and (what colour is cvs parrot?)
> versions, I came across the following line in unmake_COW:
> s->buflen = s->strlen;
> which got be a little confused - I seem to recall buflen as being
> in bytes, and strlen as being in encoding-defined characters.
> Did something change when I wasn't looking, or is this a bug
> just waiting for somebody to actually implement Unicode?

Yep, you're right, that's definitely a bug waiting for unicode. My
intention there was to only copy as much data as was needed when we
uncowify a buffer. I believe changing strlen to bufused is the proper fix,
and have committed said change.

Thanks,
Mike Lambert





Re: DOD etc

2002-08-21 Thread Mike Lambert

> In this Brave New World of DOD and GCC, what guarantees (if any)
> will we be making at the Perl 6 language level for the timely calling of
> object destructors at scope exit?

>From the straight GC perspective, none. There might be workarounds at
higher-levels, however.

> ie the classic
>
> { my $fh = IO::File->new(...);  }
>
> I know there's been lots of discussion on this over the months,
> but I'm slighly confused as to what the outcome was.

I'm not sure if there ever was a consensus. A few ideas that I recall
being brought up were:

a) allow the ability to force a DOD run at block exit. This would emulate
perl5 behavior, and would be necessary when porting perl5 code with
DESTROY methods. I can imagine having a "block-exit-var-in-scope" flag
somewhere, that's set when we create a magic filehandle var, and possibly
unset with each dod run if the variable goes out of scope. When this flag
is set, the interpreter can force a DOD on on some block_exit() opcode, or
whatever the interface.

b) We can make a special property for these variables:
my $fh is stack_freed = IO::File->new(...);
When this variable's stack frame goes out of scope, it automatically has
it's destructor called, regardless of other references, since it can't
detect them. It would leave the actual PMC header as "live" until the next
DOD pass, when it would be truly freed. If the next DOD pass finds it
alive, it could barf. This isn't entirely safe, but it does offer the best
performance, I think.

c) similar to b, but more programmer-directed. I believe .NET has two
concepts of destruction. IO filehandles can have an active destroy method
called directly on them with 'delete someobject', leaving the actual
memory hanging around until the next GC (dod) run, at which point it
really deletes the header.

..NET improves upon Java's inability to give timely constructors to
objects, by allowing the user to manually delete things when they know
there are no other references for things that need to free resources.


Other than the above, I'm not sure what other methods could be used to
force destruction. And I'm not sure if a decree has been made about what
Perl6 will do.

Mike Lambert




Re: [perl #16269] [PATCH] COW...Again and Again

2002-08-21 Thread Mike Lambert

Just to complete this thread, I have committed the current version of my
COW code, as I promised earlier this week. Below is my response to Peter's
most recent email.

> > Note that the comparison against parrot-grey is not
> > exactly fair, because it dodn't use system stackwalking.
>
> Note that I have only commented out the call to the stackwalk
> function - for COW benchmarking purposes you could always
> reinstate it. But that is beside the point now - your COW has
> been fixed, and the benchmarks confirm that gc_generations
> is equally unfriendly to all cows. There will always be programs
> that don't benefit and therefore only get the overhead - but in
> typical perl usage, I would expect that the majority of programs
> will benefit significantly, for example regex capture will be able
> to use COWed substrings.

Yeah, regex capture should be benefit *big* from COW. It also technically
helps make strings act more perl5-like, where you may easily chop
characters off the front and end of the string without reallocation. We
could even have the non-COW copy collection use strstart and strlen to
compact memory usage, giving the best of both worlds for those kinds of
applications.

> This should finally bring about the demise of grey, as I don't
> believe there is room for two totally different implementations
> of COW, and my buffer linked list, which is already expensive,
> gets absurdly so with the addition of strstart also.

This saddens me, and I hope it's not a permanent death. Grey was a very
good sanity check for me, at least. It caused us to get a 20% performance
in stackwalking (I think), motivated me to improve parrot's cow abilities
and performance, and was in general a good wake-up call that some of our
decisions were having a negative impact, and should be reconciled.

In reality, all that differs between grey's cow and mine, is that mine
allows for COWing of substrings with constant strings, and has a modified
string.c interface that improves clarity, imo. Fundamentally, it's the
original COW you provided a long time ago. I'd hate to make you
discontinue your side project because I committed a different
implementation of COW that wasn't directly compatible.

If you don't mind, please feel free to continue your work on parrot-grey.
I'd love to see the other ideas you had mentioned in your previous emails
that hadn't yet made it to grey, as some of them didn't sound entirely
illegal. You said that parroy grey was a fun project to play around
with performance numbers, and I'd hate to be the reason you stopped having
fun. :)

Thanks,
Mike Lambert






Re: [perl #16274] [PATCH] Keyed access

2002-08-21 Thread Mike Lambert

Tom Hughes wrote:
> Index: basicvar.pasm
> ===
...
> Index: instructions.pasm
> ===
...

Fixes the bug, and wumpus plays yet again.

Applied, thanks.

Mike Lambert





Re: GC generation?

2002-08-20 Thread Mike Lambert

> At 6:16 PM -0400 8/20/02, John Porter wrote:
> >Dan Sugalski wrote:
> >>  I expect a UINTVAL should be sufficient to hold the counter.
> >
> >Why?  Because you don't expect a perl process to run longer
> >than a couple hours?  Or because rollover won't matter?
>
> Rollover won't really matter much, if we're careful with how we
> document things. Still, a UINTVAL should be at least 2^32--do you
> really think we'll have that many GC generations in a few hours?

Currently, 5000 iterations of life execute in 6 seconds, with 42K DOD
runs. At that rate, we have a rollover every week. Not really a problem,
but if we have code which doesn't allow for rollover, it is a problem.

I can see using the generations value to handle code that is dependant
upon things "changing". However, as Steve mentioned, it's probably
easiest and fastest to just always re-dereference the bufstart.

It might be useful to specify the generation within a pool, with the
assumption that the GC would track it and promote it to a different
generational pool before an overflow occurs. But it'd make more sense to
use a byte/short for this, and reset it to 0 with each promotion. (Or in
the case of the final generation, ignore rollover.)

A dod generation count doesn't buy us much. Because we don't track
inter-pool pointers, we need to do a full dod every time we need to
determine the root set. However, copy collection can be localized to a
given pool, and as long as we copy every header into that pool, we can
avoid copying a lot of data.

If we have more DOD's than collections, it would make sense to just
iterate over the header list with each collection to search for pool
pointers, and hope the generational overhead is outweighed by the ability
to avoid re-copying stuff. This will probably be more apparent with
real-world programs than test programs that can keep every bit of memory
in the cache.

If collections are more frequent than DODs, we might be able to  set up
lists of pointers on a DOD run, organized into generational pool, and just
use those during collection. That's effectively one additional pointer per
header, however. And there are better uses for such space.

Finally, it's possibly to do whole-scale generational promotion of an
entire pool, and avoid a generations count altogether. I forget the
details exactly, but it involves seperating each generation into two
pools, and storing the generation count in the pool itself. It introduces
some error into the promotion rates (some stuff is promoted too early,
some too late), but it avoids the extra generational count.


So in conclusion, generational systems can be done using at most a byte or
a short, and it's even possible to do them with nothing at all. So until
the need arises, I don't think the generations count would be worth it.
Especially since I plan to try and prove the need for a header pool
pointer at some point. :)

Mike Lambert






Re: [perl #16274] [PATCH] Keyed access

2002-08-20 Thread Mike Lambert

> I have a clean version that's up to date, and as everybody seems to
> be happy with it I'm going to go ahead and commit it now.

Ah-ha! I found a showstopper! Oh, it's a little late for that, isn't it?
:)

Anyways, cd to languages/BASIC, run basic.pl, type "LOAD wumpus", and
watch it die on "Not a string!". It could be that basic is using keys in
weird ways, or it could be that the key patch is borked...I haven't looked
into it enough to determine the true cause here.

Thanks,
Mike Lambert




Re: [perl #16308] [PATCH] logical right shift

2002-08-20 Thread Mike Lambert

> This adds logical shift right opcodes. They are essential for bit shifting
> negative values without sign extension getting in the way.

Applied, thanks.

Mike Lambert




Re: [perl #16300] [BUG] hash clone hangs

2002-08-19 Thread Mike Lambert

> recent changes in hash.c seems to hang[1] hash_clone.
> This patch works, but I don't know, if it is the correct way to solve
> the problem.

Even if it is the correct way to solve the problem (which I don't know),
it uses C++-style comments which are a no-no for Parrot's C target.

Secondly, can you please turn off strip-trailing-whitespace in your
editor? Your patches are reflecting the stripped spaces, which makes it
hard to discern intentional changes from accidental ones.

Thanks,
Mike Lambert




Re: [perl #16269] [PATCH] COW...Again and Again

2002-08-18 Thread Mike Lambert
asm at label getout, so the reported
> active buffers and memory use are as accurate as we can
> make them. Which makes me think of something - grey is

Yes, this solved some of the problem. Adding a sweep&collect dropped the
active header usage down to reasonable levels, which helped me realize
that I wasn't actually allocating any additional headers due to COW...the
GC was just inefficient in it's management of them.

> ignoring reclaimable to get the size to allocate for the
> post-compaction pool, therefore the memory usage is always
> going to be higher than is actually needed - are we simply
> looking at excess allocation here, rather than excess usage?
> If so, grey will fix it in the next release with paged memory
> allocation; and I'm sure you'll think of a solution also.

That's also a distinct possibility. The current COW implementation ignores
reclaimable altogether, piecing together a proper total_size using the
code I posted in a previous email.



Aaanywys,

>From my current benchmarks, a lot of my worries about COW have been
nullified. BASIC wumpus loading now takes place in 1/3 the time, and
the worst-case performance is gc_generations of 20%. And that test
exclusively uses "repeat" to create lots of strings of varying lifetimes,
so it's unreasonable to expect any better performance on it.

So, now that the major objections to the previous patch have been
addressed, does anyone have any reasons against this patch going in?

Thanks,
Mike Lambert



Index: core.ops

===

RCS file: /cvs/public/parrot/core.ops,v

retrieving revision 1.199

diff -u -r1.199 core.ops

--- core.ops18 Aug 2002 23:57:37 -  1.199

+++ core.ops19 Aug 2002 02:28:35 -

@@ -166,9 +166,9 @@

   }

 

   $1 = string_make(interpreter, NULL, 65535, NULL, 0, NULL);

-  memset(($1)->bufstart, 0, 65535);

-  fgets(($1)->bufstart, 65534, file);

-  ($1)->strlen = ($1)->bufused = strlen(($1)->bufstart);

+  memset(($1)->strstart, 0, 65535);

+  fgets(($1)->strstart, 65534, file);

+  ($1)->strlen = ($1)->bufused = strlen(($1)->strstart);

   goto NEXT();

 }

 

@@ -354,7 +354,7 @@

   UINTVAL len = $3;

 

   s = string_make(interpreter, NULL, len, NULL, 0, NULL);

-  read($2, s->bufstart, len);

+  read($2, s->strstart, len);

   s->bufused = len;

   $1 = s;

   goto NEXT();

@@ -418,7 +418,7 @@

 op write(in INT, in STR) {

   STRING * s = $2;

   UINTVAL count = string_length(s);

-  write($1, s->bufstart, count);

+  write($1, s->strstart, count);

   goto NEXT();

 }

 

@@ -2256,7 +2256,7 @@

 t = string_make(interpreter, buf, (UINTVAL)(len - s->buflen), NULL, 0, NULL); 


 $1 = string_concat(interpreter, $1, s, 1);

 } else {

-t = string_make(interpreter, s->bufstart, (UINTVAL)len, NULL, 0, NULL); 

+t = string_make(interpreter, s->strstart, (UINTVAL)len, NULL, 0, NULL); 

 }

 $1 = string_concat(interpreter, $1, t, 1);

 

@@ -2281,7 +2281,7 @@

 }

 

 /* XXX this is EVIL, use string_replace */

-n = $1->bufstart;

+n = $1->strstart;

 t = string_to_cstring(interpreter, s);

 for (i = $4; i < $4 + $2; i++)

 n[i] = t[i - $4]; 

@@ -3891,7 +3891,7 @@

   switch ($3) {

 case STRINGINFO_HEADER:   $1 = PTR2UINTVAL($2);

   break;

-case STRINGINFO_BUFSTART: $1 = PTR2UINTVAL($2->bufstart);

+case STRINGINFO_STRSTART: $1 = PTR2UINTVAL($2->strstart);

   break;

 case STRINGINFO_BUFLEN:   $1 = $2->buflen;

   break;

@@ -4162,13 +4162,13 @@

   void (*func)(void);

   string_to_cstring(interpreter, ($2));

   string_to_cstring(interpreter, ($1));

-  p = Parrot_dlopen($1->bufstart);

+  p = Parrot_dlopen($1->strstart);

   if(p == NULL) {

  const char * err = Parrot_dlerror();

  fprintf(stderr, "%s\n", err);

  PANIC("Failed to load native library");

   }

-  func = D2FPTR(Parrot_dlsym(p, $2->bufstart));

+  func = D2FPTR(Parrot_dlsym(p, $2->strstart));

   if (NULL == func) {

 PANIC("Failed to find symbol in native library");

   }

Index: debug.c

===

RCS file: /cvs/public/parrot/debug.c,v

retrieving revision 1.25

diff -u -r1.25 debug.c

--- debug.c 18 Aug 2002 23:57:37 -  1.25

+++ debug.c 19 Aug 2002 02:28:37 -

@@ -692,7 +692,7 @@

 constants[pc[j]]->string->strlen)

 {

 escaped = PDB_escape(interpreter->code->const_table->

- constants[pc[j]]->str

Re: [perl #16283] parrot dandruff

2002-08-18 Thread Mike Lambert

> > > Tru64 finds the following objectionable spots from a fresh CVS checkout:
> >
> > Does this patch fix it? (Though even if it does, I wouldn't be at all
> > surprised if some other compiler choked on it.)
>
> Works okay in Tru64 and IRIX which are known for their pointer pickiness.

Applied, thanks.

> On IRIX, though, I get these, where probably NO_STACK_ENTRY_TYPE is
> meant instead.

Applied as well.

Mike Lambert




Re: [perl #15907] [PATCH] Make warnings configurable

2002-08-18 Thread Mike Lambert

> In the quest for removing warnings, I added an option --ccwarn to
> Configure.pl. With this option I could selectivly turn on and off
> warnings, and especially compile with -Werror, so I don't miss any
> warnings. The simple warnings (the missing return values) were already
> fixed before I was able to submit a patch.

Looking at the patch, it seems rather GCC-specific. The checking for
"no-X" versus "X" in the warnings flags seems to be rather non-portable to
compilers like MSVC.

Unfortunately, I don't believe this is easily fixable.

Mike Lambert




Re: [perl #16048] [PATCH] Eliminate alignment warning in packfile.c

2002-08-18 Thread Mike Lambert

> The following patch eliminates an alignment warning in packfile.c, and
> adds a comment to packfile.h about alignment assumptions underlying the
> size of the packfile header.

Applied, thanks.

> I wonder if we ought to have a Configure "sanity section" wherein various
> assumptions are tested prior to build time.  Two candidates for such a
> section would be
>
>   sizeof(INTVAL) >= sizeof(void *)
>   PACKFILE_HEADER_BYTES % sizeof(opcode_t) == 0
>
> I'm sure there are other assumptions too.

Anyone else have any ideas on where the best place to put the above?
Configure currently doesn't know about PACKFILE_HEADER_BYTES, since it's a
macro in packfile.h. We could check them in Parrot's initialization, but I
don't know if that's a good idea.

We could create a C file which contains the above assumpts with asserts,
and includes parrot.h. Then the main() function could assert on all of the
necessary conditions. Configure would compile and run this program to
ensure correctness.

Thoughts? Anyone want to take a crack at it?

Mike Lambert




Re: [perl #16269] [PATCH] COW...Again and Again

2002-08-18 Thread Mike Lambert
o CVS levels,
with memory usage lower, but still higher than CVS.

It seems that while COW might save memory due to sharing, it also makes
when-to-collect logic break, and break our balance of collection
frequency and new-block-size, leading to an apparant memory usage
increase. I can't really think of any other cause.

Personally, I find that COW logic makes things a bit more complex, and
somewhat harder to debug. And it certainly requires some more discipline
to be sure you copy data before modifying it, etc. So while I've been
pushing for COW for a long time, if it turns out to be horribly broken in
memory usage, I'm going to have to sideline my work on it and continue
with other stuff. :|

Thanks,
Mike Lambert




Re: [perl #16269] [PATCH] COW...Again and Again

2002-08-18 Thread Mike Lambert

> Elapsed times for 'time parrot hanoi.pbc 14 > /dev/null' are:
> CVS: 52.81, 52.05, 52.33
> CVS + grey COW: 51.53, 52.06, 51.67
> CVS + Mike's COW: 44.31, 44.48, 44.55
> CVS + grey1: 35.89, 36.48, 36.60 (+COW +cyclecount -stackwalk)
> End June grey: 30.14, 29.35, 29.53
>
> And 5000 generations of life tested again:
> CVS: 170.22, 169.01, 168.70
> CVS + grey COW: 162.65, 161.44, 163.61
> CVS + Mike's COW: 156.86, 157.78, 157.67
> CVS + grey1: 80.38, 80.74, 80.69
> End June grey: 59.21, 59.41, 59.42
> CVS 14th July: 81.22, 81.17 (last timings I recorded before stack walking)
>
> So I get an improvement on Hanoi of about 15% using your
> COW patch, and your COW is better on both tests than mine.

Wow, that's cool, if strange. COW+Hanoi was definitely slower for me.

I have another interesting test to try.
Run languages/basic/basic.pl.
Type "LOAD WUMPUS, and hit return.
Type "RUN", and hit return.
Type "N" and hit return.
It then builds a wumpus maze and does other intensive stuff in the basic
interpreter.

The patch I sent out runs 4x slower than CVS on the above test. It has a
peak memory usage of 22MB, versus straight CVS's 2MB, and your
parrot-grey's usage of 6MB. (I didn't compare parroy-grey speed because it
wouldn't be fair. ;)

My current theory is that it is due to its conservative increase of
reclaimable (only if it's guaranteed to be reclaimable), versus grey's
always-increment-reclaimable. Technically, mine is correct, since in
theory you could make a bunch of COW strings, free them all, have
reclaimable be quite large, and have the total_size calculation in
resources.c come out negative. :)

One idea I had was that because I don't have an accurate reclaimable
figure, the asymptotic behavior of the collection pool size was growing
each collection due to the under-estimating of reclaimable.  Changing this
to use a non-increasing number calculated from the pool's free space,
brought my usage down to 6MB, closer, but not nearly there.

In addition, my COW code was getting roughly 4x slower times after I hit
"N". I'm assuming it's related to the memory usage. I was able to get it
down to 1.5x the striaght CVS code when I dropped the memory usgae, and
I'd like to hope we could make it drop even further by fixing this memory
leak.

The fact that both COW's require roughly 3x *more* memory is quite
surprising. If you (or anyone else) feels like attempting to figure out
where our memory is going, I would greatly appreciate it. I'm stumped over
here, and am getting frustrated. (That's why I'm writing this email and
going to sleep ;)


Thanks,
Mike Lambert




Re: [perl #16274] [PATCH] Keyed access

2002-08-17 Thread Mike Lambert

> Attached is my first cut of a patch to address the keyed access issues.

First, thanks for spending the time to implement and clean up the keyed
code. Hopefully this'll clean the floor so that when this list has key
discussions, we'll all be arguing about the same thing. :)

> This patch doesn't do everything, but it does bring things more or less
> in line with Dan's recent specification I hope. I'm sure there are also
> problems with it so if we can get some eyeballs on it before I commit it
> that would be good.

Here are the things I noticed when going through your patch.

- assemble.pl:
shouldn't the code :
elsif ($_->[0] =~ /^([snpk])c$/) { # String/Num/PMC/Key constant
include support for "kic" somewhere?

the magic numbers in _key_constant, I'm assuming they are supposed to map
to the constants in key.h ? Perhaps a note mentioning that correspondance
would be useful. Also, it seems the number usage is broken.  You use
1,1,1,2,4,7. Shouldn't it be 1,1,1,2,3,5? And shouldn't s/inps/ be
s/insp/? Or maybe the constants in key.h need rearranging?

- dod.c:
Near the comment, "Mark the key constants as live". Constants shouldn't
need to be marked live, as constants are prevented from being GC'ed, if
PMC_constant_FLAG is set. At least, in theory. Did it not work for you?

- core.ops
Looking at the set functions, shouldn't the "Px[ KEY ] = Bx"
set of functions have $1 defined as inout instead of out in most
circumstances?

In your definition of the groups of set functions, can you change "Ax =
Px[ KEY ]" to "Ax = Px[ INTKEY ]" where appropriate?

- key.pmc
the mark() function needs to return a value. Namely, the return value of
key_mark.

- random
Your use of registers for key atom values is interesting, and I think it
will create problems. It's not a problem with your patch as much a
problem with an aspect of the key design, I think.

The plan is to allow parrot functions to implement vtable methods in
parrot. If I have a key [I0,I1], and pass it to this vtable method, it
could be passed to a function implemented in parrot, with all of parrot's
calling conventions. This means that by the time it gets to the person
implementing the key, it's extremely possible that the registers have been
overwritten.

I'm not sure how to resolve this one. Alternatives are:
a) don't allow register references in keys. Instead, force people to use
the key modification ops to reset the key to the correct values each time
they want to use it.
b) handle auto-generated .ops files, such that if they receive a KEY as a
parameter, it calls key_fixup_registers, which grabs the current values of
the registers and sticks them into the key structure. This could cause
problems with constant keys, so you might need to create a key copy.
c) any other ideas? Or should we mark this as a 'known limitation' ?

Overall, tho, the patch looks extemely complete. Tracing support,
disassemble.pl support, debug.c support, etc. You even reduced macro
usage. Rather impressive. :)

Thanks,
Mike Lambert





Re: [perl #16219] [PATCH] stack direction probe

2002-08-17 Thread Mike Lambert

Applied, thanks.

Mike Lambert

> This is a config test for the direction of stack growth that makes
> the direction a compile time constant.
>
> --
> Jason




Re: [perl #16278] [PATCH] Quicker pointer checking for stack walking

2002-08-17 Thread Mike Lambert

Applied, thanks.

> Moved the static prototype to dod.c
>
> Jason




Stack Walk Speedups?

2002-08-17 Thread Mike Lambert

As Peter has pointed out, our stackwalk code is rather slow.

The code that's in there was my first-attempt at the need for stack
walking code. There's one optimization in place, but the algorithm behind
the optimization could use some work.

Basically, it finds the min and max values of all headers. It does a check
(for quick failure purposes) to see if the data on the stack is in that
range, and then proceeds to do the accurate check. The accurate check
consists of walking through each header pool in an attempt to see if this
pointer-sized data could be interpreted as being in that pool.

Currently, this is a linear walk over the header pools. I imagine there
are many better algorithms for determing a root set from a stack. The
boehm collector probably has decent code in this regard. However, given
that we have O(N) with size of stack, I'm not sure how we'll be able to
alleviate this in the long run.

Anyone feeling adventuresome and want to attempt to speed this up? It
should be an easy introduction to the GC code in general. Just start out
in trace_system_stack, and work your way down.

Mike Lambert




Re: [INFO] The first pirate parrot takes to the air

2002-08-17 Thread Mike Lambert

Peter Gibbs wrote:

> > How much of the speed win is from the cycle count instead of stack
> > walking? Unless you've solved the problem of recursive interpreter
> > calls and setjmp, it's not a valid solution, no matter what the speed
> > win might be.

> According to my notes the progression (for 5000 lives) was:
> CVS: 172 seconds
> Cycle count instead of stack walk: 97 seconds
> COW with stack walk: 158 seconds
> Cycle count + COW: 81 seconds

Just for fun, can you run Hanoi on CVS versus CVS+COW?

I just got COW implemented here, and while I get a 17% speedup on
life, I get a 5% loss on hanoi. Since you only posted life, it's a
bit hard to see if the drop on hanoi is just my fault, or the fault of
COW in general.

(More benchmarks will appear in my soon-to-be-sent COW patch email.)

Thanks,
Mike Lambert




Re: [INFO] The first pirate parrot takes to the air

2002-08-16 Thread Mike Lambert

> For purely academic purposes, I have re-synchronised some of my
> forbidden code with the latest CVS version of Parrot. All tests pass
> without gc debug; and with gc_debug plus Steve Fink's patch.
> Benchmarks follow, on a 166MHz Pentium running linux 2.2.18.
>
>  Parrot  African Grey
> life (5000 generations)   172 seconds  81 seconds
> reverse /dev/null193 seconds 130 seconds
> hanoi 14 >/dev/null51 seconds  37 seconds

Rather impressive. Except that it makes me look bad. :)

> The differences between the two versions are:
> 1) Use of the interpreter cycle-counter instead of stack walking.
> 2) Linked lists of buffer headers sorted by bufstart
> 3) COW-supporting code in GC (for all buffer objects)
> 4) Implementation of COW for string_copy and string_substr

1) Yeah, the approach of cycle-counter is a nice one. I also had a similar
solution involving neonate flag usage, somewhere in the archives. Both
have *significant* speed advantages versus the curent codebase's stack
walking.

I tried to convince Dan of the merit, but they failed for various reasons:

Your solution, (ignoring the extra cycle counter byte for now), cannot
handle vtable methods implemented in Parrot. The current system to
implement this involves the interpreter recursively calling runops_core to
handle the vtable method. If you increment cycles on the inner loop, you
risk pre-collection of stuff on the stack of the vtable method calling
stuff.  If you don't increment cycles, you prevent any of the memory
allocated inside of this vtable method from ever being collected during
the method execution...bad stuff when your vtable methods are multiplying
gigantic matrices or somesuch.

My neonate buffers solution fails only in the presence of longjmp.

Granted, we don't do any of this yet, so these solutions will mop the
floor with my current stackwalk code, and pass tests to boot. But it's the
planned introduction of these requirements which are the reason for making
these solutions 'forbidden'.

One of Nick's solutions was to fallback on stack-walking to handle the
cases where our faster solutions fail. I can definitely see that working
with neonate buffers to handle the fixups needed after a longjmp call. But
it doesn't seem as attractive in the presence of your solution, for which
it would require stackwalking for all re-entrant runops calls. Do you have
another solutioin in mind for handling re-entrant runops calls?

As far as the extra byte in the buffer, I don't mind that one at all.
There are a lot of restrictions on the GC code in the interest of making
stuff "lightweight". Unfortuantely, GC code takes a significant portion of
the execution time in any realistic application. Hopefully we can convince
Dan to allow extra fields in the buffers in the interest of speed, but I
don't think we can reduce parrot/perl6's feature set in the interest of
speed...might as well use C if that's what you want. :)

2) Currently, we use linked list of buffer headers for freeing and
allocating headers. I'm not sure what you mean by saying that they are
sorted by bufstart? What does this buy over the current system?

3) Definitely a good one. I've been trying to merge your original COW
patch into my code here. Without GC_DEBUG, it fails one test. With
GC_DEBUG, it fails the traditional set plus that one test. The test case
is rather large unfortunately, I haven't been able to narrow down the
problem further or I'd have committed it.

4) Isn't this really the same thing as item 3? I'm basing my knowledge off
your old COW patches. Has additional work been done on the string function
integration since then, or do #3 and #4 both come from those patches?

> Some of the changes I made before the memory management
> code was totally reorganised  have not yet been re-integrated.
> My last version prior to that reorganisation ran 5000 lives in
> 61 seconds, and I hope to get back to somewhere close to
> that again.

I'm not sure how much of the new code you've merged with. Which of the new
files are you planning to integrate/merge with, and which have you thrown
out in favor of older versions? I'm specifically referring to any of
resources/dod/smallobject/headers.c.

Regardless of whether or not it goes in, I'd be interested in seeing a
patch. I can work on integrating a lot of your non-forbidden code into the
current codebase.

Thanks for spending the time to generate these numbers...they're a nice
eyeopener on what can be done without the current restrictions. Hopefully
they'll allow us to reconsider each restriction in the context of
the speed of our GC.

Mike Lambert




Re: [COMMIT] GC_DEBUG, Some GC Fixes, and Remaining GC Bugs

2002-08-13 Thread Mike Lambert

> Somebody gimme a cookie.

/me hands Steve a cookie.

> If the rx info object is going away, then obviously those parts of the
> patch need not be applied. But in the meantime, it's nice to have a
> Parrot that doesn't crash.

I agree. My disclaimer about the regex code in my original email was to
suggest that we didn't need to focus on the rx issues, but if you've
already done it... :)

> I'm not going to apply this patch yet because I'm sure someone will
> disagree with how it fixes some or all of these bugs. So would that
> someone please speak up? Thanks.

I suppose that someone is me, although there might be other someones.

> In summary, this patch
>
>  - Adds an OUT parameter to new_hash() so the hash is anchored to the root set
>while it is being constructed.
>  - Adds an OUT parameter to hash_clone() for early anchoring.
>  - Adds an OUT parameter to rx_allocate_info() for early anchoring.
>  - Briefly disables DOD while a stack is being created so allocating the contents
>of the stack buffer doesn't destroy the unanchored buffer header.

These are needed for now. However, when we get that buffer/pmc unification,
we should be able to make mark() methods in the header pools. Then, with
support for non-pmc-wrapped buffers, we can find references to them
on the system stack, and call their mark() method directly, avoiding
the above hoops. At least, that's my hope. Is it possible to mark the
above code with some XXX tag so that we can re-address it when we get the
unification in place?

>  - Makes a major change to the Pointer PMC: the previously unused ->cache area
>is now used to hold a pointer to a custom mark routine that will get fired
>during PMC traversal. Previously, Pointers had the PMC_private_GC_FLAG set,
>but nothing ever looked at it. With this change, Pointers behave as they
>always did unless something externally sets the ->cache.struct_val field
>(in other words, there is no vtable entry for setting the mark routine,
>and the PMC's custom mark routine does nothing if that field is NULL.)
>
>  - Reorders the rx_allocinfo opcode to assign things in the correct order and
>fill in the ->cache.struct_val field of the Pointer PMC it creates.

These are a bit hackish, but I agree they are needed to solve our GC_DEBUG
problems (and by extension, "real-world Parrot programs" ;). Both of these
should also be able to "go away" with the unification, so see previous
paragraph. :)

I think I'm going to make GC_DEBUG a parameter of the interpreter, and
allow it to be turned on/off via opcodes. Then we could force our test
suite to use GC_DEBUG to root out GC problems a lot sooner than they
otherwise would. Fixing all GC_DEBUG problems would help allow this kind
of testing to be part of the standard test suite.

>  - In interpreter.c, asserts that a few of the early buffer creations do not
>return the same buffer (provides early warning of GC mischief)

Oooh, nice! :)

The rest of the things you listed, which I didn't comment on are, imo,
perfectly fine.

In conclusion, I don't have any objections to this patch, although it
would be nice if "XXX Unification" markers were included in places that
needed to be addressed later.

Mike Lambert






Re: [COMMIT] GC_DEBUG, Some GC Fixes, and Remaining GC Bugs

2002-08-12 Thread Mike Lambert

> > Anyone more well-versed in these departments than I care to take a look at
> > the potential problems? Just change GC_DEBUG in parrot.h, and you can be
> > on your way. :)
>
> I can't get to it because parrot doesn't survive past initialization
> for me. When it creates the Array PMC for userargv, it allocates the
> PMC first and then the buffer for the array's data. During the
> buffer's creation, it does a collection that wipes out the PMC. My
> lo_var_ptr and hi_var_ptr are set to reasonable-sounding values at the
> top of trace_system_stack(), but I haven't been able to track it
> farther yet. Oh, and I do have your recent patch to set
> interpreter->lo_var_ptr early.
>
> The userargv PMC is not anchored other than in the C stack, because it
> dies in the pmc_new() creation process before the assignment to P0 can
> run.

Weird. I had to move the lo_var_ptr initialization code to runcode instead
of runops, in order to avoid collecting the ARGV pmc. The new code looks
like:

void *dummy_ptr;
PMC *userargv;

Is it possible that some systems might put dummy_ptr higher in memory than
userargv, thus causing userargv to become prematurely collected? If
so, there are three options:
- make two dummy ptrs, and choose the lesser of the two.
- set the dummy ptr to userargv, and hope we don't add two
header variables. ;)
- force the setting of lo_var_ptr upon the 'main' code in test_main.c,
above all possible functions.

I think 1 is easiest, but 3 does have the advantage of allowing the user
to do GC stuff outside of the parrot execution loop, like allocating
global variables (like argv, but app-specific), etc. Of course, it also
imposees additional coding overhead on the embedding programmer.

Mike Lambert




[COMMIT] GC_DEBUG, Some GC Fixes, and Remaining GC Bugs

2002-08-12 Thread Mike Lambert

Hey,

I re-added the GC_DEBUG define today, and weeded out a bunch of issues.
For those who don't remember, GC_DEBUG (currently in parrot.h) causes
various limits and settings and logic to be setup such that GC bugs occur
relatively soon after the offending code. It allocates one header at
a time, and performs DOD and collection runs extremely frequently
(effectively, anywhere they could possibly occur if GC_DEBUG weren't
defined.) It's goal is to make GC bugs which appear only in complex
programs...appear in simpler ones as well.

Check the cvs-commit traffic if you're interested in what issues I've
fixed already. From what I can tell, two things remain:
- regexes (these are known to be broken. angel's latest patch should fix
these in theory. Probably not worth spending time on fixing these.)
- hashes (these were recently rewritten to use indices, a step forward,
but they aren't 100% clean yet)
- lexicals (there's one remaining issue on the last test I didn't look
into)
- subs (likely includes all variety of them. Basically, I got the wrong
result on one test, instead of GPF failures like I received on the above
bugs.)
- possibly other that got lost in the noise of the above issues

Anyone more well-versed in these departments than I care to take a look at
the potential problems? Just change GC_DEBUG in parrot.h, and you can be
on your way. :)

Thanks,
Mike Lambert




Re: Unifying PMCs and Buffers for GC

2002-08-04 Thread Mike Lambert

Peter Gibbs wrote:

> I am very much in agreement with this concept in principle. I would like you
> to consider adding a name/tag/id field to all pool headers, containing a
> short text description of the pool, for debugging purposes.

I don't have a problem with that. And yes, it'd definitely help debugging
(as opposed to printing out the various pool addresses and comparing them ;)

> > One idea, which is most closely in line with the current semantics, is to
> > add a pool pointer to every header. I've found a few times in the past
> > where such a pointer would have come in handy. This would allow us to call
> > the pool's mark() function, to handle stuff like pointing-to-buffers, etc.
> This is something I have done in my personal version, for buffer headers
> only at present (I have been mainly ignoring PMCs, as I believe they are
> still immature). I use it for my latest version COW code, as well as to
> allow buffer headers to be returned to the correct pool when they are
> detected as free in code that is not resource-pool driven.

Re: DOD immaturity: Yeah, I agree to some extent. It's somewhat difficult
to test DOD efficiency because every string is directly traceable from the
root, thus avoding mark_used for the most part. Perhaps some GC-PMC
benchmarks are needed to weed out remaining issues.

Re: COW code. Ooohh! You've kept it up date with the current code? I was
working on applying your old patch (ticket 607 at
http://bugs6.perl.org/rt2/Ticket/Display.html?id=607), but if you've gow
COW code in the current build, that's even better.

One question: does your current code utilize bufstart as the beginning of
the buffer, or the beginning of the string?

> > b) it allows us to make new types of buffer-like headers on par with
> > existing structures.
> On this subject, I would like to see the string structure changed to include
> a buffer header structure, rather than duplicating the fields. This would
> mean a lot of changes (e.g. all s->bufstart to s->buffer.bufstart), but
> would be safer and more consistant. Of course, strings may not even
> warrant existence outside of a generic String pmc any more.

Again, I agree. If the COW code forces all the string usage to use
strstart and strlen, then bufstart and buflen essentially are used a *lot*
less. This should make the mental transition easier.

> One option would be to use a limited set of physical sizes (only multiples
> of 16 bytes or something) and have free lists per physical size, rather than
> per individual pool. This would waste some space in each header, but may
> be more efficient overall.

I suppose this allows us to mix and match entries of different types in
same pools, since each header would have a pointer to its own pool,
regardless of its neighbors. However, the number 16 could be tuned to 4 or
1 to achieve slightly better mem usage. (Or even POINTER_ALIGNMENT).

> > Finallythe unification of buffers and PMCs means that buffers can now
> > point to things of their own accord, without requiring that they be
> > surrounded by an accompanying PMC type.
> How about the other way round? If the one-size-fits-all PMCs were to be
> replaced by custom structures, then everything could be a PMC, and
> buffer headers as a separate resource could just disappear!

I think you misunderstood me here. I agree that making the buffer headers
a distinct resource is unnecessary. However, this does mean that all
headers need to be traced now. For pure strings, this can hurt
performance, although one can argue that it helps performance in the
general case of the PMC containing buffer data (a couple less
indirections needed on usage).

We could make a new header flag, BUFFER_has_pointers_FLAG, which specifies
that this buffer contains pointers to other data structures, and should be
traced. If this is unset, the buffer doesn't get added onto the free list.

Since adding it to the free list requires adjusting next_for_GC, it's
already going to reference memory there. Checking the flag would merely
prevent traversing the memory again in the 'process' portion.

Thanks for the quick reply,
Mike Lambert




Re: Unifying PMCs and Buffers for GC

2002-08-04 Thread Mike Lambert

Mike Lambert wrote:

> One idea, which is most closely in line with the current semantics, is to
> add a pool pointer to every header. I've found a few times in the past
> where such a pointer would have come in handy. This would allow us to call
> the pool's mark() function, to handle stuff like pointing-to-buffers, etc.

Oh, I meant to mention an alternative to the pool pointer, but forgot...

At one point, we had a mem_alloc_aligned, which guaranteed the start of a
block of memory given any pointer into the contents of the block. If we
store a pointer to the pool at the beginning of each set of headers, then
we navoid the need for a per-header pool pointer, at the cost of a bit
more math and an additional dereference to get at it.

The benefits to this are the drawbacks to the aforementioned approach, but
the drawbacks include:

- additional cpu, and/or cache misses in getting to the pool. for dod,
this might be very inefficient.

- it imposes additional memory requirements in order to align the block of
memory, and imposes a bit more in this 'header header' at the beginning of
the block of headers.

Mike Lambert




Re: Unifying PMCs and Buffers for GC

2002-08-04 Thread Mike Lambert
ecific memory to handle the various pointers that are
required for DODbut the point remains that this further increases the
memory footprint of buffers, and I wanted to verify that it was okay.


Comments and/or suggestions, please?

Thanks,
Mike Lambert




Re: [perl #15942] UNICOS/mk new unhappiness: hash.c

2002-08-03 Thread Mike Lambert

Hey, I was going throuh the RT system looking to resolve issues.

It looks like the offending lines of code are still there. A quick look at
the problem, and I see the following patch:

Index: hash.c
===
RCS file: /cvs/public/parrot/hash.c,v
retrieving revision 1.10
diff -u -r1.10 hash.c
--- hash.c  2 Aug 2002 02:58:27 -   1.10
+++ hash.c  4 Aug 2002 07:09:33 -
@@ -437,7 +437,7 @@
 HASHBUCKET * b = table[i];
 while (b) {
 /* XXX: does b->key need to be copied? */
-hash_put(interp, ret, b->key, key_clone(interp,
&(b->value)));
+hash_put(interp, ret, b->key, b->value);
 b = b->next;
 }
 }


Unfortunately, this causes different semantics for whether you are storing
primitives or pointers (primitives copy, whereas pointers are shallow). Of
course, one could argue that the previous one didn't work at all. :)

Thoughts?
Mike Lambert



Sean O'Rourke wrote:

> Date: Fri, 2 Aug 2002 08:20:58 -0700 (PDT)
> From: Sean O'Rourke <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Cc: [EMAIL PROTECTED]
> Subject: Re: [perl #15942] UNICOS/mk new unhappiness: hash.c
>
> That's me.  Will fix.
>
> /s
>
> On Fri, 2 Aug 2002, Jarkko Hietaniemi wrote:
>
> > # New Ticket Created by  Jarkko Hietaniemi
> > # Please include the string:  [perl #15942]
> > # in the subject line of all future correspondence about this issue.
> > # http://rt.perl.org/rt2/Ticket/Display.html?id=15942 >
> >
> >
> > The subroutine.pmc and sub.pmc problems ([perl #15920]) are gone now
> > that Dan checked in the patches but now new discontent has appeared:
> >
> > CC-167 cc: ERROR File = hash.c, Line = 440
> >   Argument of type "KEY_ATOM *" is incompatible with parameter of type "KEY *".
> >
> >   hash_put(interp, ret, b->key, key_clone(interp, &(b->value)));
> >   ^
> >
> > CC-167 cc: ERROR File = hash.c, Line = 440
> >   Argument of type "KEY *" is incompatible with parameter of type "KEY_ATOM *".
> >
> >   hash_put(interp, ret, b->key, key_clone(interp, &(b->value)));
> > ^
> >
> > 2 errors detected in the compilation of "hash.c".
> >
> > --
> > $jhi++; # http://www.iki.fi/jhi/
> > # There is this special biologist word we use for 'stable'.
> > # It is 'dead'. -- Jack Cohen
> >
> >
>
>




Re: [perl #15943] [PATCH] UNICOS/mk vs dynaloading continues

2002-08-03 Thread Mike Lambert

Applied, thanks.

Mike Lambert

Jarkko Hietaniemi wrote:

> Date: Fri, 02 Aug 2002 15:03:21 GMT
> From: Jarkko Hietaniemi <[EMAIL PROTECTED]>
> Reply-To: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: [perl #15943] [PATCH] UNICOS/mk vs dynaloading continues
> Resent-Date: 2 Aug 2002 15:03:21 -
> Resent-From: [EMAIL PROTECTED]
> Resent-To: [EMAIL PROTECTED]
>
> # New Ticket Created by  Jarkko Hietaniemi
> # Please include the string:  [perl #15943]
> # in the subject line of all future correspondence about this issue.
> # http://rt.perl.org/rt2/Ticket/Display.html?id=15943 >
>
>
> Sorry, I missed this patch hunk from #15880 (but I still think
> eventually the dynaloading should be separated from the generic
> "platform"):
>
> --- config/gen/platform/generic.c.dist2002-08-02 17:58:47.0 +0300
> +++ config/gen/platform/generic.c 2002-08-02 17:59:24.0 +0300
> @@ -4,7 +4,9 @@
>
>  #include 
>  #include 
> -#include 
> +#ifdef HAS_HEADER_DLFCN
> +#   include 
> +#endif
>
>  #include "parrot/parrot.h"
>
> --
> $jhi++; # http://www.iki.fi/jhi/
> # There is this special biologist word we use for 'stable'.
> # It is 'dead'. -- Jack Cohen
>
>
>




Re: [perl #15953] [PATCH] More GC tests

2002-08-03 Thread Mike Lambert

Applied, thanks.

Mike Lambert

Simon Glover wrote:

> Date: Fri, 02 Aug 2002 21:40:51 GMT
> From: Simon Glover <[EMAIL PROTECTED]>
> Reply-To: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: [perl #15953] [PATCH] More GC tests
> Resent-Date: 2 Aug 2002 21:40:52 -
> Resent-From: [EMAIL PROTECTED]
> Resent-To: [EMAIL PROTECTED]
>
> # New Ticket Created by  Simon Glover
> # Please include the string:  [perl #15953]
> # in the subject line of all future correspondence about this issue.
> # http://rt.perl.org/rt2/Ticket/Display.html?id=15953 >
>
>
>
>  A few more tests for the GC ops.
>
>  Simon
>
> --- t/op/gc.t.old Fri Aug  2 17:03:13 2002
> +++ t/op/gc.t Fri Aug  2 17:39:17 2002
> @@ -1,6 +1,70 @@
>  #! perl -w
>
> -use Parrot::Test tests => 1;
> +use Parrot::Test tests => 5;
> +
> +output_is( <<'CODE', '1', "sweep" );
> +  interpinfo I1, 2   # How many DOD runs have we done already?
> +  sweep
> +  interpinfo I2, 2   # Should be one more now
> +  sub I3, I2, I1
> +  print I3
> +  end
> +CODE
> +
> +output_is( <<'CODE', '1', "collect" );
> +  interpinfo I1, 3   # How many garbage collections have we done already?
> +  collect
> +  interpinfo I2, 3   # Should be one more now
> +  sub I3, I2, I1
> +  print I3
> +  end
> +CODE
> +
> +output_is( <<'CODE', <<'OUTPUT', "collectoff/on" );
> +  interpinfo I1, 3
> +  collectoff
> +  collect
> +  interpinfo I2, 3
> +  sub I3, I2, I1
> +  print I3
> +  print "\n"
> +
> +  collecton
> +  collect
> +  interpinfo I4, 3
> +  sub I6, I4, I2
> +  print I6
> +  print "\n"
> +
> +  end
> +CODE
> +0
> +1
> +OUTPUT
> +
> +output_is( <<'CODE', <<'OUTPUT', "Nested collectoff/collecton" );
> +  interpinfo I1, 3
> +  collectoff
> +  collectoff
> +  collecton
> +  collect   # This shouldn't do anything...
> +  interpinfo I2, 3
> +  sub I3, I2, I1
> +  print I3
> +  print "\n"
> +
> +  collecton
> +  collect   # ... but this should
> +  interpinfo I4, 3
> +  sub I6, I4, I2
> +  print I6
> +  print "\n"
> +
> +  end
> +CODE
> +0
> +1
> +OUTPUT
>
>  output_is(<<'CODE', <  print "starting\n"
>
>
>
>




Re: [perl #15952] [PATCH] Minor doc fix in core.ops

2002-08-03 Thread Mike Lambert

Applied, thanks.

Mike Lambert

Simon Glover wrote:

> Date: Fri, 02 Aug 2002 21:39:13 GMT
> From: Simon Glover <[EMAIL PROTECTED]>
> Reply-To: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: [perl #15952] [PATCH] Minor doc fix in core.ops
> Resent-Date: 2 Aug 2002 21:39:13 -
> Resent-From: [EMAIL PROTECTED]
> Resent-To: [EMAIL PROTECTED]
>
> # New Ticket Created by  Simon Glover
> # Please include the string:  [perl #15952]
> # in the subject line of all future correspondence about this issue.
> # http://rt.perl.org/rt2/Ticket/Display.html?id=15952 >
>
>
>
>  mem_allocs_since_last_collect is the number of new blocks allocated,
>  not the total memory allocated.
>
>  Simon
>
> --- core.ops.old  Fri Aug  2 17:32:26 2002
> +++ core.ops  Fri Aug  2 17:33:32 2002
> @@ -3797,7 +3797,7 @@ structures.
>  =item 8 The number of headers (PMC or buffer) that have been allocated
>  since the last DOD run.
>
> -=item 9 The amount of memory allocated since the last GC run.
> +=item 9 The number of new blocks of memory allocated since the last GC run.
>
>  =item 10 The total amount of memory copied during garbage collections.
>
>
>
>
>
>
>




Re: [perl #15951] [BUG] header_allocs_since_last_collect neverupdated

2002-08-03 Thread Mike Lambert

Fixed, thanks.

Mike Lambert

Simon Glover wrote:

> Date: Fri, 02 Aug 2002 21:19:29 GMT
> From: Simon Glover <[EMAIL PROTECTED]>
> Reply-To: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: [perl #15951] [BUG] header_allocs_since_last_collect never
> updated
> Resent-Date: 2 Aug 2002 21:19:29 -
> Resent-From: [EMAIL PROTECTED]
> Resent-To: [EMAIL PROTECTED]
>
> # New Ticket Created by  Simon Glover
> # Please include the string:  [perl #15951]
> # in the subject line of all future correspondence about this issue.
> # http://rt.perl.org/rt2/Ticket/Display.html?id=15951 >
>
>
>
>   The title says it all really: the counter in the interpreter structure
>  that tracks recent header allocations is initialized to 0 when the
>  interpreter is set up, but isn't incremented when headers are allocated.
>  Consequently, this:
>
>   interpinfo I1, 8
>   print I1
>
>  always prints zero.
>
>  Simon
>
>
>
>




Re: [perl #15949] [PATCH] Silence warning in hash clone

2002-08-03 Thread Mike Lambert

Applied, thanks.

Mike Lambert

Simon Glover wrote:

> Date: Fri, 02 Aug 2002 21:00:19 GMT
> From: Simon Glover <[EMAIL PROTECTED]>
> Reply-To: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: [perl #15949] [PATCH] Silence warning in hash clone
> Resent-Date: 2 Aug 2002 21:00:19 -
> Resent-From: [EMAIL PROTECTED]
> Resent-To: [EMAIL PROTECTED]
>
> # New Ticket Created by  Simon Glover
> # Please include the string:  [perl #15949]
> # in the subject line of all future correspondence about this issue.
> # http://rt.perl.org/rt2/Ticket/Display.html?id=15949 >
>
>
>
>  hash->num_buckets is unsigned, so we were getting a "comparison between
>  signed and unsigned" warning. Patch below fixes.
>
>  Simon
>
> --- hash.c.oldFri Aug  2 16:51:05 2002
> +++ hash.cFri Aug  2 16:52:28 2002
> @@ -432,7 +432,7 @@ HASH *
>  hash_clone(struct Parrot_Interp * interp, HASH * hash) {
>  HASH * ret = new_hash(interp);
>  HASHBUCKET ** table = (HASHBUCKET **)hash->buffer.bufstart;
> -int i;
> +UINTVAL i;
>  for (i = 0; i < hash->num_buckets; i++) {
>  HASHBUCKET * b = table[i];
>  while (b) {
>
>
>
>
>
>




Re: [perl #15948] [PATCH] Configure broken on windows 9x

2002-08-03 Thread Mike Lambert

Applied, thanks.

Mr. Nobody wrote:

> Date: Fri, 02 Aug 2002 20:57:57 GMT
> From: Mr. Nobody <[EMAIL PROTECTED]>
> Reply-To: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: [perl #15948] [PATCH] Configure broken on windows 9x
> Resent-Date: 2 Aug 2002 20:57:57 -
> Resent-From: [EMAIL PROTECTED]
> Resent-To: [EMAIL PROTECTED]
>
> # New Ticket Created by  "Mr. Nobody"
> # Please include the string:  [perl #15948]
> # in the subject line of all future correspondence about this issue.
> # http://rt.perl.org/rt2/Ticket/Display.html?id=15948 >
>
>
> I sent this patch before but it got the wordwraps
> messed up, its enclosed as an attachment this time so
> it will be unchanged.
>
> __
> Do You Yahoo!?
> Yahoo! Health - Feel better, live better
> http://health.yahoo.com
>
> -- attachment  1 --
> url: http://rt.perl.org/rt2/attach/32707/26971/8b1cd1/diff
>
>




Re: [perl #15845] [BUG] GC segfault

2002-07-31 Thread Mike Lambert

Applied with some modification, thanks.

Mike Lambert

Richard Cameron wrote:

> Date: Wed, 31 Jul 2002 22:24:55 +0100
> From: Richard Cameron <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Cc: [EMAIL PROTECTED]
> Subject: Re: [perl #15845] [BUG] GC segfault
>
>
> On Tuesday, July 30, 2002, at 07:20 PM, Simon Glover (via RT) wrote:
>
> >  This code segfaults:
> >
> >   sweepoff
> >   set I0, 0
> >
> > LOOP: new P0, .PerlString
> >   set P0, "ABC"
> >   save P0
> >   inc I0
> >   lt I0, 127, LOOP
> >
> >   end
>
> This is a fairly straightforward fix.
>
> Parrot_do_dod_run() ordinarily updates pool->num_free_objects as a side
> effect of looking for unused headers. If dod is disabled with sweepoff,
> then num_free_objects doesn't get updated. This confuses a piece of code
> later on which decides that it doesn't need to allocate any new buffers
> after all (although all other evidence point to the contrary).
>
> Parrot segfaults soon after.
>
> I've attached two patches, the first fixes the problem by telling the
> allocator to ignore the value of num_free_objects if it's unknown; the
> second adds the erstwhile crashing code to the test suite (although I'm
> not convinced I've put it in the best place).
>
> Richard.
>
>
>





Re: [perl #15731] [PATCH] Silence warning

2002-07-28 Thread Mike Lambert

>  Patch below kills a couple of warnings that cropped up because
>  alloc_more_objects was renamed to alloc_objects in the code but
>  not the headers. Also updates the comments.
>
>  Simon

Applied, thanks.

Mike Lambert





Re: [perl #15730] [PATCH] Fix typos

2002-07-28 Thread Mike Lambert

>  Fixes a few typos and tidies up capitalization in dod.dev
>
>  Simon

Applied, thanks.

Mike Lambert




Re: [perl #15724] [BUG] GC bugs

2002-07-28 Thread Mike Lambert

>  1. As far as I can make out, start_arena_memory & end_arena_memory are
>  never initialized before being used in alloc_objects (in dod.c).
>  Consequently, there's no guarantee that they ever do get initialized
>  properly, and hence any or all of pmc_min, pmc_max, buffer_min &
>  buffer_max in trace_system_stack (also in dod.c) may contain garbage.

Ahh, good catch, thanks. Although it won't cause any problems, it still
is a bug. This code was just an optimization which made the
trace_system_stack code MUCH faster, as compared to checking in every
buffer pool for *each* potential pointer on the stack.

Since alloc_objects includes sanity checks to change *_arena_memory, it's
guaranteed to contain all potential buffers, although it could be wildly
over-zealous with it's definition of the min and max.

>  2. In get_min_buffer_address (in header.c): shouldn't all of the
>  references to end_arena_memory actually be to start_arena_memory ?

Ah yes, now that one could cause problems. :) The only reason that I
believe it didn't cause problems in the GC benchmark suite is that all
buffers on the system stack were "recently" allocated, and likely to be
later in the system memory and thus *after* the end of the first
allocated buffer pool (which is what the code would currently return).

The redundancy in the code for the min/max stuff on pmc/header pools does
bother me (and look, it's prone to error ;), but I'm not really sure of
any cleaner ways to do it. If you have any suggestions, please feel
free to mention them.

Anyways, I've committed fixes for both of these issues.

Thanks again.
Mike Lambert






Keyed multiplication

2002-07-27 Thread Mike Lambert

Can I propose a simply-phrased question?

I have an IntArray in @P1 and a NumArray in @P2. How would I do the
equivalent of:

S1 = P1[5] * P2[5]

I'm not asking about how to do it currently, but rather how it should be
done in the 'final keyed interface'. When explaining, I'd appreciate
sample pasm code and/or rough pseudo-code for the unimplemented ops and
methods you're using. Also, it prevents any magical hand-waving. :)

Beware, as I believe this is a very tricky question that will delve into
areas of mm-dispatch. However, if answered, I think it will give great
insights into "the way things (should) work".

Of course, the possibility exists that I am completely missing something,
and this truly is a simple question. :)

Thanks,
Mike Lambert




Re: [PATCH] Reduce array.pmc keyed code

2002-07-25 Thread Mike Lambert

Scott Walters wrote:

> Part of the beauty of PMCs is that you can have very compact
> storage given a dedicated eg int array type. Generating these
> would not be a bad thing. The typical case still remains, that
> arrays will contained mixtures of all datatypes.

Yep, I agree. Thus, array.pmc would be 'the typical case', since its 100%
PMCs. Stored integers and strings would be converted to PMCs for storage,
much like how perl 5 works.

> I proposed several approaches. Taking just one of them:
>
> Requests to operate on PMCs should not be propogated down to the PMC
> through one of dozens of methods. Instead, the PMC should be fetched
> into a scratch register and operated on directly. In the case of
> primitive atomic datatypes, the recurisve multiply, add, div, etc
> operations should be disposed of, and references to the atomic types
> should be fetched to a scratch register, where they are operated on
> directly (as a reference).

Okay, I'll try and think this through. For normal aggregate PMCs, nothing
more is required. The ops to fetch and store in scratch registers is
available. And the ops to operate on them are there, too.

For atomic types, it gets a bit stranger. We need to return references to
these atomic types. This is problematic because:

We don't have anywhere to store them. We can store them in INT registers
(which I believe are guaranteed as large as pointers, but not sure), but
then this gives us the ability to break the nonexistant "safe"
interpreter by operating on the pointer. Finally, a whole complement set
of ops needs to be generated. For every op like 'add N, N', we need an
'add Np, Np', where Np is NUM pointer. Or we need to "get" the NUM
reference into an Nx register, operate on it, and copy it back into the
INT pointer to the one in the aggregate.

However, returning referencse does have one major advantage. Indicating
the *lack* of something. If I retrieve a PMC which doesn't exist, I can
get NULL back. But if I retrieve the 5th index of an integer array for
which the length is 3, what am I supposed to get bacK? NaN? In this
context, retrieving a pointer to the referred element makes sense, and
allows us to "do things" to the keyed element, that aren't necessarily
supported by the vtable methods. I'm beginning to agree with you, here. :)

> Restating, if you're going to work on this, please work with me. I'd be
> glad to help whatever you're doing, but I *hate* duplication of effort.
> I've got plenty of other things I should be working on.

Oh, I hate duplication of effort, too. But the patch I sent in took 10-15
minutes to do, and I wanted to try it out and see how it reduced the code
size. Besides, if you're complaining about how to do something, and you
provide patches that start on it, I think people take you more seriously.
:)

> The 33k implementation of Array is less then 1/4 complete. This beast
> will be 125k before its done, mark my words. Low power chips have
> quarter meg or half meg L2 caches. I'm a firm believer that a VM should fit
> in cache and leave room for some data to be cached, too. A seperate
> fetch and operation would add a few more instructions to the implementation,
> but compared to the cost of a cache stall, this is beans. The best thing we
> could do is remove some of this bloat now.

I won't argue that it'll be 125K before it's done. In fact, I think it
will be more. At least, if we continue on our current path of keyed
versions of the various array methods. In terms of convincing Dan, I think
we just need to be clearer in the argument:

- the keyed approach is fine
- get_keyed, set_keyed, are fine
- the existing .ops for keys are fine, although more are needed

The main changes that I see are:
- the elimination of the keyed versions for all the mathematical vtable
methods
- the addition get_keyed_ref method
- the addition of mathematical keyed *ops* that use get_keyed_ref to "do
their thing" ?
- perhaps some method for storing the keyed_ref result into a register?
The mathematical keyed ops might make this unnecessary, however.

Thoughts on this hopefully more concrete explanation of what could be
changed?

Thanks,
Mike Lambert

> Mike, I can't mail you directly. I'm sharing a netblock with a known spammer
> and you're using spews.org. I didn't mean to send my rant to the entire list
> again. Sorry, folks =(

Sorry about that. I don't have control over the machine receiving email
for me, and won't be back to serving my own mail until the fall semester
starts up again.






Re: [COMMIT] GC Speedup

2002-07-24 Thread Mike Lambert

> > If performance has to halve in order to implement such features, I hope
> > somebody plans to write Parrot::Lite!
>
> I'm not sure if I understand the problem properly, but is the problem with
> using exceptions (or using longjmp) and the like to unwind that we can't
> trust external extensions to play ball with our rules when we need to unwind
> through them? And that if we didn't have to call out to external code, we
> could use the faster methods without needing stack walking?

Well, I only knew about the first problem, but I suppose the "external
code" problem is another valid one. :)

> If so, is it possible to make a hybrid scheme, whereby if we know that
> between the two marks on the stack we've not called out to any external
> code we use the faster mechanism to check for leaks, but if we know that
> we entered external code (and must have come back in because we're now back
> in the parrot garbage collection system called by another parrot call) use
> a tack walk. Obviously we'd carry the overhead of more bookkeeping, but it
> might win if it saves the stackwalk. (and thrashing the cache in the process)

I do like this idea, although we'll have to run it by Dan first. I know
that both Peter and I had different solutions to the pointers-on-the-stack
problem, and all such ones were rejected. I'd have to do benchmarks
comparing my own (neonate buffers) to the current stackwalk code, but I'm
sure both Peters and mine would win, hands down.

That's mainly because we:
a) don't call external code
b) don't use longjmp

So our solutions, and the bookkeeping they entail, would never be wasted,
because there is nothing to invalidate them yet. To really determine
their worth, we need real-world programs, and figure out how often would
they be using longjmp, and external code, to determine how often the
bookkeeping is wasted. But that brings me back around to the point in my
previous email, about the mythical "real world program". :)

Mike Lambert




Re: [COMMIT] GC Speedup

2002-07-24 Thread Mike Lambert

> With the call to trace_system_stack commented out in dod.c, I get 48.5
> generations per second. The full stats are:
> 5000 generations in 103.185356 seconds. 48.456488 generations/sec
> A total of 36608 bytes were allocated
> A total of 42386 DOD runs were made
> A total of 6005 collection runs were made
> Copying a total of 72819800 bytes
> There are 21 active Buffer structs
> There are 1024 total Buffer structs
>
> This compares to the 14th July CVS version:
> 5000 generations in 81.172149 seconds. 61.597482 generations/sec
> A total of 58389 bytes were allocated
> A total of 160793 DOD runs were made
> A total of 1752 collection runs were made
> Copying a total of 1228416 bytes
> There are 81 active Buffer structs
> There are 192 total Buffer structs

I guess this means the examples/benchmarks I was using to test were not
too representative of real-world programs. Or maybe that's the case for
life.pasm. :)

Looking at the above results, I think I can see part of the problem.
What's really annoying is that the more I play with the benchmarks, the
more I realize they are useless. The new parrot has an initial buffer
count of 256, which helped performance on my system, when compared to the
pre-GC commit. The old version has 64 or so. Just this small tuning
difference means:

a) more buffers to dod and collect
b) less of a need to DOD since we can "live" longer without it
c) more memory usage because we can't collect data in old PMCs until
they've been DOD'ed

Doing minor adjustments like inlining functions, etc (which I did the
other day) can give maybe a 1-4% performance across the board, each.
However, changing a number like HEADERS_PER_ALLOC can affect performance
+/-8%, program-depending.

This makes it rather difficult to difficult to optimize the GC, since
optimizing for one program *easily* messes up the performance on other
programs.

Setting *_HEADERS_PER_ALLOC back to the original of 16 improves
performance on life.pasm by 5%, although it causes a corresponding hit on
the examples/benchmarks.

Changing UNITS_PER_ALLOC_GROWTH_FACTOR either way causes a big speed hit.
Changing REPLENISH_LEVEL_FACTOR either way causes a big speed hit.
Changing the logic on when we DOD relative to collection, in any manner,
causes a speed hit.

This leads me to believe that we have a GC that's tuned for life.pasm,
which makes a lot of sense. Before examples/benchmarks, there was only
life, and all GC performance changes were compared on that. In my attempts
to tune for examples/benchmarks, I undoubtedly caused life performance to
suffer. Parrot doesn't have any real-world programs, which makes it
difficult to do any sort of worthwhile tuning.

Hopefully, with sean's (and everyone else's) work on the Perl6 grammar, we
can start taking these perl programs (like qsort), and running them
through and benchmarking them against the parrot VM. Unfortunately, until
we have a wide test suite of programs, or start implementing adaptive
adjustment of GC parameters, I have a feeling we're just going to travel
around in circles.

Mike Lambert




Re: [PATCH] Reduce array.pmc keyed code

2002-07-24 Thread Mike Lambert

> This patch is rather questionable, and thus I did not commit it
> directly. However, it illustrates a point I wanted to make.

Doh! Hopefully my previous post will make a bit more sense now. :)

Mike Lambert


Index: array.pmc

===

RCS file: /cvs/public/parrot/classes/array.pmc,v

retrieving revision 1.28

diff -u -r1.28 array.pmc

--- array.pmc   24 Jul 2002 07:32:46 -  1.28

+++ array.pmc   25 Jul 2002 03:24:31 -

@@ -146,46 +146,16 @@

 }

 

 INTVAL get_integer_keyed (KEY* key) {

-KEY_ATOM* kp;

-INTVAL ix;

-PMC* value;

-

-if (!key) {

-return 0;

-}

-

-kp = &key->atom;

-ix = atom2int(INTERP, kp);

+PMC *value = SELF->vtable->get_pmc_keyed(INTERP, SELF, key);

 

-if (ix >= SELF->cache.int_val || ix < 0) {

-internal_exception(OUT_OF_BOUNDS, "Array element out of bounds!\n");

-}

-

-value = ((PMC**)(((Buffer *)SELF->data)->bufstart))[ix];

-

-if(key->next != NULL) {

+if(key->next != NULL)

 return value->vtable->get_integer_keyed(INTERP, value, key->next);

-}

-else {

+else

 return value->vtable->get_integer(INTERP, value);

-}

 }

 

 INTVAL get_integer_keyed_int (INTVAL* key) {

-INTVAL ix;

-PMC* value;

-

-if (!key) {

-return 0;

-}

-

-ix = *key;

-

-if (ix >= SELF->cache.int_val || ix < 0) {

-internal_exception(OUT_OF_BOUNDS, "Array element out of bounds!\n");

-}

-

-value = ((PMC**)(((Buffer *)SELF->data)->bufstart))[ix];

+   PMC *value = SELF->vtable->get_pmc_keyed_int(INTERP, SELF, key);

 return value->vtable->get_integer(INTERP, value);

 }

 

@@ -194,46 +164,16 @@

 }

 

 FLOATVAL get_number_keyed (KEY* key) {

-KEY_ATOM* kp;

-INTVAL ix;

-PMC* value;

+PMC *value = SELF->vtable->get_pmc_keyed(INTERP, SELF, key);

 

-if (!key) {

-return 0;

-}

-

-kp = &key->atom;

-ix = atom2int(INTERP, kp);

-

-if (ix >= SELF->cache.int_val || ix < 0) {

-internal_exception(OUT_OF_BOUNDS, "Array element out of bounds!\n");

-}

-

-value = ((PMC**)(((Buffer *)SELF->data)->bufstart))[ix];

-

-if(key->next != NULL) {

+if(key->next != NULL)

 return value->vtable->get_number_keyed(INTERP, value, key->next);

-}

-else {

+else

 return value->vtable->get_number(INTERP, value);

-}

 }

 

 FLOATVAL get_number_keyed_int (INTVAL* key) {

-INTVAL ix;

-PMC* value;

-

-if (!key) {

-return 0;

-}

-

-ix = *key;

-

-if (ix >= SELF->cache.int_val || ix < 0) {

-internal_exception(OUT_OF_BOUNDS, "Array element out of bounds!\n");

-}

-

-value = ((PMC**)(((Buffer *)SELF->data)->bufstart))[ix];

+   PMC *value = SELF->vtable->get_pmc_keyed_int(INTERP, SELF, key);

 return value->vtable->get_number(INTERP, value);

 }

 

@@ -243,46 +183,16 @@

 }

 

 BIGNUM* get_bignum_keyed (KEY* key) {

-KEY_ATOM* kp;

-INTVAL ix;

-PMC* value;

+PMC *value = SELF->vtable->get_pmc_keyed(INTERP, SELF, key);

 

-if (!key) {

-return 0;

-}

-

-kp = &key->atom;

-ix = atom2int(INTERP, kp);

-

-if (ix >= SELF->cache.int_val || ix < 0) {

-internal_exception(OUT_OF_BOUNDS, "Array element out of bounds!\n");

-}

-

-value = ((PMC**)(((Buffer *)SELF->data)->bufstart))[ix];

-

-if(key->next != NULL) {

+if(key->next != NULL)

 return value->vtable->get_bignum_keyed(INTERP, value, key->next);

-}

-else {

+else

 return value->vtable->get_bignum(INTERP, value);

-}

 }

 

 BIGNUM* get_bignum_keyed_int (INTVAL* key) {

-INTVAL ix;

-PMC* value;

-

-if (!key) {

-return 0;

-}

-

-ix = *key;

-

-if (ix >= SELF->cache.int_val || ix < 0) {

-internal_exception(OUT_OF_BOUNDS, "Array element out of bounds!\n");

-}

-

-value = ((PMC**)(((Buffer *)SELF->data)->bufstart))[ix];

+   PMC *va

[PATCH] Reduce array.pmc keyed code

2002-07-24 Thread Mike Lambert

This patch is rather questionable, and thus I did not commit it
directly. However, it illustrates a point I wanted to make.

As mentioned in my recent PARROT QUESTIONS email, a lot of the clutter in
the PMC aggregates can be removed with the use of redirecting functions.

The below patch reduces the resulting array.c from 40K to 20K, and the
..obj file from 25K to 33K (not that much).

It introduces another layer of recursion into the code, while at the same
time eliminating lots of duplicate code. If we disallow subclassing of the
PMC, a lot of the vtable redirects in this code could be replaced with
straight function calls (and those function calls could subsequently be
inlined within the same .c file).

After doing this, I wonder if it's not useful to allow an aggregate PMC to
declare its inner type (in this case, PMC). The pmc2c.pl would then
generate stub functions which converted from that base type to each of the
requested types. This would allow PerlIntArraty to be a base of INT, yet
perform auto-conversions to num, string, pmc, etc in the generated code.

Is the patch here too recursive to be efficient, despite the reduction in
actual code? Should all SELF->vtable methods on *.pmc files be made to
call the appropriate function directly, and assume no subclassing?

Another possibility is to regenerate the functions for the subclasses, so
that parent inlinings of SELF->vtable->get_pmc_keyed will not interfere
with a child's redefinition of their own get_pmc_keyed ?

Thoughts, comments?
Mike Lambert




Re: PARROT QUESTIONS: Keyed access: PROPOSAL

2002-07-24 Thread Mike Lambert
keys.

> > >Given your objectives of speed, generality and elegance,
> >
> > I should point out here that elegance appears third in your list
> > here. (It's fourth or fifth on mine)
>
> Ooops.

Yes, Dan's coding objectives are somewhat of a mystery to me as
well. :)

> > >   * function calls consume resources
> > Generally incorrect. Function calls are, while not free, damned cheap
> > on most semi-modern chips.
>
> Your inner loop is a few lines of code. If every inner loop execution triggers
> a cascade of function calls, this is lost. It may be small, but certain cases
> do warrent changing extremely frequently used recursive structures to
> iterative structures. I'm not saying this happends - I'm just saying that there
> is a certain point when this value does become significant.

However, if that inner loop references a multi-dim array, a
standard implementation of a recursive keyed access would fail to
work, no?

And if you're that concerned about the recursive key lookup on a heavily
nested loop, I'm sure you could hoist some of the key lookups out of the
appropriate loops (or maybe not, depending upon the Perl code in
question). However, given that Perl is hard to optimize, there's not much
you can do to optimize [$a][$b][$c] access because any one of the PMCs
might be tied, changing the behavior of the system. As such, you might
very well need to perform the full keyed lookup each time.

> I agree with Dan now that I understand better. My complaints have been addressed,
> with the one exception of refactoring code bloat. I feel this is a small change
> in implementation, and shouldn't impact design. I hope Dan will (pending time)
> consider it, and I'll be happy to hash it out with him on IRC to make sure
> both parties understand exactly what is being said and that I don't continue
> to miss things ;)

Hopefully I've helped to explain some of the things you said you were
unsure about. Of course, since I'm not Dan, you might very well hear
something completely different when he gets back from TPC. :)

Mike Lambert




Re: [patch] win32 io

2002-07-24 Thread Mike Lambert

> * win32 can flush it's file buffers (FlushFileBuffers())
> * SetFilePointer knows about whence, win32 constants (values, not names) are the 
>same as in linux.

Applied, thanks.

Mike Lambert






Re: [COMMIT] GC Speedup

2002-07-24 Thread Mike Lambert

Hey Peter,

Sorry about not replying to your ealier message...I completely forgot
until I saw this message. :)

> Thanks Mike, those changes do indeed help. Current numbers on my system for
> 5000 generations of life with various versions of Parrot (using CVS tags)
> are:
> 0.0.5  47.99 generations per second
> 0.0.6  57.41
> 0.0.7  20.18
> current   21.18 (an improvement of 4.7% over 0.0.7)

These do look pretty bad. Unfortunately, these numbers are not directly
comparable. Between 0.0.6 and 0.0.7, two major things changed in the GC
code:
- addition of stack-walk code to avoid child collection
- the GC refactoring I commited

I suspect the former is what is causing your speed hit, although I'm not
ruling out the possibility that my changes caused a problem as well. Can
you disable the trace_system_stack call and re-run these numbers?

I know that there are faster solutions to the problem of child collection,
but Dan doesn't want to use them due to the problems that occur when we
start using exceptions (and longjmp, etc). Perhaps, if the above
performance hit is due to trace_system_stack, it might give reason to
reconsider the chosen solution?

Thanks,
Mike Lambert




Re: [perl #15317] [PATCH] Recursive keyed lookups for array.pmc

2002-07-24 Thread Mike Lambert

Applied, thanks.

If someone wants to mark this ticket as resolved, I'd appreciate it.

Mike Lambert


Scott Walters wrote:

> Date: Mon, 22 Jul 2002 08:49:33 GMT
> From: Scott Walters <[EMAIL PROTECTED]>
> Reply-To: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: [perl #15317] [PATCH] Recursive keyed lookups for array.pmc
> Resent-Date: 22 Jul 2002 08:49:33 -
> Resent-From: [EMAIL PROTECTED]
> Resent-To: [EMAIL PROTECTED]
>
> # New Ticket Created by  Scott Walters
> # Please include the string:  [perl #15317]
> # in the subject line of all future correspondence about this issue.
> # http://rt.perl.org/rt2/Ticket/Display.html?id=15317 >
>
>
>
> When a KEY * key datastructure is passed to a keyed method in array.pmc,
> and key->next is true...:
> array.pmc will recurse into the keyed lookup method of the PMC that it
> contains, passing it key->next.
> This implements the recursive indexing behavior as described in PDD08.
>
> -scott
>
>
> -- attachment  1 --
> url: http://rt.perl.org/rt2/attach/30940/25927/407316/array.pmc.diff
>
>





More Keyed Questions

2002-07-23 Thread Mike Lambert

Heya,

After seeing all the bruhaha on the list about keyed-ness, I thought I'd
try my hand at it, to see if I could get any farther before running up
into the wall. :)

Here's my own list of questions...first, the main problem with keys, as I
see it, is that there is no guiding PDD or spec that describes how they
should work. As such, people can only learn from the code. And that's a
mess. While basic key support is there, lots of things are
half-implemented, or incorrectly implemented, and it's quite hard to get a
coherent picture. At least, imho.

Where are keys going to be stored? Currently, they exist only on the
system stack, and key_new and key_clone are unused. Will we eventually
have free-floating keys? Should we create support that causes them to be
stored in small-object pools?

I see that certain ops accept a type called KEY, which acts exactly like
INT. And I mean *exactly*. It pulls data from INT registers, and even the
'k' and 'kc' type translate into accesses that look exactly like 'i' and
'ic'.

Are we planning on having key creation and mutation operations? Where will
these keys be stored in order to operate on them? INT registers? (Sorta
how it is now, although it looks wrong). STR registers? (Means they need
to be headered, DOD'ed, and a larger per-key size). Will they get their
own registers, maybe only 8 deep?

What about the possibility for constructing/operating on keys using a Key
PMC. We could convert to/from real KEYs by using the PMC. This is
basically just a sidestep of the above problem.

Currently, I see:
set_keyed_integer:
PK*
PI*
*PI
*PK

set_keyed:
PS*
*PS
PN*
*PN

set:


What's the point of set_keyed versus set_keyed_integer naming? There
doesn't seem to be any overlap at all, so set_keyed_integer could safely
be named set_keyed.

Can we safely remove "set ", due to the relative inefficiency in
constructing dummy PMCs to call it? Wouldn't it be more efficient to split
the call into two "set PP*",and "setP*P" calls?

Thanks,
Mike Lambert




Re: [PATCH] genclass.pl patch

2002-07-23 Thread Mike Lambert

Josef Höök wrote:

> I've added an if case in genclass so it will print
> "return whoami;" for "name" function so that no one need to grep parrot
> source for an hour or two trying to figure out why it segfaults when
> registering pmc class in init_world... ( grumble :-) )

Applied, thanks.

Mike Lambert




[COMMIT] GC Speedup

2002-07-23 Thread Mike Lambert

I've just committed some changes tonight that improved performance
on the GC-heavy tests in examples/benchmarks/ by about 8%.

Results on each of the GC benchmark tests are, scaled against 1.0 as the
old version, are:
old new
gc_alloc_new.pbc1.000   0.969
gc_alloc_reuse.pbc  1.000   0.957
gc_generations.pbc  1.000   0.899
gc_header_new.pbc   1.000   0.991
gc_header_reuse.pbc 1.000   0.871
gc_waves_headers.pbc1.000   0.867
gc_waves_sizeable_data.pbc  1.000   0.987
gc_waves_sizeable_headers.pbc   1.000   0.855

Overall:
old 1.000
new 0.925

Details of what were done to accomplish this can be found in my email to
the cvs-parrot list. It was pretty much 4 or 5 distinct things that each
gave a couple percentage points' improvement.

Thanks,
Mike Lambert




[COMMIT] Major GC Refactoring

2002-07-18 Thread Mike Lambert

Last night I committed the GC refactoring I submitted the other day, then
spent a couple hours putting out fires on the tinderbox.

The last thing I attempted was to align my pointer accesses, because Tru64
was giving lots of warnings about
Unaligned access pid=246428  va=0x1400b7364 pc=0x12005e408
ra=0x120037228 inst=0xb52c0010

After attempting to solve them for myself unsuccessfully, I went to:
http://csa.compaq.com/Dev_Tips/unalign.htm
and
http://csa.compaq.com/Dev_Tips/unalign_example.htm
which give instructions on tracking them down. Turns out set_keyed_string,
and plenty of other parrot code, has the same problems I did. I believe
there's a way to turn this off in the compilation, but I'm not sure if we
want to do that.

Finally, it appears that there are still 64-bit issues with the code I
comitted last night, mostly in regards to the GC failing on the more
intensive tests. I will try to look into this tomorrow night, but I'm not
sure how much progress I'll be able to make, since I'm quite unfamiliar
with gdb, and 64-bit platforms (and each individually, for that matter. :)

Worst comes to worst, and DrForr needs to make 0.0.7, I can undo the
changes to get the tests passing on all platforms, again. And then try it
with JUST the stackwalking code to avoid neonate problems.

Thanks,
Mike Lambert





Re: [perl #823] [PATCH] put numeric comparisons in the right order

2002-07-15 Thread Mike Lambert

> > Um, I don't think it's right to *always* do the comparison
> > floating point.  Specifically, if both operands are ints,
> > the comparison should be int.
>
> I thought about this, but all it buys you is a few bits of precision when
> your ints and floats are both 64-bit, and slower comparisons all the time.
> IMHO it's a wash, so I did it this way.

If A = 2^30 and B = 2^30+1, won't they be the identical value when
converted to IEEE floats on a 32-bit platform? IEEE floats have a 23 bit
mantissa, which isn't enough to differentiate between 1^30 and 1^30+1^0,
since 30 - 0 > 23.

Am I missing something here?

Thanks,
Mike Lambert





Re: [perl #814] [PATCH] fix PMC type morphing

2002-07-14 Thread Mike Lambert

Foor for thought...

There currently is a 'morph' vtable entry, which I believe is intended to
morph from one vtable type to another. I think it'd be better to implement
this function properly than to use macros (talk to Robert ;), especially
considering that certain vtables might have special morphing
requirements, such as setting PMC_is_buffer_ptr_FLAG.

Of course, morph seems to be unimplemented, and my attempt at
implementing it ran into a problem, which I brought up here:
http:[EMAIL PROTECTED]/msg09317.html

There are two problems:

a) morph will break horribly when we deal with tied variables, since it
will have to reimplement *every* PMC method to avoid any morphing.

b) Since it's possible that dest == src, we need to make a copy of our
data (be it a buffer ptr, or regular number) on the local stack, call
morph() to morph the PMC and initialize data, and then set the new value.
This pattern is currently utilized in the string PMC methods, but not with
the number-related methods.

So in conconclusion, while I don't have any reservations about your
patch, I do have a preference that it be done differently. :)

Mike Lambert




Re: Adding the system stack to the GC

2002-07-12 Thread Mike Lambert

> >a) Can I assume the stack always extends into larger-addressed memory, or
> >must I handle both cases?
>
> Both cases. If you want to add configure probing to determine
> direction, go ahead.

I'm currently doing it dynamically. Get it working, then someone can do
nice configure probing. :) Turns out win32 does it 'backwards', which I
found interesting, at least.

> >b) What's the largest alignment guaranteed for pointers? Byte-level?
>
> I think we can safely assume natural alignment, so you can treat the
> stack as an array of pointers. If the start and end point are both
> pointers you can just iterate that way.

Seems to be byte-level, as I've had pointers between two pointers that
wasn't pointer-aligned. Not sure what kind of padding msvc is doing, but
it seems that from runops to a given DOD call in a string* function, that
there's about 1KB of stack. That's 1000 checks to see if any of them are
pointers to pmcs/buffers. Luckily, this number shouldn't grow with program
size, but it might be a cause for worry.


Also, I think I've discovered a situation which might not be
resolvable. How do we determine id a given buffer is alive? We walk the
stack and check for potential pointers into buffer memory. If the stack
contains garbage pointers, we might have bad references into buffer
memory. I can check for alignment within the buffer pool, and so it should
be safe to set the BUFFER_live_FLAG on them.

However, when we perform a collection, that means that we could be taking
a garbage pointer in the buffer, and attempting to copy the memory it
pointed to, into new memory.

This could give us GPFs if we access bad memory, I think. Even if we check
to ensure the buffer points into valid collectable memory (making it
slower), we still have the issue of buflen being set to MAX_INT or
something, and killing the system. :|

The same caveats apply to pmc headers which happen to have
PMC_buffer_ptr_FLAG set.

How should we get around this particular problem, or is it spelling the
doom of this particular solution?

Thanks,
Mike Lambert




Re: coders: add doco

2002-07-12 Thread Mike Lambert

> Melvin Smith wrote:
> > What parts particularly bug you? Maybe we can address a few.
>
> Well, basically, AFAICT, virtually none of the parrot code
> is adequately documented.  So, pick a random entry point. :-)

First, you have to understand that what you are saying is quite
inflammatory, regardless of its veracity. Saying "I'm not flaming here"
does not make it so. :) There are certainly many places in your original
email where you could have been less inflammatory towards the people that
have contributed code and documentation to the Parrot project.

There have been many requests for additional documentation in the past.
Patches have even been refused for lack of documentation. Did you check
the p6i mail archive to see if this issue has been brought up before? I'm
sure you won't find anyone arguing the point that parrot needs more
documentation. However, if it was as simple a matter as telling people
that more documentation was needed, Parrot would have had ample
documentation looong ago.

If you want to have a request listened to, you should be direct in what
you request. When Melvin asked you for particular places that we could
improve upon, you responded with 'all of the above'. That's quite a big
task to address, and not telling anyone any more than they already knew.

Did you have problems learning any particular aspect of parrot? If so,
that might be a good area to request additional documentation. From my
vantage point, documentation is bad only if someone attempts to learn a
particular area of the code and has trouble because of the lack of, or
inadequacy of, the documentation for that task.

Have you attempted to learn every aspect of parrot, such that you
can verifiably say that all of parrot's documentation is lacking?

Thanks for understanding,
Mike Lambert




Re: Adding the system stack to the GC

2002-07-12 Thread Mike Lambert

I'll take a stab at it. Got a few questions, tho:

a) Can I assume the stack always extends into larger-addressed memory, or
must I handle both cases?

b) What's the largest alignment guaranteed for pointers? Byte-level?

c) Where should this code go, such that it can be replaced for the
OS/platforms which need it differently? resources.c? .c? Maybe
in resources.c with each .c calling the generic one in
resources.c (since win32, generic, darwin, etc are all likely to share the
same logic.)

Mike Lambert


Dan Sugalski wrote:

> Date: Fri, 12 Jul 2002 13:05:54 -0400
> From: Dan Sugalski <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Subject: Adding the system stack to the GC
>
> Okay, anyone up for this? Should be fairly trivial--take the address
> of an auto variable in runops, store it in the interpreter, take the
> address of *another* auto variable in the GC, and walk the contiguous
> chunk of memory between, looking for valid PMC and Buffer pointers.
>
> Anyone?
> --
>  Dan
>
> --"it's like this"---
> Dan Sugalski  even samurai
> [EMAIL PROTECTED] have teddy bears and even
>teddy bears get drunk
>
>





Re: vtables and multimethod dispatch

2002-07-07 Thread Mike Lambert

> We need a multimethod dispatch for vtable calls. Right now we're
> working on a "left side wins" scheme and, while we're going to keep
> that (sort of) we really need a way to choose the method to call
> based on the types on both sides of binary operators. (Unary
> operators, luckily, are easier)

Woot. Woot. I'm glad to see that we're going to get multi-dispatch in the
parrot core.

There's a few main methodologies: Lookup logic:

a) Have the dispatch logic be intelligent, and lookup the appropriate
dispatch in order of increasing generality.

b) Have the dispatch just lookup in a table, and generate that table at
load/bind/etc time with the logic of inheritence, generality, etc. You can
generate a large table that's N^2 for N = number of PMCs. Easy on lookup
logic, bad on cache and memory usage.

With option a) above, there's a few techniques you can use:

a) Dan's preferring (I think he is, anyway) a two-level lookup (so that if
you don't need the multi-dispatch, it can be nearly as fast as regular
dispatch.

So we dispatch to left-side. It can be greedy and handle it, or perform
the second-level lookup itself, giving full multi-method dispatch.
(Extending to trinary dispatch and more should we want that.)

b) You can probably use a hashtable as a sparse matrix. Perform repeated
lookups until you get a non-null method. Most PMCs will only have
specially-designed interactions with a small subset of family PMCs, if
anything at all.


I'm sure there are other techniques...I posted links to a bunch of 'make
multi-dispatch fast' links back when I was arguing about this topic awhile
ago. But I admit to not having read them, so there are likely to be many
other techniques that my current brain dump doesn't cover. :)

> We can do this with the current vtable scheme as it is, since we
> already have a slot to put this in, and I think we're going to have
> languages that still do a left-side-win scheme.

Well, for one, the current vtable scheme does a lot more than operators,
so I don't think anyone would argue for it going away. And even
left-side-win schemes are merely a special-case of generic multi-dipatch,
with the right side being a '*' to match all PMCs.

Of course, if Perl is going to be multi-dispatch to the core, is
there a valid argument for trying to optimize the single-dispatch
case? Granted we already have that implemented, but some multi-dispatch
schemes impose the same penalty for single- as they do for multi-. Would
these schemes be allowed, or explicitly disallowed?


Finally, my last item that I'd like to see included in any multi-dispatch
scheme that gets implemented, is the ability to register methods to be
called, that aren't in either PMC. While this is infringing a bit on
p6-language territory, I still believe we need a mechanism for it
internally.

It would give a way for different mathematical libraries to interoperate
in code, by merely writing operators which could handle the conversion
from one type to another (or faster, dealing with the internals of
both).

Finally, I'd like to be able to use multi-dispatch for the purpose of
conversion/casting. While the _as_int methods handle the simple ones fine,
PMC->PMC conversions are essentially multi-dispatch, and imo, should be
treated as such. This might only matter with strong typing, but it might
also help with the differently-organized mathematical libraries: assuming
no binary operators are written, one only needs to write conversions, to
allow them to interoperate, if slowly.

Thoughts? Am I taking it too far?
Mike Lambert




Re: GC Benchmarking Tests

2002-05-29 Thread Mike Lambert

> > > gc_alloc_new.pbc
> > > gc_alloc_reuse.pbc
> >
> > I don't think these two tests are very interesting. They allocate
> > quite large strings, so they don't put much strain on the GC.
> > Instead, they measure how fast Parrot is at copying strings.
>
> I believe that's a very good thing to be testing. If the pool allocates
> more memory than it thinks it will need, it will perform less overall
> copies at the expensive of larger callocs and worser performance in other
> cases.

Erg. Seems that for every email I send, I send out another one
correcting/clarifying it. :(

I will agree with you that gc_alloc_new isn't really useful as it
currently stands. While what it is testing is good, it currently only has
five iterations due to the huge amount of memory it is allocating. As
such, I plan to give it a slower rampup that should allow for more
iterations, while still testing the same thing.

Mike Lambert




Re: GC Benchmarking Tests

2002-05-29 Thread Mike Lambert

> > gc_alloc_new.pbc
> > gc_alloc_reuse.pbc
>
> I don't think these two tests are very interesting. They allocate
> quite large strings, so they don't put much strain on the GC.
> Instead, they measure how fast Parrot is at copying strings.

I believe that's a very good thing to be testing. If the pool allocates
more memory than it thinks it will need, it will perform less overall
copies at the expensive of larger callocs and worser performance in other
cases.

If the GC is generational, it will be able to detect the early
gc_alloc_new headers as 'old', and promote them into an older pool. It
should exhibit better performance.

Finally, gc_alloc_reuse has a header turnover. So while it allocates tons
of memory, the old memory is able to be marked unused and it has a
mostly-constant 'total_used' at any given time. If DOD runs aren't done,
this will demonstrate poor performance due to the old headers not getting
marked as unused, and it needing to allocate more memory blocks.

I think it is *because* they measure how fast parrot is at copying string
that these are good tests. GCs which avoid copying will perform better on
these.

Mike Lambert




Re: [netlabs #636] GC Bench: out-of-pool-memory logic

2002-05-29 Thread Mike Lambert

> Results are:
>   before  after
> gc_alloc_new.pbc  4.1559990.18
>
> gc_alloc_new seems to have improved a *lot*. This is because
> gc_alloc_new allocates a lot of memory using the same headers. It
> doesn't tear through headers quickly enough to trigger any dod runs on
> its own, so these headers stay live and allocate tons of memory that gets
> continually copied between generations.

Okay. I guess I was a bit optimistic with that. The gc_alloc_new statistic
above is false. After getting confused when applying this patch to my
local safe GC and not seeing an equivalent speedup, I investigated a bit
more.

Normal allocation sizes are:
45
980
19600
392000
784
15680

With this patch, the allocation sizes are:
980
19600
392000
784
0

Obviously, not allocating 156 mb of memory is quite efficient. It also
makes me realize that a slower rampup here would probably be a better
test.

I guess it also raises some suspicion about the other test results. Ah
well, we need a fix for our GC problems anyway...

Mike Lambert

PS: I'm currently operating under the assumption that these kinds of
emails are a good thing. There will probably be a lot more in this style,
so if you want to change something about how I'm doing this, please let
me know.




Re: [netlabs #642] GC Bench: Collection Pool Bounds

2002-05-29 Thread Mike Lambert

> I've modified his patch to remove some unnecessary calculations.
>
>   before  after
> gc_alloc_new.pbc  4.1559993.756002
> gc_alloc_reuse.pbc16.574  9.423002
> gc_generations.pbc4.025   5.278002
> gc_header_new.pbc 3.686   3.615
> gc_header_reuse.pbc   5.5779994.908003
> gc_waves_headers.pbc  3.8150023.675001
> gc_waves_sizeable_data.pbc8.3830029.403999
> gc_waves_sizeable_headers.pbc 5.668   6.268999

Yet another correction to my results...the 'after' benchmarks are from a
completely different build of parrot. Unfortunately, the new results
aren't any easier to explain.

Correct results are:
before  after
gc_alloc_new.pbc4.1559993.836001
gc_alloc_reuse.pbc  16.574  12.318001
gc_generations.pbc  4.025   4.186
gc_header_new.pbc   3.686   4.166
gc_header_reuse.pbc 5.5779994.345999
gc_waves_headers.pbc3.8150023.796001
gc_waves_sizeable_data.pbc  8.3830027.27
gc_waves_sizeable_headers.pbc   5.668   5.617998


gc_waves_resizeable_data improves by 1.1
gc_header_reuse improves 1.2
gc_alloc_new improves 0.3
gc_alloc_reuse improves 4.2 (as it does for every benchmark)
gc_header_new worsens 0.5

gc_waves_resizeable_data improves because it closesly follows the shape of
the curve, instead of allocating lots of extra memory all the time. (not
sure how closely...depends upon when it runs out of pool memory and
copmacts..at the bottom of the curve or at the top).

Not really sure how to explain the other results, unfortunately. My
imagination has blown it's cover and been exposed for what it is, and it's
having some trouble recovering. :)

With these new stats, this patch looks like it *does* provide an
improvement, and so I think this one is worthwhile (although my comments
still stand about looking for an adaptive pool sizing system).

Thanks for bearing with me on this,
Mike Lambert




GC Benchmarking Tests

2002-05-28 Thread Mike Lambert

Hey all,

After finding out that life.pasm only does maybe 1KB per collection, and
Sean reminding me that there's more to GC than life, I decided to create
some pasm files testing specific behaviors.

Attached is what I've been using to test and compare running times for
different GC systems. It's given a list of builds of parrot, a list of
tests to run, and runs each four times and takes the sum of them as the
value for that test. Then it prints out a simple table for comparing the
results. It's not really robust or easily workable in a CVS checkout
(since it operates on multiple parrot checkouts).

Included are five tests of certain memory behaviors. They are:

gc_alloc_new.pbc
allocates more and more memory
checks collection speed, and the ability to grow the heap

gc_alloc_reuse.pbc
allocates more memory, but discards the old
checks collection speed, and the ability to reclaim the heap

gc_header_new.pbc
allocates more and more headers
checks DOD speed, and the ability to allocate new headers

gc_header_reuse.pbc
allocates more headers, but discards the old
checks DOD speed, and the ability to pick up old headers

gc_waves_headers.pbc
total headers (contain no data) allocated is wave-like
no data, so collection is not tested
tests ability to handle wavelike header usage pattersn

gc_waves_sizeable_data.pbc
buffer data (pointed to by some headers) is wave-like
a few headers, so some DOD is tested
mainly tests ability to handle wavelike buffer usage patterns

gc_waves_sizeable_headers.pbc
total headers (and some memory) allocated is wave-like
sort of a combination of the previous two
each header points to some data, so it tests the collectors
  ability to handle changing header and small-sized memory usage

gc_generations.pbc
me trying to simulate behavior which should perform exceptionally
  well under a genertaional collector, even though we don't have one :)
each memory allocation lasts either
  a long time, a medium time, or a short time


Please let me know if there are any other specific behaviors which could
use benchmarking to help compare every aspect of our GCs? Real-world
programs are too hard to come by. :) Results of the above test suite on my
machine comparing my local GC work and the current parrot GC are coming
soon...

Enjoy!
Mike Lambert

PS: If you get bouncing emails from me because my email server is down, I
apologize, and I do know about it. My email server is behind cox's
firewall which prevents port 25 access. It should be relocated and online
again in a few days.



gc_bench.zip
Description: gc_bench.zip


Re: [netlabs #629] [PATCH] Memory manager/garbage collector -majorrevision

2002-05-28 Thread Mike Lambert

>  STRING * concat (STRING* a, STRING* b, STRING* c) {
>PARROT_start();
>PARROT_str_params_3(a, b, c);
>PARROT_str_local_2(d, e);
>
>d = string_concat(a, b);
>e = string_concat(d, c);
>
>PARROT_return(e);
>  }

Yet more ideas. Woohoo! :)

I considered this kind of approach myself, but
discarded it due to the ton of extraneous code you have to write to do the
simplest of things. :( I'm not sure if the other people have considered
it, discarded it, or are still considering it.

As far as the pros/cons...

First, it requires you write in a pseudo-language to define your local PMC
headers and how to return data. I'm sure the macro freaks that have been
scarred by perl5 will jump on here and beat you down in a few hours or so.
:)

Can you provide an implementation of the macros you described above? I
have a few concerns which I'm not sure if they are addressed. For example:

PARROT_str_local(d)
I'm assuming it puts a reference to d onto the rooted stack. It would also
need to initialize d to NULL to avoid pointing at garbage buffers.

PARROT_str_params_3(a, b, c);
What's the point of this? With rule 5 that prevents function call nesting,
you're guaranteed of all your arguments being rooted. I think you can lose
either the nesting requirement or the str_params requirement.

PARROT_return(e);
I'm assuming this backs the stack up to the place pointed to by
PARROT_start(), right? This means during a longjmp, the stack won't be
backed up properly until another PARROT_return() is called, somewhere
farther up the chain, right?

Finally, I think Dan has already outlawed longjmp due to problems with
threading, but he'll have to elaborate on that. I agree my most recently
stated approach is not longjmp safe since it could leave neonate set on
certain buffers/pmcs.


Finally, in response to my original post, you asked:

> Suppose your C code builds a nested datastructure.  For instance,
> it creates some strings and add them to a hash-table.  The hash-table is
> then returned.  Should it clear the neonate flag of the strings?

I think I'd have to say...don't do that. Ops and functions shouldn't be
building large data structures, imo. Stuff like buliding large hashes
and/or arrays of data should be done in opcode, in perl code, or whatever
language is operating on parrot.

If you *really* need to operate on a nested datastructure, and you're
going to hold it against my proposal, then there are two options.

a) write code like:
base = newbasepmc #nenoate pmc
other = newchildpmc #also neonate
base->add(other) #even if collecting/dod'ing, can't collect above two
done_with_pmc(other) #un-neonates it, since it's attached to a root (neonate0 set
repeat...

It works, and then you just need to worry about what to do with your
'base' at the end of the function (to un-neonate it or not).

b) make a done_with_children_of_pmc() style function. it hijacks
onto the tracing functionality inherent in the DOD code, and searches for
a contiguous selection of neonate buffers and pointers eminating from
the pmc we pass in, and un-neonates them, leaving the passed-in-pmc
neonated. Since everything we do in the function is nenoate, everything we
construct into this base pmc should be contiguously neonate, if that
makes sense.

Granted, it's a little bit expensive to do the tracing, but you shouldn't
need to trace too deep at all, and its time is proportional to the size of
the nested data structure you are creating.

Does that help?
Mike Lambert

PS: Oh, and I forgot to mention in my previous proposal about the need for
nenonating pmc headers, and to look into what functions need to un-neonate
pmc headers. That should be localized to the vtable methods, which are
sort of a mess right now anyway with the transmogrification of vtables and
have other GC problems.





Re: [netlabs #629] [PATCH] Memory manager/garbage collector -majorrevision

2002-05-28 Thread Mike Lambert

Okay. I have yet another idea for solving our infant mortality problem,
which I think Dan might like. :)

The neonate idea originally was intended to be set on *all* headers
returned by the memory system, and they'd be reset by a clear_neonate op.
At least, that's how I understood it. A straightforward implementation of
the above is about 50% slower than it was before, so I think that rules
this option out.

The current code (without this patch), adds neonate wherever it discovers
that it is needed, and turns it off when it is done. This was quite
efficient, but required the user to constantly think about what functions
could cause GC, etc. It was rather error-prone.

If I understood Dan correctly on IRC yesterday, he was proposing that our
current approach of handling infant mortality everywhere it can occur, is
the 'correct' approach. It definitely buys us speed, but as mentioned
above, it's somewhat error prone. The below is an attempt to try and
convince Dan that in lieu of hardcore GC-everywhere programming, there is
a middle ground. I believe we need a middle ground because forcing users
to learn the quirks of our GC system makes parrot programming less fun,
and raises Parrot's barrier to entry.

As I was working on my revised GC system, I came up with a relaxation of
the above that should be easier on programmers, and yet still be fast.
It's not revolutionary by any means, but rather grabbing bits and pieces
of different people's solutions. When you call new_*_header, the neonate
flag is automatically turned on for you. As a programmer writing a
function, you explicitly turn off the neonate flag when you attach it to
the root set, or let it die on the stack. If you return it, you don't do
anything, as it becomes the caller's job to handle.

Neonate guarantees that it won't be collected, avoiding infant mortality.
The programmer does not have to explicitly turn it on. Just turn it off.

>From a cursory glance over string.c, only string_concat and string_compare
create strings which die within the scope of that function, and thus need
to be modified.

This approach would complicate many of our string .ops, however. Stuff
like "$1 = s" needs to turn off the neonate flag. Perhaps we can encode
logic into the ops2c converter to turn off the neonate flag for things
that it can detect, or perhaps we can require the user to do it because
automated converters are guaranteed to fail. Core.ops requires a lot of
such modifications, however. Things like err, open, readline, print, read,
write, clone, set, set_keyed, the various string ops (substr, pack, etc),
and savec, all require modification.

I think these guidelines make it easy for non-GC-programmers to writ
GC-dafe code, since they do not need to be aware of what allocates memory,
and what does not.

What do people think of this approach?
Mike Lambert




Re: Hashtable+GC problems

2002-05-26 Thread Mike Lambert

> > Something about the whole setup just feels wrong. GC shouldn't be this
> > intrusive. And it definitely shouldn't slow down the common case by
> > making the compiler defensively reload a buffer pointer every few
> > instructions (it'll be cached, but it fouls up reordering.)
>
> Alright. Today I discovered tracked headers. :)
>
> What are wrong with these for hashtable buckets?  These are headers, and
> so are immobile. You can allocate lots of them without having to pay much
> of a price in terms of instructions.

Well, for one, this isn't the intended use of these tracked headers. From
my recent understading of how they should work:
- They must be larger than a Buffer, and be a buffer header.
- They will eventually get collected like regular headers.
- They will be DOD'ed in the same manner as regular headers.
All of which combines to mean that the above proposed use for tracked
headers is incorrect.

But perhaps there is a place for a small object allocator? Guidelines
for it would be:
non-moving (non-copying, non-compacting)
one size per pool
similar to tracked header support
headers can be implemented on top of the small object allocator

Hashtables, with a bunch of small buckets with pointers between them,
could be implemented below buffer headers. They would avoid the overhead
of a full buffer header per bucket, but would require the hashtable to
maintain them manually (which thet current hashtable code does anyway,
iirc).

Pros:
- not stuck with using headers for everything. keys could use a small
object allocator (SOA)
- see any traditional SOA for their advantages over generic memory
pools

Cons:
- see previous email...cache coherency
- see previous email...lack of automatic GC

Any others? Thoughts on why we should/shouldn't implement this kind of
thing in parrot, below the buffer level?

Thanks,
Mike Lambert

PS: In case you're confused...yes, I was replying to myself. :)




Re: Hashtable+GC problems

2002-05-26 Thread Mike Lambert

> Ok, I'll finish off the original conversion to indexed access that I
> began once, before giving up in disgust. The problem is not just that
> you have to use indices instead of pointers; it's also that you have
> to constantly go back to the buffer header before you can get
> anywhere. That needs to be hidden by a macro or (my preference) an
> inline function, and slows down the common case. Also, you lose the
> clean sentinel value NULL (index 0 is definitely valid; index -1
> introduces signedness problems.)

Dan says it won't be slow. So nyah! :P

> Let me know if you've already started a rewrite, though, so I don't
> just redo it.

Sorry, I forgot to reply earlier...no I hadn't started work on a rewrite.

> Something about the whole setup just feels wrong. GC shouldn't be this
> intrusive. And it definitely shouldn't slow down the common case by
> making the compiler defensively reload a buffer pointer every few
> instructions (it'll be cached, but it fouls up reordering.)

Alright. Today I discovered tracked headers. :)

What are wrong with these for hashtable buckets?  These are headers, and
so are immobile. You can allocate lots of them without having to pay much
of a price in terms of instructions.

One drawback of tracked headers is the loss of cache coherency over time
as the tracked headers end up getting spread out over memory, and then
large allocations get interspersed into the various holes, with no
locality. Hopefully we can get away with this due to the studies which
have shown that objects tend to live and die in groups (and thus allocate
and free up lots of memory all at once).

Another problem is that these tracked headers aren't DOD'ed at all. This
means you have to explicitly free them with add_to_free_pool (I'm not sure
what the design of tracked headers is supposed to bewho is the
'tracked' referring to? Is user code or GC code supposed to be tracking
them?). Since all of the buckets in your hashtable should be available
from the hash itself, it should be easy to manage them yourself.

In addition to not being dod'ed themselves, they don't mark other objects
as live themselves. So you'd have to handle all your tracked headers in
your PMC, going through them yourself and handling any buffer/pmcs they
might point at.

Now you have immobile memory that's efficient to allocate, good at
avoiding memory fragmentation, and good for you to do with what you
please.

Once we figure out how hashes are implemented well, we should probably
write up some guidelines on when to use what kinds of headers, et.c

Thoughts?
Mike Lambert




Re: GC design

2002-05-26 Thread Mike Lambert

> Add a counter to the interpreter structure, which is incremented every
> opcode (field size would not be particularly important)
> Store this counter value in every new object created, and set the 'new
> object' flag (by doing this on every object, we remove the requirement for
> the creating function to be aware of what is happening)
> If an object is encountered during DOD that claims to be new, but was not
> created during the current opcode, dispute the claim.
> If the counter has exactly wrapped in the meantime, an object might survive
> longer than it should.

I know Dan's proposed solution has already been committed, but I'd
like to add my support for this. In addition to providing a counter per
opcode to ensure that we don't prematurely GC data, this field could also
be reused to calculate the generation of a particular buffer, to help sort
it into a correct generational pool (if/when we do get that).


Another proposal is to walk the C stack. In runops (or some similar
high-level function) we implement a dummy variable, and store a reference
to it in the interpreter. In do_dod, we create another dummy stack
variable. We then walk the memory byte by byte (or maybe some
larger amount, if that's guaranteed), check to see if it passses 'the
three rules', and then mark the buffer it points to. This is on the
conservative side in that we might accidentally mark things we shouldn't.
Then again, with our registers, it's very possible to reference old data
which the program never bothered to clear, which also would be overly
conservative.

The three rules were: (as defined in Jones and Lins' Garbage Collection, pg 233)
- Does p refer to a heap? (Is it within the low and high marks of all the
header pools)
- Has the heap block been allocated? (Go through the heaps, and check to
ensure that this pointer points into one of our header blocks)
- Is the offset a multiple of the object size of that block? (So we don't
get random memory pointers into the header list, but only aligned ones)


As long as the C stack is guaranteed to be contiguous, this should be
portable. I'm not sure if that is guaranteed by ANSI C, however.


Has this already been considered and explicitly rejected?

Thanks,
Mike Lambert




Re: [netlabs #607] [PATCH] COW strings (again)

2002-05-21 Thread Mike Lambert

> Actually, we don't. (Sez the man catching up on altogether too much
> mail) Since we're putting the COW stuff at the tail end, substrings
> of COW strings are fine. You set the bufstart to where the substring
> starts, buflen set so it goes to the end of the original buffer, and
> the string length bit holds the real string length. That way you
> start where you need to, but you can still find the COW marker off
> the end of the original buffer.
> [End quote]

I see one problem with this kind of approach...

We have two string headers, A, and B. A points to the second half of
string B, and the bufstart points into the middle of B.

We start the collection/compacting process. We first find header A, and
copy buflen chars after bufstart. We copied half the string. Now header B
is traversed, and it can't point to the same memory because only half of
the required amount was copied.

We could 'fix' this up by including the true buffer length in the buffer
footer, so that we ignore the header's buflen during the collection
process.

But I think the strstart is a better idea regardless. It's what perl5 did
anyway, isn't it?

Mike Lambert




Re: GC vtable method limitations?

2002-05-20 Thread Mike Lambert

> At 12:06 AM -0400 5/19/02, Mike Lambert wrote:
> >Is there a plan to make a freed method for when pmc header gets put
back
> >onto the free list? (This would require we call this method on all
pmc's
> >before moving anything to the freelist, in case of dependancies between
> >pmcs and buffers)
>
> Nope. I don't see a need--once the PMC's been destroyed, it belongs
> to the system.

Um. I see we have a destroy() vtable method, but it's only called when one
calls the destroy() op, and the PMC has PMC_active_destroy_FLAG. I don't
get this. What's the point of actively-destroying things? I thought since
we had a GC, we don't need to worry about this kind of stuff. I'd argue
that destroy() should get called when the PMC gets put back on to the free
list (similar to destructors in C++). That was the behavior I was
documenting when I discussed destruct below, at least.

> Collect's dead, I think. I'm not seeing the point anymore, and since
> we do collect runs through the buffers and not the PMCs, there's no
> place to find what needs calling.

Well, the hashtable could certainly use it. :) There is a hashtable pmc,
which stores a bunch of pointers into some internal buffer data. Every
time functions get called, it calls restore_invariants to fix them up. It
might be better to do those fixups in the collect() method, so that they
could update their internal data pointers. Or perhaps it should be
rewritten to use indices. :)


Mike Lambert




Re: GC design

2002-05-19 Thread Mike Lambert

> >Most vtable methods, and/or people that call vtable methods, will end up
> >making themselves critical. This overhead will be imposed on most function
> >calls, etc. Lots of the string api will require the users to mark
> >themselves as critical.
>
> I don't think this is accurate. People calling vtable methods have no need
> to mark themselves as critical. The things that mark themselves critical
> are internals that are allocating and holding onto objects. I think very
> few vtable methods even fall into this category, but I'd have to survey
> the .pmc files before continuing this discussion.

Perl strings, arrays, and hashes all require buffer manipulation, and will
probably fall prey to this. I agree that I was probably generalizing a
bit, and that in theory PMCs can criticalize their own methods.

> >If I remember correctly, this did get hammered out with a directive from
> >Dan. ;)
>
> I've seen no evidence of that hammering. I still think we are having GC
> crashes on this issue.

I said Dan gave a directive. I didn't say anyone listened to Dan, or
implemented what he suggested. ;)

> >The advantages of this are that nobody needs to worry about the GC
> >implications of their code.
>
> Yes they do, they have to call an explicit routine, clear_uncollectable

Ah, but as internals designers, we don't need to worry about that. We get
to push it into the compiler writers. Isn't it fun? :)

I initially had the same opposition to Dan's idea that you have. I'm not
sure why I eventually gave in...perhaps the lack of any other solution?
Don't really recall. :)

> I'd just like to see someone implement a solution, and give benchmarks
> to back it up.
>
> I'd like to see both approaches compared, personally, and I think neither
> requires a whole lot of thought to implement. I also think we should reference
> existing research and implementations, since we aren't the first to do this.

That's true. I believe most implementations have taken advantage of the
ability to access the C stack, something we don't have the liberty of with
our wide-reaching compatibility goals.

I think this will be similar to the cost of a reference counting solution
versus a tracing system, where the former amortizes the cost over the
entire system, but ends up being slower in terms of total time used. Your
approach would involve lots of computation in lots of little functions
over parrot's execution, whereas Dan's would involve a full trace
(equivalent to a DOD) to be performed every now and again.

But that's just hypothetical posturing because I don't have any real
benchmarks, of course.

Mike Lambert




Dynamic register frames?

2002-05-19 Thread Mike Lambert

I may be approaching semi-radical territory here with this idea.

I've read all the FAQs and reasons on why we choose a register
architecture versus a stack architecture. However, I recently thought of a
combination idea which, (although it was probably discovered in the
70s sometime,) I think provides the best of both worlds, and would like
to propose it before I shove it off to the dust-bin.

Problems with register architecture:

with caller-save, we're saving 0.5KB (4types*32registers*4bytes) of data
per function call, which might add up with deeply-nested functions.

If we want more than 32 elements, we need to start doing stack-pushing to
get around limitations. One thing that I wanted to do in a regex
implementation was use the full set of registers to store off certain
points in the compiled regex. With 32 registers, longer regexes will
require stack pushing at a certain point, and that will make a certain
transition of the regex slower than the other portions of the regex.


Problems with stack architecture:

time spent pushing/popping ops (also happens with parrot registers if we
use >32 elements)

stack grows at runtime


Now, my proposal is simply that instead of hardcoding to 32 elements, we
allow the function to determine the number of elements in each register
type. This gives us:

each function uses a minimal of space, since it only uses as many
registers as it requires

leaf accessor functions have 0 or 1 registers for the most part, so we
don't need to allocate a whole new set of registers for them.
Caller-save is just pushing the current register frame onto the stack,
and allocating a properly-sized register frame for the current leaf
function, which is very small. Should be efficient.

the register stack only gets used for functions. For the most part,
functions allocate all the space they know they'll need, and that's it.
(Of course, I suppose there are probably legit reasons for functions to
use the register stack frames, however).

We can use more than 32 elements. This means a 120-node regex can allocate
120 int registers for it's operation and execution. (Not saying that it's
1:1, more like 1:1 for non-greedy regex ops only, but still)

No need to worry about register liveness/allocation to make things work.
We now only need to do it if we want to make things run faster by using
*less* than one register per C/perl stack variable.


It's probably a bit late in the parrot development cycle to be propsing an
idea like this, but I suppose since this idea couldn't have been
implemented until we had functions which could store register
requirement
information, this might actually be a good time to suggest it. :)


Thoughts or comments? Thanks,
Mike Lambert




Re: GC design

2002-05-19 Thread Mike Lambert

> I would like an elegant, easy to use solution for making the GC
> play nicely.

So would we all. :)

> This creates a sliding scope window that GC must not peep through,
> and provides a clean interface for internals writers.

I think you've explained this idea before, but I complained about it
because I thought that the bottom_gen never got set to top_gen, and
figured a lot of stuff would end up permanently allocated.

Now that I see how it works, it seems to make a lot of sense. Problems
with your approach:

GC-sensitive functions must remember to mark themselves as critical.
This will be a source of bugs (whether that's a big enough of a
complaint is up for debate. ;)

Most vtable methods, and/or people that call vtable methods, will end up
making themselves critical. This overhead will be imposed on most function
calls, etc. Lots of the string api will require the users to mark
themselves as critical.


> Lets hammer this one design issue out for good because I'm tired of worrying
> about it and I think its hindering current Parrot developers and
> confusing potential newcomers.

> If it is not what I propose, lets at least discuss alternatives.

If I remember correctly, this did get hammered out with a directive from
Dan. ;)

His approach was:
the live flag is valid only within GC.
all newly-allocated headers are marked as uncollectable
there is a clear-uncollectable op, which iterates over the headers, and
  marks them all as collectable
Basically, you need to have assigned all your headers to something
traceable by the root set before your current op ends.

The advantages of this are that nobody needs to worry about the GC
implications of their code.

The disadvantage are:
- very expensive ops can allocate lots of uncollectable headers?
- you must explicitly allow for marking headers as collectable in your
opcode, at strategically placed locations. otherwise, nothing gets
collected and you have no dod results, although collection will still
occur normally.

Any other contenders to the ring? Anyone have any other major
dis/advantages they'd like to contribute about the above approaches?

FWIW, I feel confident enough about my understanding of Dan's idea to
implement that, should we choose it. Melvin's idea would require that
much more work on the multitude of functions, and so I can't imagine it
being as easy to implement. :)

Mike Lambert




  1   2   >