> For purely academic purposes, I have re-synchronised some of my
> forbidden code with the latest CVS version of Parrot. All tests pass
> without gc debug; and with gc_debug plus Steve Fink's patch.
> Benchmarks follow, on a 166MHz Pentium running linux 2.2.18.
>
>                                      Parrot      African Grey
> life (5000 generations)           172 seconds      81 seconds
> reverse <core_ops.c >/dev/null    193 seconds     130 seconds
> hanoi 14 >/dev/null                51 seconds      37 seconds

Rather impressive. Except that it makes me look bad. :)

> The differences between the two versions are:
> 1) Use of the interpreter cycle-counter instead of stack walking.
> 2) Linked lists of buffer headers sorted by bufstart
> 3) COW-supporting code in GC (for all buffer objects)
> 4) Implementation of COW for string_copy and string_substr

1) Yeah, the approach of cycle-counter is a nice one. I also had a similar
solution involving neonate flag usage, somewhere in the archives. Both
have *significant* speed advantages versus the curent codebase's stack
walking.

I tried to convince Dan of the merit, but they failed for various reasons:

Your solution, (ignoring the extra cycle counter byte for now), cannot
handle vtable methods implemented in Parrot. The current system to
implement this involves the interpreter recursively calling runops_core to
handle the vtable method. If you increment cycles on the inner loop, you
risk pre-collection of stuff on the stack of the vtable method calling
stuff.  If you don't increment cycles, you prevent any of the memory
allocated inside of this vtable method from ever being collected during
the method execution...bad stuff when your vtable methods are multiplying
gigantic matrices or somesuch.

My neonate buffers solution fails only in the presence of longjmp.

Granted, we don't do any of this yet, so these solutions will mop the
floor with my current stackwalk code, and pass tests to boot. But it's the
planned introduction of these requirements which are the reason for making
these solutions 'forbidden'.

One of Nick's solutions was to fallback on stack-walking to handle the
cases where our faster solutions fail. I can definitely see that working
with neonate buffers to handle the fixups needed after a longjmp call. But
it doesn't seem as attractive in the presence of your solution, for which
it would require stackwalking for all re-entrant runops calls. Do you have
another solutioin in mind for handling re-entrant runops calls?

As far as the extra byte in the buffer, I don't mind that one at all.
There are a lot of restrictions on the GC code in the interest of making
stuff "lightweight". Unfortuantely, GC code takes a significant portion of
the execution time in any realistic application. Hopefully we can convince
Dan to allow extra fields in the buffers in the interest of speed, but I
don't think we can reduce parrot/perl6's feature set in the interest of
speed...might as well use C if that's what you want. :)

2) Currently, we use linked list of buffer headers for freeing and
allocating headers. I'm not sure what you mean by saying that they are
sorted by bufstart? What does this buy over the current system?

3) Definitely a good one. I've been trying to merge your original COW
patch into my code here. Without GC_DEBUG, it fails one test. With
GC_DEBUG, it fails the traditional set plus that one test. The test case
is rather large unfortunately, I haven't been able to narrow down the
problem further or I'd have committed it.

4) Isn't this really the same thing as item 3? I'm basing my knowledge off
your old COW patches. Has additional work been done on the string function
integration since then, or do #3 and #4 both come from those patches?

> Some of the changes I made before the memory management
> code was totally reorganised  have not yet been re-integrated.
> My last version prior to that reorganisation ran 5000 lives in
> 61 seconds, and I hope to get back to somewhere close to
> that again.

I'm not sure how much of the new code you've merged with. Which of the new
files are you planning to integrate/merge with, and which have you thrown
out in favor of older versions? I'm specifically referring to any of
resources/dod/smallobject/headers.c.

Regardless of whether or not it goes in, I'd be interested in seeing a
patch. I can work on integrating a lot of your non-forbidden code into the
current codebase.

Thanks for spending the time to generate these numbers...they're a nice
eyeopener on what can be done without the current restrictions. Hopefully
they'll allow us to reconsider each restriction in the context of
the speed of our GC.

Mike Lambert

Reply via email to