> For purely academic purposes, I have re-synchronised some of my > forbidden code with the latest CVS version of Parrot. All tests pass > without gc debug; and with gc_debug plus Steve Fink's patch. > Benchmarks follow, on a 166MHz Pentium running linux 2.2.18. > > Parrot African Grey > life (5000 generations) 172 seconds 81 seconds > reverse <core_ops.c >/dev/null 193 seconds 130 seconds > hanoi 14 >/dev/null 51 seconds 37 seconds
Rather impressive. Except that it makes me look bad. :) > The differences between the two versions are: > 1) Use of the interpreter cycle-counter instead of stack walking. > 2) Linked lists of buffer headers sorted by bufstart > 3) COW-supporting code in GC (for all buffer objects) > 4) Implementation of COW for string_copy and string_substr 1) Yeah, the approach of cycle-counter is a nice one. I also had a similar solution involving neonate flag usage, somewhere in the archives. Both have *significant* speed advantages versus the curent codebase's stack walking. I tried to convince Dan of the merit, but they failed for various reasons: Your solution, (ignoring the extra cycle counter byte for now), cannot handle vtable methods implemented in Parrot. The current system to implement this involves the interpreter recursively calling runops_core to handle the vtable method. If you increment cycles on the inner loop, you risk pre-collection of stuff on the stack of the vtable method calling stuff. If you don't increment cycles, you prevent any of the memory allocated inside of this vtable method from ever being collected during the method execution...bad stuff when your vtable methods are multiplying gigantic matrices or somesuch. My neonate buffers solution fails only in the presence of longjmp. Granted, we don't do any of this yet, so these solutions will mop the floor with my current stackwalk code, and pass tests to boot. But it's the planned introduction of these requirements which are the reason for making these solutions 'forbidden'. One of Nick's solutions was to fallback on stack-walking to handle the cases where our faster solutions fail. I can definitely see that working with neonate buffers to handle the fixups needed after a longjmp call. But it doesn't seem as attractive in the presence of your solution, for which it would require stackwalking for all re-entrant runops calls. Do you have another solutioin in mind for handling re-entrant runops calls? As far as the extra byte in the buffer, I don't mind that one at all. There are a lot of restrictions on the GC code in the interest of making stuff "lightweight". Unfortuantely, GC code takes a significant portion of the execution time in any realistic application. Hopefully we can convince Dan to allow extra fields in the buffers in the interest of speed, but I don't think we can reduce parrot/perl6's feature set in the interest of speed...might as well use C if that's what you want. :) 2) Currently, we use linked list of buffer headers for freeing and allocating headers. I'm not sure what you mean by saying that they are sorted by bufstart? What does this buy over the current system? 3) Definitely a good one. I've been trying to merge your original COW patch into my code here. Without GC_DEBUG, it fails one test. With GC_DEBUG, it fails the traditional set plus that one test. The test case is rather large unfortunately, I haven't been able to narrow down the problem further or I'd have committed it. 4) Isn't this really the same thing as item 3? I'm basing my knowledge off your old COW patches. Has additional work been done on the string function integration since then, or do #3 and #4 both come from those patches? > Some of the changes I made before the memory management > code was totally reorganised have not yet been re-integrated. > My last version prior to that reorganisation ran 5000 lives in > 61 seconds, and I hope to get back to somewhere close to > that again. I'm not sure how much of the new code you've merged with. Which of the new files are you planning to integrate/merge with, and which have you thrown out in favor of older versions? I'm specifically referring to any of resources/dod/smallobject/headers.c. Regardless of whether or not it goes in, I'd be interested in seeing a patch. I can work on integrating a lot of your non-forbidden code into the current codebase. Thanks for spending the time to generate these numbers...they're a nice eyeopener on what can be done without the current restrictions. Hopefully they'll allow us to reconsider each restriction in the context of the speed of our GC. Mike Lambert