from:"Michel J Lambert"

Re: [PATCH] Disable GC at startup

2002-04-12 Thread Michel J Lambert


> I thought the point of the discussion was turning off the GC until such time
> that it was ready to go.  I know what it *does* - what should it *do*?
>
> {Rest of the comments snipped.}

I don't know quite what you mean by what 'should it *do*'? *do*, as in
what it should do with my patch? Or *do*, as in do what dan thinks it
should?  Or something else? As such, below is just me rehashing what I've
said before, but hopefully in a clearer manner.

My patch *did* turn off the GC until it was ready to go. Dan said that
this wasn't necessary, since GC_DEBUG was the case that caused dod and
collection runs to occur during initialization. My response was to point
out that we're currently 'lucky' that we have enough memory to complete
the interpreter initialization. Magic numbers, as 32768 happens to be
bigger than 3 arenas of 1024. If/as more stuff starts occuring during
initialization, we might overflow our magic number 32768 and cause
problems.

You stated (at least my interpretation of what you stated was) that if we
can't provide enough memory to startup without causing a collection run,
we shouldn't bother starting up at all. If we didn't have enough memory to
start up, then I'd agree with you. But, since we give out requests from
whatever we initially allocate, THEN do a collection run, and THEN
allocate more, it's perfectly possible to request more memory than we have
allocated at the time, and Parrot *should* go and get more. Currently, if
this situation happens during initialization, Parrot will crash.

Mike Lambert

Profiling Parrot

2002-04-12 Thread Michel J Lambert


> compared to a current CVS version of:
> 5000 generations in 90.813806 seconds. 55.057708 generations/sec
> A total of 32768 bytes were allocated
> A total of 130932 DOD runs were made
> A total of 10930 collection runs were made
> Copying a total of 57801936 bytes
>
> so a 97% decrease in the number of collection runs only turns into an 11%
> improvement in total performance.

This prompted me to setup Parrot for profiling in VTune here on my
machine. What follows is a rather longish email, for which I
apologize. Lots of it is due to statistics, which I'm including to allow
people to draw their own conclusions. I know I'm not the only one that's
profiled Parrot, but I've never really seen any results of Parrot
profiling on the list. If anyone has a benchmark they'd like me to run for
them, I'm willing to do that. I think Brent might have wanted this before,
back before I had set this up.

in reverse order of frequency for the more popular functions:
Total Count Function
4050singlebyte_skip_forward
2025singlebyte_decode
1800Parrot_add_i_i_i
11260238add_header_to_free
11260073Parrot_allocate
11260071new_string_header
11245053Parrot_string_make
10130030Parrot_string_length
10130003Parrot_set_i_ic
10125000Parrot_substr_s_s_i_ic
10125000Parrot_string_substr
10125000Parrot_string_compare
900 Parrot_mod_i_i_i
900 Parrot_ne_s_sc_ic
2189577 Parrot_branch_ic

(yes, we perform the extremely simple singlebyte_skip_forward 40 million
times on a 5000 generation life.pasm)

In order of total time spent in the function ALONE, we have:
TimeFunction
3004runops_fast_core
2182Parrot_string_substr
2039singlebyte_skip_forward
2018Parrot_string_compare
1212Parrot_string_make
1125Parrot_add_i_i_i
1088Parrot_mod_i_i_i
925 new_string_header
904 Parrot_substr_s_s_i_ic
836 Parrot_ne_s_sc_ic
828 add_header_to_free
737 singlebyte_decode
688 Parrot_allocate
557 Parrot_set_i_ic
534 mark_PMCs_unused
453 free_unused_buffers
440 Parrot_string_concat
393 Parrot_string_length
354 mark_buffers_unused
202 Parrot_concat_s_sc


Finally, for total time taken spent in a function and it's called
functions, we have:
Total Time  Function
9469Parrot_substr_s_s_i_ic
8564Parrot_string_substr
5842Parrot_string_make
4145Parrot_ne_s_sc_ic
3679new_string_header
3668Parrot_string_compare
2753alloc_more_string_headers
2716Parrot_do_dod_run
2039singlebyte_skip_forward
1628Parrot_concat_s_sc
1425Parrot_string_concat
1281free_unused_buffers
1125Parrot_add_i_i_i
1088Parrot_mod_i_i_i
956 Parrot_allocate
828 add_header_to_free
737 singlebyte_decode
557 Parrot_set_i_ic


While life normally took 7 seconds to execute, performing a 'call graph
profiling session' took around 287 seconds. The numbers you see above are
in milliseconds, and are obviously inflated as compared to an unprofiled
execution. The profiler lists the total time as 21873 milliseconds.

Due to the amount of string manipulation that life.pasm performs, this is
obviously the bottleneck here. It's interesting to note that GC's DOD and
collection show up, but don't really rate that high, which corresponds to
Peter's findings.

The most expensive functions, in terms of time spent within that function,
are the ones that are called the most often, regardless of how simple
they are. I wasn't sure how much of this is due to profiling overhead
involved in tracking function calls, and how much is indicative of
performance problems.

To test this, since this example only uses singlebyte encoding, I
inlined the contents of the encoding functions (singlebyte_skip_forward
and singlebyte_decode) and tried it again.

Using encodings for string_substr and string_compare:
5000 generations in 7.071001 seconds. 707.113457 generations/sec

Using hardcoded singlebyte-ness for them:
5000 generations in 6.348999 seconds. 787.525716 generations/sec

Or roughly 10%, due to the need to support encodings. This is in
comparison to the profiler's listing 12% of the time being spent in the
those two encoding functions. So it seems pretty accurate, as far as where
time is going.


Few things immediately come to mind:
a) with the current encoding system, we're guaranteed to be slower than
without it. If we want Parrot to be as fast as Perl5, we're deluding
ourselves.
b) the results will probably be worse in more string-heavy applications,
such as regexes, etc.
c) there's gotta be a way to have multiple encodings without sacrificing
that much speed

One approach that came to mind was macros (gah! no! erg!). An equivalent
system would be preprocessing of .c files to produce .c files for the
compiler, via some perl script.

Imagine if we put substring, compare, etc, etc in the encodings vtable.
This would result in code duplication due

Re: [PATCH] Disable GC at startup

2002-04-12 Thread Michel J Lambert



> The current design never shrinks the free header pools, and indeed there is
> probably little point in doing so, so nothing seems to be gained from
> including them in the collection process.
>
> Using my favourite 5000-generation life.pasm as an example:
> A total of 10930 collection runs were made
> Copying a total of 57801936 bytes
>
> Since the three pools occupy 1024 bytes each, 33576960 bytes of the above
> copying is due to the pools.

Allocating the pools from system memory would also help with the
GC_DEBUG-enabled parrot failing on startup, since we'd never be calling
Parrot_allocate during interpreter initialization (although we still do
call pmc_new, which could call dod, and/or perform allocations).

Below is a patch which allocates the pools from system memory.
Unfortunately, it doesn't seem to provide any noticeable speed gains. I
get anywhere from a -1 to 15 extra generations per second. Current results
show:

Old:
A total of 10930 collection runs were made
Copying a total of 57801936 bytes

New:
A total of  collection runs were made
Copying a total of 27401216 bytes

System is a P3 1Ghz, 512MB ram, MSVC-compiled Parrot.

Mike Lambert


? life.pbc

Index: memory.c

===

RCS file: /cvs/public/parrot/memory.c,v

retrieving revision 1.30

diff -u -r1.30 memory.c

--- memory.c29 Mar 2002 19:13:08 -  1.30

+++ memory.c12 Apr 2002 09:15:59 -

@@ -110,31 +110,25 @@

 /* Init the string header pool */

 interpreter->arena_base->string_header_pool =

 mem_sys_allocate(sizeof(struct free_pool));

-interpreter->arena_base->string_header_pool->pool_buffer.bufstart =

-Parrot_allocate(interpreter, 1024);

-interpreter->arena_base->string_header_pool->pool_buffer.flags =

-BUFFER_live_FLAG;

-interpreter->arena_base->string_header_pool->pool_buffer.buflen = 1024;

+interpreter->arena_base->string_header_pool->bufstart =

+mem_sys_allocate(1024);

+interpreter->arena_base->string_header_pool->buflen = 1024;

 interpreter->arena_base->string_header_pool->entries_in_pool = 0;

 

 /* Init the buffer header pool */

 interpreter->arena_base->buffer_header_pool =

 mem_sys_allocate(sizeof(struct free_pool));

-interpreter->arena_base->buffer_header_pool->pool_buffer.bufstart =

-Parrot_allocate(interpreter, 1024);

-interpreter->arena_base->buffer_header_pool->pool_buffer.flags =

-BUFFER_live_FLAG;

-interpreter->arena_base->buffer_header_pool->pool_buffer.buflen = 1024;

+interpreter->arena_base->buffer_header_pool->bufstart =

+mem_sys_allocate(1024);

+interpreter->arena_base->buffer_header_pool->buflen = 1024;

 interpreter->arena_base->buffer_header_pool->entries_in_pool = 0;

 

 /* Init the PMC header pool */

 interpreter->arena_base->pmc_pool =

 mem_sys_allocate(sizeof(struct free_pool));

-interpreter->arena_base->pmc_pool->pool_buffer.bufstart =

-Parrot_allocate(interpreter, 1024);

-interpreter->arena_base->pmc_pool->pool_buffer.flags =

-BUFFER_live_FLAG;

-interpreter->arena_base->pmc_pool->pool_buffer.buflen = 1024;

+interpreter->arena_base->pmc_pool->bufstart =

+mem_sys_allocate(1024);

+interpreter->arena_base->pmc_pool->buflen = 1024;

 interpreter->arena_base->pmc_pool->entries_in_pool = 0;

 Parrot_new_pmc_header_arena(interpreter);

 }

Index: resources.c

===

RCS file: /cvs/public/parrot/resources.c,v

retrieving revision 1.41

diff -u -r1.41 resources.c

--- resources.c 11 Apr 2002 18:54:33 -  1.41

+++ resources.c 12 Apr 2002 09:16:00 -

@@ -25,13 +25,12 @@

   /* First, check and see if there's enough space in the free pool. If

we're within the size of a STRING pointer, we make it bigger */

   if (pool->entries_in_pool * sizeof(PMC *) >=

-  pool->pool_buffer.buflen - sizeof(PMC *)) {

+  pool->buflen - sizeof(PMC *)) {

 /* If not, make the free pool bigger. We enlarge it by 20% */

-pool->pool_buffer.bufstart = mem_realloc(interpreter,

- pool->pool_buffer.bufstart,

- pool->pool_buffer.buflen,

- (UINTVAL)(pool->pool_buffer.buflen * 
1.2));

-pool->pool_buffer.buflen = (UINTVAL)(pool->pool_buffer.buflen * 1.2);

+pool->bufstart = mem_sys_realloc(pool->bufstart,

+ pool->buflen,

+ (UINTVAL)(pool->buflen * 1.2));

+pool->buflen = (UINTVAL)(pool->buflen * 1.2);

 

   }

 #ifdef GC_DEBUG

@@ -40,7 +39,7 @@

 

   /* Okay, so there's space. Add the header on */

   ((PMC *)to_add)->flags = PMC_on_free_lis

Re: [PATCH] Disable GC at startup

2002-04-12 Thread Michel J Lambert


> > So you're saying that the calls to get memory during interpreter
> > initialization are somehow guaranteed to not require more memory (and thus
> > a dod or collection run)? Currently, this guarantee is not expressed in
>
> I don't understand the "thus."  Nothing states that requesting memory
> mandates a DOD/GC run.

If I call Parrot_allocate, it checks to see if it has enough allocated
memory to provide the request. If not, it performs a collection run, and
tries again. If not, it tries allocationg a new block of memory, and
returns that. If that fails, it gives up.

So if our initial allocation size for some reason isn't enough for all of
the parrot initializations, it will cause a collection run. This in turn
depends that the various *_pools were correctly initialized. If this is
not the case (say, we're allocating memory for the pools in the first
place), bad things happen.

Likewise, the construction of a new PMC for the stash_hash could cause a
DOD to occur if there aren't enough PMC headers available. Granted one
won't cause this, or even a few. But the possibility exists, if say the
stash_hash PMC initializes a bunch of default PMCs for its use, or
somesuch.

> > code, comments, or documentation, from what I can tell. And I personally
> > hate implicit assumptions like these...they're the cause of all that magic
> > voodo, far away from where it's obvious what the problem is. We have no
> > idea of knowing how much is 'enough' for the GC to allocate at startup,
> > aside from the following the code and tracking allocations, which can be
> > a pain.
>
> I guess it depends on what you consider to be initialization.  Little, if
> any, of what the interpreter itself demands in memory - including memory for
> the GC itself - is going to be reclaimable; it's all mostly persistent
> throughout the life of the interpreter.

Right. However, that's not to say that memory cannot grow. The interpreter
allocates the various *_pools from the interpreter's memory_pool, and it
gets copied with each collection run. This memory can grow and change in
size as more memory for pools are needed. Currently, we use the GC's
buffer system for tha. Alternately, we could use mem_sys_allocate, and
manually free and allocate memory for them. The latter would then allow
these pools to not be copied during collection runs, and would eliminate
that particular element of unsafety.

> If there's not enough memory to bootstrap the interpreter and the GC, then
> even if the GC *could* reclaim enough to satisfy subsequent requests,
> there's not going to be nearly enough memory left to do anything.  So why
> bother?

Because if there's not enough memory, it can still allocate *more* memory.
It just tries to reclaim enough memory before resorting to the 'extreme'
of getting more system memory. I hope that makes things clear.

> (Just) enough of the interpreter needs to start up to start up the memory
> allocator and the GC subsystem.  If it needs more memory to do so, it should
> simply be given more memory.  After the GC is up and running, the
> interpreter/bootstrapping process can trigger a GC run to free up as much
> memory as it can.  The remainder of the interpreter can then start up,
> running through the GC.

You're, right, it should be given more memory to do so. But since it
performs a collection run before getting more memory, and since it's not
safe to perform a collection run during initialization, bad things happen.

Mike Lambert

Re: [PATCH] Disable GC at startup

2002-04-11 Thread Michel J Lambert


Dan Sugalski wrote:
> That's because GC_DEBUG is fundamentally broken by design. It's doing
> things it shouldn't be doing--the system is such that it shouldn't
> ever trigger GC before the memory allocation system is set up.
>
> If this patch disables the GC only when GC_DEBUG is in place, I'll put it in.

So you're saying that the calls to get memory during interpreter
initialization are somehow guaranteed to not require more memory (and thus
a dod or collection run)? Currently, this guarantee is not expressed in
code, comments, or documentation, from what I can tell. And I personally
hate implicit assumptions like these...they're the cause of all that magic
voodo, far away from where it's obvious what the problem is. We have no
idea of knowing how much is 'enough' for the GC to allocate at startup,
aside from the following the code and tracking allocations, which can be
a pain.

Currently, we alloc_new_block a default-sized block, of size 32768, and
then request three blocks of 1024 byes in mem_setup_allocator. So we're
okay. But we do stuff like request a PMC header for the stash_hash.  Who
knows what a PerlSymbolTable PMC will allocate internally? Personally, I
think it's a fragile equilibrium at the moment, although admittedly it has
a large buffer zone. However, we're still just one greedy allocation away
from GPFing parrot during initialization.

Having made my case, I'm submitting an ifdef'ed version of the code, so
you can choose this one should my argument not convince you. :)

Thanks,
Mike Lambert


Index: interpreter.c

===

RCS file: /cvs/public/parrot/interpreter.c,v

retrieving revision 1.82

diff -u -r1.82 interpreter.c

--- interpreter.c   2 Apr 2002 06:24:14 -   1.82

+++ interpreter.c   11 Apr 2002 22:48:43 -

@@ -497,6 +497,11 @@

 interpreter->DOD_block_level = 0;

 interpreter->GC_block_level = 0;

 

+interpreter->flags = flags;

+#ifdef GC_DEBUG

+Interp_flags_CLEAR(interpreter,PARROT_ALLOW_GC_FLAG);

+#endif

+

 /* Set up the memory allocation system */

 mem_setup_allocator(interpreter);

 

@@ -506,8 +511,7 @@

 pmc_new(interpreter, enum_class_PerlHash);

 interpreter->perl_stash->parent_stash = NULL;

 

-/* Initialize interpreter's flags */

-interpreter->flags = flags;

+/* Initialize interpreter warnings */

 interpreter->warns = mem_sys_allocate(sizeof(struct warnings_t));

 memset(interpreter->warns, 0, sizeof(struct warnings_t));

 PARROT_WARNINGS_off(interpreter, PARROT_WARNINGS_ALL_FLAG);

@@ -557,6 +561,11 @@

 interpreter->pmc_reg_base->next = NULL;

 interpreter->pmc_reg_base->prev = NULL;

 Parrot_clear_p(interpreter);

+

+#ifdef GC_DEBUG

+/* We've got enough for full interpreter now, so allow gc */

+Interp_flags_SET(interpreter,PARROT_ALLOW_GC_FLAG);

+#endif

 

 /* Need a user stack */

 interpreter->user_stack = new_stack(interpreter);

Index: resources.c

===

RCS file: /cvs/public/parrot/resources.c,v

retrieving revision 1.41

diff -u -r1.41 resources.c

--- resources.c 11 Apr 2002 18:54:33 -  1.41

+++ resources.c 11 Apr 2002 22:48:45 -

@@ -565,6 +565,13 @@

 void

 Parrot_do_dod_run(struct Parrot_Interp *interpreter) {

 

+#ifdef GC_DEBUG

+  /* Exit early if we shouldn't be running */

+  if(!Interp_flags_TEST(interpreter,PARROT_ALLOW_GC_FLAG)) {

+ return;

+  }

+#endif

+

   /* First go mark all PMCs as unused */

   mark_PMCs_unused(interpreter);  

 

@@ -707,6 +714,13 @@

   char *cur_spot;   /* Where we're currently copying to */

   UINTVAL cur_size; /* How big our chunk is going to be */

   struct STRING_Arena *cur_arena; /* The string arena we're working on */

+

+#ifdef GC_DEBUG

+  /* Exit early if we shouldn't be running */

+  if(!Interp_flags_TEST(interpreter,PARROT_ALLOW_GC_FLAG)) {

+ return;

+  }

+#endif

 

   /* We're collecting */

   interpreter->mem_allocs_since_last_collect = 0;

Index: include/parrot/interpreter.h

===

RCS file: /cvs/public/parrot/include/parrot/interpreter.h,v

retrieving revision 1.40

diff -u -r1.40 interpreter.h

--- include/parrot/interpreter.h3 Apr 2002 04:01:41 -   1.40

+++ include/parrot/interpreter.h11 Apr 2002 22:48:46 -

@@ -23,7 +23,8 @@

 PARROT_BOUNDS_FLAG   = 0x04,  /* We're tracking byte code bounds */

 PARROT_PROFILE_FLAG  = 0x08,  /* We're gathering profile information */

 PARROT_PREDEREF_FLAG = 0x10,  /* We're using the prederef runops */

-PARROT_JIT_FLAG  = 0x20   /* We're using the jit runops */

+PARROT_JIT_FLAG  = 0x20,  /* We're using the jit runops */

+PARROT_ALLOW_GC_FLAG = 0x40   /* Allo

[PATCH] Disable GC at startup

2002-04-11 Thread Michel J Lambert


The attached patch disables GC at startup. If you turn on GC_DEBUG, you'll
see parrot crash at startup because dod runs and collection runs are being
performed before the interpreter is fully initialized, causing all sorts
of havoc.

This also means that if our default GC allocation values do not provide
enough room to completely start up Parrot, we'll die. Of course, this
still would happen prior to this patch...this only causes the
allocate_buffer routines to detect that they didn't have enough memory,
instead of causing it to perform dod and collection runs on
uninitialized data.

There are still plenty of problems with GC_DEBUG turned on, but at least
it passes all the non-pmc tests now. Has anyone done anything with
attempting to get a make tinder and tindertest set up?

Mike Lambert


? changes.txt

Index: interpreter.c

===

RCS file: /cvs/public/parrot/interpreter.c,v

retrieving revision 1.82

diff -u -r1.82 interpreter.c

--- interpreter.c   2 Apr 2002 06:24:14 -   1.82

+++ interpreter.c   11 Apr 2002 21:16:59 -

@@ -497,6 +497,9 @@

 interpreter->DOD_block_level = 0;

 interpreter->GC_block_level = 0;

 

+interpreter->flags = flags;

+Interp_flags_CLEAR(interpreter,PARROT_ALLOW_GC_FLAG);

+

 /* Set up the memory allocation system */

 mem_setup_allocator(interpreter);

 

@@ -506,8 +509,7 @@

 pmc_new(interpreter, enum_class_PerlHash);

 interpreter->perl_stash->parent_stash = NULL;

 

-/* Initialize interpreter's flags */

-interpreter->flags = flags;

+/* Initialize interpreter warnings */

 interpreter->warns = mem_sys_allocate(sizeof(struct warnings_t));

 memset(interpreter->warns, 0, sizeof(struct warnings_t));

 PARROT_WARNINGS_off(interpreter, PARROT_WARNINGS_ALL_FLAG);

@@ -557,6 +559,9 @@

 interpreter->pmc_reg_base->next = NULL;

 interpreter->pmc_reg_base->prev = NULL;

 Parrot_clear_p(interpreter);

+

+/* We've got enough for full interpreter now, so allow gc */

+Interp_flags_SET(interpreter,PARROT_ALLOW_GC_FLAG);

 

 /* Need a user stack */

 interpreter->user_stack = new_stack(interpreter);

Index: resources.c

===

RCS file: /cvs/public/parrot/resources.c,v

retrieving revision 1.41

diff -u -r1.41 resources.c

--- resources.c 11 Apr 2002 18:54:33 -  1.41

+++ resources.c 11 Apr 2002 21:17:00 -

@@ -565,6 +565,11 @@

 void

 Parrot_do_dod_run(struct Parrot_Interp *interpreter) {

 

+  /* Exit early if we shouldn't be running */

+  if(!Interp_flags_TEST(interpreter,PARROT_ALLOW_GC_FLAG)) {

+ return;

+  }

+

   /* First go mark all PMCs as unused */

   mark_PMCs_unused(interpreter);  

 

@@ -707,6 +712,11 @@

   char *cur_spot;   /* Where we're currently copying to */

   UINTVAL cur_size; /* How big our chunk is going to be */

   struct STRING_Arena *cur_arena; /* The string arena we're working on */

+

+  /* Exit early if we shouldn't be running */

+  if(!Interp_flags_TEST(interpreter,PARROT_ALLOW_GC_FLAG)) {

+ return;

+  }

 

   /* We're collecting */

   interpreter->mem_allocs_since_last_collect = 0;

Index: include/parrot/interpreter.h

===

RCS file: /cvs/public/parrot/include/parrot/interpreter.h,v

retrieving revision 1.40

diff -u -r1.40 interpreter.h

--- include/parrot/interpreter.h3 Apr 2002 04:01:41 -   1.40

+++ include/parrot/interpreter.h11 Apr 2002 21:17:02 -

@@ -23,7 +23,8 @@

 PARROT_BOUNDS_FLAG   = 0x04,  /* We're tracking byte code bounds */

 PARROT_PROFILE_FLAG  = 0x08,  /* We're gathering profile information */

 PARROT_PREDEREF_FLAG = 0x10,  /* We're using the prederef runops */

-PARROT_JIT_FLAG  = 0x20   /* We're using the jit runops */

+PARROT_JIT_FLAG  = 0x20,  /* We're using the jit runops */

+PARROT_ALLOW_GC_FLAG = 0x40   /* Allow dod and collection to occur */

 } Interp_flags;

 

 #define Interp_flags_SET(interp, flag)   (/*@i1@*/ (interp)->flags |= (flag))

Re: Worst-case GC Behavior?

2002-04-10 Thread Michel J Lambert


> >One option might be a
> >threshold - if, after the DOD run, there is still less than N headers
> >available, allocate more even though we can satisfy the immediate
> >requirement. This would improve performance by reducing the number of DOD
> >runs, but at the cost of additional memory - a classic tradeoff!
>
> Yep.
>
> That's one of the reasons I put the counters into the GC system--I
> expect some sort of feedback system that dynamically modifies the
> allocation quantities for memory and various interpreter structures
> is in order.
>
> Anyone care to write one? :) (I went for the simple and relatively
> naive way to start mainly just to get something done)

Well, this sounds like it all goes inline with the use less "memory"
pragmas. Should the interface to the GC be raw, low-level vars that this
pragma would manually set? Or should the interface be a high-level
optimize-for-this variable which the GC uses internally?

I believe this discussion has happened before, but no consensus was
reached.

(And no, this isn't me volunteering, at least not yet. ;)

Mike Lambert

Re: Avoiding the deadlands

2002-04-09 Thread Michel J Lambert


> So, I think #2 is the way to go. We'll add a new flag,
> (BUFFER|PMC)_stay_of_execution_FLAG or something, that gets added to
> allocated PMCs and Buffers. It'll be treated the same way as the
> constant/immortal flag is treated for DOD purposes, with one
> difference--there'll be an op which specifically clears the flags.
> Since the flag is *not* valid to be set across ops, we're fine.

Sounds good to me. This requires that we manually insert this
clear_soe_flag op in code, where we want it. If we don't insert it, we
effectively make everything immortal/immune, forever. Finding 'proper'
places to insert this could be hard to do for the compiler writer, imo.
The larger the amount of code in a loop, the more likely this op is to
exist inside the loop? Not sure what other heuristics there would be.

> Only those functions that *must* set it will, and they are also
> required to make sure the stay_of_execution_count interpreter
> variable is set to the proper number (or at least large enough) on
> abnormal exit. (That way we can keep the flag clearing op from
> actually running if there's no reason)

Aren't most functions going to be required to set it, indirectly?
new_*_header seems like an ideal place to set the SOE flag, and increment
the internal counter. But that's going to impact a ton of code. Am I
correct in assuming these three functions are what you meant by 'must set
it', or were you referring to something else?

Mike Lambert

Re: PMCs requiring a 'set' dest register?

2002-04-09 Thread Michel J Lambert


> >destination register. Why would we *ever* care what's in the destination
> >register, since it never gets it's vtable methods called.
>
> That's not true. The destination needs its set method called.
> Otherwise tying/overloading won't work right.

Fair enough. I still have the problem brought up in the email that started
this thread, however. I'll reiterate it, and a possibly better solution,
in the email below.

Let me ask how '$a = $b + $c' is supposed to work. Currently, we have an
'add' vtable method, which has self, value, and dest. $a=dest, $b=self,
and $c=value. So right now, we never call set, and we transmogrify the
destination register PMC directly.

So perhaps we should assume that the dest register is a dummy one.
We transmogrify and initialize it however we want, in add. Then we
use a seperate opcode to store it into the destination register,
with set.

This means the above code becomes:
fetchlex P0, "$b"
fetchlex P1, "$c"
new P2, DummyPMC
add P2, P0, P1
fetchlex P3, "$a"
set P3, P2

Now $b's vtable gets called to add, an $a's vtable gets called to set,
supporting tied vars and whatnot. This also means that "$a = $a + $a"
never modifies itself in the same op. (which was the problem in the
example I gave in the email that started this thread), so that the rhs is
valid when performing the computation, and once the result is completed,
it's then able to store into the lhs and overwrite $a. This requires we
create a constraint that one cannot have any of the source registers equal
the destination register.

If, however, the the below code...
fetchlex P0, "$a"
fetchlex P1, "$a"
fetchlex P2, "$a"
add P2, P0, P1

is supposed to work fine, we have two problems:
1) we transmogrify the destination register before we can set it, which
means we have to store the result of the add in a temp variable. Lots of
methods violate this. (See original email for more clarification on this.)
2) We can't dispatch based upon destination type *and* first-add-argument
type with a single op. We already ruled out multi-dispatch.. ;)

Mike Lambert

Re: string api

2002-04-08 Thread Michel J Lambert


> I agree we need an overall architectural solution. Setting and clearing
> bits manually is error-prone but fast, as you said. Its identical to
> the malloc()/free() situation which is one of the primary reasons we
> use garbage collection in the first place, so why reinvent the same
> situation with different syntax?

Generally, malloc/free are used in more complex situations than just
stack-based memory management. But I see your point.

> malloc/free is vulnerable to:
> 1) leakage (forgot to free)

If you remember to mark it as used, you're pretty much guaranteed to mark
it as unused at the end of the same function.

> 2) double deallocation (freed an already freed buffer)

In general, this can't happen with setting bits. We can't unset a bit
twice, since we're only doing this on stuff returned by a function, for
the duration of the function that got it. If we return it again,
they'll set it, and free it on their own. Agreed, it's confusing, and thus
the reason for this whole discussion. :)

> I suppose a variation of the scratch-pad that might be more on the
> performance line that you are thinking could be similar to the
> scope tracking that compilers do when gathering symbols into
> symbol tables.

Ahhh, that's what I missed. I was assuming that you'd either have to push
variables on to this stack in buffer_allocate, or in the place that's
allocating them, and pop them all off with a end_GC_function. Which I
considered to be 'just as much work'.

> So a GC-able buffer gets created with a intial scope of cur_interp->scope,
> hidden in the allocator, and the collector skips collect on any
> buffer with scope <= the cur_scope.

I'm a bit confused. Say I have function A call B and C. Function C will
have the same scope as B will. If C triggers a GC run, then anything
allocated in B will have the same scope as C. How will the GC system
know that it can mark those as dead? Granted your system is safe, but it
seems a little *too* safe.

> And there is no stack churn.

I like that part, tho. :)

Mike Lambert

Worst-case GC Behavior?

2002-04-08 Thread Michel J Lambert


I think I know of two potential performance problems with the GC code.
They could be problems in my head, or real problems, as I haven't done any
profiling. We also don't have any real test cases. :)



The first example is the following code, which calls parrot_allocate to
create the string each time.

for(1..1) {
  push @a, "a";
}

If we start out with no room, it calls Parrot_go_collect for each push,
but the go_collect does nothing because there's no free memory. This then
requires another allocation, fit exactly to the size of the block, one
character.

Repeat.

Since we're never freeing any memory, it continually is allocating a block
of size 56 (memory pool) + 1 (character)  from the underlying system api.



The second example involves headers. Say we have the following code:
loop:
new P0, PerlString
branch loop

Which might correspond to:

while(1) {
  my $dummy;
}

Each time through the array, it has to alloc a PMC header. When we
allocate the header, we store it into P0, and the old header is
essentially freed.

The next time through the loop, entries_in_pool is 0, and it triggers
alloc_more_string_Headers, and a dod run. This finds the PMC we just
freed, and uses it. Repeat. Each time through the loop, it triggers a dod
run.

The above example might be a bit contrived, due to the fact that it could
pull 'my $dummy' outside of the loop, assuming it isn't tied. (If it is
tied, we need to do it each time through, since it could be counting the
number of times we set the variable, or somesuch.)



Now, I know that any memory management system can have cases which cause
worst-case behavior. I'm not sure if the cases I presented are those kind
of cases, or whether they are common enough that we need to worry about
it.

The first problem can probably be solved by enforcing a minimum block
allocation size. I'm not sure of a good solution to the second problem,
however.

If we do have a minimum block allocation size, it will perform horribly
memory-wise on something like:

while (1) {
  push @a, "a";
  push @a, "aa"x200;
}

This example destroys any implementation that has a minimum block
allocation size. This could be alleviated with a linked list of blocks
that have available memory at the tail of the block. This could give very
bad performance whenever we have lots of half-filled blocks. (Say, when we
continually allocate blocks of size 0.51*minimum_block_allocation_size.)
I recall Dan saying he didn't like traversing linked lists when searching
for memory, but it shouldn't be that bad since it all gets cleaned up on a
call to parrot_go_collect.

Finally, another approach is to randomize things. Lots of algorithms
randomize their behavior to prevent test cases that exhibit worst-case
behavior. Of course, I'm not sure if a non-deterministic interpreter is a
good thing, since it'll just make GC bugs that much more annoying to track
down.

These might all be things that were considered, and discarded as not
important enough. But I didn't see these potential problems mentioned
anywhere, so I figured I'd bring them up here, just in case.

Mike Lambert

Re: string api

2002-04-08 Thread Michel J Lambert


> >This message does remind me of how empty the TODO list is. Surely we
> >can think of many more things to be done?
>
> Speaking of..
>
> 1) Bugfix release please, we banged quite a few stack and GC bugs out.
>  Don't we get any dessert?

Peter has already stated he'd like his parrot_reallocate_buffer patch to
go in first, as it does fix a reproducible bug with clint's evaluator. And
there's still a bunch of GC bugs. I know of three types:

1) The problem that's been brought up before (and below), of CPU-stack
vars not being traceable.
2) The GC initialization stuff could potentially trigger a GC (woops!). My
fix was to disable the GC during interpreter initialization, but I'm not
sure what we want to do for this.
3) Non-string buffers (ie, stuff in last_Buffer_Arena, not
last_String_arena) are pretty broken, I think. They aren't marked,
unmarked, freed, or copied. I could be mistaken on this one, as I don't
have any test cases that break it yet, just my understanding of the code.
This should be a lot easier to fix, with a little copy and paste. But I'd
like to get a valid test case before I attempt to fix this.

> 2) I'm thinking of an internal stack not visible to user code that we use
>  for temporary PMCs and Buffers and a simple macro for entry and
>  exit of GC sensitive routines. I think I might have mentioned this.

What defines a GC-sensitive routine? Anything that does string manip, pmc
manip, or any allocations, is marked GC-sensitive?

> I'd like something like this rather than hoping all developers can
> systematically set bits or handle references correctly because in
> reality we'd probably never catch all the cases.

Two things:

First, we now have a GC_DEBUG define that we can turn on to find all
places the GC could cause problems. In the current state, I think it
covers 90% of the problems (one problem is that if I conditionally
call string_make, this function isn't guaranteed of triggering GC in
a test case, like it should.)

However, if we can't find all the places we do buffer manipulation to mark
them immortal, how are we going to properly identify all the GC-sensitive
functions?

Secondly, setting a flag should be much quicker, speed-wise. We'd only be
setting flags on buffer headers that are already in the CPU cache, as
opposed to writing to memory in this stack, pushing and popping all the
time. And if we macro-ize the setting of the flags, I don't think it
should look nearly as bad. GC_MARK(Buffer), GC_UNMARK(Buffer), etc.


I know there's been little activity in the past week...as far as my
activity, I'm waiting for the Dan to come back tomorrow, and tell us
minions what the plan is for GC stuff. Peter and I fixed most of the GC
bugs that are easily fixed, but the rest require a more architectural fix,
something I think we all are deferring to Dan on.

Mike Lambert

Re: [PATCH] Parrot_(re)allocate_buffer

2002-04-04 Thread Michel J Lambert


> The concern is that both functions are still using the same arenas, which
> means the original problems still wouldn't be solved, and we'd still have to
> do all the workarounds around them.  ;-)

Yeah, but I can't go ahead changing this stuff, especially if all the code
isn't converted over to using it yet. One working step at a time, says
Refactoring Man. :)

> IIUC, the mem*alloc routines are for internal use only - it's how Parrot
> allocate huge chunks of memory from the system that it can then parcel out
> as it sees fit.  Nothing on top or outside of Parrot should be using them.
> The Parrot*alloc functions provide memory allocation by the Parrot VM for
> the programs running on Parrot - IOW, *all* requested VM memory must come
> out of the arenas, which means it all needs to be tracked by Parrot for GC.

What about register and regular stacks? They allocate bunches of stuff to
push data onto, using mem_sys_allocate, and mem_sys_free. Is it okay to
use the mem* functions when we *know* how to manage memory ourselves? I
think it'd be a pain to force copying all this buffered data when they
know how to manage it themselves. The GC, I thought, was more meant for
PMCs and Strings, where Perl5 had used refcounts. Why use the overhead of
a GC when we aren't going to require it's tracing and copy-collecting
abilities?

> I wish the allocation headers were a little more invisible to things outside
> the memory manager, like with XStrings.  But then again, all these levels of
> indirection confuses me to no end as to what's inside and what's not.

Yeah. Is there some reason *not* to make resources.h included only
inlcuded in resources.c and interpreter.c? That would provide this hiding
you are seeking, would it not? Of course, if we want stuff to use
mem_sys_* functions, per the above paragraph, this isn't really an option.

Mike Lambert

Re: [PATCH] Parrot_(re)allocate_buffer

2002-04-03 Thread Michel J Lambert



> Why add new functions instead of patching the current ones?

I didn't know if the original functions still had a purpose. Perhaps you
would want to Parrot_allocate for something non-bufferish? Thinking about
it, that seems evilly wrong, so I guess that is not a valid reason. Doing
this would make mem_realloc and Parrot_reallocate_buffer a little bit more
difficult, since they'd have to allocate a temporary buffer on the stack
to call the function, or stash away the values of the incoming buffer
before it gets overwritten, so it can later memcpy.

Should I resubmit a patch removing the _buffer suffix, and removing
mem_realloc?

Thanks,
Mike Lambert

[PATCH] Parrot_(re)allocate_buffer

2002-04-03 Thread Michel J Lambert


> My first thoughts on this are that we should go the whole way, and have
> Parrot_allocate take a Buffer* and a requested size, and let it fill in the
> bufstart and buflen parameters (as in the not-yet-implemented
> Parrot_reallocate patch).

Heh, I thought the exact same thing when I first saw your realloc patch.
Below is a cvs-relative patch (meaning it 'includes' Peter's patch
as well, since it is unapplied), that builds off of his Parrot_reallocate
to rename it to be Parrot_reallocate_buffer and adds a new function,
Parrot_allocate_buffer.  I did this over the weekend, but this looks like
an opportune time to submit it. :) This patch also cleans up parrot code
to use the defined memory interface in memory.c and resources.c
(mem_sys_realloc versus realloc, etc).

There's one small bug I noticed and fixed in the GC manager, that I came
upon while fixing up array.pmc and perlarray.pmc:
-for (i = 0; i < trace_buf->buflen; i++) {
+for (i = 0; i < trace_buf->buflen / sizeof(PMC *); i++) {

Finally, array.pmc and perlarray.pmc still remain to be cleaned up. I
originally did them, but then the keyed patch got submitted, and I was
unable to make a clean merge there. I'll do that later. For now, this
should be cleaner than what is in there now.

This patch cannot be applied cleanly on TOP of Peter's patch, due to my
renaming of the function he implemented. It can be used instead of his
however, as it just builds additional cleanup functionality on what he
submitted. If someone commits his (I believe it was already approved)
before this one gets approval, I will resubmit the patch as a diff
relative to his.

Mike Lambert



Index: embed.c

===

RCS file: /cvs/public/parrot/embed.c,v

retrieving revision 1.20

diff -u -r1.20 embed.c

--- embed.c 2 Apr 2002 20:35:52 -   1.20

+++ embed.c 3 Apr 2002 10:10:04 -

@@ -68,7 +68,7 @@

 

 program_size = 0;

 

-program_code = (char *)malloc((unsigned)program_size + 1024);

+program_code = (char *)mem_sys_allocate((unsigned)program_size + 1024);

 if (NULL == program_code) {

 fprintf(stderr,

 "Parrot VM: Could not allocate buffer to read packfile from 
stdin.\n");

@@ -79,7 +79,7 @@

 while ((read_result = read(0, cursor, 1024)) > 0) {

 program_size += read_result;

 program_code =

-realloc(program_code, (unsigned)program_size + 1024);

+mem_sys_realloc(program_code, (unsigned)program_size + 1024);

 

 if (NULL == program_code) {

 fprintf(stderr,

Index: interpreter.c

===

RCS file: /cvs/public/parrot/interpreter.c,v

retrieving revision 1.82

diff -u -r1.82 interpreter.c

--- interpreter.c   2 Apr 2002 06:24:14 -   1.82

+++ interpreter.c   3 Apr 2002 10:10:05 -

@@ -442,7 +442,7 @@

 if (!interpreter->prederef_code) {

 size_t N = interpreter->code->byte_code_size;

 size_t i;

-void **temp = (void **)malloc(N * sizeof(void *));

+void **temp = (void **)mem_sys_allocate(N * sizeof(void *));

 

 for (i = 0; i < N; i++) {

 temp[i] = (void *)(ptrcast_t)prederef;

Index: io.ops

===

RCS file: /cvs/public/parrot/io.ops,v

retrieving revision 1.6

diff -u -r1.6 io.ops

--- io.ops  26 Mar 2002 04:39:33 -  1.6

+++ io.ops  3 Apr 2002 10:10:05 -

@@ -51,16 +51,20 @@

 

 inline op open(out PMC, in STR, in STR) {

   ParrotIO * io;

-  char * path, * mode;

-  /* We probably need to make this into a String API */

-  path = Parrot_allocate(interpreter, ($2)->bufused + 1);

-  memcpy(path, ($2)->bufstart, ($2)->bufused);

-  *(path + ($2)->bufused) = 0;

-  mode = Parrot_allocate(interpreter, ($3)->bufused + 1);

-  memcpy(mode, ($3)->bufstart, ($3)->bufused);

-  *(mode + ($3)->bufused) = 0;

+  STRING *path, *mode, *nullstr;

 

-  io = PIO_open(interpreter, path, mode);

+  /* Not sure if we actually want usascii. It depends upon how 

+   * new_io_pmc interfaces with the native system. */

+  const CHARTYPE *usascii = chartype_lookup("usascii");

+  char nullchar = '\0';

+

+  nullstr = string_make(interpreter, &nullchar, 1, NULL, 0, usascii);

+  path = string_transcode(interpreter, ($2), NULL, usascii, NULL);

+  mode = string_transcode(interpreter, ($3), NULL, usascii, NULL);

+  path = string_concat(interpreter, path, nullstr, 0);

+  mode = string_concat(interpreter, mode, nullstr, 0);

+

+  io = PIO_open(interpreter, (char*)path->bufstart, (char*)mode->bufstart);

   if(!io) {

 /* FIXME: Handle error */

   }

Index: jit.c

Re: COW strings

2002-04-02 Thread Michel J Lambert


> I made two assumptions for my test implementation of COW strings:
Wow, you already have a COW implementation? Great to hear!

> 1) we need to be able to share substrings as well as complete strings
> 2) COW must survive garbage collection
> Without these two, I believe the overhead probably outweighs the
> benefits

COW can in certain cases, *not* survive garbage collection, specifically
if there used to be two pointers to the buffer (and it was COW), but now
there is only one (and should not be COW). The implementation you describe
below seemed to handle this case, but your 'assumptions' above seem to
say that it does not handle this? What did I miss?


> 1) The string header has to reference both the start of the buffer and the
> start of the string i.e. we need to add either a second pointer or an
> offset. This impacts on all code using bufstart as the start of the string.


Yeah, I but I believe this is fine. The impact on string size is just 4
bytes. We already have bufused (buflen), so we need abufbegin (bufstart).
I wonder what the Perl5 implementation overhead on strings is?

> 2) Some sort of flag or counter is required during garbage collection to
> ensure that a given buffer is only copied once. Since the current version of
> Parrot_allocate always makes the buffer larger than the requested size, I
> just used the byte after buflen. To make sure I always have space for the
> relocation address, I changed string_make to round the buffer size up to one
> less than the next multiple of 16; _allocate will then go one bigger, which
> will be the flag byte.

Ahh, nice. In my few minutes of thinking about it, I got stuck wondering
how to store information that belongs to the buffer data (the refcount,
for example).

This kind of hidden data needs to be documented. Perhaps we should define
a buffer-header struct (oh, wait, that name's taken) that we can put at
the head or tail of a buffer for this kind of data.

> The flag byte needs three values:
> a) Uncopied buffer (set when buffer is allocated)
> b) Copied once, not referenced again
> c) Copied once and referenced by at least one other header
>
> I was originally thinking of a separate pass to find buffers in state (b)
> and reset their (header's) COW flag; an alternative would be to store the
> address of the first header in the old buffer (need to ensure that a
> minimum-size buffer can hold two pointers as well as the flag byte!) and
> clear its COW flag regardless; if another reference to the same buffer is
> found, then set the COW flag in the first header. This removes the second
> pass at a slightly increased cost for the normal GC pass.

I like the second idea, but I'm a bit confused on how you're implementing
it. The one I see, but I'm not sure is the one you described, requires a
loop at the beginning of do_collect to set the buffer's seen_already to 0,
and the buffer_header's COW to 0. Then, as we go through the real copying
of the buffers:
-If seen_already is zero, we set the pointer in the buffer to point to our
buffer header.
-If seen_already is one, we follow the stored pointer back to the orig
buffer, set COW to one on it, and set COW to one on our own buffer.

This implementation requires a single bit for the buffer flags, and one
extra pointer (as opposed to the two you mention). It unfortunately, also
has an extra loop overhead in do_collect.

Can you clarify how yours works a bit more, please? Does yours do anything
in do_dod_run, or is it all in do_collect ("GC pass" was unspecific, at
least to me). I don't see where the two two pointers come from, or why you
need three values in your flag byte, and thus I have no idea how yours
works. :)

> Before Dan put his memory allocation / garbage collection system in place,
> COW gave a significant performance increase on string-intensive processing,
> due to the overheads of using the system memory allocator and to the
> ever-increasing memory utilisation. In my latest tests, COW seemed to make
> very little difference; however, I have not yet implemented the clearing of
> the COW flag as discussed above, so "once a cow, always a cow", which will
> have some negative impact.

Well, did you make the necessary changes to the various strings?
Currently, all the string operations are immutable operations, and so COW
won't provide much benefit, unless you fix substring to return the correct
'pointer to the middle of a string, but with COW' string_headers, and so
on. And likewise, make the mutable funcs (of which there is chopn,
currently), to handle COW properly.

Did you handle all that in your test code? Or is it currently benchmarking
functionality which doesn't get executed? :)

Mike Lambert

Re: PMCs requiring a 'set' dest register?

2002-04-01 Thread Michel J Lambert


> #$bar = \$foo;
> new P1, PerlRef
> stash_load P0, "foo"   #Not needed if P0 unchanged
> set P1, P0 #Make P1 reference P0 (not 'foo')
> stash_store P1, "bar"

My original example was like that, but I wasn't sure if that was a proper
use of 'set', since 'set' normally copies one into the other. Of course,
we don't have a 'set PMC PMC' yet, so I guess the behavior is undefined.
:)

> This code seems to be related to differences between documentation and
> implementation.

Doh, my bad! I've long since learned not to trust documentation, and
always use code as the authoritative source. :|

> Note that string_repeat and string_substr are documented in the same place
> as only creating a new string if the final parameter is NULL; the
> implementations again differ.

Has anyone decided whether strings are going to be mutable,
immutable, or COW? They are currently immutable for the most part, with
the exception of chopn, and maybe a few other functions. If we do support
COW, then we're probably going to need two versions of every immutable
string function, one to return a COW, and one to return a copy. This
would allow us to explicitly copy a substring of a large string when
we want to avoid having to COW-the large string, if we know we're going to
be making modifications to the COW-ed string in a little bit.

Implementing COW is a bit harder, although now that we have a DOD pass, a
lot easier. We can update counts in there...it's just not very easy to see
how we're going to keep track of refcounts.

> I agree that the whole issue of garbage collection relating to newborns and
> temporaries is problematic. One option might be to always flag newly-created
> objects in some way, then have a single function call to release them; at
> least this means just one function call at the end of each procedure. This
> still leaves a problem with nested procedures.

Yeah, all buffer_headers are returned with BUFFER_live_FLAG so that any
allocations won't cause them to be lost in the do_collect. But there's no
problem with accidentally leaving live on, as opposed to leaving an
immortal/immune flag on.

There's always macro tricks and/or post-processing of code files, but the
latter one is the only one that can handle this case properly, and I'm
not sure we want to go there. Perhaps we should delegate to Dan on how to
solve the GC-safifying ugliness problem? (distinct from the problem that
spawned this thread :)

Mike Lambert

Re: PMCs requiring a 'set' dest register?

2002-04-01 Thread Michel J Lambert


> Remember that the higher level (eg perl6) will expect to be able to find the
> PMC afterwards. For example:
>
> $foo = "abc";
> $bar = \$foo;
> print $$bar;
> $foo = 3;
> print $$bar;

This depends upon how variables and the registers interact. If, like a
real CPU, registers must be stored back into traditional memory as
variables, then there is no problem. Code looks something like:

#$foo = "abc";
new P0, PerlString
set P0, "abc"
stash_store P0, "foo";
#^ or however we store into the current stash, or global stash

#$bar = \$foo;
new P1, PerlRef
set P1, "foo"
#^ or whatever namespace we are in. it could also be a pointer
#to some particular point in the stash, or whatnot.

#print $$bar
deref P2, P1 #put the PMC pointed to by P1, into P2
print P2

#$foo = 3;
new P0, PerlInt
#^ not strictly necessary, but it makes a new PMC,
#which is necessary for this example
set P0, 3
stash_store P0, "foo";

#print $$bar
deref P2, P1
print P2

If instead, registers are aliased onto traditional memory variables, such
that a PMC pointed to by a register is *also* pointed to by a stash
somewhere, then it's a bit harder.

One approach that's an extension of my previous idea (where we obliterate
the destination PMC), is to change the PMC* of registers to PMC**. This
allows us to support the example you mentioned, and also allows us to
blindly obliterate whatever value is in the destination with our newly
constructed PMC, assued the old one will be properly GC'ed. This is
currently the approach I am leaning towards, if registers are aliased
onto variables. (Otherwise I prefer the approach mentioned above, which is
more inline with traditional register architectures.).

I thought about leaving PMC*'s and doing a memcpy over it, but that's a
'bad thing' in terms of support PMC-header custom DOD routines.

Another example of the ugliness in terms of supporting the GC for the
PMC's is:

STRING* s = string_copy(INTERP, (STRING*)SELF->cache.struct_val);
dest->cache.struct_val =
string_concat(INTERP,
  s,
  value->vtable->get_string(INTERP, value),
  0
 );
/* don't destroy s, as it is dest->cache.struct_val */


What's the point of making a copy of that string? I don't really know,
and I don't see the point. The comment at the end doesn't help any, either
.. I'm hoping it's just code written under false assumptions.

Right now, it's not GC safe since 's' is not traceable, and if a dod run
occurs, s will be GC'ed. Same thing for the 'get_string' result, which is
also not traceable from the root set. To make this GC-safe, we need to:

a) store them all in temp variables
b) mark them as immortal/immune
c) operate on everything, get the result, and store it
d) unmark them

If we had C++, we could do all sorts of nice tricks with objects and
deterministic destruction at code block end. But we don't. This leads to
verbose code that's not easy or intuitive to write, with every PMC method
needing to do the above abcd logic to operate on variables.

Thoughts or comments?
Mike Lambert

PS: If you want, I can submit a patch that fixes the GC bugs in the
current system, leaving the semantics the same. It's just going to be
extremely ugly, verbose, and not nice to look at. (And not fun for me to
write, either.)

PMCs requiring a 'set' dest register?

2002-03-31 Thread Michel J Lambert


I'm stuck at a fork in terms of how to fix this particular GC problem.

First, I changed all the direct vtable changes to use morph(). Morph does
a: destroy(), vtable change, and an init(). This is required because some
PMCs require GC-related initiailization, such as PerlStrings, which need
to stick a String in the data pointer, and set the buffer_pointer flag.

Now take 'concat P0, P0, P0', where P0 is a PerlString. It needs to be
implemented as:
dest->vtable->morph(interpreter,dest,enum_class_PerlString);
dest->data =
string_concat(interpreter,
  pmc->data,
  value->vtable->get_string(interpreter, value),
  0
 );

The problem with this, is that by the time it goes to concat, the string
has been lost due to morph'ings init(). The hack fix to solve this is to
store the result of the concat in a temp variable, morph dest, and then
set its data field. But this could cause problems where the temp var gets
GC'ed during the morphing init(). Immortal/immune could be of help here,
but I don't think this is heading in the right direction, since it's just
making the programmer's life too confusing.

A second approach is to throw out this weird transmogrifying class stuff,
and just construct a new PMC of the appropriate type to put into the
destination register. Why would we *ever* care what's in the destination
register, since it never gets it's vtable methods called. We just go in
and blow it's data away anyway. With the GC system we have now, we can
leave PMC headers sitting on the wayside, and it will get collected. So if
we construct a new PMC of the appropriate type to store into the
destination register, it becomes GC-safe, easier to program, and we don't
need to 'set' a new PMC into the dest register before we operate on it.

I'm not sure what the original reasoning was for the way things are now,
but I didn't want to go through and change the way the things worked
without checking to be sure it's the right thing to do. Anyone have any
objections to changing around the perl* classes to work this way?

Mike Lambert

GC Torture Torture Testing

2002-03-31 Thread Michel J Lambert


I'm sorry, this new weapon is going to give a quick advantage to the fools
side, but luckily, should help the fool-proofers in the long run.

Below patch should be safe to apply, as it does nothing. Turn on GC_DEBUG
if you want to see the hell that ensues. I'm working on a patch to fix the
issues that this causes, but am posting this now so that it can be applied
separately (or should it go as one big lump?)

On a related note, is it possible to make a 'make tinder' and 'make
tindertest' ? These would have things like GC_DEBUG turned on, which makes
Parrot run much slower, but is also more likely to expose problems in
Parrot. Is this a good idea, or would the possibility for divergance
between regular parrot and tinderbox parrot be too great? I'm not much of
a makefile hacker, so it'd be nice if someone else wanted to do this.

Mike Lambert

Index: resources.c
===
RCS file: /cvs/public/parrot/resources.c,v
retrieving revision 1.37
diff -u -r1.37 resources.c
--- resources.c 30 Mar 2002 05:57:32 -  1.37
+++ resources.c 31 Mar 2002 12:32:09 -
@@ -35,6 +35,9 @@
 pool->pool_buffer.buflen = (UINTVAL)(pool->pool_buffer.buflen * 1.2);

   }
+#ifdef GC_DEBUG
+Parrot_go_collect(interpreter);
+#endif

   /* Okay, so there's space. Add the header on */
   ((PMC *)to_add)->flags = PMC_on_free_list_FLAG;
@@ -119,6 +122,11 @@
   if (!interpreter->arena_base->pmc_pool->entries_in_pool) {
 alloc_more_pmc_headers(interpreter);
   }
+#ifdef GC_DEBUG
+  else {
+Parrot_do_dod_run(interpreter);
+  }
+#endif

 {
 /* A stupid temp variable. Our pointer into the pool */
@@ -221,6 +229,11 @@
   if (!interpreter->arena_base->buffer_header_pool->entries_in_pool) {
 alloc_more_buffer_headers(interpreter);
   }
+#ifdef GC_DEBUG
+  else {
+Parrot_do_dod_run(interpreter);
+  }
+#endif

   /* Okay, we do this the long, drawn-out, hard way. Otherwise I get
  really confused and things crash. This, generally, is a Bad
@@ -280,6 +293,9 @@
 pool->pool_buffer.buflen = (UINTVAL)(pool->pool_buffer.buflen * 1.2);

   }
+#ifdef GC_DEBUG
+  Parrot_go_collect(interpreter);
+#endif

   /* Okay, so there's space. Add the header on */
   ((Buffer *)to_add)->flags = BUFFER_on_free_list_FLAG;
@@ -635,6 +651,11 @@
   if (!interpreter->arena_base->string_header_pool->entries_in_pool) {
 alloc_more_string_headers(interpreter);
   }
+#ifdef GC_DEBUG
+  else {
+Parrot_do_dod_run(interpreter);
+  }
+#endif

   /* Okay, we do this the long, drawn-out, hard way. Otherwise I get
  really confused and things crash. This, generally, is a Bad
@@ -841,6 +862,9 @@
   if (NULL == interpreter) {
 return mem_sys_allocate(size);
   }
+#ifdef GC_DEBUG
+  Parrot_go_collect(interpreter);
+#endif

   /* Make sure we round up to a multiple of 16 */
   size += 16;

Re: Bugfix release?

2002-03-30 Thread Michel J Lambert


> "Dan Sugalski" <[EMAIL PROTECTED]> wrote
>
> > With the recent stack and GC patches, are we pretty much solid now?
> > If so, a 0.0.5 bugfix release may well be in order.
>
> The one outstanding issue that I know of is the mem_realloc problem in
> add_pmc_to_free and add_header_to_free. Since the problem is actually
> endemic to mem_realloc, the best solution seems to be a replacement for
> mem_realloc that takes a Buffer*, so that an intervening GC run still leaves
> it knowing where the source is.
> In line with a previous suggestion, I propose that such a routine be called
> Parrot_reallocate.
> A patch to create such a function and use it within add_pmc_to_free and
> add_header_to_free follows; since all uses of mem_realloc are potentially at
> risk, it might be a good idea to change any such places also.

This patch looks good to me. It fixes the test case that was posted
earlier, and it's much better from a design standpoint than the previous
patch, which was withheld. I'm also working on code that builds off this
patch, so I'd appreciate if it was applied.

Thanks,
Mike Lambert

Re: Warnings Cleanup

2002-03-29 Thread Michel J Lambert


> I've applied portions of this patch.  I omitted the parts which use the
> "byte" type, which isn't going to work on all platforms.

I've changed these to use 'char'. Hopefully that will be more portable.

Mike Lambert

Index: misc.c
===
RCS file: /cvs/public/parrot/misc.c,v
retrieving revision 1.19
diff -u -r1.19 misc.c
--- misc.c  17 Mar 2002 06:44:41 -  1.19
+++ misc.c  30 Mar 2002 01:15:30 -
@@ -89,38 +89,40 @@
 void int_to_str(char *, char *, HUGEINTVAL, INTVAL );
 */

-void gen_sprintf_call(char *, char *, SpfInfo, int);
+void gen_sprintf_call(char *, char *, SpfInfo, char);

 static void
-uint_to_str(char *buf1, char *buf2, UHUGEINTVAL num, INTVAL base)
+uint_to_str(char *buf1, char *buf2, UHUGEINTVAL num, char base)
 {
-int i = 0, cur;
+int i = 0, cur2;
+   char cur;

 do {
-cur = num % base;
+cur = (char)(num % base);

 if (cur < 10) {
-buf2[i] = '0' + cur;
+buf2[i] = (char)('0' + cur);
 }
 else {
-buf2[i] = 'a' + cur;
+buf2[i] = (char)('a' + cur);
 }

 i++;
 } while (num /= base);

-cur = i;
+cur2 = i;

-for (i = 0; i <= cur; i++) {
-buf1[i] = buf2[cur - i];
+for (i = 0; i <= cur2; i++) {
+buf1[i] = buf2[cur2 - i];
 }
 }

 static void
-int_to_str(char *buf1, char *buf2, HUGEINTVAL num, INTVAL base)
+int_to_str(char *buf1, char *buf2, HUGEINTVAL num, char base)
 {
 BOOLVAL neg;
-int i = 0, cur;
+int i = 0, cur2;
+   char cur;

 if (num < 0) {
 neg = 1;
@@ -131,13 +133,13 @@
 }

 do {
-cur = num % base;
+cur = (char)(num % base);

 if (cur < 10) {
-buf2[i] = '0' + cur;
+buf2[i] = (char)('0' + cur);
 }
 else {
-buf2[i] = 'a' + cur;
+buf2[i] = (char)('a' + cur);
 }

 i++;
@@ -147,10 +149,10 @@
 buf2[i++] = '-';
 }

-cur = i;
+cur2 = i;

-for (i = 0; i < cur; i++) {
-buf1[i] = buf2[cur - i - 1];
+for (i = 0; i < cur2; i++) {
+buf1[i] = buf2[cur2 - i - 1];
 }

 buf1[i] = 0;
@@ -186,7 +188,7 @@
 }

 void
-gen_sprintf_call(char *buf, char *buf2, SpfInfo info, int thingy)
+gen_sprintf_call(char *buf, char *buf2, SpfInfo info, char thingy)
 {
 int i = 0;
 buf[i++] = '%';
@@ -251,7 +253,7 @@

 for (i++; i < (INTVAL)string_length(pat)
  && info.phase != PHASE_DONE; i++) {
-char ch = string_ord(pat, i);
+INTVAL ch = string_ord(pat, i);

 switch (info.phase) {
 /*@fallthrough@ */ case PHASE_FLAGS:
@@ -411,7 +413,7 @@

 case 'f':
 dbl = va_arg(*args, double);
-gen_sprintf_call(t1, t2, &info, (char)'f');
+gen_sprintf_call(t1, t2, &info, 'f');
 sprintf(t2, t1, dbl);
 targ = string_concat(interpreter, targ,
  cstr2pstr(t2), 0);
Index: string.c
===
RCS file: /cvs/public/parrot/string.c,v
retrieving revision 1.64
diff -u -r1.64 string.c
--- string.c24 Mar 2002 06:57:28 -  1.64
+++ string.c30 Mar 2002 01:15:32 -
@@ -434,8 +434,6 @@
  * end of our piece */
 UINTVAL true_offset;
 UINTVAL true_length;
-UINTVAL new_length;
-UINTVAL new_size;
 INTVAL diff;

 true_offset = (UINTVAL)offset;
@@ -791,7 +789,7 @@
  * would approach 128 characters in the buffer.
  */
 do {
-*--ptr = '0' + i % 10;
+*--ptr = (char)('0' + i % 10);
 }
 while(i /= 10);

Re: [PATCH] stacks.c

2002-03-29 Thread Michel J Lambert


> It won't go in cleanly any more:

How about the below patch?

Mike Lambert


Index: stacks.c
===
RCS file: /cvs/public/parrot/stacks.c,v
retrieving revision 1.26
diff -u -r1.26 stacks.c
--- stacks.c29 Mar 2002 20:14:42 -  1.26
+++ stacks.c30 Mar 2002 01:15:31 -
@@ -149,7 +149,21 @@
void *thing, Stack_entry_type type, Stack_cleanup_method
cleanup)
 {
 Stack_chunk *chunk = stack_base->prev;
-Stack_entry *entry = &chunk->entry[chunk->used];
+Stack_entry *entry;
+
+/* Do we need a new chunk? */
+if (chunk->used == STACK_CHUNK_DEPTH) {
+/* Need to add a new chunk */
+Stack_chunk *new_chunk = mem_allocate_aligned(sizeof(Stack_chunk));
+new_chunk->used = 0;
+new_chunk->next = stack_base;
+new_chunk->prev = chunk;
+chunk->next = new_chunk;
+stack_base->prev = new_chunk;
+chunk = new_chunk;
+}
+
+entry = &chunk->entry[chunk->used];

 /* Remember the type */
 entry->entry_type = type;
@@ -184,17 +198,8 @@
"Invalid stack_entry_type!\n");
 break;
 }
-
-/* Register the new entry */
-if (++chunk->used == STACK_CHUNK_DEPTH) {
-/* Need to add a new chunk */
-Stack_chunk *new_chunk = mem_allocate_aligned(sizeof(Stack_chunk));
-new_chunk->used  = 0;
-new_chunk->next  = stack_base;
-new_chunk->prev  = chunk;
-chunk->next  = new_chunk;
-stack_base->prev = new_chunk;
-}
+
+chunk->used++;
 }

 /* Pop off an entry and return a pointer to the contents */
@@ -222,7 +227,10 @@
 internal_exception(ERROR_STACK_EMPTY, "No entries on stack!\n");
 }

-entry = &chunk->entry[chunk->used - 1];
+/* Now decrement the SP */
+chunk->used--;
+
+entry = &chunk->entry[chunk->used];

 /* Types of 0 mean we don't care */
 if (type && entry->entry_type != type) {
@@ -234,9 +242,6 @@
 if (entry->flags & STACK_ENTRY_CLEANUP_FLAG) {
 (*entry->cleanup)(entry);
 }
-
-/* Now decrement the SP */
-chunk->used--;

 /* Sometimes the caller doesn't care what the value was */
 if (where == NULL) {

Re: [PATCH] stacks.c

2002-03-29 Thread Michel J Lambert


The following was applied by Dan, but from what I can tell, seems to have
become unapplied since.

Mike Lambert

Bryan C. Warnock wrote:

> Date: Fri, 22 Mar 2002 01:47:02 -0500
> From: Bryan C. Warnock <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Subject: [PATCH] stacks.c
>
> Defer allocation as long as possible.  Make logic parallel.
>
> Index: stacks.c
> ===
> RCS file: /home/perlcvs/parrot/stacks.c,v
> retrieving revision 1.23
> diff -u -r1.23 stacks.c
> --- stacks.c8 Mar 2002 03:04:03 -   1.23
> +++ stacks.c22 Mar 2002 06:45:39 -
> @@ -108,7 +108,21 @@
> void *thing, INTVAL type, stack_cleanup_method_t cleanup)
>  {
>  Stack_Chunk chunk = stack->prev;
> -Stack_Entry entry = &chunk->entry[chunk->used];
> +Stack_Entry entry;
> +
> +/* Do we need a new chunk? */
> +if (chunk->used == STACK_CHUNK_DEPTH) {
> +/* Need to add a new chunk */
> +Stack_Chunk new_chunk = mem_allocate_aligned(sizeof(*new_chunk));
> +new_chunk->used = 0;
> +new_chunk->next = stack;
> +new_chunk->prev = chunk;
> +chunk->next = new_chunk;
> +stack->prev = new_chunk;
> +chunk = new_chunk;
> +}
> +
> +entry = &chunk->entry[chunk->used];
>
>  /* Remember the type */
>  entry->entry_type = type;
> @@ -139,16 +153,7 @@
>  break;
>  }
>
> -/* Register the new entry */
> -if (++chunk->used == STACK_CHUNK_DEPTH) {
> -/* Need to add a new chunk */
> -Stack_Chunk new_chunk = mem_allocate_aligned(sizeof(*new_chunk));
> -new_chunk->used = 0;
> -new_chunk->next = stack;
> -new_chunk->prev = chunk;
> -chunk->next = new_chunk;
> -stack->prev = new_chunk;
> -}
> +chunk->used++;
>  }
>
>  /* Pop off an entry and return a pointer to the contents */
> @@ -176,7 +181,10 @@
>  internal_exception(ERROR_STACK_EMPTY, "No entries on stack!\n");
>  }
>
> -entry = &chunk->entry[chunk->used - 1];
> +/* Now decrement the SP */
> +chunk->used--;
> +
> +entry = &chunk->entry[chunk->used];
>
>  /* Types of 0 mean we don't care */
>  if (type && entry->entry_type != type) {
> @@ -189,8 +197,6 @@
>  (*entry->cleanup) (entry);
>  }
>
> -/* Now decrement the SP */
> -chunk->used--;
>
>  /* Sometimes the caller doesn't care what the value was */
>  if (where == NULL)
> --
> Bryan C. Warnock
> [EMAIL PROTECTED]
>
>

RE: Complete, Mainly-GC Patch

2002-03-29 Thread Michel J Lambert


> So then the above line:
>
> ># +   SELF->data = value->cache.struct_val;
>
> Should be
>
> ># +   SELF->data = value->data;
>
> Correct?

Those look good. Thanks for the catch.

Steve: wrt to the immortal name choice, if you like, I can submit a patch
changing the name to immune...I'm not too attached to the word immortal.
:)

And thanks for applying it.

Mike Lambert

Memory Corruption Bug

2002-03-29 Thread Michel J Lambert


I've been using clintp's pasm test script for much of my testing, which is
supposed to be an infix expression evaluator. I've been stress-testing
parrot and finding bugs by putting increasingly-large expressions into the
pasm file for it to evaluate.

Attached is a .pasm file which causes some string data to be written into
the middle of the string_pool->pool_buffer list of entries, such that when
it tries to dereference foo in new_pmc_header, it's pointing to garbage
memory. 0x20202020 for me, which is four spaces. Changing the save/restore
of spaces in the pasm file to use periods causes the pointer to be
0x2e2e2e2e.  I tried for a bit on this, but couldn't really track it down
any more than that. Hopefully someone else can figure it out.

To change the expression to make it more stressful (on both your debugging
sessions, and parrot), add to the 'save ' at the very bottom of the
script. Currently, it's along the lines of:

"2" . ( "+(45*2+6)-3*5/(2+9/7-(5*6+3-5*7/2-2)*31+52/67*10-32-56)" x
$somelargenumberofpastes )

Mike Lambert

PS: Clint, I hope you don't mind me giving out the pasm file. I believe it
is in the best interest of your master plan. :)




gc_string_stress.pasm
Description: gc_string_stress.pasm

Re: [PATCH] discarding the unborn

2002-03-29 Thread Michel J Lambert


> Wow, you've got a couple of major patches floating around that are
> getting ignored. I don't feel so bad now. :-)

Yeah, that's why I'm combining all the GC patches into one big patch, in
the hopes it'll make it more appliable.

The only things I've sent out and am awaiting feedback for are the
computed-goto and GC patches, and the email about potential memory leaks.
The warnings patch I sent out was mostly applied, and I need to fix some
of the things in there, but there are issues I brought up in that email
which have not been addressed (like the IO warnings).

I'm just trying to get as much done during my week off from school, since
I dunno how much time I'll have when I get back to the grind. :)

Mike Lambert

Complete, Mainly-GC Patch

2002-03-29 Thread Michel J Lambert


The attached patch fixes a bunch of bugs. They are:

>From before, rolled into this patch:
+ Creates a new flag, immortal, which is intended for GC use only, so it
shouldn't be set in the init() function. This is used to prevent the GC
from dod'ing the object.
+ PerlString now stores the string pointer in data instead of
cache.struct_val .
+ buffer_ptrs work properly now, and check that they point to something
before calling buffer_lives on it.

New fixes in this patch:
+ new_string_header nulls out the bufstart before returning it. This was
causing problems when 'bufstart = parrot_allocate...' was causing a
do_collect to be run, and it was referencing invalid memory. I also made
new_pmc_header null out the data field as well, for consistency, and
removed the nulling-out in pmc_new*.
+ Changed a conditional in stack_entry so that it looks up entry in the
correct place in a borderline case.
+ Fixed the same thing that Peter Gibb's patched fixed wrt the GC linked
list bug, but in what I think is a more efficient way. the next_for_GC
linked list now *requires* that the tail element point to itself. This
hack (I think this is a good hack, not a bad one, if documented)
determines the 'end-of-list'. This allows the same logic to be used in
mark_used to determine if the element is on the linked list. It avoids the
need to add additional conditional branches into mark_used, which is
probably a hotspot of the memory manager (haven't verified this, however).

Please let me know if there are any issues with this that I should
address.

Mike Lambert



Index: parrot/pmc.c
===
RCS file: /cvs/public/parrot/pmc.c,v
retrieving revision 1.11
diff -u -r1.11 pmc.c
--- parrot/pmc.c10 Mar 2002 21:19:46 -  1.11
+++ parrot/pmc.c29 Mar 2002 08:09:23 -
@@ -39,8 +39,8 @@
 return NULL;
 }
 
-pmc->flags = 0;
-pmc->data = 0;
+/* Ensure the PMC survives DOD during this function */
+pmc->flags |= PMC_immortal_FLAG;
 
 pmc->vtable = &(Parrot_base_vtables[base_type]);
 
@@ -53,6 +53,9 @@
 }
 
 pmc->vtable->init(interpreter, pmc, 0);
+
+/* Let the caller track this PMC */
+pmc->flags &= ~PMC_immortal_FLAG;
 return pmc;
 }
 
@@ -67,8 +70,8 @@
 return NULL;
 }
 
-pmc->flags = 0;
-pmc->data = 0;
+/* Ensure the PMC survives DOD during this function */
+pmc->flags |= PMC_immortal_FLAG;
 
 pmc->vtable = &(Parrot_base_vtables[base_type]);
 
@@ -81,6 +84,9 @@
 }
 
 pmc->vtable->init(interpreter, pmc, size);
+
+/* Let the caller track this PMC */
+pmc->flags &= ~PMC_immortal_FLAG;
 return pmc;
 }
 
Index: parrot/resources.c
===
RCS file: /cvs/public/parrot/resources.c,v
retrieving revision 1.35
diff -u -r1.35 resources.c
--- parrot/resources.c  26 Mar 2002 16:33:01 -  1.35
+++ parrot/resources.c  29 Mar 2002 08:09:23 -
@@ -139,6 +139,8 @@
 interpreter->active_PMCs++;
 /* Mark it live */
 return_me->flags = PMC_live_FLAG;
+/* Don't let it point to garbage memory */
+return_me->data = NULL;
 /* Return it */
 return return_me;
   }
@@ -242,6 +244,8 @@
 interpreter->active_Buffers++;
 /* Mark it live */
 return_me->flags = BUFFER_live_FLAG;
+/* Don't let it point to garbage memory */
+return_me->bufstart = NULL;
 /* Return it */
 return return_me;
   }
@@ -348,6 +352,9 @@
 /* Now put it on the end of the list */
 current_end_of_list->next_for_GC = used_pmc;
 
+/* Explicitly make the tail of the linked list be self-referential */
+used_pmc->next_for_GC = used_pmc;
+
 /* return the PMC we were passed as the new end of the list */
 return used_pmc;
 }
@@ -355,20 +362,20 @@
 /* Do a full trace run and mark all the PMCs as active if they are */
 static void
 trace_active_PMCs(struct Parrot_Interp *interpreter) {
-PMC *last, *current; /* Pointers to the last marked PMC and the
-currently being processed PMC. */
+PMC *last, *current, *prev; /* Pointers to the last marked PMC, the
+   currently being processed PMC, and in
+   the previously processed PMC in a loop. */
 unsigned int i, j, chunks_traced;
 Stack_chunk *cur_stack, *start_stack;
 struct PRegChunk *cur_chunk;
+
 /* We have to start somewhere, and the global stash is a good
place */
 last = current = interpreter->perl_stash->stash_hash;
+
 /* mark it as used and get an updated end of list */
 last = mark_used(current, last);
 
-/* Wipe out the next for gc bit, otherwise we'll never get anywhere */
-last->next_for_GC = NULL;
-
 /* Now, go run through the PMC registers and mark them as live */
 /*

Re: [PATCH] discarding the unborn

2002-03-28 Thread Michel J Lambert


FWIW, I've already submitted a patch which fixes this bug. I'm also about
to submit a patch which fixes it in a slightly better way, along with a
few of other (mostly GC) bugs I've tracked down tonight with the help of
clint's wonderfully abusive code.

The original email was:
http:[EMAIL PROTECTED]/msg09217.html

Mike Lambert

Re: GC bugs

2002-03-28 Thread Michel J Lambert

Hm...so I'm guessing my patch is the 'quick hacks' you tried and found
worked?

I agree my solution is a bit hackish. But I'm not sure how else to keep
them. The only other solution I can think of is to make a 'GC linked
list', which we put items on to immediately after construction, and take
them off as soon as we're sure they're referencable by the root set.
(Basically, add them where I set flags, and take them off where I unset
flags.) So flags or linked-list, it's a matter of style and speed, I
think.

Alternately, if we could walk the C stack to find variables, we would be
able to ensure they we caught all referencable 'things' not reachable by
the root set, but this is probably completely unportable.

Ah well, I guess I'll just leave the fate of the patch up to Dan, and see
what he thinks will be a good design decision on this.

Mike Lambert

Peter Gibbs wrote:

> Date: Thu, 28 Mar 2002 11:50:01 +0200
> From: Peter Gibbs <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Subject: GC bugs
>
> I have traced the problems with Josh's deep stack pushp/popp test to two
> problems with the garbage collector:
>
> 1) Strings pointed to by pmc->cache.struct_val are not marked as live
> 2) If a dod run is initiated while a pmc is being created, it will be freed
> as nobody points to it yet
>
> Since both these problems involve design considerations, I am leaving it to
> Dan to fix them.
> Quick hacks to bypass the problems show that the pushp/popp test then works
> fine.
>
> --
> Peter Gibbs
> EmKel Systems
>
>
>

Re: "deep" tests for stacks.t

2002-03-28 Thread Michel J Lambert

Oh, yay. Orange tinderboxen rule, only because I haven't seen that much
orange in quite awhile. :)

Anyways, I looked into the bug. There's actually a few problems...First,
is that perlstrings use the structval to store the buffer, and so it gets
missed by the GC. The patch below fixes perlstring to use the data pointer
like the GC wants. (And the other perl* classes, to use perlstring's data
pointer. What happened to the encapsulation in calling set_number, etc?)

Then I found a small GC bug where it was trying to buffer_lives on a null
pointer. Below patch fixes this as well.

Then there was *still* corruption. I thought this is the 'GC before I've
done anything with it' problem. When I create a new pmc, it calls init()
before the pmc gets put into the reg table. If the vtable init() creates a
new string (PerlString does this), then it's possible a dod run will
occur, and cause the as-yet-unassigned PMC to be incorrectly freed. Below
patch marks it as immortal for the duration of the init() call.

This finally fixed the problem. :)

While I was at it, I also made string_copy set the 'live' flag on the
newly-created string, such that when it calls Parrot_allocate for the
string's contents, a call to Parrot_go_collect will not forget to copy our
new string.

Note, there is one 'big' problem with this patch. It makes it impossible
to make constant pmc's using the init() vtable method, since it unsets the
flag after calling the function. Perhaps we should make a private GC flag,
PMC_private_constant_FLAG, or something, which the PMCs should *never*
use, and is reserved for the GC, in cases like these.

Regardless, this patch does make 'make test' happy again, and should be
safe to apply apply, as long as we don't forget about the afore-mentioned
caveat, which will probably come back to bite us in the future if we don't
take care of it. I wonder how many more GC bugs are lurking, waiting for
good tests cases like these..

Mike Lambert

Josh Wilmes wrote:

> Date: Thu, 28 Mar 2002 02:52:15 -0500
> From: Josh Wilmes <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Subject: Re: "deep" tests for stacks.t
>
>
> Mike pointed out that I was missing "end" opcodes in there, so I added
> them and went ahead and committed this code to CVS.  Currently test #7 is
> failing, but I think it's a legitimate bug- if not, I apologize for
> breaking the tinderbox ;)
>
> --Josh
>
> At 1:15 on 03/28/2002 EST, Josh Wilmes <[EMAIL PROTECTED]> wrote:
>
> >
> > I added some tests which push larger numbers of stack frames- this
> > improves our coverage in register.c.  However, one of the tests is failing
> > for me.  Is this something I did wrong, or did I find a bug?
> >
> > I'm getting weird output for the pushp and popp (deep) test.
> >
> > --Josh
> >
> > Here's the patch:
> >
> > Index: t/op/stacks.t
> > ===
> > RCS file: /cvs/public/parrot/t/op/stacks.t,v
> > retrieving revision 1.12
> > diff -u -r1.12 stacks.t
> > --- t/op/stacks.t   29 Jan 2002 02:32:17 -  1.12
> > +++ t/op/stacks.t   28 Mar 2002 06:12:31 -
> > @@ -1,6 +1,6 @@
> >  #! perl -w
> >
> > -use Parrot::Test tests => 15;
> > +use Parrot::Test tests => 18;
> >  use Test::More;
> >
> >  # Tests for stack operations, currently push*, push_*_c and pop*
> > @@ -87,6 +87,23 @@
> >  3031
> >  OUTPUT
> >
> > +
> > +my ($code, $output);
> > +for (0..1024) {
> > +   $code .= "   set I0, $_\n";
> > +   $code .= "   set I31, " . (1024-$_) . "\n";
> > +   $code .= "   pushi\n";
> > +}
> > +for (0..1024) {
> > +   $code .= "   popi\n";
> > +   $code .= "   print I0\n";
> > +   $code .= "   print I31\n";
> > +   $code .= "   print \"\\n\"\n";
> > +   $output .= (1024-$_) . "$_\n";
> > +}
> > +output_is($code, $output, "pushi & popi (deep)" );
> > +
> > +
> >  output_is(<<"CODE", <<'OUTPUT', 'pushs & pops');
> >  @{[ set_str_regs( sub {$_[0]%2} ) ]}
> > pushs
> > @@ -102,6 +119,23 @@
> >  01010101010101010101010101010101
> >  OUTPUT
> >
> > +
> > +($code, $output) = ();
> > +for (0..1024) {
> > +   $code .= "   set S0, \"$_\"\n";
> > +   $code .= "   set S31, \"" . (1024-$_) . "\"\n";
> > +   $code .= "   pushs\n";
> > +}
> > +for (0..1024) {
> > +   $code .= "   pops\n";
> > +   $code .= "   print S0\n";
> > +   $code .= "   print S31\n";
> > +   $code .= "   print \"\\n\"\n";
> > +   $output .= (1024-$_) . "$_\n";
> > +}
> > +output_is($code, $output, "pushs & pops (deep)" );
> > +
> > +
> >  output_is(<<"CODE", <<'OUTPUT', 'pushn & popn');
> >  @{[ set_num_regs( sub { "1.0".$_ } ) ]}
> > pushn
> > @@ -119,6 +153,7 @@
> >  Seem to have positive Nx after pop
> >  OUTPUT
> >
> > +
> >  output_is(<<"CODE", <<'OUTPUT', 'pushp & popp');
> > new P0, PerlString
> > set P0, "BUTTER IN HELL!\\n"
> > @@ -132,6 +167,25 @@
> >  CODE
> >  THERE'LL BE NO BUTTER IN HELL!
> >  OUTPUT
> > +
> > +
> > +($code, $output) = ();
> > +for (0..1024) {
> > +   $code .= "   new P0, PerlString\n";
> > +   $code

Warnings Cleanup

2002-03-27 Thread Michel J Lambert


Attached patch fixes many of the warnings you see on MSVC level 4.  The
ones listed below, this patch does *not* handle.

core.ops and rx.ops have some warnings about the use of MAKE_KEY, and the
non-use of the variables returned by MAKE_KEY. I believe Steve Fink's
patch fixes these.


io_win32.c has the following problems:

PIO_win32_open's:
int err = GetLastError();
C:\p\parrot8\parrot\io\io_win32.c(157) :
warning C4189: 'err' : local variable is initialized but not referenced

PIO_win32_read's:
return -1;
C:\p\parrot8\parrot\io\io_win32.c(224) : warning C4245: 'return' :
conversion from 'const int ' to 'unsigned int ', signed/unsigned mismatch

PIO_win32_write's:
return -1;
C:\p\parrot8\parrot\io\io_win32.c(237) : warning C4245: 'return' :
conversion from 'const int ' to 'unsigned int ', signed/unsigned mismatch

PIO_win32_read's:
return -1;
C:\p\parrot8\parrot\io\io_win32.c(224) : warning C4702: unreachable code


Finally, the other, but less severe warnings, are:

key.c has an unreferenced function, debug_key.

misc.c has an unreferenced function, Pad_it.

resources.c has an unreferenced function, find_dead_strings. (I thought
Dan said this wasn't needed anymore?)

Finally, we have parrot.c, a completely empty file. Does it have a
purpose, or can it be removed?

Mike Lambert



? parrot/Debug
? parrot/parrot.ilk
? parrot/parrotexe.dsp
? parrot/parrotexe.dsw
? parrot/parrotexe.ncb
? parrot/parrotexe.plg
Index: parrot/core.ops
===
RCS file: /cvs/public/parrot/core.ops,v
retrieving revision 1.113
diff -u -r1.113 core.ops
--- parrot/core.ops 22 Mar 2002 20:24:02 -  1.113
+++ parrot/core.ops 28 Mar 2002 06:30:59 -
@@ -2262,7 +2262,6 @@
 =cut
 
 op entrytype(out INT, in INT) {
-  size_t height;
   Stack_entry *entry;
 
   entry = stack_entry(interpreter, interpreter->user_stack, $2);
Index: parrot/misc.c
===
RCS file: /cvs/public/parrot/misc.c,v
retrieving revision 1.19
diff -u -r1.19 misc.c
--- parrot/misc.c   17 Mar 2002 06:44:41 -  1.19
+++ parrot/misc.c   28 Mar 2002 06:30:59 -
@@ -89,38 +89,40 @@
 void int_to_str(char *, char *, HUGEINTVAL, INTVAL );
 */
 
-void gen_sprintf_call(char *, char *, SpfInfo, int);
+void gen_sprintf_call(char *, char *, SpfInfo, char);
 
 static void
-uint_to_str(char *buf1, char *buf2, UHUGEINTVAL num, INTVAL base)
+uint_to_str(char *buf1, char *buf2, UHUGEINTVAL num, byte base)
 {
-int i = 0, cur;
+int i = 0, cur2;
+   byte cur;
 
 do {
-cur = num % base;
+cur = (byte)(num % base);
 
 if (cur < 10) {
-buf2[i] = '0' + cur;
+buf2[i] = (char)('0' + cur);
 }
 else {
-buf2[i] = 'a' + cur;
+buf2[i] = (char)('a' + cur);
 }
 
 i++;
 } while (num /= base);
 
-cur = i;
+cur2 = i;
 
-for (i = 0; i <= cur; i++) {
-buf1[i] = buf2[cur - i];
+for (i = 0; i <= cur2; i++) {
+buf1[i] = buf2[cur2 - i];
 }
 }
 
 static void
-int_to_str(char *buf1, char *buf2, HUGEINTVAL num, INTVAL base)
+int_to_str(char *buf1, char *buf2, HUGEINTVAL num, char base)
 {
 BOOLVAL neg;
-int i = 0, cur;
+int i = 0, cur2;
+   byte cur;
 
 if (num < 0) {
 neg = 1;
@@ -131,13 +133,13 @@
 }
 
 do {
-cur = num % base;
+cur = (byte)(num % base);
 
 if (cur < 10) {
-buf2[i] = '0' + cur;
+buf2[i] = (char)('0' + cur);
 }
 else {
-buf2[i] = 'a' + cur;
+buf2[i] = (char)('a' + cur);
 }
 
 i++;
@@ -147,10 +149,10 @@
 buf2[i++] = '-';
 }
 
-cur = i;
+cur2 = i;
 
-for (i = 0; i < cur; i++) {
-buf1[i] = buf2[cur - i - 1];
+for (i = 0; i < cur2; i++) {
+buf1[i] = buf2[cur2 - i - 1];
 }
 
 buf1[i] = 0;
@@ -186,7 +188,7 @@
 }
 
 void
-gen_sprintf_call(char *buf, char *buf2, SpfInfo info, int thingy)
+gen_sprintf_call(char *buf, char *buf2, SpfInfo info, char thingy)
 {
 int i = 0;
 buf[i++] = '%';
@@ -251,7 +253,7 @@
 
 for (i++; i < (INTVAL)string_length(pat)
  && info.phase != PHASE_DONE; i++) {
-char ch = string_ord(pat, i);
+INTVAL ch = string_ord(pat, i);
 
 switch (info.phase) {
 /*@fallthrough@ */ case PHASE_FLAGS:
@@ -411,7 +413,7 @@
 
 case 'f':
 dbl = va_arg(*args, double);
-gen_sprintf_call(t1, t2, &info, (char)'f');
+gen_sprintf_call(t1, t2, &info, 'f');
 sprintf(t2, t1, dbl);
 targ = string_c

Re: Computed-goto Patch

2002-03-27 Thread Michel J Lambert


Attached are my revised files. pbc2c.pl uses Parrot::OpTrans::Compiled,
and this patch uses Parrot::OpTrans::CGoto. It also fixed the issues with
the last patch:

- removed inadvertant keyed commenting
- fixed #include name
- fixed pbc2c.pl
- should have unix line endings

Please let me know if there are any other issues which need addressing.

Mike Lambert



? parrot/ops2cgc.pl
? parrot/testcomputedgoto_c.in
? parrot/lib/Parrot/OpTrans/Compiled.pm
Index: parrot/.cvsignore
===
RCS file: /cvs/public/parrot/.cvsignore,v
retrieving revision 1.14
diff -u -r1.14 .cvsignore
--- parrot/.cvsignore   8 Jan 2002 17:24:29 -   1.14
+++ parrot/.cvsignore   28 Mar 2002 04:58:56 -
@@ -1,5 +1,6 @@
 blib
 config.opt
+core_ops_cg.c
 core_ops.c
 core_ops_prederef.c
 Makefile
Index: parrot/Configure.pl
===
RCS file: /cvs/public/parrot/Configure.pl,v
retrieving revision 1.101
diff -u -r1.101 Configure.pl
--- parrot/Configure.pl 19 Mar 2002 22:53:57 -  1.101
+++ parrot/Configure.pl 28 Mar 2002 04:58:57 -
@@ -238,6 +238,17 @@
 cp=> 'cp',
 slash => '/',
 
+cg_h  => '$(INC)/oplib/core_ops_cg.h',
+cg_c  => <<'EOF',
+core_ops_cg$(O): $(GENERAL_H_FILES) core_ops_cg.c
+
+core_ops_cg.c $(INC)/oplib/core_ops_cg.h: $(OPS_FILES) ops2cgc.pl 
+lib/Parrot/OpsFile.pm lib/Parrot/Op.pm
+   $(PERL) ops2cgc.pl CGoto $(OPS_FILES)
+EOF
+cg_o  => 'core_ops_cg$(O)',
+cg_r  => '$(RM_F) $(INC)/oplib/core_ops_cg.h core_ops_cg.c',
+cg_flag   => '-DHAVE_COMPUTED_GOTO',
+
 VERSION   => $parrot_version,
 MAJOR => $parrot_version[0],
 MINOR => $parrot_version[1],
@@ -690,6 +701,32 @@
 unlink("include/parrot/vtable.h");
 }
 
+
+# and now test if we can use computed goto
+print <<"END";
+
+Still everything ok, let's check if we can use computed goto,
+don't worry if you see some errors, it will be all right,
+This could take a bit...
+END
+
+{
+ buildfile("testcomputedgoto_c");
+ my $test = system("$c{cc} $c{ccflags} -o testcomputedgoto$c{exe} 
+testcomputedgoto.c");
+ 
+ if ($test != 0) {
+   $c{"cg_h"}='';
+   $c{"cg_c"}='';
+   $c{"cg_o"}='';
+   $c{"cg_r"}='';
+   $c{"cg_flag"}='';
+ }
+
+ unlink('testcomputedgoto.c', "testcomputedgoto$c{exe}", "testcomputedgoto$c{o}");
+}
+
+# rewrite the Makefile with the updated info
+buildfile("Makefile");
 
 #
 # Rewrite the config file with the updated info
Index: parrot/MANIFEST
===
RCS file: /cvs/public/parrot/MANIFEST,v
retrieving revision 1.133
diff -u -r1.133 MANIFEST
--- parrot/MANIFEST 24 Mar 2002 21:10:49 -  1.133
+++ parrot/MANIFEST 28 Mar 2002 04:58:57 -
@@ -246,6 +246,7 @@
 misc.c
 obscure.ops
 ops2c.pl
+ops2cgc.pl
 ops2pm.pl
 optimizer.pl
 packdump.c
@@ -292,6 +293,7 @@
 test_c.in
 test_gnuc.c
 test_main.c
+testcomputedgoto_c.in
 testparrotfuncptr.c
 testparrotsizes_c.in
 trace.c
Index: parrot/MANIFEST.SKIP
===
RCS file: /cvs/public/parrot/MANIFEST.SKIP,v
retrieving revision 1.2
diff -u -r1.2 MANIFEST.SKIP
--- parrot/MANIFEST.SKIP21 Mar 2002 23:39:49 -  1.2
+++ parrot/MANIFEST.SKIP28 Mar 2002 04:58:57 -
@@ -16,6 +16,7 @@
 ^include/parrot/jit_struct\.h$
 ^include/parrot/oplib/core_ops\.h$
 ^include/parrot/oplib/core_ops_prederef\.h$
+^include/parrot/oplib/core_ops_cg\.h$
 
 ^core_ops\.c$
 ^core_ops_prederef\.c$
Index: parrot/Makefile.in
===
RCS file: /cvs/public/parrot/Makefile.in,v
retrieving revision 1.141
diff -u -r1.141 Makefile.in
--- parrot/Makefile.in  21 Mar 2002 23:47:22 -  1.141
+++ parrot/Makefile.in  28 Mar 2002 04:58:57 -
@@ -65,7 +65,7 @@
 $(INC)/memory.h $(INC)/parrot.h $(INC)/stacks.h $(INC)/packfile.h \
 $(INC)/global_setup.h $(INC)/vtable.h $(INC)/oplib/core_ops.h \
 $(INC)/oplib/core_ops_prederef.h $(INC)/runops_cores.h $(INC)/trace.h \
-$(INC)/pmc.h $(INC)/key.h $(INC)/resources.h $(INC)/platform.h \
+$(INC)/pmc.h $(INC)/key.h $(INC)/resources.h $(INC)/platform.h ${cg_h} \
 $(INC)/interp_guts.h ${jit_h} $(INC)/rx.h $(INC)/rxstacks.h \
 $(INC)/embed.h $(INC)/warnings.h $(INC)/misc.h
 
@@ -87,7 +87,7 @@
 core_ops$(O) core_ops_prederef$(O) memory$(O) packfile$(O) stacks$(O) \
 string$(O) encoding$(O) chartype$(O) runops_cores$(O) trace$(O) pmc$(O) key$(O) \
 platform$(O) ${jit_o} resources$(O) rx$(O) rxstacks$(O) embed$(O) warnings$(O) \
-misc$(O)
+misc$(O) ${cg_o}
 
 O_FILES = $(INTERP_O_FILES) $(IO_O_FILES) $(CLASS_O_FILES) $(ENCODING_O_FILES) 
$(CHARTYPE_O_FILES)
 
@@ -103,7 +103,7 @@
 #
 #

Re: Computed-goto Patch

2002-03-27 Thread Michel J Lambert


> The patch is slightly broken, core_cg_ops.h in interpreter.c versus
> core_ops_cg.h everywhere else.

I tried to do a global change to make it use core_ops_cg.h, following
prederef's example. I must have missed this one. (Still runs, just gives a
warning.)

> It does take a while to build the computed goto ops file. I still have
to test
> it myself.

Yeah. It takes quite a long time to compile core_ops_cg.c. I think because
of this, we should make comp-goto be optional, so as not to force it on
all gcc users. I don't know how to do this, however.

> I can't tell what has actually been changed in pbc2c.pl, due to the large
> amount of indentation fixup included in the patch. Can you say what you changes
> there ?

My methodology in creating this patch was to first apply Daniel's patch,
and then get it working. His patch had significant changes to an older
version of pbc2c.pl, which I tried to apply to the newest version. I then
proceeded to try and integrate them back into the current version of the
code. I guess it simplified enough that the new file was completely
identical to the original, sans whitespace. So I don't believe anything
has changed.

> The existing pbc2c implementation worked without computed goto is this still
> the case ?  The CGoto.pm OpTrans is currently being used by the bytecode to C
> compiler.  The existing CGoto.pm should be cloned with a more appropriate name
> before it gets patched to work with a computed goto core.

Per the previous paragraph, I believe it should still work fine. I'm not
familiar with the existing bytecode to C compiler, and didn't know it used
CGoto.pm. I don't see *why* it does, since C.pm, CPrederef.pm, and
CGoto.pm seem to me like they should be used for the various cores. I'm
not sure how I use this compiler, but I think I've made a decent amount of
changes to CGoto.pm such that it either won't work, or will suddenly have
all it's problems fixed. Can anyone tell me how I'm supposed to use
pbc2c.pl?

Problems with my patch:
+ includes a commenting-out of the keyed function. That probably should
not be in this computed-goto patch.
+ no changes to pbc2c.pl should be required. Those changes should be made
into a seperate 'indenting' patch.
+ misnamed the #include'd file in runops_cores.c

I'll wait on the issue of pbc2c.pl before I submit a revised patch.

Thanks for looking into this,
Mike Lambert

Re: Computed-goto Patch

2002-03-26 Thread Michel J Lambert

Doh, sorry about that.

If it makes you feel any better (or worse), I was in the midst of writing
up the email for fixing the pushp bug you mentioned, when up pops '[PATCH]
Stack fix' in my email inbox, turning my local bug fix into a nice
learning exercise. :)

I think you bring up a good point about helping to maintain who's doing
what as the parrot development group of people grows larger. Isn't this
what bugs6.perl.org was supposed to be for?

Mike Lambert

Melvin Smith wrote:

> Date: Wed, 27 Mar 2002 03:08:18 -0500
> From: Melvin Smith <[EMAIL PROTECTED]>
> To: Michel J Lambert <[EMAIL PROTECTED]>, [EMAIL PROTECTED]
> Subject: Re: Computed-goto Patch
>
> At 02:55 AM 3/27/2002 -0500, Michel J Lambert wrote:
> >Attached is a patch to implement computed-goto on gcc, taken from the
>
> Wow, talk about timing, I was up late working on computed goto core
> just now, and had hacked ops2c.pl and CGoto.pm. I had to remove
> prederef stuff until I could look at it.
>
> I was 90% there, then your patch popped up... ouch :)
>
> I guess I could have asked on the list first.
>
> In any case, Good work!
>
> We DO need a file in the repository that has a status of things that
> are implemented, implemented but reworking, partly implemented
> and on the drawing board, and who might be working on said
> area.
>
> Right now most of us have no clue whos doing what anymore besides Dan
> doing GC, Jeff doing unicode, Simon is on vacation and may or may
> not be doing anything with assembler, keyed aggregate, etc.
>
> -Melvin
>
>

[PATCH] t/pmc/.cvsignore

2002-03-26 Thread Michel J Lambert


The introduction "make quicktest" made this missing .cvsignore more
apparent.

Mike Lambert



*.pasm

*.pbc

*.out

Computed-goto Patch

2002-03-26 Thread Michel J Lambert


Attached is a patch to implement computed-goto on gcc, taken from the
original post by Daniel Grunblatt:
http:[EMAIL PROTECTED]/msg06255.html

Changes since his original patch:
+ works with the current codebase
+ handles all jumps properly, and passes all tests (I'm not sure,
but I don't believe the original did this)
+ updated to reflect the precedent set by prederef in terms of naming
and code conventions

Note, that some of the configure.pl and makefile.in hackery is done so
that this file doesn't get compiled all the time. Not sure if there's a
better way to do this, as I'm not really a configure/makefile expert.

Relevant mops.pasm times on this 1ghz machine:
MSVC normal dispatch: 30.35 Mops/sec
GCC normal dispatch:  14.61 Mops/sec
GCC computed goto:41.74 Mops/sec

Enjoy,
Mike Lambert


? parrot/ops2cgc.pl

? parrot/testcomputedgoto_c.in

Index: parrot/.cvsignore

===

RCS file: /cvs/public/parrot/.cvsignore,v

retrieving revision 1.14

diff -u -r1.14 .cvsignore

--- parrot/.cvsignore   8 Jan 2002 17:24:29 -   1.14

+++ parrot/.cvsignore   27 Mar 2002 07:50:27 -

@@ -1,5 +1,6 @@

 blib

 config.opt

+core_ops_cg.c

 core_ops.c

 core_ops_prederef.c

 Makefile

Index: parrot/Configure.pl

===

RCS file: /cvs/public/parrot/Configure.pl,v

retrieving revision 1.101

diff -u -r1.101 Configure.pl

--- parrot/Configure.pl 19 Mar 2002 22:53:57 -  1.101

+++ parrot/Configure.pl 27 Mar 2002 07:50:27 -

@@ -238,6 +238,17 @@

 cp=> 'cp',

 slash => '/',

 

+cg_h  => '$(INC)/oplib/core_ops_cg.h',

+cg_c  => <<'EOF',

+core_ops_cg$(O): $(GENERAL_H_FILES) core_ops_cg.c

+

+core_ops_cg.c $(INC)/oplib/core_ops_cg.h: $(OPS_FILES) ops2cgc.pl 
+lib/Parrot/OpsFile.pm lib/Parrot/Op.pm

+   $(PERL) ops2cgc.pl CGoto $(OPS_FILES)

+EOF

+cg_o  => 'core_ops_cg$(O)',

+cg_r  => '$(RM_F) $(INC)/oplib/core_ops_cg.h core_ops_cg.c',

+cg_flag   => '-DHAVE_COMPUTED_GOTO',

+

 VERSION   => $parrot_version,

 MAJOR => $parrot_version[0],

 MINOR => $parrot_version[1],

@@ -690,6 +701,32 @@

 unlink("include/parrot/vtable.h");

 }

 

+

+# and now test if we can use computed goto

+print <<"END";

+

+Still everything ok, let's check if we can use computed goto,

+don't worry if you see some errors, it will be all right,

+This could take a bit...

+END

+

+{

+ buildfile("testcomputedgoto_c");

+ my $test = system("$c{cc} $c{ccflags} -o testcomputedgoto$c{exe} 
+testcomputedgoto.c");

+ 

+ if ($test != 0) {

+   $c{"cg_h"}='';

+   $c{"cg_c"}='';

+   $c{"cg_o"}='';

+   $c{"cg_r"}='';

+   $c{"cg_flag"}='';

+ }

+

+ unlink('testcomputedgoto.c', "testcomputedgoto$c{exe}", "testcomputedgoto$c{o}");

+}

+

+# rewrite the Makefile with the updated info

+buildfile("Makefile");

 

 #

 # Rewrite the config file with the updated info

Index: parrot/MANIFEST

===

RCS file: /cvs/public/parrot/MANIFEST,v

retrieving revision 1.133

diff -u -r1.133 MANIFEST

--- parrot/MANIFEST 24 Mar 2002 21:10:49 -  1.133

+++ parrot/MANIFEST 27 Mar 2002 07:50:27 -

@@ -246,6 +246,7 @@

 misc.c

 obscure.ops

 ops2c.pl

+ops2cgc.pl

 ops2pm.pl

 optimizer.pl

 packdump.c

@@ -292,6 +293,7 @@

 test_c.in

 test_gnuc.c

 test_main.c

+testcomputedgoto_c.in

 testparrotfuncptr.c

 testparrotsizes_c.in

 trace.c

Index: parrot/MANIFEST.SKIP

===

RCS file: /cvs/public/parrot/MANIFEST.SKIP,v

retrieving revision 1.2

diff -u -r1.2 MANIFEST.SKIP

--- parrot/MANIFEST.SKIP21 Mar 2002 23:39:49 -  1.2

+++ parrot/MANIFEST.SKIP27 Mar 2002 07:50:27 -

@@ -16,6 +16,7 @@

 ^include/parrot/jit_struct\.h$

 ^include/parrot/oplib/core_ops\.h$

 ^include/parrot/oplib/core_ops_prederef\.h$

+^include/parrot/oplib/core_ops_cg\.h$

 

 ^core_ops\.c$

 ^core_ops_prederef\.c$

Index: parrot/Makefile.in

===

RCS file: /cvs/public/parrot/Makefile.in,v

retrieving revision 1.141

diff -u -r1.141 Makefile.in

--- parrot/Makefile.in  21 Mar 2002 23:47:22 -  1.141

+++ parrot/Makefile.in  27 Mar 2002 07:50:27 -

@@ -65,7 +65,7 @@

 $(INC)/memory.h $(INC)/parrot.h $(INC)/stacks.h $(INC)/packfile.h \

 $(INC)/global_setup.h $(INC)/vtable.h $(INC)/oplib/core_ops.h \

 $(INC)/oplib/core_ops_prederef.h $(INC)/runops_cores.h $(INC)/trace.h \

-$(INC)/pmc.h $(INC)/key.h $(INC)/resources.h $(INC)/platform.h \

+$(INC)/pmc.h $(INC)/key.h $(INC)/resources.h $(INC)/platform.h ${cg_h} \

 $(INC)/interp_guts.h ${jit_h} $(INC)/rx.h $(INC)/rxs

Potential Memory Leaks

2002-03-26 Thread Michel J Lambert


Hey,

After going through and hopefully learning the GC system yesterday, today
I went through looking for problems in the code that uses it (or doesn't
use it, as the case may be). Below are what I believe to be potential
problems in Parrot's memory use. I may very well be mistaken on many of
these, however.

Am I correct in assuming that the stacks stuff leaks memory? Both stacks.c
and rxstacks.c allocate memory via mem_allocate_aligned, but never free
it, relying on the GC for it (code written before the GC existed).

Should these stacks be changed to use a buffers system, or should they be
changed to free their own chunks on certain pops? Following the logic in
register.c, which handles it's own memory of register stack chunks, I
would guess the latter, although there might be a performance gain if it
can reuse these chunks via the GC.

I believe these are known, and the regex system is slated for a rewrite,
but I'll mention these for completeness' sake. rx_allocate_info uses
mem_sys_allocate, as does rx_clone_info. If those used buffers. Is that
all that's needed to make GC and regexes work nice together?

In addition, the regex handles its own allocation and deallocation for
bitmaps, which it does for each 'oneof' match.  Performance would surely
improve if one could put these bitmaps into the pbc file somehow. Constant
PMCs in the file would probably be needed here.  Alternately, one could
make the bitmaps GC-compliant, and just unreference them after the entire
regex match is completed.

Keys also allocate memory using mem_sys_allocate. Should this be changed
over to Buffers at some point, or will this be internally-managed key
memory? I'm guessing this is all WIP and likely to change completely,
however.

Each intqueue leaks some memory, as the new_container allocation is not
freed. What's the policy on supporting these kind of non-core, but
included in the distro, kind of things?

Finally, I'd also argue for a renaming or #defining of mem_realloc to
Parrot_reallocate, to match Parrot_allocate. At first glance, I thought
mem_alloc punted down to the native realloc routines, and it was
reallocing GC-managed memory. :)

Hope this was helpful. If you can tell me which of the above I should
ignore because it's going to get rewritten or is irrelevant, I think I can
write the necessary patches.

Mike Lambert

Re: [PATCH] tabs->spaces fix

2002-03-25 Thread Michel J Lambert


Sorry. It's easier to c+p into Pine than it is to upload+attach. I have a
feeling a few other patches I've submitted might be screwed up as well. If
this happens for anyone, please let me know.

Mike Lambert

Josh Wilmes wrote:

> Date: Mon, 25 Mar 2002 23:22:34 -0500
> From: Josh Wilmes <[EMAIL PROTECTED]>
> To: Michel J Lambert <[EMAIL PROTECTED]>
> Cc: [EMAIL PROTECTED]
> Subject: Re: [PATCH] tabs->spaces fix
>
>
> Mike,
>
> that doesn't want to apply for me- i suspect that the patch got garbled in
> the mail.  Can you MIME-attach or uuencode it?
>
> --Josh
>
> At 23:17 on 03/25/2002 EST, Michel J Lambert <[EMAIL PROTECTED]> wrote:
>
> > Below is a fix for the use of tabs in test_main.c and resources.c.
> >
> > Mike Lambert
> >
> >
> > Index: resources.c
> > ===
> > RCS file: /cvs/public/parrot/resources.c,v
> > retrieving revision 1.33
> > diff -r1.33 resources.c
> > 23,24c23,24
> > <   struct free_pool *pool, void
> > <   *to_add) {
> > ---
> > > struct free_pool *pool, void
> > > *to_add) {
> > 32,34c32,34
> > <pool->pool_buffer.bufstart,
> > <pool->pool_buffer.buflen,
> > <(UINTVAL)(pool->pool_buffer.buflen
>  * 1.2));
> > ---
> > >  pool->pool_buffer.bufstart,
> > >  pool->pool_buffer.buflen,
> > >  (UINTVAL)(pool->pool_buffer.bu
> flen * 1.2));
> > 65,66c65,66
> > <thing, as the headers go into a single
> > <allocation pool */
> > ---
> > >  thing, as the headers go into a single
> > >  allocation pool */
> > 86,87c86,87
> > <   interpreter->arena_base->pmc_pool,
> > <   cur_pmc++);
> > ---
> > > interpreter->arena_base->pmc_pool,
> > > cur_pmc++);
> > 180,181c180,181
> > <thing, as the headers go into a single
> > <allocation pool */
> > ---
> > >  thing, as the headers go into a single
> > >  allocation pool */
> > 201,202c201,202
> > <  interpreter->arena_base->buffer_header_pool,
> > <  cur_buffer++);
> > ---
> > >interpreter->arena_base->buffer_header_pool,
> > >cur_buffer++);
> > 264,265c264,265
> > <  struct free_pool *pool, void
> > <  *to_add) {
> > ---
> > >struct free_pool *pool, void
> > >*to_add) {
> > 273,275c273,275
> > <pool->pool_buffer.bufstart,
> > <pool->pool_buffer.buflen,
> > <(UINTVAL)(pool->pool_buffer.buflen
>  * 1.2));
> > ---
> > >  pool->pool_buffer.bufstart,
> > >  pool->pool_buffer.buflen,
> > >  (UINTVAL)(pool->pool_buffer.bu
> flen * 1.2));
> > 307c307
> > <   pmc_array[i].flags &= ~PMC_live_FLAG;
> > ---
> > > pmc_array[i].flags &= ~PMC_live_FLAG;
> > 329c329
> > <   string_array[i].flags &= ~BUFFER_live_FLAG;
> > ---
> > > string_array[i].flags &= ~BUFFER_live_FLAG;
> > 462,464c462,464
> > <   if(cur_chunk->SReg[j].registers[i]) {
> > < buffer_lives((Buffer *)cur_chunk->SReg[j].registers[i]);
> > <   }
> > ---
> > > if(cur_chunk->SReg[j].registers[i]) {
> > >   buffer_lives((Buffer *)cur_chunk->SReg[j].registers[i]);
> > > }
> > 476c476
> > <   buffer_lives((Buffer *)cur_stack->entry[i].entry.string_val);
> > ---
> > > buffer_lives((Buffer *)cur_stack->entry[i].entry.string

[PATCH] tabs->spaces fix

2002-03-25 Thread Michel J Lambert


Below is a fix for the use of tabs in test_main.c and resources.c.

Mike Lambert


Index: resources.c
===
RCS file: /cvs/public/parrot/resources.c,v
retrieving revision 1.33
diff -r1.33 resources.c
23,24c23,24
<   struct free_pool *pool, void
<   *to_add) {
---
> struct free_pool *pool, void
> *to_add) {
32,34c32,34
pool_buffer.bufstart,
pool_buffer.buflen,
<(UINTVAL)(pool->pool_buffer.buflen * 
1.2));
---
>  pool->pool_buffer.bufstart,
>  pool->pool_buffer.buflen,
>  (UINTVAL)(pool->pool_buffer.buflen * 
>1.2));
65,66c65,66
  thing, as the headers go into a single
>  allocation pool */
86,87c86,87
<   interpreter->arena_base->pmc_pool,
<   cur_pmc++);
---
> interpreter->arena_base->pmc_pool,
> cur_pmc++);
180,181c180,181
  thing, as the headers go into a single
>  allocation pool */
201,202c201,202
<  interpreter->arena_base->buffer_header_pool,
<  cur_buffer++);
---
>interpreter->arena_base->buffer_header_pool,
>cur_buffer++);
264,265c264,265
<  struct free_pool *pool, void
<  *to_add) {
---
>struct free_pool *pool, void
>*to_add) {
273,275c273,275
pool_buffer.bufstart,
pool_buffer.buflen,
<(UINTVAL)(pool->pool_buffer.buflen * 
1.2));
---
>  pool->pool_buffer.bufstart,
>  pool->pool_buffer.buflen,
>  (UINTVAL)(pool->pool_buffer.buflen * 
>1.2));
307c307
<   pmc_array[i].flags &= ~PMC_live_FLAG;
---
> pmc_array[i].flags &= ~PMC_live_FLAG;
329c329
<   string_array[i].flags &= ~BUFFER_live_FLAG;
---
> string_array[i].flags &= ~BUFFER_live_FLAG;
462,464c462,464
<   if(cur_chunk->SReg[j].registers[i]) {
< buffer_lives((Buffer *)cur_chunk->SReg[j].registers[i]);
<   }
---
> if(cur_chunk->SReg[j].registers[i]) {
>   buffer_lives((Buffer *)cur_chunk->SReg[j].registers[i]);
> }
476c476
<   buffer_lives((Buffer *)cur_stack->entry[i].entry.string_val);
---
> buffer_lives((Buffer *)cur_stack->entry[i].entry.string_val);
505c505
<   add_pmc_to_free(interpreter,
---
> add_pmc_to_free(interpreter,
528,531c528,531
arena_base->string_header_pool,
<  &string_array[i]);
---
>  BUFFER_on_free_list_FLAG))) {
> add_header_to_free(interpreter,
>interpreter->arena_base->string_header_pool,
>&string_array[i]);
586,587c586,587
  thing, as the headers go into a single
>  allocation pool */
607,608c607,608
<  interpreter->arena_base->string_header_pool,
<  cur_string++);
---
>interpreter->arena_base->string_header_pool,
>cur_string++);
662c662
prev) {
---
>  NULL != cur_pool; cur_pool = cur_pool->prev) {
673,675c673,675
<   struct Memory_Pool *new_block;  /* A pointer to our working pool */
<   char *cur_spot; /* Where we're currently copying to */
<   UINTVAL cur_size;   /* How big our chunk is going to be */
---
>   struct Memory_Pool *new_block;/* A pointer to our working pool */
>   char *cur_spot;   /* Where we're currently copying to */
>   UINTVAL cur_size; /* How big our chunk is going to be */
693,694c693,694
arena_base->string_header_pool->pool_buffer.bufstart,

Minor GC Patch, and Some Questions

2002-03-25 Thread Michel J Lambert


Below is a cleanup patch for a comment or two, some flags missing, that I
did while reading through it all in an attempt to figure it all out.

I was impressed with how straightforward all the code was, but there were
a few things that confused me. If you answer these, I can provide
documentation patches or cleanups to help clarify things that I found
confusing. (And I'm assuming others would find confusing.)

The biggest thing was the interaction and difference between strings and
buffers in the GC system. First, there are three pools in the headers and
the code. Then I saw that buffers and PMCs get better treatment than
strings with their mark_*_unused, trace_active_*,s and free_unused_*. To
top it off, the buffer versions of those functions deal with the STRING
arenas and registers, which seems a bit counterintuitive.

Finally, trace_active_PMCs seems to be the only function to handle
buffers. Looking at the amount of similarity between that and
trace_active_buffers, I think it might help speed to combine those two
functions together, to help avoid the duplicate loops, and to make it a
bit easier to follow.

I think I understand it all now, but I'd like to check to make sure I'm
right. All strings are, and can be treated as, buffers, since they are
referenced by pointer only. The cleanup routines know that it's a string,
and free each to the appropriate pool. Buffers have no high-level access,
and exist only as part of PMCs, so they can only be freed from there.

If my understanding is correct, then what is the purpose of
find_dead_strings and dead_string_run? They both are not implemented and
not called from anywhere. Are they intended to be implemented at some
point, or should they disappear?

ppd09 talks about custom DOD and GC functions, and support for buffers of
buffers, but I'm unable to find code that supports either. Just checking
those are still tbd.

Where does the 'Not yet implemented' come into play for free_unused_PMCs?
It looks implemented and working to me. (Although I did not debug it to
see if it is being called and working.)

Thanks,
Mike Lambert

Index: resources.c
===
RCS file: /cvs/public/parrot/resources.c,v
retrieving revision 1.33
diff -u -r1.33 resources.c
--- resources.c 22 Mar 2002 20:24:02 -  1.33
+++ resources.c 25 Mar 2002 08:11:19 -
@@ -76,15 +76,15 @@
   /* Note it in our stats */
   interpreter->total_PMCs += PMC_HEADERS_PER_ALLOC;
   /* Yeah, this is skanky. They're not really active, but
- add_header_to_free assumes that it's adding an active header to
+ add_pmc_to_free assumes that it's adding an active PMC to
  the free list */
   interpreter->active_PMCs += PMC_HEADERS_PER_ALLOC;

@@ -110,7 +110,7 @@
   /* We return system memory if we've got no interpreter yet */
   if (NULL == interpreter) {
 return_me = mem_sys_allocate(sizeof(PMC));
-return_me->flags = PMC_live_FLAG;
+return_me->flags = PMC_live_FLAG | BUFFER_sysmem_FLAG;
 return_me->vtable = NULL;
 return_me->data = NULL;
 return return_me;
@@ -212,7 +212,7 @@
  yet */
   if (interpreter == NULL) {
 return_me = mem_sys_allocate(sizeof(Buffer));
-return_me->flags = BUFFER_live_FLAG;
+return_me->flags = BUFFER_live_FLAG | BUFFER_sysmem_FLAG;
 return return_me;
   }

@@ -362,7 +362,7 @@
 struct PRegChunk *cur_chunk;
 /* We have to start somewhere, and the global stash is a good
place */
-last = current = interpreter->perl_stash->stash_hash;;
+last = current = interpreter->perl_stash->stash_hash;
 /* mark it as used and get an updated end of list */
 last = mark_used(current, last);

@@ -617,7 +617,7 @@
  yet */
   if (interpreter == NULL) {
 return_me = mem_sys_allocate(sizeof(STRING));
-return_me->flags = BUFFER_live_FLAG;
+return_me->flags = BUFFER_live_FLAG | BUFFER_sysmem_FLAG;
 return return_me;
   }

Re: Old Warnock's Dilemma for 'make quicktest'

2002-03-24 Thread Michel J Lambert


> Could you also provide a patch (or at least comments in the code) that
> describe the limiations?

This satisfactory? (I didn't mention it in the README because that's
supposed to be simpler instructions sans caveats, imo)

Mike Lambert

Index: docs/running.pod
===
RCS file: /cvs/public/parrot/docs/running.pod,v
retrieving revision 1.6
diff -u -r1.6 running.pod
--- docs/running.pod19 Mar 2002 23:29:17 -  1.6
+++ docs/running.pod25 Mar 2002 00:47:13 -
@@ -84,4 +84,26 @@

 and then use any of the above methods for running tests.

+=item B
+
+C is similar to make test, except that it skips some steps
+that are unnecessary most of the time, allowing for a significant speed gain.
+It's intended to be used as a quick check during development, lowering the
+barrier to checking tests. A C should be performed before
+submitting patches, just in case it is affected.
+
+Instead of recompiling the .pasm files to .pbc files each time
+the test is run, C checks to see if the .pasm file is
+identical to the test, and if so, it uses the previously-generated .pbc file.
+
+C can fail under the following circumstances:
+
+- it is cancelled before the generation of the .pbc file can be completed
+- the .pbc file format changes
+- the numbering of opcodes used in the tests is changed
+- the assembler changes in a manner that would affect its output on tests
+
+If C fails to work properly, C is always
+available as a fallback.
+
 =back

[PATCH] String Constness Warning Removal

2002-03-24 Thread Michel J Lambert


Below is a patch to fix the two warnings about const-ness I get in
string.c. This should also fix the tcc build on tinderbox. Hopefully this
spot-fix to the const-ness problems won't cause further const-ness
problems farther down the line.

Mike Lambert

Index: include/parrot/string.h
===
RCS file: /cvs/public/parrot/include/parrot/string.h,v
retrieving revision 1.34
diff -r1.34 string.h
26,27c26,27
< ENCODING *encoding;
< CHARTYPE *type;
---
> const ENCODING *encoding;
> const CHARTYPE *type;

Re: Old Warnock's Dilemma for 'make quicktest'

2002-03-21 Thread Michel J Lambert


Times are:
make quicktest, after caching output: 23 seconds
make test: 175 seconds

make quicktest could screw up if:
- you ctrl-c it, or make test (although I haven't had problems with that
yet) during the compilation process at just the right time
- you do anything to invalidate existing .pbc files, such as rearranging
ops. not sure what else could trigger this.

I still find it extremely useful, in spite of these limitations on its
usage. The patch is included below.

Mike Lambert



Index: Makefile.in
===
RCS file: /cvs/public/parrot/Makefile.in,v
retrieving revision 1.140
diff -u -r1.140 Makefile.in
--- Makefile.in 16 Mar 2002 14:38:25 -  1.140
+++ Makefile.in 21 Mar 2002 22:48:56 -
@@ -403,6 +403,11 @@
 .test_dummy_j:
$(PERL) t/harness -j

+quicktest: $(TEST_PROG) assemble.pl .quicktest_dummy
+
+.quicktest_dummy:
+   $(PERL) t/harness quick
+
 mopstest: $(TEST_PROG) examples/assembly/mops.pbc
$(TEST_PROG) examples/assembly/mops.pbc

Index: lib/Parrot/Test.pm
===
RCS file: /cvs/public/parrot/lib/Parrot/Test.pm,v
retrieving revision 1.18
diff -u -r1.18 Test.pm
--- lib/Parrot/Test.pm  4 Mar 2002 02:32:54 -   1.18
+++ lib/Parrot/Test.pm  21 Mar 2002 22:49:01 -
@@ -64,12 +64,27 @@
   my $t = $0; $t =~ s/\.t$/$count\.$_/; $t
 } ( qw(pasm pbc out) );

-open ASSEMBLY, "> $as_f" or die "Unable to open '$as_f'";
-binmode ASSEMBLY;
-print ASSEMBLY $assembly;
-close ASSEMBLY;
+my $can_skip_compile = $ENV{PARROT_QUICKTEST};
+if ($can_skip_compile)
+{
+  open INASSEMBLY, "$as_f" or $can_skip_compile = 0;
+  if ($can_skip_compile) {
+local $/ = undef;
+my $inassembly = ;
+close INASSEMBLY;
+$can_skip_compile = 0 if ($assembly ne $inassembly);
+$can_skip_compile = 0 if (not -e $by_f);
+  }
+}

-_run_command( "$PConfig{perl} assemble.pl $as_f --output $by_f" );
+if (!$can_skip_compile) {
+  open ASSEMBLY, "> $as_f" or die "Unable to open '$as_f'";
+  binmode ASSEMBLY;
+  print ASSEMBLY $assembly;
+  close ASSEMBLY;
+
+  _run_command( "$PConfig{perl} assemble.pl $as_f --output $by_f" );
+}
 $TEST_PROG_ARGS = "" unless defined $TEST_PROG_ARGS;
 _run_command( "./$PConfig{test_prog} ${TEST_PROG_ARGS} $by_f", 'STDOUT' => 
$out_f, 'STDERR' => $out_f);

@@ -86,9 +101,7 @@
 my $pass = $Builder->$meth( $prog_output, $output, $desc );

 unless($ENV{POSTMORTEM}) {
-  foreach my $i ( $as_f, $by_f, $out_f ) {
-unlink $i;
-  }
+  unlink $out_f;
 }

 return $pass;
Index: t/harness
===
RCS file: /cvs/public/parrot/t/harness,v
retrieving revision 1.9
diff -u -r1.9 harness
--- t/harness   31 Jan 2002 19:03:34 -  1.9
+++ t/harness   21 Mar 2002 22:49:01 -
@@ -14,6 +14,9 @@
 getopts('jP', \%opts);
 $ENV{TEST_PROG_ARGS} = join(' ', map { "-$_" } keys %opts );

+$ENV{PARROT_QUICKTEST} = grep $_ eq 'quick', @ARGV;
+@ARGV = grep $_ ne 'quick', @ARGV;
+
 # Pass in a list of tests to run on the command line, else run all the tests.
 my @tests = @ARGV ? @ARGV : map { glob( "t/$_/*.t" ) } ( qw(op pmc) );
 runtests(@tests);

Re: MSVC Warnings

2002-03-21 Thread Michel J Lambert


Ooops. I originally did the /**/, but thought that was way too easy to be
C code, and so changed them to //. :)

Below is a proper patch.

Mike Lambert

Index: core.ops
===
RCS file: /cvs/public/parrot/core.ops,v
retrieving revision 1.111
diff -u -r1.111 core.ops
--- core.ops21 Mar 2002 22:01:50 -  1.111
+++ core.ops21 Mar 2002 22:48:58 -
@@ -567,6 +567,7 @@
 =cut

 inline op set_keyed (out PMC, in PMC, in PMC, in PMC)  {
+/*
 KEY_PAIR src_key_p, dest_key_p;
 KEY src_key, dest_key;

@@ -575,6 +576,7 @@

 $1->vtable->set_pmc_keyed(interpreter,
 $1, $2 ? &src_key : NULL, $3, $4 ? &dest_key : NULL);
+*/
 goto NEXT();
 }

Index: classes/intqueue.pmc
===
RCS file: /cvs/public/parrot/classes/intqueue.pmc,v
retrieving revision 1.6
diff -u -r1.6 intqueue.pmc
--- classes/intqueue.pmc10 Mar 2002 21:18:13 -  1.6
+++ classes/intqueue.pmc21 Mar 2002 22:48:59 -
@@ -68,7 +68,7 @@
 }

 static INTVAL queue_length ( CONTAINER* container ) {
-  INTVAL length;
+  INTVAL length = 0;
   if(container->head != NULL) {
 BUCKET* head = container->head;
 while(head != NULL) {

MSVC Warnings

2002-03-21 Thread Michel J Lambert



We take in const params, and set them into non-const members of STRING:
string.c(51) : warning C4090: '=' : different 'const' qualifiers
string.c(55) : warning C4090: '=' : different 'const' qualifiers

Due to MAKE_KEY in set_keyed setting keys[0] on an uninitialized pointer.
But I hear this isn't supposed to work anyway. The patch below comments
the code out.
c:\p\parrot\core.ops(574) : warning C4700: local variable 'src_key' used
without having been initialized
c:\p\parrot\core.ops(575) : warning C4700: local variable 'dest_key' used
without having been initialized

IntQueue can miscalculate the length:
c:\p\parrot\classes\intqueue.c(84) : warning C4700: local variable
'length' used without having been initialized

Patch to remove MAKE_KEY and IntQueue warnings is below.

Mike Lambert



Index: core.ops
===
RCS file: /cvs/public/parrot/core.ops,v
retrieving revision 1.111
diff -u -r1.111 core.ops
--- core.ops21 Mar 2002 22:01:50 -  1.111
+++ core.ops21 Mar 2002 22:30:03 -
@@ -567,14 +567,14 @@
 =cut

 inline op set_keyed (out PMC, in PMC, in PMC, in PMC)  {
-KEY_PAIR src_key_p, dest_key_p;
-KEY src_key, dest_key;
+//KEY_PAIR src_key_p, dest_key_p;
+//KEY src_key, dest_key;

-MAKE_KEY(src_key, src_key_p, $2, enum_key_pmc, pmc_val);
-MAKE_KEY(dest_key, dest_key_p, $4, enum_key_pmc, pmc_val);
+//MAKE_KEY(src_key, src_key_p, $2, enum_key_pmc, pmc_val);
+//MAKE_KEY(dest_key, dest_key_p, $4, enum_key_pmc, pmc_val);

-$1->vtable->set_pmc_keyed(interpreter,
-$1, $2 ? &src_key : NULL, $3, $4 ? &dest_key : NULL);
+//$1->vtable->set_pmc_keyed(interpreter,
+//$1, $2 ? &src_key : NULL, $3, $4 ? &dest_key :
NULL);
 goto NEXT();
 }

Index: classes/intqueue.pmc
===
RCS file: /cvs/public/parrot/classes/intqueue.pmc,v
retrieving revision 1.6
diff -u -r1.6 intqueue.pmc
--- classes/intqueue.pmc10 Mar 2002 21:18:13 -  1.6
+++ classes/intqueue.pmc21 Mar 2002 22:30:04 -
@@ -68,7 +68,7 @@
 }

 static INTVAL queue_length ( CONTAINER* container ) {
-  INTVAL length;
+  INTVAL length = 0;
   if(container->head != NULL) {
 BUCKET* head = container->head;
 while(head != NULL) {

Old Warnock's Dilemma for 'make quicktest'

2002-03-21 Thread Michel J Lambert


Heya,

I submitted a bunch of patches awhile back, all of which were committed
except for one, which dealt with adding a 'make quicktest'. Information on
it can be found here:

http:[EMAIL PROTECTED]/msg07834.html

It did provide for a significant speedup, from 170 seconds using 'make
test' to 24 seconds using 'make quicktest' on this old version of parrot.
Is there any interest in me bringing this patch up to date?

Thanks,
Mike Lambert

Re: Multimethod dispatch for parrot?

2002-03-08 Thread Michel J Lambert


> Yes. We will, for actual method and sub dispatch. Not for the other
> vtable methods, though.

I guess my big 'complaint' was against using vtables for the variety of
operators, due to the inherent asymmetry in them. I knew Perl6 is going to
have MMD, but I didn't know how deep that support is going to be. If it's
just a layered-on language level construct, then it'll be just like Perl5
support for Class::MultiMethods. If we can get it a bit deeper, then we
can use it to support Perl6 operators, such as add, etc, which was the
main reason of my original email. I figured tho, that it might be nice to
implement it at the Parrot level, for a few reasons.

It's in C, so we can get more speed.
If we define, for example, a 'conversion' operator with PythonInt,PerlInt
as the dispatch type, then we can automatically have inter-language
conversion support. I'm sure XS code has all sorts of macros for changing
C strings, to Perl strings, to Unicode strings, and so on. This
multimethod dispatch would make that work at the Parrot level, and just
further the goal of inter-dynamic-language co-operation.

> >In my opinion, this would provide
> >several benefits over the current system. While (IMHO) MMD provides many
> >benefits over the current system in terms of extensibility and ease of
> >maintainence, it is still a superset of single-dispatch systems, and could
> >easily be made to support non-MMD languages like Python, Perl6, etc.
>
> Don't assume perl 6 won't me MMD. It likely will. (I'd actually be
> really surprised if it wasn't, but we haven't gotten there yet)

I just didn't want to assume that Perl6 *will* be completely MMD.. If it
is, great, but I don't know at what level there will be support for it. It
may go against the grain of everyone who's been taught that OOP is the one
true light. But I suppose those people aren't using Perl anyway. :)

> It's not a lack, really. It's a deliberate design decision, and one
> made specifically for speed reasons. It saves an extra, relatively
> expensive, comparison for most of the operations.
>
> The core vtable methods need to be as fast as we can possibly make
> them--we don't want to drop speed if we can possibly help it.

I understand that vtables have an inherent speed advantage over MMD, and I
wasn't arguing that there wasn't. Rather, I was arguing that it would
simplify many things. But that goes into the design versus speed debate to
a degree, and I hear you prefer the speed side of any such argument
involving speed. :)

That being said, the entirety of multimethod dispatch isn't really needed
if the types are known at compiletime. Since we know the types of the two
operators in the bytecode, the programmer is essentially doing a manual
multimethod dispatch. I just wish there could be a greater sharing between
this approach and the higher-level approach involving MMD on PMC's, since
they aren't entirely seperate. But I guess I'm entering the realm of
wishful thinking...

> Why? They're base types, really. I'd like to add in support for
> complex numbers too, but I've not yet gone that far. Might next week,
> you never know.

Say there is now complex support and bignum support in the core Parrot
engine. In order to have the most complete implementation of complex, so
that the user couldn't possibly have a need for anything else (bit of
sarcasm in that..), complex would need to use bignums for both elements,
or at least handle the switch between ints, bigints, and bignums. If
bignum*complex goes to the bignum vtable, and complex*bignum goes to the
other, these two different classes now need to operate on each other, and
include distinct support for the other type, in their own class, in fact
including the same code in both places because of their similarity. MMD's
approach would allow for that, but it would also the same function to be
used for both variants of the operation if you so desired, where it
handled the interaction between the two distinct mathematical types in a
bignumcomplexinteraction file, instead of splitting it between bignum and
complex.

> Complex numbers are an issue. Matrices aren't--they're a form of
> aggregate, and they get handled elsewhere.

Complex and bignum's support most of the mathematical needs. But are these
part of a 'core' system that's Parrot? I could see an argument for using
them as a core Parrot library, but I can't see any logic for how these fit
into core Parrot itself. In fact, by making these types of things
part of the core, you make them ignorant of non-core things, and can
actually make other operations more difficult (see next paragraph).

As an example for matrices, however:
Say we have the matrix PMC, and the bignum PMC. Now I multiply a bignum *
matrix, or matrix * bignum. In these two cases, one goes to the bignum
vtable, and one goes to the matrix vtable. Since bignum is in the core,
it's very unlikely to have knowledge about this user-created matrix PMC.
How then, will the user ever be able to make a bignum

Multimethod dispatch for parrot?

2002-03-07 Thread Michel J Lambert


Heya,

I was curious if anyone has ever considered implementing multimethod
dispatch (MMD) directly into parrot. In my opinion, this would provide
several benefits over the current system. While (IMHO) MMD provides many
benefits over the current system in terms of extensibility and ease of
maintainence, it is still a superset of single-dispatch systems, and could
easily be made to support non-MMD languages like Python, Perl6, etc.

Parrot has been doing some questionable things, in my mind, to get around
the lack of multimethod dispatch. First, there are basic native types such
as num, int, and string, which I'm perfectly fine with. But what bothers
me is the fact that bigint's and bignum's are being given a special place
in the vtable. If we can't get these classes working with the base types
properly without putting them in the core and vtable itself, how can we
expect users to make such things as complex numbers and matrices mesh with
the system?

In MMD, A.add(B) in OOP terms would be dispatched to an add function
specifically registered for A and B, instead of just calling add on A.
This difference is not apparent when dealing with the parrot native types,
since they already provide different functions for add_int and add_num.
But if you're a PMC, as just about every perl variable will be, you get
the generic treatment and are given an add, or at best, an add_same. I'm
proposing to extend that to allow there to be different add functions for
specific PMC types.

Some might argue that this leads to an explosion of type registrations,
since a single class must register itself with every type out there. This
could be solved by allowing them to register a given function for PMC's in
general (the existing way, basically), and allow a few specific functions
for specific types they would like to handle directly.

MMD would also allow users who had code that dealt with Math::BigInt and
Math::BigInteger to write routines to coerce between the two types
*without* requiring participation of the original module writers or
editing of those packages. They could write something for
add(BigInt,BigInteger), coercion functions, and suddenly their two
incompatible pieces of code will be able to work together, hopefully
seamlessly.

With parrot, this could be implemented to allow Perl* types to interact
with Parrot* types, providing automatic conversion and interaction between
the two types of PMCs, by whoever decides to make the interface work
between the two languages. This wouldn't require any additions to the
perl* types or parrot* types, but could be written independantly, and
registered with the parrot system at loadtime.

Others might say that multimethod dispatch is too slow. There are numerous
papers out there on implementing multimethod dispatch efficiently, such
as:
http://citeseer.nj.nec.com/amiel96fast.html
http://citeseer.nj.nec.com/495443.html
http://citeseer.nj.nec.com/ferragina99multimethod.html
http://citeseer.nj.nec.com/330853.html
http://citeseer.nj.nec.com/pang99multimethod.html
and so on and so on... I haven't read them myself, but a quick search
proved to find many available papers out there.

Regardless, MMD will still be slower than the current parrot vtable
dispatch system, and that point is unavoidable. One could argue that
vtable dispatch is slower than a generic 'add' function, too. But the
benefits provided by vtables to avoid spaghetti if/else code more than
outweigh the disadvantages. I'd argue that the same logic would apply to
MMD as well.

Finally, here's a real-world example of code that could be improved. :)
perlint needs to handle addition to strings, nums, ints, perlstrings,
perlnums, perlints, and other pmc's. The first three are covered by their
individual functions, the next two are implemented by if/else's in the
generic add function, and perlints get called via add_same. This variety
of function names and approachs just seems wasteful, and MMD could
potentially simplify a lot of this into one function per type, with a
consistent scheme.

Any thoughts or comments on this? Is this a case of 'patches welcome' for
the current system, but no one's seen a need for it? Or is there an
argument against implementing MMD in the core parrot?

Sorry for the long email, but I wanted to get this off my chest in case
it's possible to change things, before we get too ingrained with the
current method.

Thanks,
Mike Lambert

49 matches

Mail list logo