# New Ticket Created by  Mike Lambert 
# Please include the string:  [perl #16269]
# in the subject line of all future correspondence about this issue. 
# <URL: http://rt.perl.org/rt2/Ticket/Display.html?id=16269 >


I finally was able to get Peter's old COW patch up and running with our
current codebase. Some of the difficulty involved me attempting to
change some of the semantics in an attempt to support the
directions Parrot is going.

However, in the interest of saving someone from updating yet-another COW
patch in the long-distant future because this wasn't applied (similar to
how Peter's patch got left to bit-rot), can we try to reach a closure on
this patch (either apply or don't apply) so that further development may
continue?


Patch is attached. It does a "great" bufstart->strstart renaming.
Currently, I think bufused is irrelevant in favor of strlen, but I did not
remove it, just in case.

Most of the relevant changes occurred in resources.c and string.c
resources.c has the new COW-aware copying code, and string.c includes
support for handling COW properly with the current string functions. There
is more work to be done in the string.c department as far as enabling
additional functions to use COW. This should hopefully give an even
greater speedup in the future.


Changes versus Peter's old COW implementation:

- supports COW of constant strings. This created some problems because
constant headers were never copied, and COW headers were. When they
shared buffers, there was a bit of pool confusion. To alleviate this
problem, I've implemented a BUFFER_selfpoolptr_FLAG, which indicates
whether or not the buffer points to data within the buffer's header
pool's memory pool's contents. In other words, the flag indicates
whether or not the data pointed to by bufstart should be copied or not.

- supports COW on all buffers. This was implemented by making a
BUFFER_strstart_FLAG. If this is set, the GC copying code casts the buffer
to a STRING*, and updates the strstart field to reflect the new memory
location. The old COW code did this just for strings, because there was a
different collection loop for strings "back then". This means that arrays,
et al, can be COW if the support is set up right.

- renamed some of the COW API in string.c to make more sense to myself.
Hopefully that indicates it will make more sense to others. It basically
involves unmake_COW (creates an editable version of our buffer data), and
make_COW_reference (creates a COW string header pointing at the passed
in's data). This covers the majority of the cases.


Benchmarks:
Running the GC benchmarks I always run, gives the following results.
relative is parrot-cow/parrot-cvs. Lower is better.

                                parrot-cvs (s)  parrot-cow (s)  relative
gc_alloc_new.pbc                6.34            6.46            1.02
gc_alloc_reuse.pbc              15.60           15.43           0.99
gc_generations.pbc              10.58           10.80           1.02
gc_header_new.pbc               5.77            5.78            1.00
gc_header_reuse.pbc             8.22            7.12            0.87
gc_waves_headers.pbc            9.88            9.45            0.96
gc_waves_sizeable_data.pbc      20.93           6.66            0.32
gc_waves_sizeable_headers.pbc   5.80            7.24            1.25

Overall, against an 8.0 baseline, the new cow code is a 7.42 (sum of above
relatives). Note the big improvement in the waves_sizeable_data code that
COWs lots of the same string in sizeable_data, and the drop in performance
of gc_waves_sizeable_headers, that's mostly header manipulation.

On life.pbc with 5000 generations, we get:
CVS: 6.98 sec
COW: 6.24 sec

A 10% improvement. (Sorry about claiming a 17% in the previous email.
Looks like 500 generations wasn't enough to get an accurate count. :( )

On hanoi 16, we get:
CVS: 8.74 sec
COW: 9.18 sec
A 5% decrease. Not good. :| I've not looked at the hanoi code to venture
any guesses as to why there is a performance drop.


Thoughts, comments? I'm hoping to see how Peter's code compares against
mine in terms of speed on the various benchmarks. Hopefully his
has better performance on hanoi, and it is reasonably feasible to
merge the two patches taking the best of both. Otherwise we're going to
need to decide if a performance drop on hanoi and small drops on the
various GC benchmarks is enough to warrant this patch not being
applied.

Thanks,
Mike Lambert



-- attachment  1 ------------------------------------------------------
url: http://rt.perl.org/rt2/attach/34137/27884/3ea88f/cow.patch

Index: core.ops

===================================================================

RCS file: /cvs/public/parrot/core.ops,v

retrieving revision 1.198

diff -u -r1.198 core.ops

--- core.ops    14 Aug 2002 01:43:13 -0000      1.198

+++ core.ops    17 Aug 2002 08:20:06 -0000

@@ -166,9 +166,9 @@

   }

 

   $1 = string_make(interpreter, NULL, 65535, NULL, 0, NULL);

-  memset(($1)->bufstart, 0, 65535);

-  fgets(($1)->bufstart, 65534, file);

-  ($1)->strlen = ($1)->bufused = strlen(($1)->bufstart);

+  memset(($1)->strstart, 0, 65535);

+  fgets(($1)->strstart, 65534, file);

+  ($1)->strlen = ($1)->bufused = strlen(($1)->strstart);

   goto NEXT();

 }

 

@@ -354,7 +354,7 @@

   UINTVAL len = $3;

 

   s = string_make(interpreter, NULL, len, NULL, 0, NULL);

-  read($2, s->bufstart, len);

+  read($2, s->strstart, len);

   s->bufused = len;

   $1 = s;

   goto NEXT();

@@ -418,7 +418,7 @@

 op write(in INT, in STR) {

   STRING * s = $2;

   UINTVAL count = string_length(s);

-  write($1, s->bufstart, count);

+  write($1, s->strstart, count);

   goto NEXT();

 }

 

@@ -2256,7 +2256,7 @@

         t = string_make(interpreter, buf, (UINTVAL)(len - s->buflen), NULL, 0, NULL); 


         $1 = string_concat(interpreter, $1, s, 1);

     } else {

-        t = string_make(interpreter, s->bufstart, (UINTVAL)len, NULL, 0, NULL); 

+        t = string_make(interpreter, s->strstart, (UINTVAL)len, NULL, 0, NULL); 

     }

     $1 = string_concat(interpreter, $1, t, 1);

 

@@ -2281,7 +2281,7 @@

     }

 

     /* XXX this is EVIL, use string_replace */

-    n = $1->bufstart;

+    n = $1->strstart;

     t = string_to_cstring(interpreter, s);

     for (i = $4; i < $4 + $2; i++)

         n[i] = t[i - $4]; 

@@ -3891,7 +3891,7 @@

   switch ($3) {

     case STRINGINFO_HEADER:   $1 = PTR2UINTVAL($2);

                               break;

-    case STRINGINFO_BUFSTART: $1 = PTR2UINTVAL($2->bufstart);

+    case STRINGINFO_STRSTART: $1 = PTR2UINTVAL($2->strstart);

                               break;

     case STRINGINFO_BUFLEN:   $1 = $2->buflen;

                               break;

@@ -4163,13 +4163,13 @@

   void (*func)(void);

   string_to_cstring(interpreter, ($2));

   string_to_cstring(interpreter, ($1));

-  p = Parrot_dlopen($1->bufstart);

+  p = Parrot_dlopen($1->strstart);

   if(p == NULL) {

      const char * err = Parrot_dlerror();

      fprintf(stderr, "%s\n", err);

      PANIC("Failed to load native library");

   }

-  func = D2FPTR(Parrot_dlsym(p, $2->bufstart));

+  func = D2FPTR(Parrot_dlsym(p, $2->strstart));

   if (NULL == func) {

     PANIC("Failed to find symbol in native library");

   }

Index: debug.c

===================================================================

RCS file: /cvs/public/parrot/debug.c,v

retrieving revision 1.24

diff -u -r1.24 debug.c

--- debug.c     23 Jul 2002 15:07:10 -0000      1.24

+++ debug.c     17 Aug 2002 08:20:08 -0000

@@ -696,7 +696,7 @@

                         constants[pc[j]]->string->strlen)

                     {

                         escaped = PDB_escape(interpreter->code->const_table->

-                                         constants[pc[j]]->string->bufstart,

+                                         constants[pc[j]]->string->strstart,

                                              interpreter->code->const_table->

                                          constants[pc[j]]->string->strlen);

                         if (escaped)

Index: dod.c

===================================================================

RCS file: /cvs/public/parrot/dod.c,v

retrieving revision 1.13

diff -u -r1.13 dod.c

--- dod.c       17 Aug 2002 01:11:08 -0000      1.13

+++ dod.c       17 Aug 2002 08:20:08 -0000

@@ -291,9 +291,8 @@

         interpreter->arena_base->pmc_pool->total_objects - total_used;

 }

 

-/* Put any free buffers that aren't on the free list on the free list 

- * Free means: not 'live' and not immune 

- * Temporary immunity is also granted to newborns */

+/* Put any buffers that are now unused, on to the free list

+ * Avoid buffers that are immune from collection (ie, constant) */

 static void

 free_unused_buffers(struct Parrot_Interp *interpreter, 

                     struct Small_Object_Pool *pool)

@@ -308,12 +307,12 @@

          cur_arena = cur_arena->prev) {

         Buffer *b = cur_arena->start_objects;

         for (i = 0; i < cur_arena->used; i++) {

-            /* If it's not live or on the free list, put it on the free list */

-            if (!(b->flags & (BUFFER_live_FLAG | BUFFER_on_free_list_FLAG)) &&

-                (!(b->flags & BUFFER_constant_FLAG) || 

-                 (b->flags & BUFFER_COW_FLAG))) 

+            /* If this thing is not live and not dead yet, make it dead now. */

+            if (!(b->flags & ( BUFFER_on_free_list_FLAG

+                             | BUFFER_constant_FLAG

+                             | BUFFER_live_FLAG )))

             {

-                if (pool->mem_pool) {

+                 if (!(b->flags & BUFFER_COW_FLAG) && pool->mem_pool) {

                     ((struct Memory_Pool *)pool->mem_pool)->reclaimable += b->buflen;

                 }

                 add_free_buffer(interpreter, pool, b);

Index: hash.c

===================================================================

RCS file: /cvs/public/parrot/hash.c,v

retrieving revision 1.21

diff -u -r1.21 hash.c

--- hash.c      17 Aug 2002 01:11:08 -0000      1.21

+++ hash.c      17 Aug 2002 08:20:09 -0000

@@ -81,9 +81,9 @@

 static INTVAL

 key_hash(Interp *interpreter, STRING *value)

 {

-    char *buffptr = value->bufstart;

-    UINTVAL len = value->bufused;

-    register UINTVAL hash = 5381;

+    char *buffptr = value->strstart;

+    INTVAL len = value->strlen;

+    register INTVAL hash = 5381;

 

     UNUSED(interpreter);

 

Index: headers.c

===================================================================

RCS file: /cvs/public/parrot/headers.c,v

retrieving revision 1.7

diff -u -r1.7 headers.c

--- headers.c   12 Aug 2002 06:55:02 -0000      1.7

+++ headers.c   17 Aug 2002 08:20:10 -0000

@@ -100,7 +100,7 @@

     

     /* Don't let it point to garbage memory */

     buffer->bufstart = NULL;

-    buffer->flags = 0;

+    buffer->flags = BUFFER_selfpoolptr_FLAG;

     

     return buffer;

 }

@@ -120,8 +120,11 @@

 struct Small_Object_Pool *

 new_pmc_pool(struct Parrot_Interp *interpreter)

 {

+    STRING *name = string_make(NULL, "PMC Pool", 

+                               strlen("PMC Pool"), 0, 0, 0);

     struct Small_Object_Pool *pmc_pool = new_small_object_pool(

-                      interpreter, sizeof(PMC), PMC_HEADERS_PER_ALLOC);

+                      interpreter, sizeof(PMC), PMC_HEADERS_PER_ALLOC, name);

+add_extra_buffer_header(interpreter,name);

     pmc_pool->add_free_object = add_free_pmc;

     pmc_pool->get_free_object = get_free_pmc;

     pmc_pool->alloc_objects = alloc_pmcs;

@@ -136,11 +139,14 @@

 new_bufferlike_pool(struct Parrot_Interp *interpreter,

                         size_t actual_buffer_size)

 {

+    STRING *name = string_make(NULL, "Buffer-like Pool", 

+                               strlen("Buffer-like Pool"), 0, 0, 0);

     size_t buffer_size = 

         (actual_buffer_size + sizeof(void*) - 1) & ~(sizeof(void*) - 1);

     struct Small_Object_Pool *pool = 

         new_small_object_pool(interpreter, buffer_size,

-                          BUFFER_HEADERS_PER_ALLOC);

+                          BUFFER_HEADERS_PER_ALLOC, name);

+add_extra_buffer_header(interpreter,name);

     pool->add_free_object = add_free_buffer;

     pool->get_free_object = get_free_buffer;

     pool->alloc_objects = alloc_buffers;

@@ -155,6 +161,10 @@

 {

     struct Small_Object_Pool *pool = 

         new_bufferlike_pool(interpreter, sizeof(Buffer));

+    STRING *name = string_make(NULL, "Buffer Pool", 

+                               strlen("Buffer Pool"), 0, 0, 0);

+add_extra_buffer_header(interpreter,name);

+    pool->name = name;

     return pool;

 }

 

@@ -163,6 +173,10 @@

 {

     struct Small_Object_Pool *pool = 

         new_bufferlike_pool(interpreter, sizeof(STRING));

+    STRING *name = string_make(NULL, "String Pool", 

+                               strlen("String Pool"), 0, 0, 0);

+add_extra_buffer_header(interpreter,name);

+    pool->name = name;

     pool->objects_per_alloc = STRING_HEADERS_PER_ALLOC;

     pool->align_1 = STRING_ALIGNMENT-1;

     if (constant) {

@@ -233,7 +247,8 @@

                              ? interpreter->arena_base->constant_string_header_pool

                              : interpreter->arena_base->string_header_pool

                              );

-    string->flags |= flags;

+    string->flags |= flags | BUFFER_strstart_FLAG;

+    string->strstart = 0;

     return string;

 }

 

Index: interpreter.c

===================================================================

RCS file: /cvs/public/parrot/interpreter.c,v

retrieving revision 1.96

diff -u -r1.96 interpreter.c

--- interpreter.c       17 Aug 2002 01:11:08 -0000      1.96

+++ interpreter.c       17 Aug 2002 08:20:10 -0000

@@ -43,7 +43,7 @@

         const char *fp_data;

         INTVAL fp_len;

 

-        fp_data = PCONST(0)->string->bufstart;

+        fp_data = PCONST(0)->string->strstart;

         fp_len = PCONST(0)->string->buflen;

 

         if (strncmp(OPCODE_FINGERPRINT, fp_data, fp_len)) {

Index: io.ops

===================================================================

RCS file: /cvs/public/parrot/io.ops,v

retrieving revision 1.10

diff -u -r1.10 io.ops

--- io.ops      9 Jun 2002 16:34:43 -0000       1.10

+++ io.ops      17 Aug 2002 08:20:11 -0000

@@ -110,7 +110,7 @@

   ParrotIO * io;

   io = (ParrotIO*)($1->data);

   if ($2 && io) {

-    PIO_write(interpreter, io, ($2)->bufstart, string_length($2));

+    PIO_write(interpreter, io, ($2)->strstart, string_length($2));

   }

   goto NEXT();

 }

@@ -125,7 +125,7 @@

 

 op printerr(in STR) {

   if ($1) {

-    PIO_write(interpreter, PIO_STDERR(interpreter), ($1)->bufstart,

+    PIO_write(interpreter, PIO_STDERR(interpreter), ($1)->strstart,

                        string_length($1));

   }

   goto NEXT();

@@ -148,7 +148,7 @@

 

 op puts(in STR) {

   if (($1) && string_length($1)) {

-    PIO_write(interpreter, PIO_STDOUT(interpreter), ($1)->bufstart,

+    PIO_write(interpreter, PIO_STDOUT(interpreter), ($1)->strstart,

                        string_length($1));

   }

   goto NEXT();

@@ -157,7 +157,7 @@

 op puts(in INT) {

   STRING * s = string_from_int(interpreter, $1);

   if (string_length(s)) {

-    PIO_write(interpreter, PIO_STDOUT(interpreter), s->bufstart,

+    PIO_write(interpreter, PIO_STDOUT(interpreter), s->strstart,

                        string_length(s));

   }

   goto NEXT();

@@ -166,7 +166,7 @@

 op puts(in NUM) {

   STRING * s = Parrot_sprintf_c(interpreter, "%Vf", $1);

   if (string_length(s)) {

-    PIO_write(interpreter, PIO_STDOUT(interpreter), s->bufstart,

+    PIO_write(interpreter, PIO_STDOUT(interpreter), s->strstart,

                        string_length(s));

   }

   goto NEXT();

@@ -198,8 +198,8 @@

   else

     n = $2; 

   $1 = string_make(interpreter, NULL, n, NULL, 0, NULL);

-  memset(($1)->bufstart, 0, n);

-  nr = PIO_read(interpreter, PIO_STDIN(interpreter), ($1)->bufstart, (size_t)n);

+  memset(($1)->strstart, 0, n);

+  nr = PIO_read(interpreter, PIO_STDIN(interpreter), ($1)->strstart, (size_t)n);

   if(nr > 0)

     ($1)->strlen = ($1)->bufused = nr;

   else

@@ -215,8 +215,8 @@

   else

     n = $3; 

   $1 = string_make(interpreter, NULL, n, NULL, 0, NULL);

-  memset(($1)->bufstart, 0, n);

-  nr = PIO_read(interpreter, (ParrotIO*)($2->data), ($1)->bufstart, (size_t)n);

+  memset(($1)->strstart, 0, n);

+  nr = PIO_read(interpreter, (ParrotIO*)($2->data), ($1)->strstart, (size_t)n);

   if(nr > 0)

     ($1)->strlen = ($1)->bufused = nr;

   else

Index: misc.c

===================================================================

RCS file: /cvs/public/parrot/misc.c,v

retrieving revision 1.22

diff -u -r1.22 misc.c

--- misc.c      11 Jun 2002 20:11:51 -0000      1.22

+++ misc.c      17 Aug 2002 08:20:11 -0000

@@ -495,7 +495,7 @@

     STRING *ret = Parrot_vsprintf_c(interpreter, pat, args);

 /*    string_transcode(interpreter, ret, NULL, NULL, &ret);*/

 

-    memcpy(targ, ret->bufstart, ret->bufused);

+    memcpy(targ, ret->strstart, ret->bufused);

     targ[ret->bufused + 1] = 00;

 }

 

@@ -510,7 +510,7 @@

         len = ret->bufused;

     }

 

-    memcpy(targ, ret->bufstart, len);

+    memcpy(targ, ret->strstart, len);

     targ[len + 1] = 0;

 }

 

Index: packdump.c

===================================================================

RCS file: /cvs/public/parrot/packdump.c,v

retrieving revision 1.2

diff -u -r1.2 packdump.c

--- packdump.c  17 Mar 2002 06:44:41 -0000      1.2

+++ packdump.c  17 Aug 2002 08:20:11 -0000

@@ -95,7 +95,7 @@

         printf("        SIZE     => %ld,\n", (long)self->string->bufused);

         /* TODO: Won't do anything reasonable for most encodings */

         printf("        DATA     => '%.*s'\n",

-               (int)self->string->bufused, (char *)self->string->bufstart);

+               (int)self->string->bufused, (char *)self->string->strstart);

         printf("    } ],\n");

         break;

 

Index: packout.c

===================================================================

RCS file: /cvs/public/parrot/packout.c,v

retrieving revision 1.9

diff -u -r1.9 packout.c

--- packout.c   23 Jul 2002 02:09:27 -0000      1.9

+++ packout.c   17 Aug 2002 08:20:12 -0000

@@ -276,8 +276,8 @@

          * characters to ensure padding.  */

         charcursor = (char *)cursor;

 

-        if (self->string->bufstart) {

-            mem_sys_memcopy(charcursor, self->string->bufstart,

+        if (self->string->strstart) {

+            mem_sys_memcopy(charcursor, self->string->strstart,

                             self->string->bufused);

             charcursor += self->string->bufused;

 

Index: resources.c

===================================================================

RCS file: /cvs/public/parrot/resources.c,v

retrieving revision 1.78

diff -u -r1.78 resources.c

--- resources.c 12 Aug 2002 06:55:02 -0000      1.78

+++ resources.c 17 Aug 2002 08:20:12 -0000

@@ -41,7 +41,7 @@

 

     /* Allocate a new block. Header info's on the front, plus a fudge

      * factor for good measure */

-    new_block = mem_sys_allocate(sizeof(struct Memory_Block) + 

+    new_block = mem_sys_allocate(sizeof(struct Memory_Block) +

                                  alloc_size + 32);

     if (!new_block) {

         return NULL;

@@ -70,20 +70,31 @@

     return new_block;

 }

 

+/* Allocates memory for headers */

 static void *

 mem_allocate(struct Parrot_Interp *interpreter, size_t *req_size,

              struct Memory_Pool *pool, size_t align_1)

 {

     char *return_val;

     size_t size = *req_size;

+

+    /* Ensure that our minimum size requirements are met, 

+     * so that we have room for a forwarding COW pointer */

+    if( size < sizeof(void*) )

+        size = sizeof(void*);

+

+    /* Make sure we have room for the buffer's tail flags,

+     * also used by the COW logic to detect moved buffers */

+    size += sizeof(struct Buffer_Tail);

+

+    /* Round up to requested alignment */

+    size = (size + align_1) & ~align_1;

+

     if (NULL == interpreter) {

         void *mem = mem_sys_allocate(size);

         return mem;

     }

 

-    /* Round up to requested alignment */

-    size = (size + align_1) & ~align_1;

-

     /* If not enough room, try to find some */

     if (pool->top_block == NULL) {

         alloc_new_block(interpreter, size, pool);

@@ -124,7 +135,8 @@

     return_val = pool->top_block->top;

     pool->top_block->top += size;

     pool->top_block->free -= size;

-    *req_size = size;

+    *req_size = size - sizeof(struct Buffer_Tail);

+    ((struct Buffer_Tail*)((char *)return_val + size - 1))->flags = 0;

     return (void *)return_val;

 }

 

@@ -162,6 +174,7 @@

     /* total_size = pool->total_allocated; */

     /* TODO: can reduce this by pool->total_reclaimable if we want to 

      * be precise */

+

     /* Snag a block big enough for everything */

     new_block = alloc_new_block(interpreter, total_size, pool);

   

@@ -184,25 +197,70 @@

             Buffer *b = cur_buffer_arena->start_objects;

             UINTVAL i;

             for (i = 0; i < cur_buffer_arena->used; i++) {

-                if (b->bufstart) {

-                    /* Is the buffer live, and can we move it? */

-                    if (!(b->flags & (BUFFER_on_free_list_FLAG | 

-                                      BUFFER_constant_FLAG | 

-                                      BUFFER_immobile_FLAG))) 

-                    {

-                        memcpy(cur_spot, b->bufstart, b->buflen);

+                if (b->bufstart && 

+                    !(b->flags & ( BUFFER_on_free_list_FLAG

+                                 | BUFFER_constant_FLAG

+                                 | BUFFER_immobile_FLAG

+                                 | BUFFER_external_FLAG

+                                 ))) {

+                    struct

+                                               Buffer_Tail *tail = 

+                        (struct Buffer_Tail *)((char *)b->bufstart + b->buflen);

+                    ptrdiff_t offset = (ptrdiff_t)((STRING*)b)->strstart - 
+(ptrdiff_t)b->bufstart;

+                    /* buffer has already been moved; just change the header */

+                    if (b->flags & BUFFER_COW_FLAG

+                       && tail->flags & TAIL_moved_FLAG) {

+                        /* Find out who else references our data */

+                        Buffer* hdr = *(Buffer**)(b->bufstart);

+                        /* Make sure they know that we own it too */

+                        hdr->flags |= BUFFER_COW_FLAG;

+                        /* Now make sure we point to where the other guy does */

+                        b->bufstart = hdr->bufstart;

+                        /* And if we're a string, update strstart */

+                        /* Somewhat of a hack, but if we get per-pool collections, 

+                         * it should help ease the pain */

+                        if (b->flags & BUFFER_strstart_FLAG) {

+                            ((STRING*)b)->strstart = (char *)b->bufstart + offset;

+                        }

+                    }

+                    else 

+                    if (b->flags & BUFFER_selfpoolptr_FLAG) {

+                        struct Buffer_Tail *new_tail = 

+                               (struct Buffer_Tail *)((char *)cur_spot + b->buflen);

+                        /* Copy our memory to the new pool */

+                        memcpy(cur_spot, b->bufstart, 

+                               b->buflen);

+                        new_tail = 0;

+                        /* If we're COW */

+                        if (b->flags & BUFFER_COW_FLAG) {

+                          /* Let the old buffer know how to find us */

+                          *(Buffer**)(b->bufstart) = b;

+                          /* No guaranatees that our data is still COW, 

+                           * so assume not, and let the above code fix-up */

+                          b->flags &= ~BUFFER_COW_FLAG;

+                          /* Finally, let the tail know that we've moved,

+                           * so that any other references can know to look

+                           * for us and not re-copy */

+                          tail->flags |= TAIL_moved_FLAG;

+                        }

                         b->bufstart = cur_spot;

-                        cur_size = b->buflen;

+                        if (b->flags & BUFFER_strstart_FLAG) {

+                            ((STRING*)b)->strstart = (char *)b->bufstart + offset;

+                        }

+                        cur_size = b->buflen + sizeof(struct Buffer_Tail);

                         cur_size = (cur_size + header_pool->align_1) & 
~header_pool->align_1;

                         cur_spot += cur_size;

                     }

                 }

-                b = (Buffer *)((char *)b + object_size);

+                b = (Buffer *)((char*)b + object_size);

             }

         }

     }

 

     /* Run through all the out-of-band Buffer header pools and copy */

+    /* This code ignores COW, for now. This essentially means that if 

+     * any other buffers COW-reference data with the buffers below, 

+     * that data will get duplicated during this collection run. */

     for (j = 0; j < (INTVAL)( interpreter->arena_base->extra_buffer_headers.buflen / 
sizeof(Buffer*) ); j++) {

         Buffer** buffers = interpreter->arena_base->extra_buffer_headers.bufstart;

         Buffer* b = buffers[j];

@@ -211,11 +269,18 @@

                               BUFFER_constant_FLAG | 

                               BUFFER_immobile_FLAG)))

             {

+                struct Buffer_Tail *new_tail = 

+                       (struct Buffer_Tail *)((char *)cur_spot + b->buflen);

+                UINTVAL offset = (ptrdiff_t)((STRING*)b)->strstart - 
+(ptrdiff_t)b->bufstart;

                 memcpy(cur_spot, b->bufstart, b->buflen);

+                new_tail->flags = 0;

                 b->bufstart = cur_spot;

                 cur_size = b->buflen;

                 cur_size = (cur_size + BUFFER_ALIGNMENT - 1) & ~(BUFFER_ALIGNMENT - 
1);

                 cur_spot += cur_size;

+                if (b->flags & BUFFER_strstart_FLAG) {

+                    ((STRING*)b)->strstart = (char *)b->bufstart + offset;

+                }

             }

         }

     }

@@ -282,8 +347,10 @@

     buffer = from;

     copysize = (buffer->buflen > tosize ? tosize : buffer->buflen);

     if (interpreter) {

-        interpreter->arena_base->memory_pool->reclaimable +=

-            buffer->buflen;

+        if (!(buffer->flags & BUFFER_COW_FLAG)) {

+            interpreter->arena_base->memory_pool->reclaimable +=

+                buffer->buflen;

+        }

         mem = mem_allocate(interpreter, &alloc_size, 

                            interpreter->arena_base->memory_pool, BUFFER_ALIGNMENT-1);

     }

@@ -320,7 +387,9 @@

     pool = (str->flags & BUFFER_constant_FLAG)

          ? interpreter->arena_base->constant_string_pool

          : interpreter->arena_base->memory_pool;

-    pool->reclaimable += str->buflen;

+    if (!(str->flags & BUFFER_COW_FLAG)) {

+        pool->reclaimable += str->buflen;

+    }

 

     mem = mem_allocate(interpreter, &alloc_size, pool, STRING_ALIGNMENT-1);

     if (!mem) {

@@ -330,10 +399,11 @@

      * track down those bugs, this can be removed which would make

      * things cheaper */

     if (copysize) {

-        memcpy(mem, str->bufstart, copysize);

+         memcpy(mem, str->bufstart, copysize);

     }

     str->bufstart = mem;

     str->buflen = alloc_size;

+    str->strstart = str->bufstart;

     return mem;

 }

 

@@ -361,6 +431,7 @@

 

     str->buflen = 0;

     str->bufstart = NULL;

+    str->strstart = NULL;

 

     if (!interpreter) {

         str->bufstart = mem_allocate(NULL, &req_size, NULL, STRING_ALIGNMENT-1);

@@ -372,6 +443,7 @@

         str->bufstart = mem_allocate(interpreter, &req_size, pool, 
STRING_ALIGNMENT-1);

     }

     str->buflen = req_size;

+    str->strstart = str->bufstart;

     return str;

 }

 

Index: smallobject.c

===================================================================

RCS file: /cvs/public/parrot/smallobject.c,v

retrieving revision 1.9

diff -u -r1.9 smallobject.c

--- smallobject.c       12 Aug 2002 06:55:02 -0000      1.9

+++ smallobject.c       17 Aug 2002 08:20:13 -0000

@@ -143,7 +143,7 @@

 

 struct Small_Object_Pool *

 new_small_object_pool(struct Parrot_Interp *interpreter,

-                  size_t object_size, size_t objects_per_alloc)

+                  size_t object_size, size_t objects_per_alloc, STRING *name)

 {

     struct Small_Object_Pool *pool;

 

@@ -159,6 +159,8 @@

     pool->get_free_object = get_free_object;

     pool->alloc_objects = alloc_objects;

     pool->mem_pool = NULL;

+    pool->name = name;

+    pool->name->flags |= BUFFER_constant_FLAG;

     return pool;

 }

 

Index: string.c

===================================================================

RCS file: /cvs/public/parrot/string.c,v

retrieving revision 1.83

diff -u -r1.83 string.c

--- string.c    23 Jul 2002 02:09:27 -0000      1.83

+++ string.c    17 Aug 2002 08:20:14 -0000

@@ -17,6 +17,59 @@

 

 #define EXTRA_SIZE 4

 

+/* String COW support */

+static void

+unmake_COW(struct Parrot_Interp *interpreter, STRING *s)

+{

+#if 0

+    if (s->flags & BUFFER_constant_FLAG) {

+        /* this happens when we call string_to_cstring on 

+         * a constant string in order to print it

+         */

+        internal_exception(INVALID_OPERATION,

+                           "Cannot unmake COW on a constant header");

+    }

+    else

+#endif

+    if (s->flags & (BUFFER_COW_FLAG|BUFFER_constant_FLAG)) {

+        UINTVAL offset = (char *)s->strstart - (char *)s->bufstart;

+        /* Create new pool data for this header to use, 

+         * independant of the original COW data */

+        Parrot_reallocate_string(interpreter, s, s->buflen);

+        s->strstart = (char *)s->bufstart + offset;

+        s->flags &= ~(UINTVAL)(BUFFER_COW_FLAG | BUFFER_constant_FLAG);

+    }

+}

+

+static void

+make_COW(struct Parrot_Interp *interpreter, STRING *s)

+{

+    s->flags &= BUFFER_COW_FLAG;

+}

+

+/* clone a string header without allocating a new buffer

+ * i.e. create a 'copy-on-write' string

+ */

+static STRING *

+make_COW_reference(struct Parrot_Interp *interpreter, STRING *s)

+{

+    STRING *d;

+    if (s->flags & BUFFER_constant_FLAG) {

+        d = new_string_header(interpreter, 

+                              s->flags & ~(UINTVAL)BUFFER_constant_FLAG);

+        memcpy(d, s, sizeof (STRING));

+        d->flags |= BUFFER_COW_FLAG;

+        d->flags &= ~(UINTVAL)(BUFFER_constant_FLAG|BUFFER_selfpoolptr_FLAG);

+    }

+    else {

+        d = new_string_header(interpreter, s->flags);

+        s->flags |= BUFFER_COW_FLAG;

+        memcpy(d, s, sizeof (STRING));

+    }

+    return d;

+}

+

+

 /* Basic string stuff - creation, enlargement, destruction, etc. */

 

 /*=for api string string_init

@@ -62,9 +115,11 @@

             a = string_grow(interpreter, a, ((a->bufused + b->bufused)

                             - a->buflen) + EXTRA_SIZE);

         }

+        unmake_COW(interpreter, a);

+

         /* Tack B on the end of A */

-        mem_sys_memcopy((void *)((ptrcast_t)a->bufstart + a->bufused),

-                        b->bufstart, b->bufused);

+        mem_sys_memcopy((void *)((ptrcast_t)a->strstart + a->bufused),

+                        b->strstart, b->bufused);

         a->bufused += b->bufused;

         a->strlen += b->strlen;

         return a;

@@ -87,7 +142,7 @@

             a->encoding = b->encoding;

             a->type = b->type;

             a->language = b->language;

-            memcpy(a->bufstart, b->bufstart, b->bufused);

+            memcpy(a->strstart, b->strstart, b->bufused);

             return a;

         }

     }

@@ -121,7 +176,7 @@

     s->type = type;

 

     if (buffer) {

-        mem_sys_memcopy(s->bufstart, buffer, buflen);

+        mem_sys_memcopy(s->strstart, buffer, buflen);

         s->bufused = buflen;

         (void)string_compute_strlen(s);

     }

@@ -137,6 +192,7 @@

  */

 STRING *

 string_grow(struct Parrot_Interp * interpreter, STRING * s, INTVAL addlen) {

+    unmake_COW(interpreter, s);

     /* Don't check buflen, if we are here, we already checked. */

     Parrot_reallocate_string(interpreter, s, s->buflen + addlen);

     return s;

@@ -171,7 +227,7 @@

 INTVAL

 string_index(const STRING *s, UINTVAL idx)

 {

-    return s->encoding->decode(s->encoding->skip_forward(s->bufstart, idx));

+    return s->encoding->decode(s->encoding->skip_forward(s->strstart, idx));

 }

 

 /*=for api string string_ord

@@ -219,21 +275,9 @@

  * create a copy of the argument passed in

  */

 STRING *

-string_copy(struct Parrot_Interp *interpreter, const STRING *s)

+string_copy(struct Parrot_Interp *interpreter, STRING *s)

 {

-    STRING *d;

-    d = new_string_header(interpreter, 

-                          s->flags & ~(UINTVAL)BUFFER_constant_FLAG);

-    Parrot_allocate_string(interpreter, d, s->buflen);

-    d->bufused = s->bufused;

-    d->strlen = s->strlen;

-    d->encoding = s->encoding;

-    d->type = s->type;

-    d->language = s->language;

-

-    memcpy(d->bufstart, s->bufstart, s->buflen);

-

-    return d;

+    return make_COW_reference(interpreter, s);

 }

 

 /*=for api string string_transcode

@@ -241,7 +285,7 @@

  */

 STRING *

 string_transcode(struct Parrot_Interp *interpreter,

-                 const STRING *src, const ENCODING *encoding,

+                 STRING *src, const ENCODING *encoding,

                  const CHARTYPE *type, STRING **dest_ptr)

 {

     STRING *dest;

@@ -284,9 +328,9 @@

         }

     }

 

-    srcstart = (void *)src->bufstart;

+    srcstart = (void *)src->strstart;

     srcend = srcstart + src->bufused;

-    deststart = dest->bufstart;

+    deststart = dest->strstart;

     destend = deststart;

 

     while (srcstart < srcend) {

@@ -320,7 +364,7 @@

 INTVAL

 string_compute_strlen(STRING *s)

 {

-    s->strlen = s->encoding->characters(s->bufstart, s->bufused);

+    s->strlen = s->encoding->characters(s->bufstart, s->bufused) - 
+((UINTVAL)s->strstart - (UINTVAL)s->bufstart);

     return s->strlen;

 }

 

@@ -343,9 +387,9 @@

             }

             result = string_make(interpreter, NULL, a->bufused + b->bufused,

                                  a->encoding, 0, a->type);

-            mem_sys_memcopy(result->bufstart, a->bufstart, a->bufused);

-            mem_sys_memcopy((void *)((ptrcast_t)result->bufstart + a->bufused),

-                             b->bufstart, b->bufused);

+            mem_sys_memcopy(result->strstart, a->strstart, a->bufused);

+            mem_sys_memcopy((void *)((ptrcast_t)result->strstart + a->bufused),

+                             b->strstart, b->bufused);

             result->strlen = a->strlen + b->strlen;

             result->bufused = a->bufused + b->bufused;

         }

@@ -389,8 +433,8 @@

 

     /* copy s into dest num times */

     for (i = 0; i < num; i++) {

-        mem_sys_memcopy((void *)((ptrcast_t)dest->bufstart + s->bufused * i),

-                        s->bufstart, s->bufused);

+        mem_sys_memcopy((void *)((ptrcast_t)dest->strstart + s->bufused * i),

+                        s->strstart, s->bufused);

     }

 

     dest->bufused = s->bufused * num;

@@ -407,7 +451,7 @@

  * Allocate memory for d if necessary.

  */

 STRING *

-string_substr(struct Parrot_Interp *interpreter, const STRING *src,

+string_substr(struct Parrot_Interp *interpreter, STRING *src,

               INTVAL offset, INTVAL length, STRING **d)

 {

     STRING *dest;

@@ -425,7 +469,6 @@

         return string_make(interpreter, NULL, 0, src->encoding, 0, src->type);

     }

 

-    true_length = (UINTVAL)length;

     if (offset < 0) {

         true_offset = (UINTVAL)(src->strlen + offset);

     }

@@ -434,30 +477,29 @@

         internal_exception(SUBSTR_OUT_OF_STRING,

                            "Cannot take substr outside string");

     }

+

+    true_length = (UINTVAL)length;

     if (true_length > (src->strlen - true_offset)) {

         true_length = (UINTVAL)(src->strlen - true_offset);

     }

 

-    substart_off = (const char *)src->encoding->skip_forward(src->bufstart,

+    substart_off = (const char *)src->encoding->skip_forward(src->strstart,

                                                        true_offset) -

-        (char *)src->bufstart;

+        (char *)src->strstart;

     subend_off =

-        (const char *)src->encoding->skip_forward((char *)src->bufstart +

+        (const char *)src->encoding->skip_forward((char *)src->strstart +

                                             substart_off,

                                             true_length) -

-        (char *)src->bufstart;

-

-    dest =

-        string_make(interpreter, NULL, true_length * src->encoding->max_bytes,

-                    src->encoding, 0, src->type);

+        (char *)src->strstart;

 

     if (subend_off < substart_off) {

         internal_exception(SUBSTR_OUT_OF_STRING,

                            "subend somehow is less than substart");

     }

 

-    mem_sys_memcopy(dest->bufstart, (char *)src->bufstart + substart_off,

-                    (unsigned)(subend_off - substart_off));

+    /* do in-place if possible */

+    dest = make_COW_reference(interpreter,  src);

+    dest->strstart = (char *)dest->strstart + substart_off;

     dest->bufused = subend_off - substart_off;

     dest->strlen = true_length;

 

@@ -490,6 +532,8 @@

     UINTVAL true_length;

     INTVAL diff;

         

+    unmake_COW(interpreter, src);

+

     true_offset = (UINTVAL)offset;

     true_length = (UINTVAL)length;

 

@@ -516,14 +560,14 @@

     }

 

     /* Save the substring that is replaced for the return value */

-    substart_off = (const char *)src->encoding->skip_forward(src->bufstart,

+    substart_off = (const char *)src->encoding->skip_forward(src->strstart,

                                                        true_offset) -

-        (char *)src->bufstart;

+        (char *)src->strstart;

     subend_off =

-        (const char *)src->encoding->skip_forward((char *)src->bufstart +

+        (const char *)src->encoding->skip_forward((char *)src->strstart +

                                             substart_off,

                                             true_length) -

-        (char *)src->bufstart;

+        (char *)src->strstart;

 

     if (subend_off < substart_off) {

         internal_exception(SUBSTR_OUT_OF_STRING,

@@ -534,7 +578,7 @@

         string_make(interpreter, NULL, true_length * src->encoding->max_bytes,

                     src->encoding, 0, src->type);

 

-    mem_sys_memcopy(dest->bufstart, (char *)src->bufstart + substart_off,

+    mem_sys_memcopy(dest->strstart, (char *)src->strstart + substart_off,

                     (unsigned)(subend_off - substart_off));

     dest->bufused = subend_off - substart_off;

     dest->strlen = true_length;

@@ -555,14 +599,14 @@

         || ((INTVAL)src->bufused - (INTVAL)src->buflen) <= diff) {      

  

         if(diff != 0) {

-            mem_sys_memmove((char*)src->bufstart + substart_off + rep->bufused,

-                                (char*)src->bufstart + subend_off,

+            mem_sys_memmove((char*)src->strstart + substart_off + rep->bufused,

+                                (char*)src->strstart + subend_off,

                                 src->buflen - (subend_off - diff));

             src->bufused -= diff;

         }

 

-        mem_sys_memcopy((char*)src->bufstart + substart_off,

-                                rep->bufstart, rep->bufused);

+        mem_sys_memcopy((char*)src->strstart + substart_off,

+                                rep->strstart, rep->bufused);

         if(diff != 0) 

             (void)string_compute_strlen(src);    

     }

@@ -576,11 +620,11 @@

  

         /* Move the end of old string that isn't replaced to new offset

          * first */

-        mem_sys_memmove((char*)src->bufstart + subend_off + diff,

-                                (char*)src->bufstart + subend_off,

+        mem_sys_memmove((char*)src->strstart + subend_off + diff,

+                                (char*)src->strstart + subend_off,

                                 src->buflen - subend_off);

         /* Copy the replacement in */

-        mem_sys_memcopy((char *)src->bufstart + substart_off, rep->bufstart,

+        mem_sys_memcopy((char *)src->strstart + substart_off, rep->strstart,

                                 rep->bufused);

         src->bufused += diff;

         (void)string_compute_strlen(src);

@@ -596,8 +640,8 @@

 STRING *

 string_chopn(STRING *s, INTVAL n)

 {

-    const char *bufstart = s->bufstart;

-    const char *bufend = bufstart + s->bufused;

+    const char *strstart = s->strstart;

+    const char *bufend = strstart + s->bufused;

     UINTVAL true_n;

 

     true_n = (UINTVAL)n;

@@ -610,7 +654,7 @@

 

     bufend = s->encoding->skip_backward(bufend, true_n);

 

-    s->bufused = bufend - bufstart;

+    s->bufused = bufend - strstart;

     s->strlen = s->strlen - true_n;

 

     return s;

@@ -646,9 +690,9 @@

                               NULL);

     }

 

-    s1start = s1->bufstart;

+    s1start = s1->strstart;

     s1end = s1start + s1->bufused;

-    s2start = s2->bufstart;

+    s2start = s2->strstart;

     s2end = s2start + s2->bufused;

 

     while (cmp == 0 && s1start < s1end && s2start < s2end) {

@@ -686,7 +730,7 @@

 

     if (len == 1) {

 

-        UINTVAL c = s->encoding->decode(s->bufstart);

+        UINTVAL c = s->encoding->decode(s->strstart);

 

         if (s->type->is_digit(c) && s->type->get_digit(c) == 0) {

             return 0;

@@ -715,7 +759,7 @@

     INTVAL i = 0;

 

     if (s) {

-        const char *start = s->bufstart;

+        const char *start = s->strstart;

         const char *end = start + s->bufused;

         int sign = 1;

         INTVAL in_number = 0;

@@ -755,7 +799,7 @@

     FLOATVAL f = 0.0;

 

     if (s) {

-        const char *start = s->bufstart;

+        const char *start = s->strstart;

         const char *end = start + s->bufused;

         int sign = 1;

         INTVAL seen_dot = 0;

@@ -880,10 +924,16 @@

 {

     char *cstring;

 

-    if (s->buflen == s->bufused)

+    if (s->flags & BUFFER_constant_FLAG) {

+        s = make_COW_reference(interpreter,s);

+    }

+    unmake_COW(interpreter, s);

+

+       if (s->buflen == s->bufused) {

         string_grow(interpreter, s, 1);

+    }

 

-    cstring = s->bufstart;

+    cstring = s->strstart;

 

     cstring[s->bufused] = 0;

 

Index: trace.c

===================================================================

RCS file: /cvs/public/parrot/trace.c,v

retrieving revision 1.18

diff -u -r1.18 trace.c

--- trace.c     23 Jul 2002 02:09:27 -0000      1.18

+++ trace.c     17 Aug 2002 08:20:14 -0000

@@ -48,7 +48,7 @@

                 break;

             case PARROT_ARG_SC:

                 escaped = PDB_escape(interpreter->code->const_table->

-                                     constants[*(pc + i)]->string->bufstart,

+                                     constants[*(pc + i)]->string->strstart,

                                      interpreter->code->const_table->

                                      constants[*(pc + i)]->string->strlen);

                 fprintf(stderr, "\"%s\"", escaped);

@@ -73,7 +73,7 @@

             case PARROT_ARG_S:

                 if (interpreter->ctx.string_reg.registers[*(pc + i)]) {

                     escaped = PDB_escape(interpreter->ctx.string_reg.

-                                         registers[*(pc + i)]->bufstart,

+                                         registers[*(pc + i)]->strstart,

                                          interpreter->ctx.string_reg.

                                          registers[*(pc + i)]->strlen);

                     fprintf(stderr, "S%ld=\"%s\"", (long)*(pc + i),

Index: warnings.c

===================================================================

RCS file: /cvs/public/parrot/warnings.c,v

retrieving revision 1.10

diff -u -r1.10 warnings.c

--- warnings.c  23 Jul 2002 02:09:27 -0000      1.10

+++ warnings.c  17 Aug 2002 08:20:14 -0000

@@ -37,7 +37,7 @@

         return -1;

     }

 

-    if (PIO_write(interpreter, PIO_STDERR(interpreter), targ->bufstart,

+    if (PIO_write(interpreter, PIO_STDERR(interpreter), targ->strstart,

          targ->bufused) < 0) {

         return -2;

     }

@@ -77,7 +77,7 @@

         return -1;

     }

 

-    if (PIO_write(interpreter, PIO_STDERR(interpreter), targ->bufstart,

+    if (PIO_write(interpreter, PIO_STDERR(interpreter), targ->strstart,

          targ->bufused) < 0) {

         return -2;

     }

Index: classes/perlstring.pmc

===================================================================

RCS file: /cvs/public/parrot/classes/perlstring.pmc,v

retrieving revision 1.27

diff -u -r1.27 perlstring.pmc

--- classes/perlstring.pmc      8 Aug 2002 20:57:48 -0000       1.27

+++ classes/perlstring.pmc      17 Aug 2002 08:20:15 -0000

@@ -72,7 +72,7 @@

         STRING* s2 = (STRING*)other->data;

         return (INTVAL)( other->vtable == SELF->vtable &&

                           s1->bufused   == s2->bufused  &&

-            (memcmp(s1->bufstart,s2->bufstart,(size_t)s1->bufused)==0));

+            (memcmp(s1->strstart,s2->bufstart,(size_t)s1->bufused)==0));

     }

 

     void set_integer (PMC* value) {

Index: docs/jit.pod

===================================================================

RCS file: /cvs/public/parrot/docs/jit.pod,v

retrieving revision 1.5

diff -u -r1.5 jit.pod

--- docs/jit.pod        1 Aug 2002 19:57:24 -0000       1.5

+++ docs/jit.pod        17 Aug 2002 08:20:17 -0000

@@ -164,9 +164,9 @@

 

 Gets replaced by the C<FLOATVAL> constant specified in the I<n>th argument.

 

-B<STRING_CONST_bufstart[n]>

+B<STRING_CONST_strstart[n]>

 

-Gets replaced by C<bufstart> of the C<STRING> constant specified in the I<n>th 
argument.

+Gets replaced by C<strstart> of the C<STRING> constant specified in the I<n>th 
+argument.

 

 B<STRING_CONST_buflen[n]>

 

@@ -285,7 +285,7 @@

 

  Parrot_print_sc {

     movl $1,&TEMP_INT[1]

-    SYSTEMCALL(WRITE,3, A&TEMP_INT[1] V&STRING_CONST_bufstart[1] 
V*STRING_CONST_strlen[1])

+    SYSTEMCALL(WRITE,3, A&TEMP_INT[1] V&STRING_CONST_strstart[1] 
+V*STRING_CONST_strlen[1])

  }

 

  Parrot_end {

Index: docs/strings.pod

===================================================================

RCS file: /cvs/public/parrot/docs/strings.pod,v

retrieving revision 1.8

diff -u -r1.8 strings.pod

--- docs/strings.pod    10 Jan 2002 23:23:03 -0000      1.8

+++ docs/strings.pod    17 Aug 2002 08:20:18 -0000

@@ -162,15 +162,16 @@

 how the C<STRING> structure works. You can find the definition of this

 structure in F<string.h>:

 

-    struct parrot_string {

-      void *bufstart;

-      INTVAL buflen;

-      INTVAL bufused;

-      INTVAL flags;

-      INTVAL strlen;

-      INTVAL encoding;

-      INTVAL type;

-      INTVAL unused;

+    struct parrot_string_t {

+        void *bufstart;

+        UINTVAL buflen;

+        UINTVAL flags;

+        UINTVAL bufused;

+        void *strstart;

+        UINTVAL strlen;

+        const ENCODING *encoding;

+        const CHARTYPE *type;

+        INTVAL language;

     };

 

 Let's look at each element of this structure in turn.

@@ -187,6 +188,11 @@

 This is used for memory allocation; it tells you the currently allocated

 size of the buffer in bytes.

 

+=head2 C<flags>

+

+This is a general holding area for string flags. The exact flags

+required have not yet been determined.

+

 =head2 C<bufused>

 

 C<bufused> on the other hand, contains the number of bytes out of the

@@ -194,10 +200,11 @@

 C<buflen>, is used by the buffer growing algorithm to determine when and

 by how much to grow the allocation buffer.

 

-=head2 C<flags>

+=head2 C<strstart>

 

-This is a general holding area for string flags. The exact flags

-required have not yet been determined.

+This stores the actual start of the string. In the case of COW strings 

+holding references to portions of a larger string, (for example, in regex 

+match variables), this is a pointer into the start of the string.

 

 =head2 C<strlen>

 

@@ -242,9 +249,9 @@

 

 XXX I don't know what this is for.

 

-=head2 C<unused>

+=head2 C<language>

 

-This field is, as its name suggests, unused; however, it can be used to

+This field is currently unused; however, it can be used to

 hold a pointer to the correct vtable for foreign strings.

 

 =head1 String Vtable Functions

Index: include/parrot/smallobject.h

===================================================================

RCS file: /cvs/public/parrot/include/parrot/smallobject.h,v

retrieving revision 1.5

diff -u -r1.5 smallobject.h

--- include/parrot/smallobject.h        28 Jul 2002 23:24:45 -0000      1.5

+++ include/parrot/smallobject.h        17 Aug 2002 08:20:19 -0000

@@ -36,6 +36,7 @@

     void *mem_pool;

     size_t start_arena_memory;

     size_t end_arena_memory;

+    STRING* name;

 };

 

 INTVAL contained_in_pool(struct Parrot_Interp *,

@@ -59,7 +60,7 @@

 void alloc_objects(struct Parrot_Interp *, struct Small_Object_Pool *);

 

 struct Small_Object_Pool * new_small_object_pool(struct Parrot_Interp *,

-                                                 size_t, size_t);

+                                                 size_t, size_t, STRING *);

 

 struct Small_Object_Pool * get_sized_small_object_pool(struct Parrot_Interp *,

                                                        size_t);

Index: include/parrot/string.h

===================================================================

RCS file: /cvs/public/parrot/include/parrot/string.h,v

retrieving revision 1.43

diff -u -r1.43 string.h

--- include/parrot/string.h     18 Jul 2002 04:30:42 -0000      1.43

+++ include/parrot/string.h     17 Aug 2002 08:20:19 -0000

@@ -26,6 +26,7 @@

     UINTVAL buflen;

     UINTVAL flags;

     UINTVAL bufused;

+    void *strstart;

     UINTVAL strlen;

     const ENCODING *encoding;

     const CHARTYPE *type;

@@ -45,6 +46,15 @@

 

 typedef struct parrot_string_t String;

 

+

+/* Tail added to end of string buffers; used for COW GC */

+struct Buffer_Tail {

+    unsigned char flags;

+};

+typedef enum TAIL_flag {

+    TAIL_moved_FLAG = 1 << 0,

+} TAIL_flags;

+

 /* Buffer flags */

 typedef enum BUFFER_flag {

     /* bits the GC can keep its dirty mitts off of */

@@ -76,12 +86,18 @@

     /* For debugging, report when this buffer gets moved around */

     BUFFER_report_FLAG = 1 << 16,

     /* Generation in the GC pools */

-    BUFFER_generation_FLAG = 1 << 17 | 1 << 18

+    BUFFER_generation_FLAG = 1 << 17 | 1 << 18,

+    /* Buffer header has a strstart which needs to be updated with bufstart */

+    BUFFER_strstart_FLAG = 1 << 19,

+    /* Buffer's memory data is in this header's header pool's memory pool */

+    /* for now, this is true in constant headers for constant buffer data,and 

+     * true for non-constant headers pointing at non-constant buffer data */

+    BUFFER_selfpoolptr_FLAG = 1 << 20,

 } BUFFER_flags;

 

 /* stringinfo parameters */

 #define STRINGINFO_HEADER   1

-#define STRINGINFO_BUFSTART 2

+#define STRINGINFO_STRSTART 2

 #define STRINGINFO_BUFLEN   3

 #define STRINGINFO_FLAGS    4

 #define STRINGINFO_BUFUSED  5

Index: include/parrot/string_funcs.h

===================================================================

RCS file: /cvs/public/parrot/include/parrot/string_funcs.h,v

retrieving revision 1.13

diff -u -r1.13 string_funcs.h

--- include/parrot/string_funcs.h       24 Jun 2002 16:41:28 -0000      1.13

+++ include/parrot/string_funcs.h       17 Aug 2002 08:20:19 -0000

@@ -23,7 +23,7 @@

 STRING *string_repeat(struct Parrot_Interp *, const STRING *, UINTVAL,

                       STRING **);

 STRING *string_chopn(STRING *, INTVAL);

-STRING *string_substr(struct Parrot_Interp *, const STRING *, INTVAL,

+STRING *string_substr(struct Parrot_Interp *, STRING *, INTVAL,

                       INTVAL, STRING **);

 STRING *string_replace(struct Parrot_Interp *, STRING *, INTVAL, INTVAL,

                        STRING *, STRING **);

@@ -43,8 +43,8 @@

 STRING *string_make(struct Parrot_Interp *, const void *buffer,

                     UINTVAL buflen, const ENCODING *, UINTVAL flags,

                     const CHARTYPE *);

-STRING *string_copy(struct Parrot_Interp *, const STRING *);

-STRING *string_transcode(struct Parrot_Interp *, const STRING *src,

+STRING *string_copy(struct Parrot_Interp *, STRING *);

+STRING *string_transcode(struct Parrot_Interp *, STRING *src,

                          const ENCODING *, const CHARTYPE *,

                          STRING **dest_ptr);

 void string_init(void);

Index: t/op/gc.t

===================================================================

RCS file: /cvs/public/parrot/t/op/gc.t,v

retrieving revision 1.2

diff -u -r1.2 gc.t

--- t/op/gc.t   3 Aug 2002 07:30:09 -0000       1.2

+++ t/op/gc.t   17 Aug 2002 08:20:22 -0000

@@ -85,3 +85,4 @@

 starting

 ending

 OUTPUT

+

Index: t/op/string.t

===================================================================

RCS file: /cvs/public/parrot/t/op/string.t,v

retrieving revision 1.29

diff -u -r1.29 string.t

--- t/op/string.t       27 Jul 2002 20:18:12 -0000      1.29

+++ t/op/string.t       17 Aug 2002 08:20:23 -0000

@@ -1,6 +1,6 @@

 #! perl -w

 

-use Parrot::Test tests => 87;

+use Parrot::Test tests => 90;

 use Test::More;

 

 output_is( <<'CODE', <<OUTPUT, "set_s_s|sc" );

@@ -1289,6 +1289,56 @@

 0

 OUTPUT

 

+

+output_is( <<'CODE', <<OUTPUT, "concat/substr (COW)" );

+       set S0, "<JA"

+       set S1, "PH>"

+       set S2, ""

+       concat S2, S2, S0

+       concat S2, S2, S1

+       print S2

+       print "\n"

+       substr S0, S2, 1, 4

+       print S0

+       print "\n"

+       end

+CODE

+<JAPH>

+JAPH

+OUTPUT

+

+output_is( <<'CODE', <<OUTPUT, "constant to cstring" );

+  stringinfo I0, "\n", 2

+  stringinfo I1, "\n", 2

+  eq I1, I0, ok1

+  print "N"

+ok1:

+  print "OK"

+  print "\n"

+  stringinfo I2, "\n", 2

+  eq I2, I0, ok2

+  print "N"

+ok2:

+  print "OK\n"

+  end

+CODE

+OK

+OK

+OUTPUT

+

+output_is( <<'CODE', <<OUTPUT, "COW with chopn leaving original untouched" );

+  set S0, "ABCD"

+  clone S1, S0

+  chopn S0, 1

+  print S0

+  print "\n"

+  print S1

+  print "\n"

+  end

+CODE

+ABC

+ABCD

+OUTPUT

 

 # Set all string registers to values given by &$_[0](reg num)

 sub set_str_regs {

Reply via email to