Re: Of PMCs Buffers and memory management

Mike Lambert Fri, 27 Sep 2002 14:02:37 -0700

> First and foremost, is there any compelling reason, to have totally
> different structures for PMCs and Buffers?
> - Both have a ->data aka ->bufstart
> - Both have ->flags, that have vastly the same meaning.


As jason said in another message, Dan has changed his mind from
yesteryear, and decided that buffers and pmcs should be the same
structure. There are a few ideas of my own that would be better
implemented if we unified the two, but unfortunately, I haven't have the
motivation to unify them. I made a few passes at it, and the task is just
monumental, in terms of lots of search and replaces, and lots of debugging
to track down why the semantics have broken. :)

I could attempt a piecemeal conversion, submitting patches that get us a
bit closer, except that each patch would not be acceptable on its own due
to the confusion introduced. ie, having PMC use BUFFER_*_FLAGs, or having
worse memory usage/dod-speeds because of the larger size of buffers/pmcs
after they are unified, etc.

> This separation means two different routines for marking in DOD,
> separated allocation and so on - a lot of code duplication, which
> is IMHO not necessary.
> Finally they are both SmallObjects and handled as these in the deep
> inyards of arenas.

Currently, there are two different marking routines, but with good reason.
PMCs can recursively reference PMCs, and thus their next_for_GC to avoid
recursion in DOD (see below). Buffers, on the other hand, are for raw data
(although pmcs referencing buffers can do magic with the buffer's data
itself). The fact that both use the smallobject allocator is a more recent
introduction, and it wasn't that way in their original design.

> DOD considerations
>
> We have currently:
> - buffer_lives -> BUFFER_live_FLAG
> - mark_used -> PMC_live_flag + next_for_GC
>
> If PMC and Buffers are unified, it should be possible to mark them in
> one recursive process:
>
> mark(buffer) {
>   if (life_flag)      // already done
>     return
>   set life_flag
>   if (buffer_is_buffer_ptr)
>     mark(buffer->data)
>   else if(buffer_is_array_of_buffers)
>     mark(...) for (buffer->data[..])
>   else if(has_custom_mark)
>     ((PMC*)buffer)->vtable->mark()
> }

Yes, that works. But there is a reason for next_for_GC. The next_for_GC
creates a linked list. This linked list is then iterated over in a for
loop. There is no recursion, no chance of blowing the C stack, no worries
about the overheads of recursive calls, etc.

So while it may seem more memory efficient to not use next_for_GC, it
actually isn't. A linked list of 500 elements would cause 500 recursive
calls and use more memory than would a next_for_GC solution.

> 2)
> What is PMCs member:
>      SYNC *synchronize; /* undocumented + unused */

This is for multi-threaded access, where you need to synchronize on
something as a way to control access to the PMC. Of course, this is
entirely placeholder, as we don't have multi-threading or multiple
interpreters. :)

> 3)
> What are: arena_base->extra_buffer_headers;

These are an array of pointers to buffer headers. For example, the
interpreter has some buffers "inlined" into the actual interpreter struct.
These headers aren't part of any header pools, but the data they reference
should be retained when pools are copied. This could be called a hack, and
maybe we should force all headers to come from header pools. But there is
no compelling reason to do so, at this point in time. (I have some ideas
that would require it, tho)

> 4)
> Is there any deeper reason that the sized_small_object pool allocates
> unused slots for intermediate object sizes?

This currently isn't used, although my plan was to use it for KEY/HASH
structs before their designs were changed such that this wasn't necessary.
:)

The sized_small_object pool is intended for pools where you only care
about allocating objects of a given size. It creates an array of
sized-pool pointers. Then it indexes into this array by
sizeof-this-smallobject / sizeof(void*). If there are no objects of a
given size, then there is just a null pointer in the array. Are you seeing
something else?

> Finally, if we ever have multiple interpreters, which can be built
> dynamically, _all_ structures including the interpreter itself and
> it's internal data structures should have to be derived from a Buffer
> object (or have to manage there own destroy method). If not, these
> interpreters will leak memory like the current one, and this is more
> then a bunch of sieves.

Perhaps. But buffers are for storing data. The "proper" way to make it a
"buffer" would be a sized buffer with lots of fields attached to it, and
maybe some data in bufstart (not sure). Then one'd need to wrap it in a
PMC in order to give it a custom mark() method, so that fields of the
sized buffer interpreter header could be marked() and buffer_lives()
themselves. (Currently, this is done in dod.c).

If they were unified, the PMC would be an interpreter referencing a sized
buffer header. Or if we had sized PMCs, the fields could be part of
it, avoiding the need for a buffer.

However, as far as leaking memory, there is no reason that interpreters
have to be PMC/buffers. Just as we have an make_interpreter to create an
interpreter, we can have an unmake_interpreter that destroys the
interpreter. I don't think we want interpreters appearing and
disapppearing with references...they should be explicitly created and
destroyed. But that's a discussion for another thread. My point is that
all things don't need to be traced, and some stuff can be handled
manually, as long as the perl programmer doesn't see it directly.

Hope this helps answer your questions,
Mike Lambert

Re: Of PMCs Buffers and memory management

Reply via email to