> First and foremost, is there any compelling reason, to have totally > different structures for PMCs and Buffers? > - Both have a ->data aka ->bufstart > - Both have ->flags, that have vastly the same meaning.
As jason said in another message, Dan has changed his mind from yesteryear, and decided that buffers and pmcs should be the same structure. There are a few ideas of my own that would be better implemented if we unified the two, but unfortunately, I haven't have the motivation to unify them. I made a few passes at it, and the task is just monumental, in terms of lots of search and replaces, and lots of debugging to track down why the semantics have broken. :) I could attempt a piecemeal conversion, submitting patches that get us a bit closer, except that each patch would not be acceptable on its own due to the confusion introduced. ie, having PMC use BUFFER_*_FLAGs, or having worse memory usage/dod-speeds because of the larger size of buffers/pmcs after they are unified, etc. > This separation means two different routines for marking in DOD, > separated allocation and so on - a lot of code duplication, which > is IMHO not necessary. > Finally they are both SmallObjects and handled as these in the deep > inyards of arenas. Currently, there are two different marking routines, but with good reason. PMCs can recursively reference PMCs, and thus their next_for_GC to avoid recursion in DOD (see below). Buffers, on the other hand, are for raw data (although pmcs referencing buffers can do magic with the buffer's data itself). The fact that both use the smallobject allocator is a more recent introduction, and it wasn't that way in their original design. > DOD considerations > > We have currently: > - buffer_lives -> BUFFER_live_FLAG > - mark_used -> PMC_live_flag + next_for_GC > > If PMC and Buffers are unified, it should be possible to mark them in > one recursive process: > > mark(buffer) { > if (life_flag) // already done > return > set life_flag > if (buffer_is_buffer_ptr) > mark(buffer->data) > else if(buffer_is_array_of_buffers) > mark(...) for (buffer->data[..]) > else if(has_custom_mark) > ((PMC*)buffer)->vtable->mark() > } Yes, that works. But there is a reason for next_for_GC. The next_for_GC creates a linked list. This linked list is then iterated over in a for loop. There is no recursion, no chance of blowing the C stack, no worries about the overheads of recursive calls, etc. So while it may seem more memory efficient to not use next_for_GC, it actually isn't. A linked list of 500 elements would cause 500 recursive calls and use more memory than would a next_for_GC solution. > 2) > What is PMCs member: > SYNC *synchronize; /* undocumented + unused */ This is for multi-threaded access, where you need to synchronize on something as a way to control access to the PMC. Of course, this is entirely placeholder, as we don't have multi-threading or multiple interpreters. :) > 3) > What are: arena_base->extra_buffer_headers; These are an array of pointers to buffer headers. For example, the interpreter has some buffers "inlined" into the actual interpreter struct. These headers aren't part of any header pools, but the data they reference should be retained when pools are copied. This could be called a hack, and maybe we should force all headers to come from header pools. But there is no compelling reason to do so, at this point in time. (I have some ideas that would require it, tho) > 4) > Is there any deeper reason that the sized_small_object pool allocates > unused slots for intermediate object sizes? This currently isn't used, although my plan was to use it for KEY/HASH structs before their designs were changed such that this wasn't necessary. :) The sized_small_object pool is intended for pools where you only care about allocating objects of a given size. It creates an array of sized-pool pointers. Then it indexes into this array by sizeof-this-smallobject / sizeof(void*). If there are no objects of a given size, then there is just a null pointer in the array. Are you seeing something else? > Finally, if we ever have multiple interpreters, which can be built > dynamically, _all_ structures including the interpreter itself and > it's internal data structures should have to be derived from a Buffer > object (or have to manage there own destroy method). If not, these > interpreters will leak memory like the current one, and this is more > then a bunch of sieves. Perhaps. But buffers are for storing data. The "proper" way to make it a "buffer" would be a sized buffer with lots of fields attached to it, and maybe some data in bufstart (not sure). Then one'd need to wrap it in a PMC in order to give it a custom mark() method, so that fields of the sized buffer interpreter header could be marked() and buffer_lives() themselves. (Currently, this is done in dod.c). If they were unified, the PMC would be an interpreter referencing a sized buffer header. Or if we had sized PMCs, the fields could be part of it, avoiding the need for a buffer. However, as far as leaking memory, there is no reason that interpreters have to be PMC/buffers. Just as we have an make_interpreter to create an interpreter, we can have an unmake_interpreter that destroys the interpreter. I don't think we want interpreters appearing and disapppearing with references...they should be explicitly created and destroyed. But that's a discussion for another thread. My point is that all things don't need to be traced, and some stuff can be handled manually, as long as the perl programmer doesn't see it directly. Hope this helps answer your questions, Mike Lambert