Hi,

At the moment, under Win32, virtually all of the Perl 6 test suite fails unless run with -G (disabling garbage collection). The problem for at least one of them is that the free list appears to be corrupted.

First, a few notes on the free list. Parrot allocates large chunks of memory, called pools, and then allocates objects out of these pools itself. All garbage collectible objects start out with the same two things:

typedef struct pobj_t {
   UnionVal u;
   Parrot_UInt flags;
} pobj_t;

The UnionVal contains a range of things, but is at least the size of two integers or two pointers. Normally this is used to store some of the data for the object itself. However, after the object is freed, the first pointer-sized chunk of the UnionVal is used for another thing: to store a linked list of free objects. When we want a new object, unless the free list is empty (and thus NULL), we take the object on the front and set the free list to whatever that referred to as the next thing on the free list, like this:

   if (!pool->free_list)
       (*pool->more_objects)(interp, pool);
   ptr = pool->free_list;
   pool->free_list = *(void **)ptr;

The segfault was occuring on the third line of this, namely because ptr was coming back as 0xFFFFFFFF. After some messing around, I realized that if I changed the code to read:

   if (!pool->free_list)
       (*pool->more_objects)(interp, pool);
   ptr = pool->free_list;
   if (*(void **)ptr == 0xFFFFFFFF) {
       PMC *check = (PMC*)ptr;
       return NULL;
   }
   pool->free_list = *(void **)ptr;

And set a breakpoint on the "return NULL;", then I'd get some better idea of what ptr actually was. Turns out it is a PMC - a Key PMC in fact. And if you look in the Key PMC, you see a comment like:

PMC_int_val(-1) means end of iteration.

PMC_int_val is the same memory location as the free list pointer would be, -1 is 0xFFFFFFFF and...well, you can see where this is going. So somehow this Key PMC is not getting marked live, when it is still being used, right?

Well, maybe. Next I looked at the flags of this PMC.

00000100 00010000 00000110 00000011

We'd expect that:

b_PObj_on_free_list_FLAG = 1 << 19,

Would be set, but it ain't. So the PMC is on the free list, but hasn't got the "I'm on the free list" flag. Thus it was, in theory, never actually placed onto the free list. Changing the test condition from earlier to:

   if (!PObj_on_free_list_TEST((PObj*)ptr)) {
       PMC *check = (PMC*)ptr;
       return NULL;
   }

And setting the breakpoint gave check as the Key PMC, just like before. From which I infer that perhaps it's not a Key PMC that is not being marked, but something else that keeps a Key PMC referenced from the first pointer in the UnionVal Perhaps an iterator; from Iterator.pmc's mark routine:

       /* the KEY */
       if (PMC_struct_val(SELF))
            pobject_lives(INTERP, (PObj *) PMC_struct_val(SELF));

Or perhaps not, that's all I have time for today. But this post is for those of you who wonder how one goes about tracing GC problems - I have been asked before and just wanted to make things a little less mysterious. Hope this helps. Or that it spurs someone else to continue the hunt for this bug, which would be rather nice to nail.

Happy hacking,

Jonathan

Reply via email to