Mark
I seem to gather that (normal) Hugs has GC problems on Sparc
with optimisation on. David W also reported this recently.
I got the impression that you think this happens because the
register window mechanism magically obscures some registers
which could hold roots, at the point when GC occurs.
I would like to offer a different explanation, plus an easy
fix. When I was working for the UFO project at Manchester,
we did a collector which took roots from the C stack. It too
had problems on sparcs, and the following observation saved
the day. I think it might equally apply to Hugs.
In the code below, complexFnReturningAPair is obvious,
and ... compute_and_alloc ... refers to some code which
may cause a garbage collection.
Pair complexFnReturningAPair ( ... ) { .... }
Pair p;
p = complexFnReturningAPair ( ... );
for ( ... ) {
... fst(p) ... snd(p) ...
... compute_and_alloc ...
}
So: p becomes a pair. We then enter a loop which refers to
fst(p) and/or snd(p), and in which GC could happen. After cppisation,
the code looks like
p = complexFnReturningAPair ( ... );
for ( ... ) {
... heapTopFst[p] ... heapTopSnd[p] ...
... compute_and_alloc ...
}
Expanding the array address calculations gives
p = complexFnReturningAPair ( ... );
for ( ... ) {
... heapTopFst + 4*p ... heapTopSnd + 4*p ...
... compute_and_alloc ...
}
Now the critical step, done by gcc when optimising: lift
4*p out of the loop since it's loop-invariant, and
(incidentally) CSE the two 4*p's together:
tmp = 4 * complexFnReturningAPair ( ... );
for ( ... ) {
... heapTopFst + tmp ... heapTopSnd + tmp ...
... compute_and_alloc ...
}
Blargh! Now, inside the loop, there is no mention of the
original p returned by complexFnReturningAPair, neither in
regs nor on the stack. So the pair won't be retained by
GC even though it's still live in the loop.
The moral of the story is this: when using an array-index-addressed
heap, as Hugs does, we need to treat as a root not only valid
array indexes into the heap, but also intermediates created by
the array address calculations. That is:
p sat -MAX_HEAP < p <= 0
p*sizeof(int) sat -MAX_HEAP < p <= 0
heapTopFst+ p*sizeof(int) sat -MAX_HEAP < p <= 0
heapTopSnd+ p*sizeof(int) sat -MAX_HEAP < p <= 0
Looking in machdep.c:gcCStack(), I only see the first of those four
possibilities handled. Perhaps the following would be better:
#define Blargh markWithoutMove(*ptr); \
markWithoutMove((*ptr)/sizeof(Cell)); \
markWithoutMove((
(void*)(*ptr)-(void*)heapTopFst)/sizeof(Cell)); \
markWithoutMove((
(void*)(*ptr)-(void*)heapTopSnd)/sizeof(Cell))
#define StackGrowsDown { while (ptr<=CStackBase) { Blargh; ptr++; }; }
#define StackGrowsUp { while (ptr>=CStackBase) { Blargh; ptr--; }; }
#define GuessDirection if (ptr>CStackBase) StackGrowsUp else StackGrowsDown
I tried this on x86 and it made no difference at all, since there
isn't a problem for x86 anyway. I'd be interested to hear if it makes
any difference on Sparcs. (note -- the code might not be exactly right).
Check if you try it ...
Finally, I think I know why the problem doesn't strike x86s. That is
because the x86 has addressing modes of the form
(reg1 + k*reg2) for k as 1, 2, 4 or 8
So a reference to heapTopFst[p] becomes
; reg1 holds p
; reg2 holds heapTopFst
movl (reg1 + 4*reg2), reg3
and there is never any point precomputing 4*p, since the architecture
can do it for free.
The entire diatribe applies not only to sparcs but other riscs which
have lots of registers but only simple addressing modes. Have you
had problem reports for (eg) MIPS too?
J
Observation re Hugs GC problem on sparcs w/gcc -O2
Julian Seward (Intl Vendor) Thu, 22 Apr 1999 11:39:53 +0200 (MET DST)
- FW: Observation re Hugs GC problem on sparcs w... Julian Seward (Intl Vendor)
- FW: Observation re Hugs GC problem on spa... Julian Seward (Intl Vendor)
