On Wed, Mar 04, 2009 at 08:49:56PM -0500, Jonathan S. Shapiro wrote:
> 2. You're doing *constrained* sharing. For such applications the GC,
> the compilation, and the runtime need to be specialized.

Yeah, that's the kind of contexts I'm concerned about. Architectures
like Larrabee are apparently designed with the idea that message passing
between cores will still be done by the hardware, as an effect of cache
lines manipulation (with additional hint instructions). Larrabee is
graphics oriented (but that's not a reason to not want to program it
with BitC), but there is speculation this is the direction more
general-purpose hardware will follow. Video game people would presumably
use Larrabee for non-graphics programming anyway, like AI. Pal-Kristian
has likely more insights on this matters (Naughty Dog is a major video
game studio owned by Sony).

A GC (or several cooperating instances of a GC) can rely on that:
ownership of memory can be handed-off to another core along with the
cache line, and will be reclaimed by the GC of that core, etc. But with
32 cores, as many L2 caches, several separate ring networks connecting
them, you really want to make sure a running GC is not going to trash
the caches of the wrong cores and overload the networks in unreasonable
ways. Maintaining that information could be expensive. I am not aware of
any GC designed for this kind of hardware (Niagara also comes to
mind -- Has Sun done any work on the JVM for their Niagara arch?).

Because that type of chip will have a disproportionally fast local
interconnects and on-die memories (caches) wrt. external RAM, you would
want to make sure the memory is known to be reclaimable *before* it
"leaves" the CPU (i.e. the only copy left is in external RAM), whenever
possible. You don't want the GC loading that memory back on the chip
just for the sake of deciding it can be freed.

Ways for the programmer to tell the runtime: "I know the lifetime of
these allocation, let me tell you about it and help me guarantee that I
don't screw that up" would be useful. Other people have talked about
ways of doing that, like pools, Erlang processes, ...

As you said in another post, GC are very fast to allocate. Often the
challenge is to make them fast to reclaim memory, at all times (no
unexpected millisecond latency, a possible killer for a video game).
It's not necessarily an impossible proposition, as long as scans of
large part of the memory are minimized.

I am not quite saying a GC is not an option in these contexts, but I
would hope BitC makes it possible to work with the GC: share lifetime
information, force a reclaim pass on an area of memory (e.g. all the
allocations that were done below the current stack frame -- most are
likely reclaimable), etc. Some of this is library stuff; some could
benefit from language support.

For instance, OCaml only offers weak references as a way of
communicating a performance hint to the GC. Hopefully, more can be done.



> > I guess you mean "you can't afford not to write in assembler"? If so, I
> > disagree. I've written embedded applications using a few dozen KB of RAM
> > in C, and that's just fine. And a lot easier that in assembly. You have
> > to be careful about inlining and stack size but that's far from
> > unreasonable constraints.
> 
> Like I said. You can only write those in assembler. Using a clever
> macro package (to wit: C) doesn't really alter the point I was making.

Right, sorry. I didn't get the joke about C.


> The 32KB application space no longer really exists, even in embedded
> systems. You literally cannot buy ROMS that small anymore.

I guess I'm biased: soft-cores in FPGAs sometimes have to work with a
very small memory (possibly with an external DDR, but far enough you
don't want to access it in uncontrolled ways), and that's a real case
for me.

Thanks.
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to