On Wed, Mar 04, 2009 at 08:49:56PM -0500, Jonathan S. Shapiro wrote: > 2. You're doing *constrained* sharing. For such applications the GC, > the compilation, and the runtime need to be specialized.
Yeah, that's the kind of contexts I'm concerned about. Architectures like Larrabee are apparently designed with the idea that message passing between cores will still be done by the hardware, as an effect of cache lines manipulation (with additional hint instructions). Larrabee is graphics oriented (but that's not a reason to not want to program it with BitC), but there is speculation this is the direction more general-purpose hardware will follow. Video game people would presumably use Larrabee for non-graphics programming anyway, like AI. Pal-Kristian has likely more insights on this matters (Naughty Dog is a major video game studio owned by Sony). A GC (or several cooperating instances of a GC) can rely on that: ownership of memory can be handed-off to another core along with the cache line, and will be reclaimed by the GC of that core, etc. But with 32 cores, as many L2 caches, several separate ring networks connecting them, you really want to make sure a running GC is not going to trash the caches of the wrong cores and overload the networks in unreasonable ways. Maintaining that information could be expensive. I am not aware of any GC designed for this kind of hardware (Niagara also comes to mind -- Has Sun done any work on the JVM for their Niagara arch?). Because that type of chip will have a disproportionally fast local interconnects and on-die memories (caches) wrt. external RAM, you would want to make sure the memory is known to be reclaimable *before* it "leaves" the CPU (i.e. the only copy left is in external RAM), whenever possible. You don't want the GC loading that memory back on the chip just for the sake of deciding it can be freed. Ways for the programmer to tell the runtime: "I know the lifetime of these allocation, let me tell you about it and help me guarantee that I don't screw that up" would be useful. Other people have talked about ways of doing that, like pools, Erlang processes, ... As you said in another post, GC are very fast to allocate. Often the challenge is to make them fast to reclaim memory, at all times (no unexpected millisecond latency, a possible killer for a video game). It's not necessarily an impossible proposition, as long as scans of large part of the memory are minimized. I am not quite saying a GC is not an option in these contexts, but I would hope BitC makes it possible to work with the GC: share lifetime information, force a reclaim pass on an area of memory (e.g. all the allocations that were done below the current stack frame -- most are likely reclaimable), etc. Some of this is library stuff; some could benefit from language support. For instance, OCaml only offers weak references as a way of communicating a performance hint to the GC. Hopefully, more can be done. > > I guess you mean "you can't afford not to write in assembler"? If so, I > > disagree. I've written embedded applications using a few dozen KB of RAM > > in C, and that's just fine. And a lot easier that in assembly. You have > > to be careful about inlining and stack size but that's far from > > unreasonable constraints. > > Like I said. You can only write those in assembler. Using a clever > macro package (to wit: C) doesn't really alter the point I was making. Right, sorry. I didn't get the joke about C. > The 32KB application space no longer really exists, even in embedded > systems. You literally cannot buy ROMS that small anymore. I guess I'm biased: soft-cores in FPGAs sometimes have to work with a very small memory (possibly with an external DDR, but far enough you don't want to access it in uncontrolled ways), and that's a real case for me. Thanks. _______________________________________________ bitc-dev mailing list [email protected] http://www.coyotos.org/mailman/listinfo/bitc-dev
