On 11/9/06, Etienne Gagnon <[EMAIL PROTECTED]> wrote:
Salikh Zakirov wrote: > Technically, it should not be too difficult to add an additional field to the VTable > structure, and require GC to write 1 there during object scanning. > However, if the VTable mark is located in the same cache line as gcmap, > it may severely hit parallel GC performance on a multiprocessor due to false sharing, > as writing VTable mark will invalidate the gcmap pointers loaded to caches of other > processors. > > object VTable gcmap > +--------+ +-----------+ +------------------+ > | VT ptr |------->| gcmap ptr |----------->| offset of ref #1 | > | ... | | ... | | offset of ref #2 | > +--------+ +-----------+ | ... | > | 0 | > +------------------+If you go that far for every scanned object (!), then you could simply place the class unloading bit in the gc map, at index -1) to minimize disruption of current code... object VTable gcmap +------------------+ +--------+ +-----------+ | cl.un. mark bit | | VT ptr |------->| gcmap ptr |----------->| offset of ref #1 | | ... | | ... | | offset of ref #2 | +--------+ +-----------+ | ... | | 0 | +------------------+ This gets rid of the cache line hazard...
We will get rid of false sharing. That's true. But it still be quite expensive to write those '1' values, because of ping-ponging of the cache line between processors. I see only one solution to this: use separate mark bits in vtable per GC thread which should reside in different cache lines and different from that word containing gcmap pointer. -- Ivan Intel Enterprise Solutions Software Division
