Re: [DRLVM][GC] high-level design proposal for GCV5

2006-09-10 Thread Xiao-Feng Li

Weldon, I am writing it. Will submit an initial trace-forward copying
collector for the nursery soon, and a mark-compaction collector for
the mature later.

Thanks,
xiaofeng

On 9/11/06, Weldon Washburn <[EMAIL PROTECTED]> wrote:

Anyone building the below GC?  Can you give us an update?
  - Weldon


On 8/22/06, Xiao-Feng Li <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> Going on what's in the email called, "[DRLVM][GC] Goals for
> 2006/2007", I put together a top-level design of GC.  The intention is
> to use this design to guide the implementation of Harmony GCV5.
> Briefly the goals are to build a Generational Mark-Compaction (GenMC)
> garbage collector, initially it will be two generations: Nursery and
> Mature. Your comments are welcome.
>
> 1. Design principles
>
> - The source code should have parallel allocation from the start.
> Also, the collector should be able to take advantage of multiprocessor
> HW from the start. In other words when a single threaded Java app runs
> out of memory on a 4-way box, all 4 CPUs should be involved in GC.
>
> - Collection policy should be separated from the issue of object age.
> One space has one collection policy, while multiple spaces can be of
> same age.
>
> - There should be a clear distinction between collection policy and
> allocation policy.
>
> - Where it is not too time consuming, we should develop our own core
> data structures such as queues and locks.  The intention is to reduce
> memory/performance/functional dependencies on platform-specific
> libraries.
>
> 2. Generations
>
> - The nursery should support linear scan and flexible copying order.
> The size should be variable at runtime with min and max boundaries.
> For expedient initial implementation, the nursery can use depth-first
> trace-forwarding algorithm for the collection.
>
> - The mature can be arranged in blocks and collected with parallel
> mark-compaction algorithm. (c.f. Compressor). The blocks are in the
> range of 64KB in size.
>
> - Large Object Space can start with a simple treadmill collector.
>
> - Collection triggering should be abstracted from collection itself.
> The intention is to allow experimentation with different collection
> trigger heuristics without actually modifying the collector.
>
> - More than two generations should be considered in the design.
>
> 3. Write barrier
>
> - The initial implementation should be a "slot remembering" barrier.
> Object remembering and card marking can be implemented later for
> experiments or performance evaluation. Substituting write barrier may
> be implemented if initial performance data suggests it is worthwhile.
> (Substituting write barrier is a kind of barrier design that does both
> the write and the barrier operations. It is an optimization for
> performance and flexibility.)
>
> - putfield/aastore/putstatic will need a write barrier, so do some
> native functions.  It would be a good idea to evaluate if it is better
> to enumerate statics as root references or use a write barrier. The VM
> itself will need manual write barriers at places appropriate.
>
> - The initial write barrier implementation should be an SSB.  Its OK
> to use explicit buffer overflow checks at the beginning.
>
> 4. Parallelism
>
> - Parallel allocation: Each mutator thread should grab a Nursery chunk
> for thread local allocation. Also, each collector thread should grab a
> Mature chunk for promoting live objects into Mature space. LOS
> allocation does not have to be parallel.
>
> - Parallel collection: The new GC should be designed with parallel
> trace-forwarding for the nursery and parallel mark-compaction for
> mature space.
>
> - Data structures and algorithms should be thread-safe in design from
> the beginning. The parallelism should be controllable, e.g., the
> number of parallel collection threads should be controllable at the
> command line.
>
> - For debug purposes, it should be possible to force the GC into
> single threaded collection.
>
> Comments?
>
> Thanks,
> xiaofeng
>
> -
> Terms of use : http://incubator.apache.org/harmony/mailing.html
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>


--
Weldon Washburn
Intel Middleware Products Division




-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [DRLVM][GC] high-level design proposal for GCV5

2006-09-10 Thread Weldon Washburn

Anyone building the below GC?  Can you give us an update?
 - Weldon


On 8/22/06, Xiao-Feng Li <[EMAIL PROTECTED]> wrote:


Hi,

Going on what's in the email called, "[DRLVM][GC] Goals for
2006/2007", I put together a top-level design of GC.  The intention is
to use this design to guide the implementation of Harmony GCV5.
Briefly the goals are to build a Generational Mark-Compaction (GenMC)
garbage collector, initially it will be two generations: Nursery and
Mature. Your comments are welcome.

1. Design principles

- The source code should have parallel allocation from the start.
Also, the collector should be able to take advantage of multiprocessor
HW from the start. In other words when a single threaded Java app runs
out of memory on a 4-way box, all 4 CPUs should be involved in GC.

- Collection policy should be separated from the issue of object age.
One space has one collection policy, while multiple spaces can be of
same age.

- There should be a clear distinction between collection policy and
allocation policy.

- Where it is not too time consuming, we should develop our own core
data structures such as queues and locks.  The intention is to reduce
memory/performance/functional dependencies on platform-specific
libraries.

2. Generations

- The nursery should support linear scan and flexible copying order.
The size should be variable at runtime with min and max boundaries.
For expedient initial implementation, the nursery can use depth-first
trace-forwarding algorithm for the collection.

- The mature can be arranged in blocks and collected with parallel
mark-compaction algorithm. (c.f. Compressor). The blocks are in the
range of 64KB in size.

- Large Object Space can start with a simple treadmill collector.

- Collection triggering should be abstracted from collection itself.
The intention is to allow experimentation with different collection
trigger heuristics without actually modifying the collector.

- More than two generations should be considered in the design.

3. Write barrier

- The initial implementation should be a "slot remembering" barrier.
Object remembering and card marking can be implemented later for
experiments or performance evaluation. Substituting write barrier may
be implemented if initial performance data suggests it is worthwhile.
(Substituting write barrier is a kind of barrier design that does both
the write and the barrier operations. It is an optimization for
performance and flexibility.)

- putfield/aastore/putstatic will need a write barrier, so do some
native functions.  It would be a good idea to evaluate if it is better
to enumerate statics as root references or use a write barrier. The VM
itself will need manual write barriers at places appropriate.

- The initial write barrier implementation should be an SSB.  Its OK
to use explicit buffer overflow checks at the beginning.

4. Parallelism

- Parallel allocation: Each mutator thread should grab a Nursery chunk
for thread local allocation. Also, each collector thread should grab a
Mature chunk for promoting live objects into Mature space. LOS
allocation does not have to be parallel.

- Parallel collection: The new GC should be designed with parallel
trace-forwarding for the nursery and parallel mark-compaction for
mature space.

- Data structures and algorithms should be thread-safe in design from
the beginning. The parallelism should be controllable, e.g., the
number of parallel collection threads should be controllable at the
command line.

- For debug purposes, it should be possible to force the GC into
single threaded collection.

Comments?

Thanks,
xiaofeng

-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





--
Weldon Washburn
Intel Middleware Products Division


Re: [DRLVM][GC] high-level design proposal for GCV5

2006-08-26 Thread Xiao-Feng Li

I'd actually used it in my local experiments before you told it has
write barrier verification. It  worked almost straightforwardly
without much intergration efforts. :-)

Thanks,
xiaofeng

On 8/25/06, Ivan Volosyuk <[EMAIL PROTECTED]> wrote:

The HARMONY-881 patch was submitted quite some time ago. I'm also
looking forward to see it integrated.
--
Regards,
Ivan

On 8/25/06, Xiao-Feng Li <[EMAIL PROTECTED]> wrote:
> Ivan, Salikh, good idea and thanks for the info. We'd want to apply
> the work into Harmony GC.
>
> Thanks,
> xiaofeng
>
>
> On 8/25/06, Salikh Zakirov <[EMAIL PROTECTED]> wrote:
> > Ivan Volosyuk wrote:
> > > Yes, I have experimented with per-slot verification of write barrier.
> > >
> > > The idea was the following: each word in java heap has external mirror
> > > record. After garbage collection all mirror records are synchronized
> > > with the heap. Each write barrier updates mirror record with
> > > corresponding data. Before next garbage collection there is a trace
> > > for all reachable objects in heap which validates that each slot
> > > contains the same data as the mirror.
> > >
> > > The idea is quite simple. Even that, it helped my find number of
> > > places in VM code which have updated slots in heap without call to
> > > write barrier. The results of the work are in HARMONY-504.
> > >
> > > The scheme has one impotant limitation. When several threads write to
> > > single slot, this verifier may report false-positive result of missing
> > > write barrier. But, I didn't see such situtations in any workloads I
> > > have tested it with.
> >
> > The infrastructure based on the same idea, but independent of GC 
implementation
> > is submitted as HARMONY-881. It provides patch for the VM, which 
incorporates
> > the heap mirroring framework for write barrier verification into the VM 
core,
> > and can work with any garbage collector.
> >
> >
> > -
> > Terms of use : http://incubator.apache.org/harmony/mailing.html
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
>
> -
> Terms of use : http://incubator.apache.org/harmony/mailing.html
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>


--
Ivan
Intel Middleware Products Division

-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [DRLVM][GC] high-level design proposal for GCV5

2006-08-25 Thread Ivan Volosyuk

The HARMONY-881 patch was submitted quite some time ago. I'm also
looking forward to see it integrated.
--
Regards,
Ivan

On 8/25/06, Xiao-Feng Li <[EMAIL PROTECTED]> wrote:

Ivan, Salikh, good idea and thanks for the info. We'd want to apply
the work into Harmony GC.

Thanks,
xiaofeng


On 8/25/06, Salikh Zakirov <[EMAIL PROTECTED]> wrote:
> Ivan Volosyuk wrote:
> > Yes, I have experimented with per-slot verification of write barrier.
> >
> > The idea was the following: each word in java heap has external mirror
> > record. After garbage collection all mirror records are synchronized
> > with the heap. Each write barrier updates mirror record with
> > corresponding data. Before next garbage collection there is a trace
> > for all reachable objects in heap which validates that each slot
> > contains the same data as the mirror.
> >
> > The idea is quite simple. Even that, it helped my find number of
> > places in VM code which have updated slots in heap without call to
> > write barrier. The results of the work are in HARMONY-504.
> >
> > The scheme has one impotant limitation. When several threads write to
> > single slot, this verifier may report false-positive result of missing
> > write barrier. But, I didn't see such situtations in any workloads I
> > have tested it with.
>
> The infrastructure based on the same idea, but independent of GC 
implementation
> is submitted as HARMONY-881. It provides patch for the VM, which incorporates
> the heap mirroring framework for write barrier verification into the VM core,
> and can work with any garbage collector.
>
>
> -
> Terms of use : http://incubator.apache.org/harmony/mailing.html
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





--
Ivan
Intel Middleware Products Division

-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [DRLVM][GC] high-level design proposal for GCV5

2006-08-25 Thread Xiao-Feng Li

Ivan, Salikh, good idea and thanks for the info. We'd want to apply
the work into Harmony GC.

Thanks,
xiaofeng


On 8/25/06, Salikh Zakirov <[EMAIL PROTECTED]> wrote:

Ivan Volosyuk wrote:
> Yes, I have experimented with per-slot verification of write barrier.
>
> The idea was the following: each word in java heap has external mirror
> record. After garbage collection all mirror records are synchronized
> with the heap. Each write barrier updates mirror record with
> corresponding data. Before next garbage collection there is a trace
> for all reachable objects in heap which validates that each slot
> contains the same data as the mirror.
>
> The idea is quite simple. Even that, it helped my find number of
> places in VM code which have updated slots in heap without call to
> write barrier. The results of the work are in HARMONY-504.
>
> The scheme has one impotant limitation. When several threads write to
> single slot, this verifier may report false-positive result of missing
> write barrier. But, I didn't see such situtations in any workloads I
> have tested it with.

The infrastructure based on the same idea, but independent of GC implementation
is submitted as HARMONY-881. It provides patch for the VM, which incorporates
the heap mirroring framework for write barrier verification into the VM core,
and can work with any garbage collector.


-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [DRLVM][GC] high-level design proposal for GCV5

2006-08-25 Thread Salikh Zakirov
Ivan Volosyuk wrote:
> Yes, I have experimented with per-slot verification of write barrier.
> 
> The idea was the following: each word in java heap has external mirror
> record. After garbage collection all mirror records are synchronized
> with the heap. Each write barrier updates mirror record with
> corresponding data. Before next garbage collection there is a trace
> for all reachable objects in heap which validates that each slot
> contains the same data as the mirror.
> 
> The idea is quite simple. Even that, it helped my find number of
> places in VM code which have updated slots in heap without call to
> write barrier. The results of the work are in HARMONY-504.
> 
> The scheme has one impotant limitation. When several threads write to
> single slot, this verifier may report false-positive result of missing
> write barrier. But, I didn't see such situtations in any workloads I
> have tested it with.

The infrastructure based on the same idea, but independent of GC implementation
is submitted as HARMONY-881. It provides patch for the VM, which incorporates
the heap mirroring framework for write barrier verification into the VM core, 
and can work with any garbage collector.


-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [DRLVM][GC] high-level design proposal for GCV5

2006-08-25 Thread Ivan Volosyuk

Yes, I have experimented with per-slot verification of write barrier.

The idea was the following: each word in java heap has external mirror
record. After garbage collection all mirror records are synchronized
with the heap. Each write barrier updates mirror record with
corresponding data. Before next garbage collection there is a trace
for all reachable objects in heap which validates that each slot
contains the same data as the mirror.

The idea is quite simple. Even that, it helped my find number of
places in VM code which have updated slots in heap without call to
write barrier. The results of the work are in HARMONY-504.

The scheme has one impotant limitation. When several threads write to
single slot, this verifier may report false-positive result of missing
write barrier. But, I didn't see such situtations in any workloads I
have tested it with.

--
Regards,
Ivan

On 8/25/06, Xiao-Feng Li <[EMAIL PROTECTED]> wrote:

Ivan, there is no disagreement on the importance of those
verifications (or validations) for Harmony GC. :-) What I was saying
is we need more thinking on the design than the rough idea, such as
when, how, what, etc.

Have you any good idea on the write barrier verfication design? Thanks,

-xiaofeng

On 8/24/06, Ivan Volosyuk <[EMAIL PROTECTED]> wrote:
> > > It might also make sense to design in a write barrier verifier.  The
> > > concept is to verify that all the old-to-young pointers are properly
> > > handled.  One way of doing this is to force a full heap mark.  Then
> > > compare the full heap mark's old-to-young pointers to what the write
> > > barrier mechanism derived.
> >
> > This needs more thinking. The old-to-yound pointers got in a full-heap
> > marking have only live ones. But I think the idea to have some write
> > barrier verifier is interesting, e..g, the rememebered set has to be a
> > superset of live old-to-yound pointers.
>
> IMHO, write barrier verifier is a _must_ to have feature for us. The
> same applies to full heap tracing-validation. As VM and JIT code are
> subject of changes, it should be the way to isolate bugs introduced by
> GC and the ones coming from VM/JIT.
> --
> Ivan
>
> >
> > Thanks,
> > xiaofeng
>
> -
> Terms of use : http://incubator.apache.org/harmony/mailing.html
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





--
Ivan
Intel Middleware Products Division

-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [DRLVM][GC] high-level design proposal for GCV5

2006-08-24 Thread Xiao-Feng Li

Ivan, there is no disagreement on the importance of those
verifications (or validations) for Harmony GC. :-) What I was saying
is we need more thinking on the design than the rough idea, such as
when, how, what, etc.

Have you any good idea on the write barrier verfication design? Thanks,

-xiaofeng

On 8/24/06, Ivan Volosyuk <[EMAIL PROTECTED]> wrote:

> > It might also make sense to design in a write barrier verifier.  The
> > concept is to verify that all the old-to-young pointers are properly
> > handled.  One way of doing this is to force a full heap mark.  Then
> > compare the full heap mark's old-to-young pointers to what the write
> > barrier mechanism derived.
>
> This needs more thinking. The old-to-yound pointers got in a full-heap
> marking have only live ones. But I think the idea to have some write
> barrier verifier is interesting, e..g, the rememebered set has to be a
> superset of live old-to-yound pointers.

IMHO, write barrier verifier is a _must_ to have feature for us. The
same applies to full heap tracing-validation. As VM and JIT code are
subject of changes, it should be the way to isolate bugs introduced by
GC and the ones coming from VM/JIT.
--
Ivan

>
> Thanks,
> xiaofeng

-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [DRLVM][GC] high-level design proposal for GCV5

2006-08-24 Thread Ivan Volosyuk

> It might also make sense to design in a write barrier verifier.  The
> concept is to verify that all the old-to-young pointers are properly
> handled.  One way of doing this is to force a full heap mark.  Then
> compare the full heap mark's old-to-young pointers to what the write
> barrier mechanism derived.

This needs more thinking. The old-to-yound pointers got in a full-heap
marking have only live ones. But I think the idea to have some write
barrier verifier is interesting, e..g, the rememebered set has to be a
superset of live old-to-yound pointers.


IMHO, write barrier verifier is a _must_ to have feature for us. The
same applies to full heap tracing-validation. As VM and JIT code are
subject of changes, it should be the way to isolate bugs introduced by
GC and the ones coming from VM/JIT.
--
Ivan



Thanks,
xiaofeng


-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [DRLVM][GC] high-level design proposal for GCV5

2006-08-23 Thread Xiao-Feng Li

On 8/24/06, Weldon Washburn <[EMAIL PROTECTED]> wrote:

I like it.  Some comments.  It would be really nice to avoid dealing
with write barriers during initial bringup.  It would be good if the
GCV5 collector is designed so that it is easy to configure to always
do a full heap collection.  The idea is to eliminate the dependency on
the write barrier during initial development.


This is easy to configure the GC to do always major collection. I
would first develop the GC to have only minor collector with large
enough mature space that can finish typical workloads. This would make
the development more controllable.


It might also make sense to design in a write barrier verifier.  The
concept is to verify that all the old-to-young pointers are properly
handled.  One way of doing this is to force a full heap mark.  Then
compare the full heap mark's old-to-young pointers to what the write
barrier mechanism derived.


This needs more thinking. The old-to-yound pointers got in a full-heap
marking have only live ones. But I think the idea to have some write
barrier verifier is interesting, e..g, the rememebered set has to be a
superset of live old-to-yound pointers.

Thanks,
xiaofeng


On 8/22/06, Xiao-Feng Li <[EMAIL PROTECTED]> wrote:
> Hi,
>
> Going on what's in the email called, "[DRLVM][GC] Goals for
> 2006/2007", I put together a top-level design of GC.  The intention is
> to use this design to guide the implementation of Harmony GCV5.
> Briefly the goals are to build a Generational Mark-Compaction (GenMC)
> garbage collector, initially it will be two generations: Nursery and
> Mature. Your comments are welcome.
>
> 1. Design principles
>
> - The source code should have parallel allocation from the start.
> Also, the collector should be able to take advantage of multiprocessor
> HW from the start. In other words when a single threaded Java app runs
> out of memory on a 4-way box, all 4 CPUs should be involved in GC.
>
> - Collection policy should be separated from the issue of object age.
> One space has one collection policy, while multiple spaces can be of
> same age.
>
> - There should be a clear distinction between collection policy and
> allocation policy.
>
> - Where it is not too time consuming, we should develop our own core
> data structures such as queues and locks.  The intention is to reduce
> memory/performance/functional dependencies on platform-specific
> libraries.
>
> 2. Generations
>
> - The nursery should support linear scan and flexible copying order.
> The size should be variable at runtime with min and max boundaries.
> For expedient initial implementation, the nursery can use depth-first
> trace-forwarding algorithm for the collection.
>
> - The mature can be arranged in blocks and collected with parallel
> mark-compaction algorithm. (c.f. Compressor). The blocks are in the
> range of 64KB in size.
>
> - Large Object Space can start with a simple treadmill collector.
>
> - Collection triggering should be abstracted from collection itself.
> The intention is to allow experimentation with different collection
> trigger heuristics without actually modifying the collector.
>
> - More than two generations should be considered in the design.
>
> 3. Write barrier
>
> - The initial implementation should be a "slot remembering" barrier.
> Object remembering and card marking can be implemented later for
> experiments or performance evaluation. Substituting write barrier may
> be implemented if initial performance data suggests it is worthwhile.
> (Substituting write barrier is a kind of barrier design that does both
> the write and the barrier operations. It is an optimization for
> performance and flexibility.)
>
> - putfield/aastore/putstatic will need a write barrier, so do some
> native functions.  It would be a good idea to evaluate if it is better
> to enumerate statics as root references or use a write barrier. The VM
> itself will need manual write barriers at places appropriate.
>
> - The initial write barrier implementation should be an SSB.  Its OK
> to use explicit buffer overflow checks at the beginning.
>
> 4. Parallelism
>
> - Parallel allocation: Each mutator thread should grab a Nursery chunk
> for thread local allocation. Also, each collector thread should grab a
> Mature chunk for promoting live objects into Mature space. LOS
> allocation does not have to be parallel.
>
> - Parallel collection: The new GC should be designed with parallel
> trace-forwarding for the nursery and parallel mark-compaction for
> mature space.
>
> - Data structures and algorithms should be thread-safe in design from
> the beginning. The parallelism should be controllable, e.g., the
> number of parallel collection threads should be controllable at the
> command line.
>
> - For debug purposes, it should be possible to force the GC into
> single threaded collection.
>
> Comments?
>
> Thanks,
> xiaofeng
>
> -
> Terms of use : http://in

Re: [DRLVM][GC] high-level design proposal for GCV5

2006-08-23 Thread Weldon Washburn

I like it.  Some comments.  It would be really nice to avoid dealing
with write barriers during initial bringup.  It would be good if the
GCV5 collector is designed so that it is easy to configure to always
do a full heap collection.  The idea is to eliminate the dependency on
the write barrier during initial development.

It might also make sense to design in a write barrier verifier.  The
concept is to verify that all the old-to-young pointers are properly
handled.  One way of doing this is to force a full heap mark.  Then
compare the full heap mark's old-to-young pointers to what the write
barrier mechanism derived.

On 8/22/06, Xiao-Feng Li <[EMAIL PROTECTED]> wrote:

Hi,

Going on what's in the email called, "[DRLVM][GC] Goals for
2006/2007", I put together a top-level design of GC.  The intention is
to use this design to guide the implementation of Harmony GCV5.
Briefly the goals are to build a Generational Mark-Compaction (GenMC)
garbage collector, initially it will be two generations: Nursery and
Mature. Your comments are welcome.

1. Design principles

- The source code should have parallel allocation from the start.
Also, the collector should be able to take advantage of multiprocessor
HW from the start. In other words when a single threaded Java app runs
out of memory on a 4-way box, all 4 CPUs should be involved in GC.

- Collection policy should be separated from the issue of object age.
One space has one collection policy, while multiple spaces can be of
same age.

- There should be a clear distinction between collection policy and
allocation policy.

- Where it is not too time consuming, we should develop our own core
data structures such as queues and locks.  The intention is to reduce
memory/performance/functional dependencies on platform-specific
libraries.

2. Generations

- The nursery should support linear scan and flexible copying order.
The size should be variable at runtime with min and max boundaries.
For expedient initial implementation, the nursery can use depth-first
trace-forwarding algorithm for the collection.

- The mature can be arranged in blocks and collected with parallel
mark-compaction algorithm. (c.f. Compressor). The blocks are in the
range of 64KB in size.

- Large Object Space can start with a simple treadmill collector.

- Collection triggering should be abstracted from collection itself.
The intention is to allow experimentation with different collection
trigger heuristics without actually modifying the collector.

- More than two generations should be considered in the design.

3. Write barrier

- The initial implementation should be a "slot remembering" barrier.
Object remembering and card marking can be implemented later for
experiments or performance evaluation. Substituting write barrier may
be implemented if initial performance data suggests it is worthwhile.
(Substituting write barrier is a kind of barrier design that does both
the write and the barrier operations. It is an optimization for
performance and flexibility.)

- putfield/aastore/putstatic will need a write barrier, so do some
native functions.  It would be a good idea to evaluate if it is better
to enumerate statics as root references or use a write barrier. The VM
itself will need manual write barriers at places appropriate.

- The initial write barrier implementation should be an SSB.  Its OK
to use explicit buffer overflow checks at the beginning.

4. Parallelism

- Parallel allocation: Each mutator thread should grab a Nursery chunk
for thread local allocation. Also, each collector thread should grab a
Mature chunk for promoting live objects into Mature space. LOS
allocation does not have to be parallel.

- Parallel collection: The new GC should be designed with parallel
trace-forwarding for the nursery and parallel mark-compaction for
mature space.

- Data structures and algorithms should be thread-safe in design from
the beginning. The parallelism should be controllable, e.g., the
number of parallel collection threads should be controllable at the
command line.

- For debug purposes, it should be possible to force the GC into
single threaded collection.

Comments?

Thanks,
xiaofeng

-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





--
Weldon Washburn
Intel Middleware Products Division

-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[DRLVM][GC] high-level design proposal for GCV5

2006-08-22 Thread Xiao-Feng Li

Hi,

Going on what's in the email called, "[DRLVM][GC] Goals for
2006/2007", I put together a top-level design of GC.  The intention is
to use this design to guide the implementation of Harmony GCV5.
Briefly the goals are to build a Generational Mark-Compaction (GenMC)
garbage collector, initially it will be two generations: Nursery and
Mature. Your comments are welcome.

1. Design principles

- The source code should have parallel allocation from the start.
Also, the collector should be able to take advantage of multiprocessor
HW from the start. In other words when a single threaded Java app runs
out of memory on a 4-way box, all 4 CPUs should be involved in GC.

- Collection policy should be separated from the issue of object age.
One space has one collection policy, while multiple spaces can be of
same age.

- There should be a clear distinction between collection policy and
allocation policy.

- Where it is not too time consuming, we should develop our own core
data structures such as queues and locks.  The intention is to reduce
memory/performance/functional dependencies on platform-specific
libraries.

2. Generations

- The nursery should support linear scan and flexible copying order.
The size should be variable at runtime with min and max boundaries.
For expedient initial implementation, the nursery can use depth-first
trace-forwarding algorithm for the collection.

- The mature can be arranged in blocks and collected with parallel
mark-compaction algorithm. (c.f. Compressor). The blocks are in the
range of 64KB in size.

- Large Object Space can start with a simple treadmill collector.

- Collection triggering should be abstracted from collection itself.
The intention is to allow experimentation with different collection
trigger heuristics without actually modifying the collector.

- More than two generations should be considered in the design.

3. Write barrier

- The initial implementation should be a "slot remembering" barrier.
Object remembering and card marking can be implemented later for
experiments or performance evaluation. Substituting write barrier may
be implemented if initial performance data suggests it is worthwhile.
(Substituting write barrier is a kind of barrier design that does both
the write and the barrier operations. It is an optimization for
performance and flexibility.)

- putfield/aastore/putstatic will need a write barrier, so do some
native functions.  It would be a good idea to evaluate if it is better
to enumerate statics as root references or use a write barrier. The VM
itself will need manual write barriers at places appropriate.

- The initial write barrier implementation should be an SSB.  Its OK
to use explicit buffer overflow checks at the beginning.

4. Parallelism

- Parallel allocation: Each mutator thread should grab a Nursery chunk
for thread local allocation. Also, each collector thread should grab a
Mature chunk for promoting live objects into Mature space. LOS
allocation does not have to be parallel.

- Parallel collection: The new GC should be designed with parallel
trace-forwarding for the nursery and parallel mark-compaction for
mature space.

- Data structures and algorithms should be thread-safe in design from
the beginning. The parallelism should be controllable, e.g., the
number of parallel collection threads should be controllable at the
command line.

- For debug purposes, it should be possible to force the GC into
single threaded collection.

Comments?

Thanks,
xiaofeng

-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]