Re: Confusion regarding 'mark-sweep-compact' naming

Kirk Pepperdine Sun, 13 Aug 2017 23:46:07 -0700

Hi,

> On Aug 14, 2017, at 7:47 AM, Peter Veentjer <alarmnum...@gmail.com> wrote:
> 
> I have been improving my gc knowledge and there is some confusion on my side 
> regarding the 'mark-sweep-compact' algorithm I frequently see mentioned in 
> posts, articles and some not too formal books on the topic. E.g.
> 
> AFAIK mark-sweep-compact is not correct. It is either mark-sweep or 
> mark-compact where as part of the compacting process the dead objects are 
> removed. When I check more formal literature like "The Garbage Collection 
> Handbook: the Art of Automatic Memory Management", then there is no 
> mentioning of mark-sweep-compact.
> 
> So I guess that when 'mark-sweep-compact' is mentioned, they are actually 
> referring to 'mark-compact'. Is this assumption correct?

The terminology around this is very loose so the very short answer is…. sort 
of. So, a very terse but hopefully useful summary of collectors.

When we talk about mark-sweep we are generally referring to a class of garbage 
collectors known as tracing collectors. The actual details of how the mark and 
sweep for each of the different tracing collectors is implemented depend upon a 
couple of factors. One of the most important factors is how memory pool(s) 
(data structure in heap space) is organized. I tend to label a collector as 
either “in-place” or “evacuation”. The most trivial examples for both are; if 
you only have a single memory pool then you will be performing an “in-place” 
collection as the data remains in the that memory pool at the end of the 
collection. If you have two spaces you can set this up as what is known has 
hemispheric collections. In this case you mark the data in the currently active 
memory pool and then “evacuate” it to the inactive pool. When complete active 
becomes inactive (thus empty) and inactive becomes active. Wash, rinse, repeat….

In-place collections will need a free-list and need to some how deal with 
fragmentation (of the free-list). Thus they compact. When and how often is an 
implementation detail. Evacuating collectors use a top of heap pointer (and 
other more complex tricks) that act as a free list. Since they evacuate to 
another pool, the data will naturally be compacted. Thus no compaction will be 
necessary. As you can imagine, the time/space cost models for these two general 
approaches are quite different.

In OpenJDK we have both generational and regional collectors. The generational 
collector divides memory into 4 memory pools named Eden, Survivor 0, Survivor 
1, and Tenured space. The first three are grouped into young and Tenured into 
old space. In young we can use mark-evacuation where as in old we are forced to 
use mark-sweep-compact. I’m going to claim (I’m sure Gil will want to chime in 
here) that G1, C4, Balance and Shenandoah are regional collectors in that they 
divide heap into a large number of memory pools (regions). G1 aims for 2048 
memory pools. In all 4 cases, the collectors are very very different variants 
of mark-evacuation each with their own quirks. The advantage of using regions 
is that you only need an evacuating collector as you don’t end up with a 
terminal pool (tenured) where you are forced to use an in-place collector. If 
you are careful in how you select regions to be recovered, you can realize some 
nice wins over generational collectors. However, these collectors tend to be 
far more complex and harder on your electron budget than generational 
collectors are.

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mechanical-sympathy+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Confusion regarding 'mark-sweep-compact' naming

Reply via email to