Re: Confusion regarding 'mark-sweep-compact' naming

Gil Tene Wed, 14 Aug 2019 02:30:36 -0700

On Aug 13, 2019, at 11:34 PM, Peter Veentjer 
<pe...@hazelcast.com<mailto:pe...@hazelcast.com>> wrote:

Thanks for your answer Aleksey,

comments inline.

On Tuesday, August 6, 2019 at 12:41:28 PM UTC+3, Aleksey Shipilev wrote:
On 8/6/19 7:38 AM, Peter Veentjer wrote:
> There is still some confusion on my side. This time it is regarding the
> algorithmic complexity of
> mark-sweep-compact.
>
> The complexity of the mark phase is linear to the live set.
> The complexity of the sweep phase is linear to the heap size.
> The complexity of the compact phase is linear to the live set.

Well...

The multiplicative factors in either case might make one linearity perform
better than the other on
wildly different heap sizes and live sets. The adversarial example of this is N
large byte arrays of
size M. Compact might have to copy N*M bytes (especially when doing sliding
compaction), and sweep
might only need to visit N locations to check on object metadata.

There are second-order heap size effects for both mark and compact. Take two
heaps with the same
live set, one very dense and one very sparse, and the mere fact you have to
walk memory far away
adds up the secondary effect of heap size. It would not linger for compacting
collectors, though, as
first few compaction cycles would collapse the heap to denser variant.

That is all to say that algorithmic complexity is a seriously bad way to reason
about (GC) performance.

I understand. I created a bunch of flashcard to get a better conceptual
understanding of the different gc algorithms;
I'll add some additional flashcard to refine my understanding.

> Why would you ever want to include the a sweep phase because the complexity
> of mark-sweep-compact is
> now linear to the heap size; a lot worse compared to 'linear to the live
> set'. Get rid of the sweep
> and complexity goes down to 'linear to the live set'. What is the added value
> of such an inefficient
> GC implementation?

I think m-s-c nomenclature is confusing, as the actual implementations are
doing either m-s or m-c
during their individual phases.

Exactly. And this was the source of my confusion and caused by an abuse of
names.

There are very real and popular collectors that do Mark, Sweep, *and* Compact,
e.g the OpenJDK oldgen collectors in both ParallelGC and SerialGC. There are
very real and popular collectors that do only Mark and Sweep, e.g. the
non-compacting oldgen of the CMS collector (as long as it does not fall back to
the STW compacting fullGC). And there are very real and popular collectors that
do only Mark and Compact (no sweep), e.g. C4, Pauseless, Compressor, and
recently ZGC.

For an easy to follow code example of Mark, Sweep, and Compact, look at the
(ahem...) psMarkSweep.c code for the compacting OpenJDK ParallelGC oldgen
implementation, which does 4 separate passes (
http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/7576bbd5a03c/src/share/vm/gc_implementation/parallelScavenge/psMarkSweep.cpp#l201
):

1. A tracing, pointer chasing pass that marks the live set.
Line 514: // Recursively traverse all live objects and mark them

2. A sweeping pass that computes the new object addresses
Line 579: // Now all live objects are marked, compute the new object
addresses.
(new addresses are computed to ”shift everything to one side”,
which, when actually done in pass 4, will compact the heap in place)

3. A sweeping pass to adjust references to point to the new object addresses:
Line 598: // Adjust the pointers to reflect the new locations
(this is sometimes referred to as a fixup pass)

4. A compact pass that moves the objects to their previously chosen target
locations.
Line 645: // All pointers are now adjusted, move objects accordingly

This is a classic Mark, Sweep, and Compact collector. Pass 1 only visits live
objects, while passes 2-4 linearly traverse the entire heap, live or dead),
going forward one object at a time, and skipping over the dead objects one by
one. For at least the first sweeping pass, when you arrive at a dead object’s
header, you still need to follow its type indicator to determine the object
size in order to know how far to skip ahead to the next object. So you visit
and follow pointer information in each and every object header in the heap,
dead or alive.

In contrast, a pure mark-compact can be built to never do a “move from one
object to another, including the dead ones” pass. E.g. an evacuating mark
compact can use the mark pass to produce an efficient way to move from one live
object to the next when evacuating and doing pointer fixup.

There are various schemes for doing this efficiently without ever visiting a
dead object. E.g. In C4 we keep live marks in a bit vector with one bit per 64
bit word of heap (we mark in this bit vector, not in the object headers). When
it comes time to evacuate e.g. a 2MB region, we just scan through the
associated 32KB of memory that contains the live bits for that 2MB region,
using a find-first-bit operation (which on many processors can be done
efficiently with direct bit scan, learding zeros, or trailing zeros
instructions), stopping only on set live bits, and visiting only the live
objects they correspond to. The evacuating compact never looks at a dead object
header or follows its klass indicator to figure out its size.

An evacuating compactor will typically fix object references (to point to their
correct location) after copying all the object bodies. Since all references in
the heap may potentially need fixing, all reference fields in all live objects
that could possibly point to any relocated objects will need to be visited at
some point to do this, and each non-null reference visited will need to
potentially follow a forwarding pointer somewhere to find the object’s actively
target address.

People often point to this fixup pass as a “sweep” of sorts, as it is often
performed linearly. In collectors where this pass is performed on a compacted
set of objects, such a “sweep” will only visit live objects, so it can still be
dramatically less work than visiting all dead object headers and following
their Klass pointers.

However, with the invention of “Quick Release” (first published in the
Pauseless and C4 papers, but also used since by Compressor and ZGC, see
explanation in section 2.5 of the C4 paper [1]), at least some concurrent
evacuating regional collectors are able to avoid having a separate fixup pass
altogether. Storing forwarding pointers outside of evacuated regions (as
opposed to the traditional way of storing them in or near evacuated object
headers) enables the recycling of an evacuated region’s memory for new
allocations and for hand-over-hand compaction without stomping the forwarding
pointers. This not only allows the compaction of the entire heap with no
“target” memory space needed, it enables lazy-fixup, since fixup is no longer
required to occur before memory can be completely recycled.

With quick release and lazy fixup (supported by a loaded reference value
barrier like LVB), fixup can be delayed as far as the next GC cycle. Which
means fixup work can be performed in the same pointer chasing tracing pass that
would perform the marking for the next GC cycle, with some pretty big savings
(eliminating an entire OSs in all objects in the heap).

This means that a modern regional collector that supports concurrent evacuation
can do a pure mark compact (no sweep), and can achieve that with exactly two
passes: a single mark+fixup pass (mark for this GC cycle, fixup fur previous),
and a single evacuating relocate (but no fixup) pass. Pauseless, C4,
Compressor, and now ZGC all do exactly that.

Note that the collector *has* to support concurrent compaction (or at the very
least concurrent fixup) to get this dramatic improvement in efficiency, as
without it the application could not be allowed to run between the compacting
evacuation and the eventual fixup. This is a practical example of how a
concurrent compacting collector can actually be built to be more CPU efficient
than a stop-the-world compacting collector, and a busting of the myth that
concurrency always comes at a cost. Because of Quick-Release, a concurrent
collector can fundamentally beat a stop the world collector in efficiency.

[1] C4: The Continuously Concurrent Compacting Collector. ISMM’11
https://www.azul.com/files/c4_paper_acm1.pdf

The benefit of doing either is well covered.

The hard naming question is how to properly label the collector where different
phases employ
different algos. For example, in CMS the young collection is copying, old
collection is mark-sweep,
fall-back full GC is mark-compact. How would you label CMS then?

It would be easier/simpler to label the young and old collector independently;
but I understand your point.

I believe marking it m-s-c is
borderline tolerable, until users get confused thinking both sweep and compact
happen in the same phase.

Also with implementations that manage backing storage in fine-grained chunks
(for example, free
lists), "sweep" might mean dealing with them on per-dead-object basis. For
example, this might mean
preparing the storage for subsequent compaction by "sweeping" out old objects
before compaction
kicks in, which requires free space. Practical implementations I know opt to
instead reconstruct the
backing storage metadata after massive compaction moves, though.

My $0.02c, anyway.

-Aleksey

--
You received this message because you are subscribed to a topic in the Google
Groups "mechanical-sympathy" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/mechanical-sympathy/Kk0rQij7q8I/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
mechanical-sympathy+unsubscr...@googlegroups.com<mailto:mechanical-sympathy+unsubscr...@googlegroups.com>.
To view this discussion on the web, visit
https://groups.google.com/d/msgid/mechanical-sympathy/2a29baf3-2027-4eb6-a7f1-4737397f9f0f%40googlegroups.com<https://groups.google.com/d/msgid/mechanical-sympathy/2a29baf3-2027-4eb6-a7f1-4737397f9f0f%40googlegroups.com?utm_medium=email&utm_source=footer>.

--
You received this message because you are subscribed to the Google Groups
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to mechanical-sympathy+unsubscr...@googlegroups.com.
To view this discussion on the web, visit
https://groups.google.com/d/msgid/mechanical-sympathy/817756B0-6E3D-475A-9E3B-B96448DEE04E%40azul.com.

Re: Confusion regarding 'mark-sweep-compact' naming

Reply via email to