Re: Confusion regarding 'mark-sweep-compact' naming

Peter Veentjer Tue, 13 Aug 2019 23:44:43 -0700

Giving it a bit more thought:

the parallel and serial collectors are categorized as 'mark-sweep-compact' 
collectors.


https://plumbr.io/handbook/garbage-collection-algorithms/removing-unused-objects/compact
https://www.azul.com/files/Understanding_Java_Garbage_Collection_v4.pdf
https://www.oracle.com/webfolder/technetwork/tutorials/obe/java/gc01/index.html

But in both cases, the young collector is a evacuating collector while the 
old collectors are mark-compact algorithms.

This brings up the question: how does sweep enter the picture?

CMS could be categorizes as mark-sweep-compact; even though it does a 
mark-sweep for the old generation; in case of a concurrent mode failure it 
will fall back on the parallel collector which does a mark-compact. So in 
that could I could understand why this one would be called 
mark-sweep-compact; but CMS is never called mark-sweep-compact.



On Wednesday, August 14, 2019 at 9:34:15 AM UTC+3, Peter Veentjer wrote:
>
> Thanks for your answer Aleksey, 
>
> comments inline.
>
> On Tuesday, August 6, 2019 at 12:41:28 PM UTC+3, Aleksey Shipilev wrote:
>>
>> On 8/6/19 7:38 AM, Peter Veentjer wrote: 
>> > There is still some confusion on my side. This time it is regarding the 
>> algorithmic complexity of 
>> > mark-sweep-compact. 
>> > 
>> > The complexity of the mark phase is linear to the live set. 
>> > The complexity of the sweep phase is linear to the heap size. 
>> > The complexity of the compact phase is linear to the live set. 
>>
>> Well... 
>>
>> The multiplicative factors in either case might make one linearity 
>> perform better than the other on 
>> wildly different heap sizes and live sets. The adversarial example of 
>> this is N large byte arrays of 
>> size M. Compact might have to copy N*M bytes (especially when doing 
>> sliding compaction), and sweep 
>> might only need to visit N locations to check on object metadata. 
>>
>> There are second-order heap size effects for both mark and compact. Take 
>> two heaps with the same 
>> live set, one very dense and one very sparse, and the mere fact you have 
>> to walk memory far away 
>> adds up the secondary effect of heap size. It would not linger for 
>> compacting collectors, though, as 
>> first few compaction cycles would collapse the heap to denser variant. 
>>
>> That is all to say that algorithmic complexity is a seriously bad way to 
>> reason about (GC) performance. 
>>
>
> I understand. I created a bunch of flashcard to get a better conceptual 
> understanding of the different gc algorithms;
> I'll add some additional flashcard to refine my understanding.
>  
>
>>
>> > Why would you ever want to include the a sweep phase because the 
>> complexity of mark-sweep-compact is 
>> > now linear to the heap size; a lot worse compared to 'linear to the 
>> live set'. Get rid of the sweep 
>> > and complexity goes down to 'linear to the live set'. What is the added 
>> value of such an inefficient 
>> > GC implementation? 
>>
>> I think m-s-c nomenclature is confusing, as the actual implementations 
>> are doing either m-s or m-c 
>> during their individual phases. 
>
>
> Exactly. And this was the source of my confusion and caused by an abuse of 
> names.
>
>  
>
>> The benefit of doing either is well covered. 
>>
>> The hard naming question is how to properly label the collector where 
>> different phases employ 
>> different algos. For example, in CMS the young collection is copying, old 
>> collection is mark-sweep, 
>> fall-back full GC is mark-compact. How would you label CMS then? 
>
>
> It would be easier/simpler to label the young and old collector 
> independently; but I understand your point.
>
>  
>
>> I believe marking it m-s-c is 
>> borderline tolerable, until users get confused thinking both sweep and 
>> compact happen in the same phase. 
>>
>
>> Also with implementations that manage backing storage in fine-grained 
>> chunks (for example, free 
>> lists), "sweep" might mean dealing with them on per-dead-object basis. 
>> For example, this might mean 
>>  preparing the storage for subsequent compaction by "sweeping" out old 
>> objects before compaction 
>> kicks in, which requires free space. Practical implementations I know opt 
>> to instead reconstruct the 
>> backing storage metadata after massive compaction moves, though. 
>>
>> My $0.02c, anyway. 
>>
>> -Aleksey 
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mechanical-sympathy+unsubscr...@googlegroups.com.
To view this discussion on the web, visit 
https://groups.google.com/d/msgid/mechanical-sympathy/c4c052fc-86da-435e-a8eb-8bba5031041d%40googlegroups.com.

Re: Confusion regarding 'mark-sweep-compact' naming

Reply via email to