Re: Confusion regarding 'mark-sweep-compact' naming

Peter Veentjer Tue, 13 Aug 2019 23:35:06 -0700

Thanks for your answer Aleksey, 

comments inline.


On Tuesday, August 6, 2019 at 12:41:28 PM UTC+3, Aleksey Shipilev wrote:
>
> On 8/6/19 7:38 AM, Peter Veentjer wrote: 
> > There is still some confusion on my side. This time it is regarding the 
> algorithmic complexity of 
> > mark-sweep-compact. 
> > 
> > The complexity of the mark phase is linear to the live set. 
> > The complexity of the sweep phase is linear to the heap size. 
> > The complexity of the compact phase is linear to the live set. 
>
> Well... 
>
> The multiplicative factors in either case might make one linearity perform 
> better than the other on 
> wildly different heap sizes and live sets. The adversarial example of this 
> is N large byte arrays of 
> size M. Compact might have to copy N*M bytes (especially when doing 
> sliding compaction), and sweep 
> might only need to visit N locations to check on object metadata. 
>
> There are second-order heap size effects for both mark and compact. Take 
> two heaps with the same 
> live set, one very dense and one very sparse, and the mere fact you have 
> to walk memory far away 
> adds up the secondary effect of heap size. It would not linger for 
> compacting collectors, though, as 
> first few compaction cycles would collapse the heap to denser variant. 
>
> That is all to say that algorithmic complexity is a seriously bad way to 
> reason about (GC) performance. 
>

I understand. I created a bunch of flashcard to get a better conceptual 
understanding of the different gc algorithms;
I'll add some additional flashcard to refine my understanding.
 

>
> > Why would you ever want to include the a sweep phase because the 
> complexity of mark-sweep-compact is 
> > now linear to the heap size; a lot worse compared to 'linear to the live 
> set'. Get rid of the sweep 
> > and complexity goes down to 'linear to the live set'. What is the added 
> value of such an inefficient 
> > GC implementation? 
>
> I think m-s-c nomenclature is confusing, as the actual implementations are 
> doing either m-s or m-c 
> during their individual phases. 


Exactly. And this was the source of my confusion and caused by an abuse of 
names.

 

> The benefit of doing either is well covered. 
>
> The hard naming question is how to properly label the collector where 
> different phases employ 
> different algos. For example, in CMS the young collection is copying, old 
> collection is mark-sweep, 
> fall-back full GC is mark-compact. How would you label CMS then? 


It would be easier/simpler to label the young and old collector 
independently; but I understand your point.

 

> I believe marking it m-s-c is 
> borderline tolerable, until users get confused thinking both sweep and 
> compact happen in the same phase. 
>

> Also with implementations that manage backing storage in fine-grained 
> chunks (for example, free 
> lists), "sweep" might mean dealing with them on per-dead-object basis. For 
> example, this might mean 
>  preparing the storage for subsequent compaction by "sweeping" out old 
> objects before compaction 
> kicks in, which requires free space. Practical implementations I know opt 
> to instead reconstruct the 
> backing storage metadata after massive compaction moves, though. 
>
> My $0.02c, anyway. 
>
> -Aleksey 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mechanical-sympathy+unsubscr...@googlegroups.com.
To view this discussion on the web, visit 
https://groups.google.com/d/msgid/mechanical-sympathy/2a29baf3-2027-4eb6-a7f1-4737397f9f0f%40googlegroups.com.

Re: Confusion regarding 'mark-sweep-compact' naming

Reply via email to