Re: [go-nuts] [generics] Data structures and the garbage collector

Markus Heukelom Fri, 10 Jul 2020 09:48:46 -0700


> Many data structures are implemented using pointers. How does this perform 
>> wrt garbage collecting? For example take any data structure that uses some 
>> tree of nodes, or even simpler just a singly linked list: there is one 
>> pointer from the left node to the next. Wouldn't this incur huge GC 
>> (unavoidable) cycles for large linked lists that will look at all pointers? 
>> I can imagine that a large AVL tree would then cause potentially large 
>> hick-ups in the GC cycle? 
>>
>>
> Is your worry latency in the form of pause times while doing GC, or 
> bandwidth in the form of CPU cycles when doing GC? It isn't entirely clear.
>
>
My "worry" is about CPU cycles that would not be (necessarily) needed in 
let's say C++ or C. The fact that Go has a concurrent GC resulting in no 
"stop the world"-hick-ups is really awesome, but was not my main point.


(btw I don't really "worry" about it for any real applications; it was more 
a question out of interest)

The reason I asked is because I use a custom tree data-structure that 
manages parent-child, and leaf node prev-next, relationships using 
pointers. I use the tree for multiple value types, so the payload type is 
interface{} now. It would be a great place for me to use generics; this 
would make my code more clear and safer. 

In any case, you are right that these things require attention in a garbage 
> collector, both of them. But Go has addressed this for many years now.
>
> As an example, imagine you have a binary AVL or Red/Black tree taking up 
> 128 Gigabyte of memory space. You now delete the root node of said tree.
>
> Example 1: in C++ we have reference counted the tree. Deleting the root 
> node sets its childrens refcounts to 0, so they get deleted, and so on 
> recursively down the tree. It might take some time to delete 128 Gigabyte 
> of data this way. So C++ isn't exempt from memory reclamation pauses either.
>
>
In C/C++ you could better use an arena-allocator and not refcounts. 
Clearing the tree would then use only (nodes / (nodes per region)) 
deallocation calls. As the region is typically large, this would probably 
be orders of magnitude quicker. I believe compilers use arena allocators 
for this reason. (Deleting individual nodes would be cheaper maybe as 
well.) 

Refcounting I think is best reserved for situations where the lifetime of 
an object is not determined by, or escapes, the package (class internals in 
C++), or for reasons  of simplicity of implementation of course.

I was under the (false) impression that one of the reasons of having a 
built-in array, slice and map type in Go (besides it just being really 
handy) is that for example a map[string]int would relieve the GC needing to 
trace the keys and values (in contrast to for example map[int]*string). But 
apparently (see Ian's answer) this is not the case, and the GC always needs 
to trace all elements, keys and values. In that case there is (luckily) no 
reason to choose build-in map over a custom map written with generics.
 

> Example 2: Imagine a language using Cheney's copying collection as the 
> main garbage collection scheme. It will use 256 gigabyte of memory for its 
> two copying spaces, but collection time will be 0, since the tree's data 
> isn't live. You are paying with memory for faster collection in this case.
>
> Example 3: Concurrent collectors, like the one Go uses nowadays, will 
> collect the tree while the program is running and doing productive work. 
> While doing so incurs a bandwidth loss, latency pause times do not suffer. 
> If you have spare CPU cores, they'll be used to handle the collection while 
> the remaining cores run the program (more or less, this is a gross 
> simplification).
>
> As for generics:
>
> When you use a generic, you substitute in a type to generate a 
> special-case of a function for that type. If the implementation uses 
> monomorphization, it will generate a new piece of code specifically for 
> that type invocation. This might mean other optimizations trigger for the 
> code path, notably stripping pointers away.
>
> As for general worry:
>
> In some cases it is beneficial to construct data structures such that they 
> pack data well and don't use excessive amounts of pointers. One reason is 
> that collecting pointers takes time in any language. Another reason is that 
> pointer indirections cost memory loads, which can be expensive. A third 
> reason is that packed data has cache locality, so access to data close 
> together is several orders of magnitude faster than following yet another 
> pointer indirection. And finally, it might avoid memory pressure in copying 
> data around.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/db652e6a-3cb2-480b-a50d-92387c298adbn%40googlegroups.com.

Re: [go-nuts] [generics] Data structures and the garbage collector

Reply via email to