Re: Potential GSoC project - GC improvements

Adam Wilson via Digitalmars-d Sat, 12 Mar 2016 13:26:28 -0800

Jeremy DeHaan wrote:

On Saturday, 12 March 2016 at 08:50:06 UTC, Adam Wilson wrote:

If I may make a suggestion. The lock free work is unlikely to require
the entirety of GSoC. And the precise GC is the next most important
thing on your list and will have the biggest impact on GC performance.


Rainer has two different precise GC's in pull requests right now and
both are slower than the current one unless there are false pointers.  I
would expect anything I come up with to largely act the same. The reason
I keep pushing for a generational garbage collector is because I think
it would be where we would see the most benefit in terms of general
performance.

I would contend that it's not quite that simple. Of course the preciseGC is slower, it's scanning more stuff. The current conservative GC bydefinition can't figure out large chunks of memory so it won't evenbother scanning them. What it does to those things it can't scan. Andwhat it can't scan it has to pin, which means the memory can't be moved.You can't build a fully generational GC until you can move everything.They talk about this problem in the books. However, once you have aprecise GC, a fully moving GC becomes possible and only then do theperformance benefits of a generational GC become realized. I stronglysuspect that a partially moving GC will perform worse than the preciseGC. And I know for a fact that it will suffer enormously fromfragmentation and Out-of-Memory issues. Precision is the basis fromwhich all higher order GC functions flow.

Once the GC is fully precise we can implement a fully compacting GC,
which improves the usefulness of generational collection.
Additionally, precision allows a significant amount of work to be done
in improving the performance of the GC in multi-threaded scenarios. It
should be quite possible to avoid needing fork() or anything like it
altogether. I know that the .NET GC doesn't need to use anything like it.


A compacting GC could be built on top of a generational GC without much
difficulty I would think, if we wanted to go that route. The compaction
would just happen as part of a collection cycle when things are moved
into the next generation. I have concerns about doing any compaction
though, mostly because D can have both references and pointers to
objects, and I don't know how we would want to go about updating
pointers. Especially since pointers to D objects can exists in C and C++
code.

Erm, a generational GC is, in practice, a moving GC as it must move theblobs around from the various heaps. The commonly used method isStop-and-Copy.

As for the C/C++ code problem various solutions have been proposed here.For one, D knows the difference between the two, so pinning is possiblethere. Or a separate heap space can be used. AFIAK each OS allows you tospecify multiple heaps, so you could specify an unmanaged heap to goalong with the managed heaps. IIRC, C++/CLI does something similar onWindows. This would be the preferred solution IMO. (Look into HeapCreateon Windows for example.)

Another reason I want to work on a generational GC is because this can
also lead into a concurrent GC without trying to emulate fork() on
windows. The .Net GC has 3 generations with the last one having its
collections running concurrently because it is unlikely to affect
anything else going on. They don't bother running the other generations
concurrently because their collections are really short. We could do
something similar.

A concurrent GC without precision is mostly useless because the GC rootsare primarily derived from the stack which the thread was spawned andthe stack roots are not scanned in a conservative GC. Additionally, theAsync/Await pattern is not practical without a precise GC.

Perhaps someone more intimate with GC's than I am can speak up, but I
think that a generational GC would be the best use of time in relation
to performance gains. Other things can then be implemented on top of it
later.

The problem is that performance is the most obvious problem with the GC,but not the most important problem, or most useful. Precision enables afully moving, fully generational, fully concurrent, GC. It also enablesother major features that have nothing to do with GC.

Also, I would strongly recommend getting this book and reading it
cover to cover before starting:
http://www.amazon.com/gp/product/1420082795/ref=pd_lpo_sbs_dp_ss_1?pf_rd_p=1944687562&pf_rd_s=lpo-top-stripe-1&pf_rd_t=201&pf_rd_i=0471941484&pf_rd_m=ATVPDKIKX0DER&pf_rd_r=0QD9X3E5QATSBCBT6BMM


Thank you for the link to the book. I was planning on this one
http://www.amazon.com/gp/product/0471941484/ , but maybe I will buy them
both.

Obviously you can do whatever you want. :) And I'll admit that precisionisn't sexy, and it won't improve performance right away, but it opensthe door way to further work that will provide those things. It's hardand thankless work, but you'll be better setup for GSoC 2017 for addinggenerational and concurrency support.

My opinion is that we need to focus on improving the core of the GC sothat it can better perform the high-order GC functions. Yes, it willmake the GC slower, but it's already bad so I doubt people will noticemuch.

We've discussed (and by discussed I mean thermonuclear flame-wars) theGC before numerous times before at great length on this forum, so I amglad to see that this thread hasn't turned into that, yet.


--
// Adam Wilson
// quiet.dlang.dev

Re: Potential GSoC project - GC improvements

Reply via email to