Chris Wright wrote:
On Sun, 13 Mar 2016 12:43:37 -0700, Adam Wilson wrote:
A "partially moving" GC does not exist, as far as I know.


Yep, it's a Bad Idea.

It's not a standard term. Google's only seeing about four references to
the term, none of them authoritative or definitive. Since it's non-
standard, it would behoove you to define it.

I think you mean a GC in which some objects are pinned (due to
limitations in the GC rather than the user explicitly requesting that
they be pinned).

Regardless, you ought to define the term and explain why it's a bad idea
instead of providing a bare assertion. It doesn't make any difference to
the operation of the GC *why* something's pinned; what matters is how
much is pinned.

So, for instance, you could point to a conservative moving GC, select one
circumstance in which it can't move objects, tell us how much memory that
ends up pinning on average, and then you'll have an argument.


Is there an implementation of a conservative moving (compacting) GC out there? I'm not aware of one, but there are a lot of GC's out there. Boehm isn't.

In D, we have a tiny bit of type information surfaced to the GC to
reduce the incidence of false pointers. We could additionally take
advantage of the fact that numbers tend to be small and people tend to
use 4 to 8 byte numbers: we could arrange our address space to avoid
ranges where false pointers are likely. But that's probably
unnecessary.


We are not limited by what we currently have. Most of Rainer's work
involved surfacing more data to the GC.

Sure. My point is that we don't *need* a lot of type information in the GC
to get reasonable advantages. A little type information goes a long way,
and a little caution in placing objects in the heap can help too.

In D, thanks to unions, we can never create a fully precise collector,
but we can create a moving collector. Objects pointed to from a union
can't be moved, but few objects will be pointed to from a union.

Actually, we probably could with modifications to the compiler.

No, we couldn't. I've been assuming that we can make whatever
modifications we want to the compiler for free.

The compiler has static type information. This explicitly contains
ambiguities. When it's based solely on data available at compile time,
the compiler can resolve the ambiguities -- and you could have written
the code without using a union.

But in the majority of cases, people use a union because they must. That
means which values are set depends on runtime data, and the compiler
can't tell you which fields will be set.

We could provide a runtime function that is invoked when you set a value
inside a union. That won't work very well. Consider:

   struct A { long a; long b; }
   struct B { double a; void* p; }
   union AB { A a; B b; }

   extern void foo(ref A a);

   AB ab;
   ab.b.p = GC.malloc(512);
   foo(ab.a);

Was that pointer overwritten? No clue! The compiler can't tell. When
compiling the definition of foo(), it didn't have any idea whether the
struct was a union member. Now we have to intercept *every* indirect
write to *any* value with a runtime call, even if you never use a union
in your program --

Oh look, now Python's beating us by an order of magnitude in benchmarks.

What do we get for it? Well, the 1% of programs that use this pattern
will allow the GC to move an additional 0.1% of their objects.

Or we leave the GC conservative in this one case. We can detect this case
pretty easily and treat apparent pointers as pinned. Everything will
work, and the GC will be better at releasing memory than it currently is.


Ok, so yes, it needs conservative scanning in this case. Because even in C# I can pin things, so obviously it has valid use cases. I'm not absolutely against pinning and unions aren't going away, and if the GC would have to be go conservative then so be it. And given that unions are a pretty niche use case, I don't think we should drop the idea of precision just because it can't be 100% precise, so we end up with a few conservatively-scanned edge cases, big deal, the benefits of precision to the whole ecosystem would be enormous in the general case.

Is this a debate about precise vs. non-precise GC or are we just bikeshedding about terminology and technical details? ;) I moved recently and lost track of the GC book for awhile so I've been refreshing my terminology. :P

--
// Adam Wilson
// quiet.dlang.dev

Reply via email to