Something I don't quite get: You reference the Dingle paper on Gel, and your proposal seems essentially to be that, but the paper on Gel states explicitly that its approach doesn't work with multithreading. They propose a couple of potential solutions, but neither looks easy, and I didn't see it in your writeup. Perhaps I missed it, but since you seem concerned about the current GC in part on account of multithreading, how does this fix that, make it easier, and/or get past the concerns of Gel's authors?
(Sorry if this is a really dumb question; as I said, I may have missed it.)