Re: Easy & huge GC optimizations

Chris via Digitalmars-d Fri, 23 May 2014 10:30:53 -0700

On Friday, 23 May 2014 at 15:41:39 UTC, John Colvin wrote:

On Friday, 23 May 2014 at 13:43:53 UTC, Chris wrote:
On Friday, 23 May 2014 at 06:17:43 UTC, Rainer Schuetze wrote:
On 22.05.2014 21:04, Etienne wrote:
On 2014-05-22 2:12 PM, Rainer Schuetze wrote:
"NO_INTERIOR" is currently only used for the hash arrayused byassociative arrays. It is a bit dangerous to use as anypointer,slice orregister still operating on the array is ignored, socollecting it might
corrupt your memory.
That's quite a relief, I was afraid of having to do it ;)
I'm currently exploring the possibility of sampling thepointers duringmark'ing to check if they're gone and using bayesianprobabilities to
decide whether or not to skip the pool.

I explained it all here:
https://github.com/D-Programming-Language/druntime/pull/797#issuecomment-43896016


-- paste --
Basically, when marking, you take 1 in X of the referencesand send themto a specific array that represents the pool they refer to.Then, nexttime you're going to collect you test them individually andif they'remostly there, you skip marking/free'ing for that particularpool duringcollection. You can force collection on certain pools every1 in Xcollections to even out the average lifetime of thereferences.
You're going to want to have a lower certainty of failurefor bigallocations, but basically you're using probabilities toavoid pushing alot of useless load on the processor, especially when you'rein a partof an application that's just allocating a lot (samplingwill determine
that the software is not in a state of data removal).

http://en.wikipedia.org/wiki/Bayes_factor

-- end paste --
The bayes factor is merely there to choose the appropriatemodel thatfits with the program. Bayesian inference would take care ofdeciding ifa pool should end up being mark'ed. In other words, machinelearning.
Would you think it'd be a good optimization opportunity?
Hmm, I guess I don't get the idea. You cannot skip a poolbased on some statistics, you might have references in thereto anything. As a result you cannot collect anything.
I'm not a fan of machine learning, especially not in cases you_can_ control, like memory allocation / deallocation. Guessingis not a good strategy, if you can have control oversomething. Machine learning is only good for vast andunpredictable data (voice recognition for example). Then itmakes sense to apply probability. But if you can actuallycontrol what you are doing, why would you want to rely on astupid and blind machine that decides things for you based ona probability of n%? Things can go wrong and you don't evenknow why. Mind you, we should rule the machines, not the otherway around.
Bear in mind here that most code goes though a whole bunch ofmachine learning algorithms in the CPU itself. Like it or not,it has proved extremely successful.

What I'm saying is that in cases where you do have control youshould not transfer it to the machine. Either you free memoryyourself with free() or the GC mechanism is exact and does not"assume" things. This could cause inexplicable random bugs. Iremember that about the GC introduced in Objective-C the manualsaid something like: Some objects may never be collected. I'm notan expert on GC, far from it, but I didn't like the sound of it.

I know that CPU's do a good bit of guessing. But that's not thesame thing. If they err, they make up for it ("Ooops, it's not inthe cache! Will get it from HD, just a nanosec!"). If the GCerrs, how do you make up for it? Please educate me.

Re: Easy & huge GC optimizations

Reply via email to