WeakMap GC performance (was: WeakMap.prototype.clear performance)

David Bruant Wed, 23 Jan 2013 01:49:26 -0800

[reordering]
Allen wrote:

We can understand the value of providing a clear method without talking about 
GC at all.

I don't doubt there is a case to clear a data structure, but it can befilled with clearless weakmaps. What I'm trying to find is adifferentiating factor. I agree that:* clearable and clear-less weakmaps both have a use. Which is dominantfor developers has yet to be determined and only tastes and feelingshave been provided so far (including by myself).* clearable weakmaps and clear-less weakmap can be symmetrically and atclose to no cost implemented on top of one another.

Until evidence (from other languages?) is provided that one case mattersmore, I personally call this a tie. That's where my reflection is at.

I think a major remaining point is performance. If clear-less weakmapsinduce an incompressible significant GC cost, then, that is a validjustification to have native .clear.Now, implementors will have to deal with programs where some long-livedweakmaps aren't manually cleared, the interesting question here is: howfar can they go to reduce the GC cost (without requiring a majorbreakthrough in GC research of course ;-) )?If the cost can be reduced to a marginal difference with manual .clear,I call the performance argument a tie too (leaving the debate to ataste/feeling debate)



Le 23/01/2013 00:36, Allen Wirfs-Brock a écrit :

On Jan 22, 2013, at 2:35 PM, David Bruant wrote:

So, to find out if a weakmap is dead, it has to come from another source than 
the mark-and-sweep algorithm (since it losts its precision)...
Given the additional prohibitive cost weakmaps seem to have on the GC, maybe 
things that would otherwise be considered too costly could make sense to be 
applied specifically to WeakMaps. For instance, would the cost of 
reference-counting only weakmaps be worth the benefit from knowing early that 
the weakmap is dead? (I have no idea how much each costs, so it's hard for me 
to compare the costs)
For WeakMapWithClear, reference counting would declare the weakmap dead as soon 
as the new weakmap is assigned to the private property so that's good. It 
wouldn't work if some weakmaps are part of a cycle of course... but maybe that 
it's such an edge case that it's acceptable to ask users doing that to break 
their weakmaps cycle manually if they don't want the GC not to be too mad at 
them.

You know, as much as Jason and I enjoy talking about garbage collectors, this 
probably isn't the place to revisit the last 40 years of a highly developed 
area of specialized CS technology.

Even if there is a .clear method, it doesn't mean people will use it, sothe costs weakmaps induce on GC will have to be taken care of even ifpeople don't manually clear the weakmap [forking the thread for thisreason]. JS engine implementors will have to solve this problemregardless of the introduction of a .clear method or not. Since JSengines start having generational GC and WeakMaps, I feel here and nowmight be a very good place and time to revisit these 40 years. Eachimplementor will have to do this revisit anyway.If anything, this thread may become a good resource for developers tounderstand why some of their programs using WeakMaps have conjecturallyor inherently bad GC characteristics.

Of all points in this thread, the one that got stuck in my head is whenJason said: "In our current implementation, creating a new WeakMap anddropping the old one is very nearly equivalent in performance to clear()."What this means is that something is lost when moving to a naivegenerational GC regarding WeakMaps. The loss is the knowledge of whenexactly a weakmap is dead. And this loss has a cost related to weakmapGC cost. Although Mark showed a linear algorithm, one can still wonderif in practice this algorithm induce a significant cost (the worst-casecomplexity doesn't say much about the most-frequent-case cost of analgorithm).

What I'm trying to find out is whether there is a small-costweakmap-specific tracking system that could tell the GC that a weakmapis dead as soon as possible. First and foremost, what did the researchfind in these 40 years on this specific question?Did it prove that any tracking system doing what I describe would costso much that it wouldn't save on what it's supposed to? If so, I'll behappy to read the paper(s) and give up on the topic. I assume it's notthe case to continue.

Ideally, the tracking system would have the following properties:
* it costs nothing (or a small startup constant) if there is no weakmap

* the overall cost of the tracking system in normal cases issignificantly less than what it costs to have a weakmap falsely assumedalive.I say "in normal cases" because that's what modern GCs are already inthe business of. Generational GC is an optimistic optimization based onthe *observation* that in most real-life programs, most objects areshort-lived. It's possible to craft a pathological program where theperformance of the generational GC will be worse than the performance ofa non-generational GC (keep all objects alive and then, the generationalGC will spend most its time marking and moving objects from generationto generation)

I've suggested weakmap-only reference counting and maybe it fulfills theproperties, maybe not. Other ideas welcome whether from 40 years ofprevious research or creative people reading this message.


David
_______________________________________________
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss

WeakMap GC performance (was: WeakMap.prototype.clear performance)

Reply via email to