[reordering]
Allen wrote:
We can understand the value of providing a clear method without talking about 
GC at all.
I don't doubt there is a case to clear a data structure, but it can be filled with clearless weakmaps. What I'm trying to find is a differentiating factor. I agree that: * clearable and clear-less weakmaps both have a use. Which is dominant for developers has yet to be determined and only tastes and feelings have been provided so far (including by myself). * clearable weakmaps and clear-less weakmap can be symmetrically and at close to no cost implemented on top of one another.

Until evidence (from other languages?) is provided that one case matters more, I personally call this a tie. That's where my reflection is at.

I think a major remaining point is performance. If clear-less weakmaps induce an incompressible significant GC cost, then, that is a valid justification to have native .clear. Now, implementors will have to deal with programs where some long-lived weakmaps aren't manually cleared, the interesting question here is: how far can they go to reduce the GC cost (without requiring a major breakthrough in GC research of course ;-) )? If the cost can be reduced to a marginal difference with manual .clear, I call the performance argument a tie too (leaving the debate to a taste/feeling debate)


Le 23/01/2013 00:36, Allen Wirfs-Brock a écrit :
On Jan 22, 2013, at 2:35 PM, David Bruant wrote:
So, to find out if a weakmap is dead, it has to come from another source than 
the mark-and-sweep algorithm (since it losts its precision)...
Given the additional prohibitive cost weakmaps seem to have on the GC, maybe 
things that would otherwise be considered too costly could make sense to be 
applied specifically to WeakMaps. For instance, would the cost of 
reference-counting only weakmaps be worth the benefit from knowing early that 
the weakmap is dead? (I have no idea how much each costs, so it's hard for me 
to compare the costs)
For WeakMapWithClear, reference counting would declare the weakmap dead as soon 
as the new weakmap is assigned to the private property so that's good. It 
wouldn't work if some weakmaps are part of a cycle of course... but maybe that 
it's such an edge case that it's acceptable to ask users doing that to break 
their weakmaps cycle manually if they don't want the GC not to be too mad at 
them.

You know, as much as Jason and I enjoy talking about garbage collectors, this 
probably isn't the place to revisit the last 40 years of a highly developed 
area of specialized CS technology.
Even if there is a .clear method, it doesn't mean people will use it, so the costs weakmaps induce on GC will have to be taken care of even if people don't manually clear the weakmap [forking the thread for this reason]. JS engine implementors will have to solve this problem regardless of the introduction of a .clear method or not. Since JS engines start having generational GC and WeakMaps, I feel here and now might be a very good place and time to revisit these 40 years. Each implementor will have to do this revisit anyway. If anything, this thread may become a good resource for developers to understand why some of their programs using WeakMaps have conjecturally or inherently bad GC characteristics.

Of all points in this thread, the one that got stuck in my head is when Jason said: "In our current implementation, creating a new WeakMap and dropping the old one is very nearly equivalent in performance to clear()." What this means is that something is lost when moving to a naive generational GC regarding WeakMaps. The loss is the knowledge of when exactly a weakmap is dead. And this loss has a cost related to weakmap GC cost. Although Mark showed a linear algorithm, one can still wonder if in practice this algorithm induce a significant cost (the worst-case complexity doesn't say much about the most-frequent-case cost of an algorithm).

What I'm trying to find out is whether there is a small-cost weakmap-specific tracking system that could tell the GC that a weakmap is dead as soon as possible. First and foremost, what did the research find in these 40 years on this specific question? Did it prove that any tracking system doing what I describe would cost so much that it wouldn't save on what it's supposed to? If so, I'll be happy to read the paper(s) and give up on the topic. I assume it's not the case to continue.
Ideally, the tracking system would have the following properties:
* it costs nothing (or a small startup constant) if there is no weakmap
* the overall cost of the tracking system in normal cases is significantly less than what it costs to have a weakmap falsely assumed alive. I say "in normal cases" because that's what modern GCs are already in the business of. Generational GC is an optimistic optimization based on the *observation* that in most real-life programs, most objects are short-lived. It's possible to craft a pathological program where the performance of the generational GC will be worse than the performance of a non-generational GC (keep all objects alive and then, the generational GC will spend most its time marking and moving objects from generation to generation)

I've suggested weakmap-only reference counting and maybe it fulfills the properties, maybe not. Other ideas welcome whether from 40 years of previous research or creative people reading this message.

David
_______________________________________________
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss

Reply via email to