[jira] [Commented] (TOREE-374) Variables declared on the Notebook are not garbage collected
[ https://issues.apache.org/jira/browse/TOREE-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856266#comment-15856266 ] David Taieb commented on TOREE-374: --- [~lbustelo] Totally understand the technical limitation, but looking from the perspective of the user, it looks like a bug. At the very least we should document best practices workaround Also, looking at the results from // show, I see this import $line22$read.$iw.$iw.$iw.$iw.$iw.$iw.x; class $iw extends Serializable def () = { super.; () }; val res4 = println(x) }; Wonder how $line22$read.$iw.$iw.$iw.$iw.$iw.$iw.x; is created and whether we have an opportunity to clean it up within a pre_run_cell event? > Variables declared on the Notebook are not garbage collected > > > Key: TOREE-374 > URL: https://issues.apache.org/jira/browse/TOREE-374 > Project: TOREE > Issue Type: Bug >Affects Versions: 0.1.0 >Reporter: David Taieb > > I'm not sure if it's a bug or a limitation of the underlying scala REPL. > As part of supporting PixieDust (https://github.com/ibm-cds-labs/pixiedust) > auto-visualization feature within Scala gateway, I have implemented a weak > hashmap that tracks objects declared on the Scala REPL. However, I have found > that objects are not correctly gc'ed when the object is declared in a cell > with a val or var keyword and then the cell is ran again. One would expect > that the original object has no more references and should be gc'ed but it's > not. > However, when the object is declare with var keyword and then set to null in > another cell, then it is correctly gc'ed. > I'm concerned that users who run the same cell multiple times would > unwittingly have memory leaks which can eventually lead to OOM errors. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TOREE-374) Variables declared on the Notebook are not garbage collected
[ https://issues.apache.org/jira/browse/TOREE-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15855262#comment-15855262 ] David Taieb commented on TOREE-374: --- [~jodersky] Simple steps to reproduce. In cell 1, create a ReferenceQueue with the following code, run cell 1 only once: ``` import scala.ref.WeakReference import scala.ref.ReferenceQueue val queue:ReferenceQueue[AnyRef] = new ReferenceQueue ``` In cell 2, create an obj and WeakReference to it ``` var obj = new Object() val weakRef = new WeakReference(obj, queue) ``` Run cell 2 twice, the expected behaviour is that the first instance of obj should be marked for gc and placed in the ReferenceQueue In cell 3, poll the ReferenceQueue: ``` System.gc() println(queue.poll) ``` Run cell 3 and observe that it output None. No object has been marked for deletion. Now the positive test, add obj = null in cell 3 as such (note: that's why I used var in the cell2, which means that val can never be gc'ed since you can't dereference them) ``` obj=null System.gc() println(queue.poll) ``` Output is: Some(scala.ref.WeakReferenceWithWrapper@25ac93fd) which is expected. > Variables declared on the Notebook are not garbage collected > > > Key: TOREE-374 > URL: https://issues.apache.org/jira/browse/TOREE-374 > Project: TOREE > Issue Type: Bug >Affects Versions: 0.1.0 >Reporter: David Taieb > > I'm not sure if it's a bug or a limitation of the underlying scala REPL. > As part of supporting PixieDust (https://github.com/ibm-cds-labs/pixiedust) > auto-visualization feature within Scala gateway, I have implemented a weak > hashmap that tracks objects declared on the Scala REPL. However, I have found > that objects are not correctly gc'ed when the object is declared in a cell > with a val or var keyword and then the cell is ran again. One would expect > that the original object has no more references and should be gc'ed but it's > not. > However, when the object is declare with var keyword and then set to null in > another cell, then it is correctly gc'ed. > I'm concerned that users who run the same cell multiple times would > unwittingly have memory leaks which can eventually lead to OOM errors. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TOREE-374) Variables declared on the Notebook are not garbage collected
[ https://issues.apache.org/jira/browse/TOREE-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15854852#comment-15854852 ] David Taieb commented on TOREE-374: --- [~mariusvniekerk] If there was a way to manually force the generate class to be gc'ed, we could use the pre_run_cell event to listen on cell being executed and force the variables to be de-referenced and gc'ed. Thought? > Variables declared on the Notebook are not garbage collected > > > Key: TOREE-374 > URL: https://issues.apache.org/jira/browse/TOREE-374 > Project: TOREE > Issue Type: Bug >Affects Versions: 0.1.0 >Reporter: David Taieb > > I'm not sure if it's a bug or a limitation of the underlying scala REPL. > As part of supporting PixieDust (https://github.com/ibm-cds-labs/pixiedust) > auto-visualization feature within Scala gateway, I have implemented a weak > hashmap that tracks objects declared on the Scala REPL. However, I have found > that objects are not correctly gc'ed when the object is declared in a cell > with a val or var keyword and then the cell is ran again. One would expect > that the original object has no more references and should be gc'ed but it's > not. > However, when the object is declare with var keyword and then set to null in > another cell, then it is correctly gc'ed. > I'm concerned that users who run the same cell multiple times would > unwittingly have memory leaks which can eventually lead to OOM errors. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (TOREE-374) Variables declared on the Notebook are not garbage collected
David Taieb created TOREE-374: - Summary: Variables declared on the Notebook are not garbage collected Key: TOREE-374 URL: https://issues.apache.org/jira/browse/TOREE-374 Project: TOREE Issue Type: Bug Affects Versions: 0.1.0 Reporter: David Taieb I'm not sure if it's a bug or a limitation of the underlying scala REPL. As part of supporting PixieDust (https://github.com/ibm-cds-labs/pixiedust) auto-visualization feature within Scala gateway, I have implemented a weak hashmap that tracks objects declared on the Scala REPL. However, I have found that objects are not correctly gc'ed when the object is declared in a cell with a val or var keyword and then the cell is ran again. One would expect that the original object has no more references and should be gc'ed but it's not. However, when the object is declare with var keyword and then set to null in another cell, then it is correctly gc'ed. I'm concerned that users who run the same cell multiple times would unwittingly have memory leaks which can eventually lead to OOM errors. -- This message was sent by Atlassian JIRA (v6.3.15#6346)