[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark JavaWr...

2016-12-01 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15843 LGTM too Thanks a lot! Merging with master, branch-2.1, branch-2.0 Has anyone heard of complaints of this in current use cases of earlier branches? If not, I won't backport it

[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark JavaWr...

2016-12-01 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15843 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark JavaWr...

2016-12-01 Thread techaddict
Github user techaddict commented on the issue: https://github.com/apache/spark/pull/15843 @jkbradley @holdenk @viirya PR updated --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark JavaWr...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15843 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69476/ Test PASSed. ---

[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark JavaWr...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15843 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark JavaWr...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15843 **[Test build #69476 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69476/consoleFull)** for PR 15843 at commit

[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark JavaWr...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15843 **[Test build #69476 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69476/consoleFull)** for PR 15843 at commit

[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark JavaWr...

2016-11-30 Thread techaddict
Github user techaddict commented on the issue: https://github.com/apache/spark/pull/15843 @jkbradley @holdenk will update the PR with changes today. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark JavaWr...

2016-11-29 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/15843 I agree, for a follow up (so we don't lose track of it) - I've created SPARK-18630 but option 1 for now is a strict improvement over the current situation. Thanks for all of your work on this

[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark JavaWr...

2016-11-28 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15843 Sorry for the slow response on this. Given the time pressure for 2.1, let's go with option 1 for now with a follow-up task to implement option 2. It would be great to include this fix in 2.1.

[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark JavaWr...

2016-11-21 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15843 @jkbradley @holdenk @techaddict Do we want to implement copy() in JavaWrapper in this PR too? Or separate it to follow ones with the required copy methods on JVM side? --- If your project is set

[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark JavaWr...

2016-11-15 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/15843 From the Py4J documentation it seems like we could be leaking memory with the first option, although perhaps not a lot of memory, but if it was being used in an iterative Python algorithm for

[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark JavaWr...

2016-11-14 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15843 I'd prefer option2 for safety since the model summaries should be an issue for GC. And looks like Java model summaries don't have copy method. --- If your project is set up for it, you can reply to

[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark JavaWr...

2016-11-14 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15843 I don't see a need to do deep copies of model summaries, but I agree I don't like how JavaWrapper is ambiguous about whether it does shallow or deep copies. I'd say the confusion comes

[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark JavaWr...

2016-11-12 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15843 Oh. I thought JavaWrapper is only used on JavaParams. But there are also others like LogisticRegressionSummary which directly inherits JavaWrapper. --- If your project is set up for it, you can

[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark JavaWr...

2016-11-12 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/15843 I'm not so sure about that, we still would want to cleanup the underlying Java reference object on delete if it isn't needed anymore. I think the question is do we want to support shallow copy of

[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark JavaWr...

2016-11-11 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15843 @jkbradley Sounds making sense more. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark JavaWr...

2016-11-11 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15843 I'm now wondering if ```__del___``` should be in JavaParams instead of JavaWrapper since JavaWrapper does not override ```copy```. Do yall agree it will be safer if it's moved there? --- If