[GitHub] spark pull request: Clear local copies of accumulators as soon as ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3570#issuecomment-65844299 [Test build #24189 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24189/consoleFull) for PR 3570 at commit [`537baad`](https://github.com/apache/spark/commit/537baad0379644537f21385f0cc1150b4af0b237). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Clear local copies of accumulators as soon as ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3570#issuecomment-65855809 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24189/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Clear local copies of accumulators as soon as ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3570#issuecomment-65855799 [Test build #24189 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24189/consoleFull) for PR 3570 at commit [`537baad`](https://github.com/apache/spark/commit/537baad0379644537f21385f0cc1150b4af0b237). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Clear local copies of accumulators as soon as ...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3570#issuecomment-65733162 Hi @nkronenfeld, Thanks for this PR. These sorts of resource leakage issues can be tricky to debug, so thanks for spotting this. It would be great to file a dedicated JIRA for the memory-leak reported here. I find the current `Accumulators` object code to be difficult to understand, so I'd be open to a larger refactoring / reorganization of that code. To handle cleanup after thread death, I think we can make `localAccums` into a thread-local `Map`, since thread-locals should get GC'd when threads die. We still need to worry about threads staying in thread pools and being re-used, though, so we should ensure that the thread-local is cleared after each task. I think we should be able to do this by moving the `Accumulators.clear()` in `Executor.scala` from the start of the task into the `finally` block that handles task cleanup. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Clear local copies of accumulators as soon as ...
GitHub user nkronenfeld opened a pull request: https://github.com/apache/spark/pull/3570 Clear local copies of accumulators as soon as we're done with them I suspect this PR is not really quite ready for merging, but this seemed the best way of getting a few comments on it. A little explanation: I'm working with some rather large accumulators. I've been asked why not just use a reducer, and the answer is, that when this works, it works about 10x as fast. And so far, when it doesn't work, it seems improvable. This is my first attempt at such improvement. As far as I can tell, on each worker, when each time a task is run, the following happens: 1. The global accumulator object is cleared on that thread 2. The current individual accumulators are registered on that thread (as copies) 3. When the task completes, the accumulator values are collected and returned. Note that when this is done, nothing is cleared. This means, if one is using large accumulators, those accumulators just use up memory uselessly until the next task is run on that thread. And if a thread somehow dies, that memory is used up and can't be retrieved. And, as far as I can tell, that local thread value will never be used again. All I've done so far is to clear local values for that thread as it returns them. I would like to do 2 more things: 1. Reduce this to a two-step process: register, and return, both clearing values 2. Put in something to make sure values are cleared if a task dies (or a thread too, if that is possible) This seems so simple so far, though, I was hoping that I could get some confirmation that I understood things correctly before going on. I don't have a PR or JIRA issue on this so far - the issues I see are a bit hard to pin down. When I run my application without this change, it works, but is very fragile - some tasks take 6 seconds, some 4 minutes. When I implement this, everything seems to run smoothly, and I stop getting random, large task times. This is somewhat related to JIRA issue SPARK-664 in that they both deal with accumulator performance, but it doesn't really address that directly. You can merge this pull request into a Git repository by running: $ git pull https://github.com/nkronenfeld/spark-1 Accumulator-Improvements Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3570.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3570 commit 39a82f2b3ccf68a9a4d1abb0561b5680278a2610 Author: Nathan Kronenfeld nkronenf...@oculusinfo.com Date: 2014-12-03T04:11:11Z Clear local copies of accumulators as soon as we're done with them --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Clear local copies of accumulators as soon as ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3570#issuecomment-65354014 [Test build #24074 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24074/consoleFull) for PR 3570 at commit [`39a82f2`](https://github.com/apache/spark/commit/39a82f2b3ccf68a9a4d1abb0561b5680278a2610). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Clear local copies of accumulators as soon as ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3570#issuecomment-65359286 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24074/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Clear local copies of accumulators as soon as ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3570#issuecomment-65359283 [Test build #24074 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24074/consoleFull) for PR 3570 at commit [`39a82f2`](https://github.com/apache/spark/commit/39a82f2b3ccf68a9a4d1abb0561b5680278a2610). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org