[GitHub] spark pull request: [SPARK-4847][SQL]Fix extraStrategies cannot t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3698#issuecomment-66960987 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24452/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4847][SQL]Fix extraStrategies cannot t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3698#issuecomment-66960986 [Test build #24452 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24452/consoleFull) for PR 3698 at commit [`4741130`](https://github.com/apache/spark/commit/4741130819ca02ad6a426a3aeb0f6ef1f972f36e). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor][Core] fix comments in MapOutputTracker
GitHub user scwf opened a pull request: https://github.com/apache/spark/pull/3700 [Minor][Core] fix comments in MapOutputTracker Using driver and executor in the comments of ```MapOutputTracker``` is more clear. You can merge this pull request into a Git repository by running: $ git pull https://github.com/scwf/spark commentFix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3700.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3700 commit aa68524eb4e31c86873ff7877f00380c3a33a8c9 Author: wangfei wangf...@huawei.com Date: 2014-12-15T07:58:52Z master and worker should be driver and executor --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor][Core] fix comments in MapOutputTracker
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3700#issuecomment-66961393 [Test build #24454 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24454/consoleFull) for PR 3700 at commit [`aa68524`](https://github.com/apache/spark/commit/aa68524eb4e31c86873ff7877f00380c3a33a8c9). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4848] Stand-alone cluster: Allow differ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3699#issuecomment-66961638 [Test build #24453 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24453/consoleFull) for PR 3699 at commit [`479c31c`](https://github.com/apache/spark/commit/479c31c9d3e580879d76146e2a687b5235c87b33). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4848] Stand-alone cluster: Allow differ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3699#issuecomment-66961646 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24453/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor][Core] fix comments in MapOutputTracker
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3700#issuecomment-66967398 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24454/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor][Core] fix comments in MapOutputTracker
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3700#issuecomment-66967388 [Test build #24454 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24454/consoleFull) for PR 3700 at commit [`aa68524`](https://github.com/apache/spark/commit/aa68524eb4e31c86873ff7877f00380c3a33a8c9). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor][Core] fix comments in MapOutputTracker
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3700#issuecomment-66979896 [Test build #24455 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24455/consoleFull) for PR 3700 at commit [`61b4e61`](https://github.com/apache/spark/commit/61b4e61938c6ff2bea3bbebdcbb5074b0f7d766d). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor][Core] fix comments in MapOutputTracker
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3700#issuecomment-66983891 [Test build #24456 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24456/consoleFull) for PR 3700 at commit [`b4e3f95`](https://github.com/apache/spark/commit/b4e3f95e5f900b2a2e5e3c20c4f8310a5d820439). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor][Core] fix comments in MapOutputTracker
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3700#issuecomment-66984839 [Test build #24457 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24457/consoleFull) for PR 3700 at commit [`d8a857a`](https://github.com/apache/spark/commit/d8a857af14c5ccaa5a283b84bb0e210f3b71c41e). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor][Core] fix comments in MapOutputTracker
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3700#issuecomment-66986162 [Test build #24455 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24455/consoleFull) for PR 3700 at commit [`61b4e61`](https://github.com/apache/spark/commit/61b4e61938c6ff2bea3bbebdcbb5074b0f7d766d). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor][Core] fix comments in MapOutputTracker
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3700#issuecomment-66986170 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24455/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Hot Fix] Fix WriteAheadLogBackedBlockRDDSuite
GitHub user scwf opened a pull request: https://github.com/apache/spark/pull/3701 [Hot Fix] Fix WriteAheadLogBackedBlockRDDSuite WriteAheadLogBackedBlockRDDSuite failed due to random string file name ``` /tmp/1418645532799-0/?? [info] - Read data available only in write ahead log, and test storing in block manager *** FAILED *** (2 milliseconds) ``` Using another way to produce random file name. You can merge this pull request into a Git repository by running: $ git pull https://github.com/scwf/spark patch-10 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3701.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3701 commit b7f89551280e9db315d6dcd8ab13f4bc364032fe Author: wangfei wangf...@huawei.com Date: 2014-12-15T12:27:32Z fix WriteAheadLogBackedBlockRDDSuite --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Hot Fix] Fix WriteAheadLogBackedBlockRDDSuite
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/3701#issuecomment-66987767 link to https://github.com/apache/spark/pull/3687 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor][Core] fix comments in MapOutputTracker
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3700#issuecomment-66990253 [Test build #24456 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24456/consoleFull) for PR 3700 at commit [`b4e3f95`](https://github.com/apache/spark/commit/b4e3f95e5f900b2a2e5e3c20c4f8310a5d820439). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor][Core] fix comments in MapOutputTracker
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3700#issuecomment-66990258 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24456/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor][Core] fix comments in MapOutputTracker
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3700#issuecomment-66993288 [Test build #24457 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24457/consoleFull) for PR 3700 at commit [`d8a857a`](https://github.com/apache/spark/commit/d8a857af14c5ccaa5a283b84bb0e210f3b71c41e). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor][Core] fix comments in MapOutputTracker
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3700#issuecomment-66993298 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24457/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor][Core] fix comments in MapOutputTracker
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/3700#issuecomment-66995231 Have fixed the test failure in #3701, so revert debug changes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Hot Fix] Fix WriteAheadLogBackedBlockRDDSuite
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/3701#issuecomment-66995238 This is for SPARK-4826 right? can you explain the fix? is it that reserved characters like `/` or non-ASCII chars were part of the file name? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor][Core] fix comments in MapOutputTracker
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3700#issuecomment-66995744 [Test build #24459 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24459/consoleFull) for PR 3700 at commit [`aa68524`](https://github.com/apache/spark/commit/aa68524eb4e31c86873ff7877f00380c3a33a8c9). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826][Streaming] - Create unique file n...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/3695#issuecomment-66995767 The fix in https://github.com/apache/spark/pull/3701 looks simpler but it's fixing a different supposed cause. Is it really that the random method produces the same string many times, because of a race condition or something? seems like synchronization would fix that then. The alternate method in the other PR may do the same. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4843 [YARN] Squash ExecutorRunnableUtil ...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/3696#issuecomment-66995976 @ksakellis This test has been failing for lots of recent builds and is not related to this PR. After it's resolved you can ask Jenkins to test again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Hot Fix] Fix WriteAheadLogBackedBlockRDDSuite
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/3701#issuecomment-66996256 Yes, it is. I have not noticed that PR. Based on my test, this failure is caused by ```Random.nextString(10)```, it produce the same string each time when test(a debug link on jenkins: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24456/consoleFull) for unkonw reason. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Hot Fix][Streaming] Fix WriteAheadLogBackedBl...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3701#issuecomment-66996814 [Test build #24458 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24458/consoleFull) for PR 3701 at commit [`b7f8955`](https://github.com/apache/spark/commit/b7f89551280e9db315d6dcd8ab13f4bc364032fe). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Hot Fix][Streaming] Fix WriteAheadLogBackedBl...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3701#issuecomment-66996823 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24458/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4848] Stand-alone cluster: Allow differ...
Github user nkronenfeld commented on the pull request: https://github.com/apache/spark/pull/3699#issuecomment-66998011 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4848] Stand-alone cluster: Allow differ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3699#issuecomment-66998218 [Test build #24460 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24460/consoleFull) for PR 3699 at commit [`479c31c`](https://github.com/apache/spark/commit/479c31c9d3e580879d76146e2a687b5235c87b33). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MLLIB] SPARK-4846: When the vocabulary size i...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/3697#issuecomment-66999069 Sorry I didn't quite explain my thought. I don't know if this prevents the OOM? I don't see how this prevents the serialization, by making them lazy. I could be wrong especially if you've confirmed this help but maybe it deserves some explanation. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-2883][SQL]initial support ORC in s...
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/2576#issuecomment-67003139 @marmbrus, i am fixing the test failure and refactoring the code based on datasource api, and one question here is, should i keep the sink part(write interface) here? Or just provide the ability to read orc file based on datasource api? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1507][YARN]specify num of cores for AM
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/3686#issuecomment-67003632 @sryza I understand what you are saying but I don't see anywhere in this pull request that the yarn-client AM is referred to as the driver, conf in current code is spark.yarn.am.cores, am I missing something? I didn't have time to do a full review last week, but I was leaning towards like what @vanzin mentioned and reusing the driver-cores option to specify specify cores in yarn-cluster mode, which is why I mentioned that option in my original post also. That way it matches other things like driver-memory, etc. Sorry for any confusion on my comment, it wasn't intended to be a full review, just answering the question from @scwf. The current conf specified (spark.yarn.am.cores) would then work in client mode. thoughts or objections to that? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor][Core] fix comments in MapOutputTracker
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3700#issuecomment-67004340 [Test build #24459 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24459/consoleFull) for PR 3700 at commit [`aa68524`](https://github.com/apache/spark/commit/aa68524eb4e31c86873ff7877f00380c3a33a8c9). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor][Core] fix comments in MapOutputTracker
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3700#issuecomment-67004351 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24459/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4848] Stand-alone cluster: Allow differ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3699#issuecomment-67007668 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24460/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4848] Stand-alone cluster: Allow differ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3699#issuecomment-67007651 [Test build #24460 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24460/consoleFull) for PR 3699 at commit [`479c31c`](https://github.com/apache/spark/commit/479c31c9d3e580879d76146e2a687b5235c87b33). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MLLIB] SPARK-4846: When the vocabulary size i...
Github user jinntrance commented on the pull request: https://github.com/apache/spark/pull/3697#issuecomment-67009217 I've commented here https://issues.apache.org/jira/browse/SPARK-4846 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4547 [MLLIB] [WIP] OOM when making bins ...
GitHub user srowen opened a pull request: https://github.com/apache/spark/pull/3702 SPARK-4547 [MLLIB] [WIP] OOM when making bins in BinaryClassificationMetrics Now that I've implemented the basics here, I'm less convinced there is a need for this change, somehow. Callers can downsample before or after. Really the OOM is not in the ROC curve code, but in code that might `collect()` it for local analysis. Still, might be useful to down-sample since the ROC curve probably never needs millions of points. This is a first pass. Since the `(score,label)` are already grouped and sorted, I think it's sufficient to just take every Nth such pair, in order to downsample by a factor of N? this is just like retaining every Nth point on the curve, which I think is the goal. All of the data is still used to build the curve of course. What do you think about the API, and usefulness? You can merge this pull request into a Git repository by running: $ git pull https://github.com/srowen/spark SPARK-4547 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3702.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3702 commit a1c3ba3b87bb779149febc1146d51c4b90b55011 Author: Sean Owen so...@cloudera.com Date: 2014-12-15T16:14:59Z Add downsamplingFactor to BinaryClassificationMetrics --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4547 [MLLIB] [WIP] OOM when making bins ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3702#issuecomment-67019824 [Test build #24461 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24461/consoleFull) for PR 3702 at commit [`a1c3ba3`](https://github.com/apache/spark/commit/a1c3ba3b87bb779149febc1146d51c4b90b55011). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826][Streaming] - Create unique file n...
Github user harishreedharan commented on the pull request: https://github.com/apache/spark/pull/3695#issuecomment-67028163 I can't be sure which one is causing the issue but this one should take care of both. Since thee file names are being updated atomically - each test will get a unique file without random causing issues again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4547 [MLLIB] [WIP] OOM when making bins ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3702#issuecomment-67030795 [Test build #24461 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24461/consoleFull) for PR 3702 at commit [`a1c3ba3`](https://github.com/apache/spark/commit/a1c3ba3b87bb779149febc1146d51c4b90b55011). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class BinaryClassificationMetrics(` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4547 [MLLIB] [WIP] OOM when making bins ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3702#issuecomment-67030807 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24461/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Add a Note on jsonFile having separate JSON ob...
Github user petervandenabeele commented on the pull request: https://github.com/apache/spark/pull/3517#issuecomment-67032682 I committed a revert that limits the squashed diff to a small addition of a Note for the 3 tabs of Scala, Java and Python. If anything more needs to happen, glad to look into it. There is no rebase required ? I could do it in a separate PR if useful. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user zhzhan commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-67035255 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1442][SQL][WIP] Initial window function...
GitHub user liancheng opened a pull request: https://github.com/apache/spark/pull/3703 [SPARK-1442][SQL][WIP] Initial window function implementation (refactored from #2953) This WIP PR is refactored from PR #2953. Please refer to the original PR description for features implemented and not implemented in this PR. The original PR was a huge one, commenting on each issue could be very time consuming. After offline discussions with @guowei2, I decided to work on a refactoring branch to fix most minor issues first and then start discussion based on this refactored version. Major issues left in this PR are: 1. Window spec is added to aggregation functions with a `var`, which breaks query plan immutability. 2. When used with window specs, common aggregation functions like `COUNT`, `SUM`, `AVG` etc are not translated into Hive aggregation functions rather than Spark SQL builtin implementations. 3. Execution code (`execution.WindowFunction`) can be further simplified. You can merge this pull request into a Git repository by running: $ git pull https://github.com/liancheng/spark window-refact Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3703.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3703 commit 9897413564ff27f0a311cc2cef6322422f3807ab Author: guowei2 guow...@asiainfo.com Date: 2014-10-24T09:55:47Z window function commit 7d7a703d5e7bf37e00d074cb8c04e2150f8fbeb4 Author: guowei2 guow...@asiainfo.com Date: 2014-10-27T05:29:35Z window function commit 1999e07a23c18808738e4e3b14b64c1db108eda2 Author: guowei2 guow...@asiainfo.com Date: 2014-10-27T06:03:17Z window function commit 76bfd4b8b1137426b0dbcde5d56cefb0c98cfab5 Author: guowei2 guow...@asiainfo.com Date: 2014-10-27T14:16:22Z window function commit 88c5789d9f6989d0fedcbdd129de097152e2d8eb Author: guowei2 guow...@asiainfo.com Date: 2014-10-28T04:00:57Z window function commit 828199a48c619d03b4ec524dbdfe9c043baa5e14 Author: guowei2 guow...@asiainfo.com Date: 2014-10-29T07:49:01Z fix problems after rebase commit 03bd77d5533f76484d7589e0296283b58f2d0688 Author: guowei2 guow...@asiainfo.com Date: 2014-10-30T10:12:42Z change test suite and golden files commit d06baeba2dc859f860c8fd43c292275837b3e0e6 Author: guowei2 guow...@asiainfo.com Date: 2014-11-05T03:01:33Z add constant objectinspector support for udafs, such as last_value(col, false) commit 173016c08770fd2aa6ee15c3f194c2282bd46e68 Author: guowei2 guow...@asiainfo.com Date: 2014-11-26T06:58:47Z fix window function to support multi-different window partitions commit ab21933e64b3ee7afdcbb622bec935a34fe0785c Author: guowei2 guow...@asiainfo.com Date: 2014-11-27T08:40:26Z fix DecimalType bug after rebase commit 66ef7a6d449f6ec1e644d2e73118d8be1cb56cde Author: guowei2 guow...@asiainfo.com Date: 2014-11-28T09:51:16Z fix bug about attribute reference commit dc87d8d08c33644e61f6355ed07baf720b0e9ef9 Author: Cheng Lian l...@databricks.com Date: 2014-12-04T06:34:32Z WIP: refactoring window functions support commit 2da61753590fe00ecf46219f387d70d48c6dd32a Author: Cheng Lian l...@databricks.com Date: 2014-12-15T06:42:57Z Removed trailing spaces from query string in HiveWindowFunctionSuite commit 922a8b9bfe0278577378c3cd9fc13cb9998b6e0f Author: Cheng Lian l...@databricks.com Date: 2014-12-15T17:18:42Z Fixes COUNT with window spec --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1442][SQL][WIP] Initial window function...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3703#issuecomment-67035848 [Test build #24462 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24462/consoleFull) for PR 3703 at commit [`922a8b9`](https://github.com/apache/spark/commit/922a8b9bfe0278577378c3cd9fc13cb9998b6e0f). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-67035890 [Test build #24463 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24463/consoleFull) for PR 3409 at commit [`daec3d0`](https://github.com/apache/spark/commit/daec3d01c937d80961b0f9eec4e0ad96539bd421). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1442][SQL][WIP] Initial window function...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/3703#issuecomment-67035927 Comments from the [review on Reviewable.io](https://reviewable.io:443/reviews/apache/spark/3703) --- Note that instead of whitelisting window function test cases in `HiveCompatibilitySuite`, a new `HiveWindowFunctionSuite` was added. This is because the current Spark SQL HiveQl parser doesn't handle comments, and window function test input files come with Hive contains comment lines. --- !-- Sent from Reviewable.io -- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826][Streaming] - Create unique file n...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3695#issuecomment-67036493 [...] conflicts reported in SPARK-4826 was likely caused by multiple tests running at the same time and using the same filename. Here's what I find confusing, though: the Jenkins master builds (and branch-1.2 builds) don't have any parallelism within a single build: only one thread of control should be executing in a given instance of `WriteAheadLogBackedBlockRDDSuite`. There might be multiple instances of `WriteAheadLogBackedBlockRDDSuite` executing in different Jenkins builds, but I would expect that the different Jenkins builds would have different values of `Files.createTempDir()` and hence would write to different output locations. Are you saying that within the same Jenkins build / instance of `WriteAheadLogBackedBlockRDDSuite`, we are having two separate tests that are choosing to write to the same subdirectory / file inside of `Files.createTempDir()`? Maybe something is implicitly calling `Random.setSeed()` between test cases, causing the tests to pick the same subdirectory. If that's the case, we might be able to fix things by moving the directory deletion call to a `afterEach()` method or by using `Files.createTempFile` within that temp directory in order to pick the filename. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4843 [YARN] Squash ExecutorRunnableUtil ...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/3696#discussion_r21844424 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnable.scala --- @@ -50,11 +51,13 @@ class ExecutorRunnable( executorCores: Int, appId: String, securityMgr: SecurityManager) - extends Runnable with ExecutorRunnableUtil with Logging { + extends Runnable with Logging { + + lazy val env = prepareEnvironment var rpc: YarnRPC = YarnRPC.create(conf) var nmClient: NMClient = _ - val sparkConf = spConf + val sparkConf: SparkConf = spConf --- End diff -- Adding the type here is not really necessary (although it makes things more consistent). But I'd suggest just removing this variable altogether, and rename the constructor parameter instead). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4843 [YARN] Squash ExecutorRunnableUtil ...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/3696#discussion_r21844488 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnable.scala --- @@ -110,4 +113,165 @@ class ExecutorRunnable( nmClient.startContainer(container, ctx) } + private def prepareCommand( + masterAddress: String, + slaveId: String, + hostname: String, + executorMemory: Int, + executorCores: Int, + appId: String, + localResources: HashMap[String, LocalResource]): List[String] = { +// Extra options for the JVM +val javaOpts = ListBuffer[String]() + +// Set the environment variable through a command prefix +// to append to the existing value of the variable +var prefixEnv: Option[String] = None + +// Set the JVM memory +val executorMemoryString = executorMemory + m +javaOpts += -Xms + executorMemoryString + -Xmx + executorMemoryString + + +// Set extra Java options for the executor, if defined +sys.props.get(spark.executor.extraJavaOptions).foreach { opts = + javaOpts += opts +} +sys.env.get(SPARK_JAVA_OPTS).foreach { opts = + javaOpts += opts +} +sys.props.get(spark.executor.extraLibraryPath).foreach { p = + prefixEnv = Some(Utils.libraryPathEnvPrefix(Seq(p))) +} + +javaOpts += -Djava.io.tmpdir= + + new Path(Environment.PWD.$(), YarnConfiguration.DEFAULT_CONTAINER_TEMP_DIR) + +// Certain configs need to be passed here because they are needed before the Executor +// registers with the Scheduler and transfers the spark configs. Since the Executor backend +// uses Akka to connect to the scheduler, the akka settings are needed as well as the +// authentication settings. +sparkConf.getAll. + filter { case (k, v) = k.startsWith(spark.auth) || k.startsWith(spark.akka) }. + foreach { case (k, v) = javaOpts += YarnSparkHadoopUtil.escapeForShell(s-D$k=$v) } + +sparkConf.getAkkaConf. + foreach { case (k, v) = javaOpts += YarnSparkHadoopUtil.escapeForShell(s-D$k=$v) } + +// Commenting it out for now - so that people can refer to the properties if required. Remove +// it once cpuset version is pushed out. +// The context is, default gc for server class machines end up using all cores to do gc - hence +// if there are multiple containers in same node, spark gc effects all other containers +// performance (which can also be other spark containers) +// Instead of using this, rely on cpusets by YARN to enforce spark behaves 'properly' in +// multi-tenant environments. Not sure how default java gc behaves if it is limited to subset +// of cores on a node. +/* +else { + // If no java_opts specified, default to using -XX:+CMSIncrementalMode + // It might be possible that other modes/config is being done in + // spark.executor.extraJavaOptions, so we dont want to mess with it. + // In our expts, using (default) throughput collector has severe perf ramnifications in + // multi-tennent machines + // The options are based on + // http://www.oracle.com/technetwork/java/gc-tuning-5-138395.html#0.0.0.%20When%20to%20Use + // %20the%20Concurrent%20Low%20Pause%20Collector|outline + javaOpts += -XX:+UseConcMarkSweepGC + javaOpts += -XX:+CMSIncrementalMode + javaOpts += -XX:+CMSIncrementalPacing + javaOpts += -XX:CMSIncrementalDutyCycleMin=0 + javaOpts += -XX:CMSIncrementalDutyCycle=10 +} +*/ + +// For log4j configuration to reference +javaOpts += (-Dspark.yarn.app.container.log.dir= + ApplicationConstants.LOG_DIR_EXPANSION_VAR) + +val commands = prefixEnv ++ Seq(Environment.JAVA_HOME.$() + /bin/java, + -server, + // Kill if OOM is raised - leverage yarn's failure handling to cause rescheduling. + // Not killing the task leaves various aspects of the executor and (to some extent) the jvm in + // an inconsistent state. + // TODO: If the OOM is not recoverable by rescheduling it on different node, then do + // 'something' to fail job ... akin to blacklisting trackers in mapred ? + -XX:OnOutOfMemoryError='kill %p') ++ + javaOpts ++ + Seq(org.apache.spark.executor.CoarseGrainedExecutorBackend, +masterAddress.toString, +slaveId.toString, +hostname.toString, +executorCores.toString, +appId, +1, ApplicationConstants.LOG_DIR_EXPANSION_VAR + /stdout, +
[GitHub] spark pull request: SPARK-4843 [YARN] Squash ExecutorRunnableUtil ...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/3696#issuecomment-67040004 A couple of minor nits, and I assume this is mostly code motion, so it LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1442][SQL][WIP] Initial window function...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/3703#issuecomment-67041844 Comments from the [review on Reviewable.io](https://reviewable.io:443/reviews/apache/spark/3703) --- sup**[sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregates.scala, line 30 \[r1\]](https://reviewable.io:443/reviews/apache/spark/3703#-JdE5SkiLKM30za1keNC)** ([raw file](https://github.com/apache/spark/blob/922a8b9bfe0278577378c3cd9fc13cb9998b6e0f/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregates.scala#L30)):/sup This can be problematic. Ideally every aggregation function that can be used with window should have a `windowSpec: Option[WindowSpec]` field which defaults to `None`, and a `withWindowSpec` method that returns a new instance of the aggregation function object itself with a window spec. --- sup**[sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala, line 874 \[r1\]](https://reviewable.io:443/reviews/apache/spark/3703#-JdE660e137A3Dk7Uqp2)** ([raw file](https://github.com/apache/spark/blob/922a8b9bfe0278577378c3cd9fc13cb9998b6e0f/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala#L874)):/sup The thread-local `windowDefs` map is used to store window definitions (`w1`, `w2` and `w3`) in queries like this: ```sql SELECT p_mfgr, p_name, p_size, SUM(p_size) OVER w1 AS s1, SUM(p_size) OVER w2 AS s2, SUM(p_size) OVER (w3 ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING) AS s3 FROM part WINDOW w1 AS (DISTRIBUTE BY p_mfgr SORT BY p_size RANGE BETWEEN 2 PRECEDING AND 2 FOLLOWING), w2 AS w3, w3 AS (DISTRIBUTE BY p_mfgr SORT BY p_size RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) ``` This map is cleaned and refilled in `collectWindowDefs` below, so it doesn't grow indefinitely. --- sup**[sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala, line 1060 \[r1\]](https://reviewable.io:443/reviews/apache/spark/3703#-JdEA7B6QXd6N0aDAEMh)** ([raw file](https://github.com/apache/spark/blob/922a8b9bfe0278577378c3cd9fc13cb9998b6e0f/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala#L1060)):/sup All builtin aggregation functions need a similar `case` clause to handle their windowed version. Otherwise they all fallback to Hive UDAF implementations. `COUNT` is picked here because its Hive version `GenericUDAFCount` implements `GenericUDAFResolver2` rather than `AbstractGenericUDAFResolver`, and is not handled by `HiveFunctionRegistry.lookupFunction`. --- !-- Sent from Reviewable.io -- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3607] ConnectionManager threads.max con...
Github user ilganeli commented on the pull request: https://github.com/apache/spark/pull/3664#issuecomment-67043577 Hi @andrewor14 is this ready to be merged now that it's passed the tests? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1037] The name of findTaskFromList fi...
Github user ilganeli commented on the pull request: https://github.com/apache/spark/pull/3665#issuecomment-67043632 Hi @andrewor14 is this ready to be merged now that it's passed the tests? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-67046289 [Test build #24463 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24463/consoleFull) for PR 3409 at commit [`daec3d0`](https://github.com/apache/spark/commit/daec3d01c937d80961b0f9eec4e0ad96539bd421). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3409#issuecomment-67046304 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24463/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1442][SQL][WIP] Initial window function...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3703#issuecomment-67047485 [Test build #24462 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24462/consoleFull) for PR 3703 at commit [`922a8b9`](https://github.com/apache/spark/commit/922a8b9bfe0278577378c3cd9fc13cb9998b6e0f). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class WindowSpec(windowPartition: WindowPartition, windowFrame: Option[WindowFrame])` * `case class WindowPartition(partitionBy: Seq[Expression], sortBy: Seq[SortOrder])` * `case class WindowFrame(frameType: FrameType, preceding: Int, following: Int)` * `abstract class AggregateExpression extends Expression with Serializable ` * `case class WindowFunction(` * `case class WindowFunction(` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1442][SQL][WIP] Initial window function...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3703#issuecomment-67047490 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24462/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826][Streaming] - Create unique file n...
Github user harishreedharan commented on the pull request: https://github.com/apache/spark/pull/3695#issuecomment-67048449 So, it could also what #3701 mentions, where the random class is returning the same string for each call - in which case every tests following the first one fails. This PR should fix that too, since each test will get a unique file name. Basically, I am trying to fix boh the cases - (1) Random returns the same string (2) tests executing in parallel. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4736][mllib] [random forest] functions ...
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/3583#issuecomment-67049092 @dikejiang Apologies--I think I was not clear. I was recommending that you change this PR to implement predictRaw(), rather than predictWithWeight(). Does that sound reasonable? Since predictRaw gives more info than predictWithWeight, it seems best to only include predictRaw. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826] Fix generation of temp file names...
GitHub user JoshRosen opened a pull request: https://github.com/apache/spark/pull/3704 [SPARK-4826] Fix generation of temp file names in WAL tests This PR is another approach for fixing SPARK-4826, an issue where a bug in how we generate temp. file names was causing spurious test failures in the write ahead log suites. Closes #3695. Closes #3701. You can merge this pull request into a Git repository by running: $ git pull https://github.com/JoshRosen/spark SPARK-4826 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3704.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3704 commit 86c1944fdaf16dc2e3a758bbd6f5a84c2d925535 Author: Josh Rosen joshro...@databricks.com Date: 2014-12-15T18:59:40Z Revert HOTFIX: Disabling failing block manager test This reverts commit 4c0673879b5c504797dafb11607d14b04c1bf47d. commit 93629194b4756229a75914d8d80c10b138ce7500 Author: Josh Rosen joshro...@databricks.com Date: 2014-12-15T19:23:27Z [SPARK-4826] Fix bug in generation of temp file names. in WAL suites. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826] Fix generation of temp file names...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3704#issuecomment-67050050 [Test build #24464 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24464/consoleFull) for PR 3704 at commit [`9362919`](https://github.com/apache/spark/commit/93629194b4756229a75914d8d80c10b138ce7500). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826] Fix generation of temp file names...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/3704#discussion_r21849567 --- Diff: streaming/src/test/scala/org/apache/spark/streaming/rdd/WriteAheadLogBackedBlockRDDSuite.scala --- @@ -137,7 +137,12 @@ class WriteAheadLogBackedBlockRDDSuite extends FunSuite with BeforeAndAfterAll { blockIds: Seq[BlockId] ): Seq[WriteAheadLogFileSegment] = { require(blockData.size === blockIds.size) -val writer = new WriteAheadLogWriter(new File(dir, Random.nextString(10)).toString, hadoopConf) +val logFilePath = { + val f = File.createTempFile(wal, null, dir) + assert(f.delete()) --- End diff -- This might look race prone (deleting a file and hoping that someone else won't come along and write it in the meantime), but it should be safe because: 1. Different Jenkins builds will have different temp directories (`dir`). 2. Within a JVM, multiple calls to `createTempFile` will never return the same pathname ([see Javadoc](http://docs.oracle.com/javase/7/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File))). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826][Streaming] - Create unique file n...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3695#issuecomment-67050346 I've opened a new PR with what I think is is a more direct approach to fixing this issue: #3704. Please take a look and let me know what you think. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Hot Fix][Streaming] Fix WriteAheadLogBackedBl...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3701#issuecomment-67050288 I've opened a new PR with what I think is is a more direct approach to fixing this issue: #3704. Please take a look and let me know what you think. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826] Fix generation of temp file names...
Github user harishreedharan commented on the pull request: https://github.com/apache/spark/pull/3704#issuecomment-67050710 This looks good to me, though the approach does not make it obvious why this approach was chosen (of course you can figure out in this context, but imagine reading this code a year later). I think the other two are slightly simpler approaches by ensuring unique names on file creation. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826] Fix generation of temp file names...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/3704#issuecomment-67051178 I buy your explanation, given the javadoc, so this LGTM. But I think a cleaner approach that doesn't require reasoning like that to convince people would be to just use a different temp dir per test (i.e. use `BeforeAndAfter` and create the dir in the `before` block). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826][Streaming] - Create unique file n...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3695#issuecomment-67051651 [Test build #24465 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24465/consoleFull) for PR 3695 at commit [`7d0044b`](https://github.com/apache/spark/commit/7d0044bf81649f84b29bb3cf56df07b5f2062561). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826] Fix generation of temp file names...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3704#issuecomment-67052062 The root issue is that this code is trying to return a unique file system path that meets two conditions: - The path does not correspond to an existing file. - Files / directories created at that path will be cleaned up by the test suite. I think this was the intent expressed by the original `Random.nextString(10)` code. One approach would be to have the _directory_ be random and per-test and to just use a fixed filename within that directory. This precludes parallel execution of the tests within a single copy of the write-ahead log suites, but I don't think we're pursuing that type of parallelism right now so I don't think that will be a huge deal. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826] Fix generation of temp file names...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/3704#issuecomment-67052541 This precludes parallel execution of the tests within a single copy of the write-ahead log suites I assume that frameworks handle that automatically (e.g. by having multiple instances of the test class), otherwise you could never parallelize tests that use before initializers. I'm pretty sure that works as intented at least with JUnit, but not super familiar with scalatest internals. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826] Fix generation of temp file names...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3704#issuecomment-67054165 Alright, I've simplified things to move the temp. dir creation to `beforeEach` instead of `beforeAll` and to use a fixed filename within that directory. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826] Fix generation of temp file names...
Github user harishreedharan commented on the pull request: https://github.com/apache/spark/pull/3704#issuecomment-67054336 +1. The latest changes look good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826][Streaming] - Create unique file n...
Github user harishreedharan commented on the pull request: https://github.com/apache/spark/pull/3695#issuecomment-67054435 Closing this as #3704 takes care of this issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826][Streaming] - Create unique file n...
Github user harishreedharan closed the pull request at: https://github.com/apache/spark/pull/3695 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826] Fix generation of temp file names...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/3704#discussion_r21851601 --- Diff: streaming/src/test/scala/org/apache/spark/streaming/rdd/WriteAheadLogBackedBlockRDDSuite.scala --- @@ -38,36 +38,42 @@ class WriteAheadLogBackedBlockRDDSuite extends FunSuite with BeforeAndAfterAll { var blockManager: BlockManager = null var dir: File = null + override def beforeEach(): Unit = { +dir = Files.createTempDir() + } + + override def afterEach(): Unit = { +dir.delete() --- End diff -- This doesn't work if the dir is not empty. You could use `Utils.createTempDir()` and, optionally, `Utils.deleteRecursively()` (since `createTempDir` already takes care of that for you). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826] Fix generation of temp file names...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/3704#discussion_r21851620 --- Diff: streaming/src/test/scala/org/apache/spark/streaming/util/WriteAheadLogSuite.scala --- @@ -44,7 +43,7 @@ class WriteAheadLogSuite extends FunSuite with BeforeAndAfter { before { tempDir = Files.createTempDir() testDir = tempDir.toString -testFile = new File(tempDir, Random.nextString(10)).toString +testFile = new File(tempDir, testFile).toString --- End diff -- This is kind of an unrelated change, but I wanted to remove this `Random.nextString()` call since it seemed confusing and didn't seem to serve any obvious purpose, since the `tempDir` is re-created before each test anyways. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826] Fix generation of temp file names...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3704#issuecomment-67054841 [Test build #24466 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24466/consoleFull) for PR 3704 at commit [`a693ddb`](https://github.com/apache/spark/commit/a693ddb1f8fb796337f1aee3c81d3fb7537888a1). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826] Fix generation of temp file names...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/3704#discussion_r21852089 --- Diff: streaming/src/test/scala/org/apache/spark/streaming/rdd/WriteAheadLogBackedBlockRDDSuite.scala --- @@ -38,36 +38,42 @@ class WriteAheadLogBackedBlockRDDSuite extends FunSuite with BeforeAndAfterAll { var blockManager: BlockManager = null var dir: File = null + override def beforeEach(): Unit = { +dir = Files.createTempDir() + } + + override def afterEach(): Unit = { +dir.delete() --- End diff -- Good catch; I've fixed this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826] Fix generation of temp file names...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3704#issuecomment-67055659 [Test build #24467 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24467/consoleFull) for PR 3704 at commit [`f2307f5`](https://github.com/apache/spark/commit/f2307f55134cb14beac42c84c330304926e8d5d6). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-2883][SQL]initial support ORC in s...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2576#issuecomment-67059719 We are adding support for writing data in the next version of the API so probably better to wait until that is available. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Support for Mesos DockerInfo
Github user tnachen commented on the pull request: https://github.com/apache/spark/pull/3074#issuecomment-67060666 @hellertime, do you think you can address the style and also add a test? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Support for Mesos DockerInfo
Github user hellertime commented on the pull request: https://github.com/apache/spark/pull/3074#issuecomment-67060975 Just pushed a style fix. It addresses the two points @ash211 pointed out. I'll look into designing a test for this. I'm thinking I'll test that the protobuf has the expected values in its fields after a call to `withDockerInfo` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4688] Have a single shared network time...
Github user varunsaxena commented on the pull request: https://github.com/apache/spark/pull/3562#issuecomment-67061747 @rxin , any conclusion on this ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826] Fix generation of temp file names...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3704#issuecomment-67063097 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24464/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826] Fix generation of temp file names...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3704#issuecomment-67063087 [Test build #24464 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24464/consoleFull) for PR 3704 at commit [`9362919`](https://github.com/apache/spark/commit/93629194b4756229a75914d8d80c10b138ce7500). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826][Streaming] - Create unique file n...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3695#issuecomment-67064878 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24465/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826][Streaming] - Create unique file n...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3695#issuecomment-67064868 [Test build #24465 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24465/consoleFull) for PR 3695 at commit [`7d0044b`](https://github.com/apache/spark/commit/7d0044bf81649f84b29bb3cf56df07b5f2062561). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826] Fix generation of temp file names...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3704#issuecomment-67067803 [Test build #24466 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24466/consoleFull) for PR 3704 at commit [`a693ddb`](https://github.com/apache/spark/commit/a693ddb1f8fb796337f1aee3c81d3fb7537888a1). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class WriteAheadLogBackedBlockRDDSuite extends FunSuite with BeforeAndAfterAll with BeforeAndAfterEach ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826] Fix generation of temp file names...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3704#issuecomment-67067815 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24466/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826] Fix generation of temp file names...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3704#issuecomment-67068688 [Test build #24467 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24467/consoleFull) for PR 3704 at commit [`f2307f5`](https://github.com/apache/spark/commit/f2307f55134cb14beac42c84c330304926e8d5d6). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class WriteAheadLogBackedBlockRDDSuite extends FunSuite with BeforeAndAfterAll with BeforeAndAfterEach ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826] Fix generation of temp file names...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3704#issuecomment-67068703 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24467/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3382] GradientDescent convergence toler...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3636#issuecomment-67069722 [Test build #544 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/544/consoleFull) for PR 3636 at commit [`f867eea`](https://github.com/apache/spark/commit/f867eea1735c39ccb939604f10817d8a6eb48b55). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4826] Fix generation of temp file names...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/3704#issuecomment-67069732 Looks like a robust approach to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4814 [CORE] Enable assertions in SBT, Ma...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/3692#issuecomment-67069819 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3382] GradientDescent convergence toler...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3636#issuecomment-67069849 [Test build #544 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/544/consoleFull) for PR 3636 at commit [`f867eea`](https://github.com/apache/spark/commit/f867eea1735c39ccb939604f10817d8a6eb48b55). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-785 [CORE] ClosureCleaner not invoked on...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/3690#issuecomment-67069825 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4298][Core] - The spark-submit cannot r...
Github user brennonyork commented on the pull request: https://github.com/apache/spark/pull/3561#issuecomment-67070428 Bump, @andrewor14 and @JoshRosen, any updates / issues with the modified code? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-785 [CORE] ClosureCleaner not invoked on...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3690#issuecomment-67070449 [Test build #24469 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24469/consoleFull) for PR 3690 at commit [`8df68fe`](https://github.com/apache/spark/commit/8df68fed84cbbcc328d9ccce5df930f1c76c6b07). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4814 [CORE] Enable assertions in SBT, Ma...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3692#issuecomment-67070446 [Test build #24468 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24468/consoleFull) for PR 3692 at commit [`caca704`](https://github.com/apache/spark/commit/caca7047a6bcd672ae5e9657f4b2d5a61ba97cb7). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org