[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/772#issuecomment-47063421 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/772#issuecomment-47063423 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16105/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SQL]Extract the joinkeys from join condition
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1190#issuecomment-47063650 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16104/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SQL]Extract the joinkeys from join condition
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1190#issuecomment-47063649 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2150: Provide direct link to finished ap...
Github user rahulsinghaliitd commented on the pull request: https://github.com/apache/spark/pull/1094#issuecomment-47064230 @vanzin I was only referring to how the UI URL is passed around. I have used the longer way of passing it around using command line arguments whereas the other change uses spark conf by simply setting it as another property. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/1056#issuecomment-47065070 Uploaded a new patch that adds a general executor-driver heartbeat. With the patch, observed jobs running fine on a pseudo-distributed yarn cluster. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1056#issuecomment-47065134 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1056#issuecomment-47065129 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1056#issuecomment-47065240 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1056#issuecomment-47065241 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16106/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1056#issuecomment-47065707 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1056#issuecomment-47065717 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1470: Use the scala-logging wrapper inst...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1208#issuecomment-47067625 To be honest, my scar from the incident (deprecating Scala 2.10 support before Scala 2.11 was even released) hasn't fully recovered. Do we gain anything by moving onto this logging API? There is a tiny teeny performance boost, but I don't think we log any hot code path in spark-core. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2263][SQL] Support inserting MAPK, V ...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1205#issuecomment-47067733 Thanks. Merging this in master branch-1.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2263][SQL] Support inserting MAPK, V ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1205 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP][SPARK-2097][SQL] UDF Support
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1063#issuecomment-47067767 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2267] Log exception when TaskResultGett...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1202#issuecomment-47067798 Ok merging this. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP][SPARK-2097][SQL] UDF Support
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1063#issuecomment-47067774 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP][SPARK-2097][SQL] UDF Support
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1063#issuecomment-47067885 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP][SPARK-2097][SQL] UDF Support
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1063#issuecomment-47067886 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16108/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [BUGFIX][SQL] Should match java.math.BigDecima...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1199#issuecomment-47067891 I'm going to merge this first since the test is most likely a different problem. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [BUGFIX][SQL] Should match java.math.BigDecima...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1199 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2244] Fix hang introduced by SPARK-1466
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1197#issuecomment-47067995 @andrewor14 @mattf Did you guys figure out which pr is a better way to solve this problem? (this one or #1178) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2267] Log exception when TaskResultGett...
Github user rxin closed the pull request at: https://github.com/apache/spark/pull/1202 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2038: rename conf parameters in the sa...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1137#issuecomment-47068276 Looks good. Merging in master. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2038: rename conf parameters in the sa...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1137 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1056#issuecomment-47068360 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1056#issuecomment-47068361 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16107/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/772#issuecomment-47069158 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/772#issuecomment-47069146 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2755] More general Storage Interface fo...
GitHub user colorant opened a pull request: https://github.com/apache/spark/pull/1209 [SPARK-2755] More general Storage Interface for Shuffle / Spill etc Hi, this is for https://issues.apache.org/jira/browse/SPARK-2275 The code here is not intended to be merged as current status, instead, I just try to show what I think this change could be. so I put up this PR as a quick way to verify the idea and see how much things need to be modified. It definitely need to be improved or even restructured. And this is for solving the problem 1 in the jira ticket, since problem 2 rely on problem 1, So I want to use this PR to present my general ideas and to find out what do you think about this whole thing. Thanks. You can merge this pull request into a Git repository by running: $ git pull https://github.com/colorant/spark shufflebm Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1209.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1209 commit 76319779b9d1bdef3ebc8a8cdc12d73bb3a7c13e Author: Raymond Liu raymond@intel.com Date: 2014-06-18T08:12:36Z initial commit for redesign shuffle/spill blockmanager interface --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2755] More general Storage Interface fo...
Github user colorant commented on the pull request: https://github.com/apache/spark/pull/1209#issuecomment-47070166 Hi @andrewor14 , it seems to me that you are work on some big change related to BlockManager, could you take a look on this one? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2755] More general Storage Interface fo...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1209#issuecomment-47070209 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/772#issuecomment-47070227 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/772#issuecomment-47070228 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16109/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2755] More general Storage Interface fo...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1209#issuecomment-47070218 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2755] More general Storage Interface fo...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1209#issuecomment-47070335 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16110/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2755] More general Storage Interface fo...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1209#issuecomment-47070332 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Pluggable Diskstore for BlockManager
Github user colorant commented on the pull request: https://github.com/apache/spark/pull/907#issuecomment-47070411 Hi @andrewor14 , other than #1209 , also this one is related to BM, could you also take a look on the general idea ? I know the code need a rebase to the latest code, but I am seek for a general feedback about the ideas ;) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2755] More general Storage Interface fo...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1209#issuecomment-47071142 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2755] More general Storage Interface fo...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1209#issuecomment-47071126 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Use the Executor's ClassLoader in sc.objectFil...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/181#issuecomment-47072331 @darabos do you mind picking this up now the test util was merged? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: fix compile error of streaming project
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/153#issuecomment-47072491 @gzm55 can you explain the compilation error? Otherwise we should close the pull request. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Minor optimizations. Use safer take, tail meth...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/473#issuecomment-47072778 @izendejas do you mind updating the pull request to address my comment? Everything else looks good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1056#issuecomment-47073091 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1056#issuecomment-47073080 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Use the Executor's ClassLoader in sc.objectFil...
Github user darabos commented on the pull request: https://github.com/apache/spark/pull/181#issuecomment-47074387 Sorry for leaving this hanging. I'll take a look at the test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2125] Add sort flag and move sort into ...
GitHub user jerryshao opened a pull request: https://github.com/apache/spark/pull/1210 [SPARK-2125] Add sort flag and move sort into shuffle implementations This patch adds a sort flag into ShuffleDependecy and moves sort into hash shuffle implementation. Moving sort into shuffle implementation can give space for other shuffle implementations (like sort-based shuffle) to better optimize sort through shuffle. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jerryshao/apache-spark SPARK-2125 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1210.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1210 commit 0b3b9b7e6665092f054c665c87221a698da5 Author: jerryshao saisai.s...@intel.com Date: 2014-06-16T01:48:25Z Move sort into shuffle implementations commit 6e402de45b54134150dfd34370fe6a17c5acfc03 Author: jerryshao saisai.s...@intel.com Date: 2014-06-24T09:45:17Z Minor changes about naming and order commit 0c675efca688a0e03869f9aea0332073bf672bf6 Author: jerryshao saisai.s...@intel.com Date: 2014-06-25T05:39:46Z Fix issues related to unit test commit 9ad9aaaf1f06a4b88d57d6415b5f639c018226e6 Author: jerryshao saisai.s...@intel.com Date: 2014-06-25T08:26:32Z Change sort flag into Option --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2125] Add sort flag and move sort into ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1210#issuecomment-47075231 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2125] Add sort flag and move sort into ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1210#issuecomment-47075220 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2125] Add sort flag and move sort into ...
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/1210#discussion_r14174587 --- Diff: core/src/main/scala/org/apache/spark/shuffle/hash/HashShuffleReader.scala --- @@ -49,6 +49,17 @@ class HashShuffleReader[K, C]( } else { iter } + +val sortedIter = for (asc - dep.ascending; ordering - dep.keyOrdering) yield { + val buf = aggregatedIter.toArray --- End diff -- This does not take up a lot of memory? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2125] Add sort flag and move sort into ...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/1210#discussion_r14174925 --- Diff: core/src/main/scala/org/apache/spark/shuffle/hash/HashShuffleReader.scala --- @@ -49,6 +49,17 @@ class HashShuffleReader[K, C]( } else { iter } + +val sortedIter = for (asc - dep.ascending; ordering - dep.keyOrdering) yield { + val buf = aggregatedIter.toArray --- End diff -- Yes, it's true. But I will not change the original implementation, since [PR931](https://github.com/apache/spark/pull/931) will solve this issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/772#issuecomment-47076927 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/772#issuecomment-47076966 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/772#issuecomment-47076974 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/772#issuecomment-47077133 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/772#issuecomment-47077134 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16114/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1056#issuecomment-47081823 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1056#issuecomment-47081827 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16112/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2755] More general Storage Interface fo...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1209#issuecomment-47081824 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2755] More general Storage Interface fo...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1209#issuecomment-47081826 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16111/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2125] Add sort flag and move sort into ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1210#issuecomment-47086129 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2125] Add sort flag and move sort into ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1210#issuecomment-47086130 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16113/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2204] Launch tasks on the proper execut...
Github user sebastienrainville commented on the pull request: https://github.com/apache/spark/pull/1140#issuecomment-47088399 Yes, I tested it on our cluster and it seems to work properly. Thanks for creating the JIRA to clean up the code! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/772#issuecomment-47089536 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/772#issuecomment-47089530 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/772#issuecomment-47098184 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2186: Spark SQL DSL support for simple a...
GitHub user edrevo opened a pull request: https://github.com/apache/spark/pull/1211 SPARK-2186: Spark SQL DSL support for simple aggregations such as SUM and AVG **Description** This patch enables using the `.select()` function in SchemaRDD with functions such as `Sum`, `Count` and other. **Testing** Unit tests added. You can merge this pull request into a Git repository by running: $ git pull https://github.com/edrevo/spark add-expression-support-in-select Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1211.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1211 commit e1d344a18a92ca8dd05b094b5079fdca3b629551 Author: Ximo Guanter Gonzalbez x...@tid.es Date: 2014-06-25T13:09:35Z SPARK-2186: Spark SQL DSL support for simple aggregations such as SUM and AVG --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2186: Spark SQL DSL support for simple a...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1211#issuecomment-47101091 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1946] Submit tasks after (configured ra...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/900#issuecomment-47104091 thanks @li-zhihui. I was actually referring to modifying the user docs to add the new configs. look in docs/configuration.md. It makes sense to move it down and get as much initialization stuff out of the way before waiting. To me exactly which class it goes in depends on how we see it fitting and potentially being used in the future. You could for instance move it down into submitMissingTasks before the call to submitTasks and leave it in DAGScheduler instead. I think for this pr where we are just checking initially (job submission) that we have enough executors it doesn't matter to much. But in the future if we would want to check between stages or potentially when adding tasks then it matters where it goes. perhaps @kayousterhout has opinion on where it better fits? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2277: make TaskScheduler track hosts on ...
GitHub user lirui-intel opened a pull request: https://github.com/apache/spark/pull/1212 SPARK-2277: make TaskScheduler track hosts on rack You can merge this pull request into a Git repository by running: $ git pull https://github.com/lirui-intel/spark trackHostOnRack Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1212.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1212 commit 79ac750154eb37e36fcb733559a35d66f043e31d Author: Rui Li rui...@intel.com Date: 2014-06-25T14:33:22Z SPARK-2277: make TaskScheduler track hosts on rack commit 5e4ef62b7a31ff2c3207a53959079b1acfe3d6fb Author: Rui Li rui...@intel.com Date: 2014-06-25T14:39:43Z SPARK-2277: remove unnecessary import --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2277: make TaskScheduler track hosts on ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1212#issuecomment-47111959 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1470: Use the scala-logging wrapper inst...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1208#issuecomment-47118382 The main benefit is unified log Interface. Now the code using `scala-logging-slf4j` and `slf4j-api` at the same time --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1516]Throw exception in yarn client ins...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/1099#issuecomment-47125253 @mengxr any further comments on this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2150: Provide direct link to finished ap...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/1094#issuecomment-47125879 @rahulsinghaliitd ah, good point. Passing as a SparkConf property should work now that I fixed some things in the yarn-cluster backend. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2150: Provide direct link to finished ap...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/1094#issuecomment-47126414 Latest patch LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2150: Provide direct link to finished ap...
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/1094#issuecomment-47126454 (Aside from rebasing to fix the merge conflicts.) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Minor optimizations. Use safer take, tail meth...
Github user izendejas commented on the pull request: https://github.com/apache/spark/pull/473#issuecomment-47125703 Will do later today. Thanks. On Wed, Jun 25, 2014 at 1:16 AM, Reynold Xin notificati...@github.com wrote: @izendejas https://github.com/izendejas do you mind updating the pull request to address my comment? Everything else looks good. â Reply to this email directly or view it on GitHub https://github.com/apache/spark/pull/473#issuecomment-47072778. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add Shortest-path computations to graphx.lib w...
Github user andy327 closed the pull request at: https://github.com/apache/spark/pull/10 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix JIRA-983 and support exteranl sort for sor...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/931#issuecomment-47127748 @xiajunluan are you going to be able to address these soon? We'd like to get this merged quickly if possible. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...
Github user mattf commented on the pull request: https://github.com/apache/spark/pull/1178#issuecomment-47130532 @andrewor14 what's the reproducer for the hangs when an exception is thrown case? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2244] Fix hang introduced by SPARK-1466
Github user mattf commented on the pull request: https://github.com/apache/spark/pull/1197#issuecomment-47131121 @rxin not yet - my current position is that the hang should be resolved independently of other changes (i.e. not in conjunction w/ a masked output change - keep the changed simple and single purpose). for that reason i still prefer the simple close() solution. however, there is a case that @andrewor14 has mentioned that close() does not cover. i'd like to reproduce that case as well before making a final recommendation on approach. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2204] Launch tasks on the proper execut...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1140#issuecomment-47131216 Jenkins, test this please. LGTM pending tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2204] Launch tasks on the proper execut...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1140#issuecomment-47131451 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2204] Launch tasks on the proper execut...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1140#issuecomment-47131468 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/1178#issuecomment-47132857 @mattf try adding the following lines to `bin/spark-class` (anywhere near the lines with `SPARK_MEM` is fine): ``` echo Hello. This goes to stdout... echo and interferes with pyspark reading the py4j port as an int ``` What pyspark tries to do is to read the string Hello. This goes to stdout... as an int and throws an exception. I think whether it hangs depends on the environment, but on mine I ran into the deadlock the python docs warned against. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2244] Fix hang introduced by SPARK-1466
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/1197#issuecomment-47133138 @mattf, whether or not close() works out in the end, we still need to redirect all of Spark's logging to the console output. As long as we pass in `stderr=PIPE` in subprocess it will swallow all of this. Part of my PR is to fix that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2244] Fix hang introduced by SPARK-1466
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/1197#issuecomment-47133424 My PR is intended to be a hot fix anyway. The whole issue with reading the py4j port through `stdout` is hacky and prone to interference from output of other scripts. If you would like to, you are welcome to submit a patch for the longer term solution. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2186: Spark SQL DSL support for simple a...
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1211#issuecomment-47134060 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...
Github user mattf commented on the pull request: https://github.com/apache/spark/pull/1178#issuecomment-47134086 @andrewor14 thanks, i've been able to reproduce a hang when spark-class outputs something other than the port # --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2186: Spark SQL DSL support for simple a...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1211#issuecomment-47134233 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2186: Spark SQL DSL support for simple a...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1211#issuecomment-47134219 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1178#issuecomment-47134903 This looks good to me. I'm going merge it since pyspark is broken without this patch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2244] Fix hang introduced by SPARK-1466
Github user mattf commented on the pull request: https://github.com/apache/spark/pull/1197#issuecomment-47135124 @rxin @andrewor14 from what i can tell there are three issues here - a. hang on simple job; reported as SPARK-2244 and SPARK-2242; root cause is stderr buffer deadlock b. masked output from shell subprocess; introduced by SPARK-1466; root cause is lack of pass through for stderr c. fragile port passing between child and parent in pyspark all should be addressed in isolation (andrewor14, the fact that your patch tries to address multiple concerns at the same time is why i'd prefer an alternative). i recommend - . first, fix (a) w/ close() and resolve both SPARK-2242 and SPARK-2244 . second, file a bug for (b) and address it w/ enhanced exception handling based on the current SPARK-2242 patch . third, file a new bug for (c) with a solution that is yet to be determined --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1178 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2258 / 2266] Fix a few worker UI bugs
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1203#discussion_r14201947 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -333,18 +327,20 @@ private[spark] class Worker( finishedDrivers(driverId) = driver memoryUsed -= driver.driverDesc.mem coresUsed -= driver.driverDesc.cores -} -case x: DisassociatedEvent if x.remoteAddress == masterAddress = - logInfo(s$x Disassociated !) - masterDisconnected() +case d @ DisassociatedEvent(localAddress, remoteAddress, inbound) = + if (remoteAddress == masterAddress) { +logInfo(s$d Disassociated!) +masterDisconnected() + } else { +logWarning(sReceived unknown dissociation event: $d) --- End diff -- I don't think this warning is a good idea. The worker also can become disassociated from the executor or driver actors, in those cases I'm not sure we want to log a warning. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2244] Fix hang introduced by SPARK-1466
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/1197#issuecomment-47137043 The thing is pyspark is still broken even if we fix (a) but not (b). For example, if your driver cannot communicate with the master somehow, it normally prints the warning messages Cannot connect to master or something. If Spark logging is masked, then running `sc.parallelize` in this case still hangs without any output. This is actually the case I personally ran into in the first place. Since, issues (a) and (b) are related and have a common simple fix, I think it makes sense to fix them both at once. I agree that (c) should be a new issue and is outside of the scope of this issue. For now, I just want to make sure pyspark is not broken on master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2258 / 2266] Fix a few worker UI bugs
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1203#issuecomment-47137130 @andrewor14 do you mind submitting a version of this without the code formatting changes that I can easily merge and backport into branch-1.0? I think there are only four lines here that relate to fixing those bugs. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2258 / 2266] Fix a few worker UI bugs
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/1203#discussion_r14202037 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -333,18 +327,20 @@ private[spark] class Worker( finishedDrivers(driverId) = driver memoryUsed -= driver.driverDesc.mem coresUsed -= driver.driverDesc.cores -} -case x: DisassociatedEvent if x.remoteAddress == masterAddress = - logInfo(s$x Disassociated !) - masterDisconnected() +case d @ DisassociatedEvent(localAddress, remoteAddress, inbound) = + if (remoteAddress == masterAddress) { +logInfo(s$d Disassociated!) +masterDisconnected() + } else { +logWarning(sReceived unknown dissociation event: $d) --- End diff -- Ah I see, I guess that's the reason why it wasn't there in the existing code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1946] Submit tasks after (configured ra...
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/900#discussion_r14202320 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -46,9 +46,17 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A { // Use an atomic variable to track total number of cores in the cluster for simplicity and speed var totalCoreCount = new AtomicInteger(0) + var totalExecutors = new AtomicInteger(0) val conf = scheduler.sc.conf private val timeout = AkkaUtils.askTimeout(conf) private val akkaFrameSize = AkkaUtils.maxFrameSizeBytes(conf) + // Submit tasks only after (registered executors / total executors) arrived the ratio. --- End diff -- arrived the ratio -- is equal to at least this value --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---