[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/772#issuecomment-47063421
  
Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/772#issuecomment-47063423
  

Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16105/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SQL]Extract the joinkeys from join condition

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1190#issuecomment-47063650
  
All automated tests passed.
Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16104/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SQL]Extract the joinkeys from join condition

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1190#issuecomment-47063649
  
Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2150: Provide direct link to finished ap...

2014-06-25 Thread rahulsinghaliitd
Github user rahulsinghaliitd commented on the pull request:

https://github.com/apache/spark/pull/1094#issuecomment-47064230
  
@vanzin I was only referring to how the UI URL is passed around. I have 
used the longer way of passing it around using command line arguments whereas 
the other change uses spark conf by simply setting it as another property.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...

2014-06-25 Thread sryza
Github user sryza commented on the pull request:

https://github.com/apache/spark/pull/1056#issuecomment-47065070
  
Uploaded a new patch that adds a general executor-driver heartbeat.  With 
the patch, observed jobs running fine on a pseudo-distributed yarn cluster.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1056#issuecomment-47065134
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1056#issuecomment-47065129
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1056#issuecomment-47065240
  
Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1056#issuecomment-47065241
  

Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16106/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1056#issuecomment-47065707
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1056#issuecomment-47065717
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1470: Use the scala-logging wrapper inst...

2014-06-25 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1208#issuecomment-47067625
  
To be honest, my scar from the incident (deprecating Scala 2.10 support 
before Scala 2.11 was even released) hasn't fully recovered. Do we gain 
anything by moving onto this logging API? There is a tiny teeny performance 
boost, but I don't think we log any hot code path in spark-core.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2263][SQL] Support inserting MAPK, V ...

2014-06-25 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1205#issuecomment-47067733
  
Thanks. Merging this in master  branch-1.0.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2263][SQL] Support inserting MAPK, V ...

2014-06-25 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/1205


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [WIP][SPARK-2097][SQL] UDF Support

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1063#issuecomment-47067767
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2267] Log exception when TaskResultGett...

2014-06-25 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1202#issuecomment-47067798
  
Ok merging this. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [WIP][SPARK-2097][SQL] UDF Support

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1063#issuecomment-47067774
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [WIP][SPARK-2097][SQL] UDF Support

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1063#issuecomment-47067885
  
Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [WIP][SPARK-2097][SQL] UDF Support

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1063#issuecomment-47067886
  

Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16108/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [BUGFIX][SQL] Should match java.math.BigDecima...

2014-06-25 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1199#issuecomment-47067891
  
I'm going to merge this first since the test is most likely a different 
problem. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [BUGFIX][SQL] Should match java.math.BigDecima...

2014-06-25 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/1199


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2244] Fix hang introduced by SPARK-1466

2014-06-25 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1197#issuecomment-47067995
  
@andrewor14 @mattf Did you guys figure out which pr is a better way to 
solve this problem? (this one or #1178)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2267] Log exception when TaskResultGett...

2014-06-25 Thread rxin
Github user rxin closed the pull request at:

https://github.com/apache/spark/pull/1202


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2038: rename conf parameters in the sa...

2014-06-25 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1137#issuecomment-47068276
  
Looks good. Merging in master. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2038: rename conf parameters in the sa...

2014-06-25 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/1137


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1056#issuecomment-47068360
  
Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1056#issuecomment-47068361
  

Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16107/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/772#issuecomment-47069158
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/772#issuecomment-47069146
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2755] More general Storage Interface fo...

2014-06-25 Thread colorant
GitHub user colorant opened a pull request:

https://github.com/apache/spark/pull/1209

[SPARK-2755] More general Storage Interface for Shuffle / Spill etc

Hi, this is for https://issues.apache.org/jira/browse/SPARK-2275

The code here is not intended to be merged as current status, instead, I 
just try to show what I think this change could be.

so I put up this PR as a quick way to verify the idea and see how much 
things need to be modified. It definitely need to be improved or even 
restructured. 

And this is for solving the problem 1 in the jira ticket, since problem 2 
rely on problem 1, So I  want to use this PR to present my general ideas and to 
find out what do you think about this whole thing. Thanks.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/colorant/spark shufflebm

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1209.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1209


commit 76319779b9d1bdef3ebc8a8cdc12d73bb3a7c13e
Author: Raymond Liu raymond@intel.com
Date:   2014-06-18T08:12:36Z

initial commit for redesign shuffle/spill blockmanager interface




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2755] More general Storage Interface fo...

2014-06-25 Thread colorant
Github user colorant commented on the pull request:

https://github.com/apache/spark/pull/1209#issuecomment-47070166
  
Hi @andrewor14 , it seems to me that you are work on some big change 
related to BlockManager, could you take a look on this one?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2755] More general Storage Interface fo...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1209#issuecomment-47070209
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/772#issuecomment-47070227
  
Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/772#issuecomment-47070228
  

Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16109/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2755] More general Storage Interface fo...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1209#issuecomment-47070218
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2755] More general Storage Interface fo...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1209#issuecomment-47070335
  

Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16110/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2755] More general Storage Interface fo...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1209#issuecomment-47070332
  
Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Pluggable Diskstore for BlockManager

2014-06-25 Thread colorant
Github user colorant commented on the pull request:

https://github.com/apache/spark/pull/907#issuecomment-47070411
  
Hi @andrewor14 , other than #1209 , also this one is related to BM, could 
you also take a look on the general idea ? I know the code need a rebase to the 
latest code, but I am seek for a general feedback about the ideas ;)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2755] More general Storage Interface fo...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1209#issuecomment-47071142
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2755] More general Storage Interface fo...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1209#issuecomment-47071126
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Use the Executor's ClassLoader in sc.objectFil...

2014-06-25 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/181#issuecomment-47072331
  
@darabos do you mind picking this up now the test util was merged?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: fix compile error of streaming project

2014-06-25 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/153#issuecomment-47072491
  
@gzm55 can you explain the compilation error? Otherwise we should close the 
pull request.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Minor optimizations. Use safer take, tail meth...

2014-06-25 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/473#issuecomment-47072778
  
@izendejas do you mind updating the pull request to address my comment? 
Everything else looks good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1056#issuecomment-47073091
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1056#issuecomment-47073080
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Use the Executor's ClassLoader in sc.objectFil...

2014-06-25 Thread darabos
Github user darabos commented on the pull request:

https://github.com/apache/spark/pull/181#issuecomment-47074387
  
Sorry for leaving this hanging. I'll take a look at the test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2125] Add sort flag and move sort into ...

2014-06-25 Thread jerryshao
GitHub user jerryshao opened a pull request:

https://github.com/apache/spark/pull/1210

[SPARK-2125] Add sort flag and move sort into shuffle implementations

This patch adds a sort flag into ShuffleDependecy and moves sort into hash 
shuffle implementation.

Moving sort into shuffle implementation can give space for other shuffle 
implementations (like sort-based shuffle) to better optimize sort through 
shuffle. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jerryshao/apache-spark SPARK-2125

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1210.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1210


commit 0b3b9b7e6665092f054c665c87221a698da5
Author: jerryshao saisai.s...@intel.com
Date:   2014-06-16T01:48:25Z

Move sort into shuffle implementations

commit 6e402de45b54134150dfd34370fe6a17c5acfc03
Author: jerryshao saisai.s...@intel.com
Date:   2014-06-24T09:45:17Z

Minor changes about naming and order

commit 0c675efca688a0e03869f9aea0332073bf672bf6
Author: jerryshao saisai.s...@intel.com
Date:   2014-06-25T05:39:46Z

Fix issues related to unit test

commit 9ad9aaaf1f06a4b88d57d6415b5f639c018226e6
Author: jerryshao saisai.s...@intel.com
Date:   2014-06-25T08:26:32Z

Change sort flag into Option




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2125] Add sort flag and move sort into ...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1210#issuecomment-47075231
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2125] Add sort flag and move sort into ...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1210#issuecomment-47075220
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2125] Add sort flag and move sort into ...

2014-06-25 Thread witgo
Github user witgo commented on a diff in the pull request:

https://github.com/apache/spark/pull/1210#discussion_r14174587
  
--- Diff: 
core/src/main/scala/org/apache/spark/shuffle/hash/HashShuffleReader.scala ---
@@ -49,6 +49,17 @@ class HashShuffleReader[K, C](
 } else {
   iter
 }
+
+val sortedIter = for (asc - dep.ascending; ordering - 
dep.keyOrdering) yield {
+  val buf = aggregatedIter.toArray
--- End diff --

This does not take up a lot of memory?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2125] Add sort flag and move sort into ...

2014-06-25 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/1210#discussion_r14174925
  
--- Diff: 
core/src/main/scala/org/apache/spark/shuffle/hash/HashShuffleReader.scala ---
@@ -49,6 +49,17 @@ class HashShuffleReader[K, C](
 } else {
   iter
 }
+
+val sortedIter = for (asc - dep.ascending; ordering - 
dep.keyOrdering) yield {
+  val buf = aggregatedIter.toArray
--- End diff --

Yes, it's true. But I will not change the original implementation, since 
[PR931](https://github.com/apache/spark/pull/931) will solve this issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...

2014-06-25 Thread ScrapCodes
Github user ScrapCodes commented on the pull request:

https://github.com/apache/spark/pull/772#issuecomment-47076927
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/772#issuecomment-47076966
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/772#issuecomment-47076974
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/772#issuecomment-47077133
  
Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/772#issuecomment-47077134
  

Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16114/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1056#issuecomment-47081823
  
Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2099. Report progress while task is runn...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1056#issuecomment-47081827
  

Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16112/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2755] More general Storage Interface fo...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1209#issuecomment-47081824
  
Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2755] More general Storage Interface fo...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1209#issuecomment-47081826
  

Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16111/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2125] Add sort flag and move sort into ...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1210#issuecomment-47086129
  
Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2125] Add sort flag and move sort into ...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1210#issuecomment-47086130
  

Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16113/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2204] Launch tasks on the proper execut...

2014-06-25 Thread sebastienrainville
Github user sebastienrainville commented on the pull request:

https://github.com/apache/spark/pull/1140#issuecomment-47088399
  
Yes, I tested it on our cluster and it seems to work properly. Thanks for 
creating the JIRA to clean up the code!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/772#issuecomment-47089536
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/772#issuecomment-47089530
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1776] Have Spark's SBT build read depen...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/772#issuecomment-47098184
  
Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2186: Spark SQL DSL support for simple a...

2014-06-25 Thread edrevo
GitHub user edrevo opened a pull request:

https://github.com/apache/spark/pull/1211

SPARK-2186: Spark SQL DSL support for simple aggregations such as SUM and 
AVG

**Description** This patch enables using the `.select()` function in 
SchemaRDD with functions such as `Sum`, `Count` and other.
**Testing** Unit tests added.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/edrevo/spark add-expression-support-in-select

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1211.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1211


commit e1d344a18a92ca8dd05b094b5079fdca3b629551
Author: Ximo Guanter Gonzalbez x...@tid.es
Date:   2014-06-25T13:09:35Z

SPARK-2186: Spark SQL DSL support for simple aggregations such as SUM and 
AVG




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2186: Spark SQL DSL support for simple a...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1211#issuecomment-47101091
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1946] Submit tasks after (configured ra...

2014-06-25 Thread tgravescs
Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/900#issuecomment-47104091
  
thanks @li-zhihui. I was actually referring to modifying the user docs to 
add the new configs.  look in docs/configuration.md.   

It makes sense to move it down and get as much initialization stuff out of 
the way before waiting.  To me exactly which class it goes in depends on how we 
see it fitting and potentially being used in the future.  You could for 
instance move it down into submitMissingTasks before the call to submitTasks 
and leave it in DAGScheduler instead.

I think for this pr where we are just checking initially (job submission) 
that we have enough executors it doesn't matter to much.  But in the future if 
we would want to check between stages or potentially when adding tasks then it 
matters where it goes.  

perhaps @kayousterhout has opinion on where it better fits?   



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2277: make TaskScheduler track hosts on ...

2014-06-25 Thread lirui-intel
GitHub user lirui-intel opened a pull request:

https://github.com/apache/spark/pull/1212

SPARK-2277: make TaskScheduler track hosts on rack



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lirui-intel/spark trackHostOnRack

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1212.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1212


commit 79ac750154eb37e36fcb733559a35d66f043e31d
Author: Rui Li rui...@intel.com
Date:   2014-06-25T14:33:22Z

SPARK-2277: make TaskScheduler track hosts on rack

commit 5e4ef62b7a31ff2c3207a53959079b1acfe3d6fb
Author: Rui Li rui...@intel.com
Date:   2014-06-25T14:39:43Z

SPARK-2277: remove unnecessary import




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2277: make TaskScheduler track hosts on ...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1212#issuecomment-47111959
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1470: Use the scala-logging wrapper inst...

2014-06-25 Thread witgo
Github user witgo commented on the pull request:

https://github.com/apache/spark/pull/1208#issuecomment-47118382
  
The main benefit is unified log Interface.  Now the code  using 
`scala-logging-slf4j` and `slf4j-api` at the same time


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1516]Throw exception in yarn client ins...

2014-06-25 Thread tgravescs
Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/1099#issuecomment-47125253
  
@mengxr  any further comments on this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2150: Provide direct link to finished ap...

2014-06-25 Thread vanzin
Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/1094#issuecomment-47125879
  
@rahulsinghaliitd ah, good point. Passing as a SparkConf property should 
work now that I fixed some things in the yarn-cluster backend.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2150: Provide direct link to finished ap...

2014-06-25 Thread vanzin
Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/1094#issuecomment-47126414
  
Latest patch LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2150: Provide direct link to finished ap...

2014-06-25 Thread vanzin
Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/1094#issuecomment-47126454
  
(Aside from rebasing to fix the merge conflicts.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Minor optimizations. Use safer take, tail meth...

2014-06-25 Thread izendejas
Github user izendejas commented on the pull request:

https://github.com/apache/spark/pull/473#issuecomment-47125703
  
Will do later today. Thanks.


On Wed, Jun 25, 2014 at 1:16 AM, Reynold Xin notificati...@github.com
wrote:

 @izendejas https://github.com/izendejas do you mind updating the pull
 request to address my comment? Everything else looks good.

 —
 Reply to this email directly or view it on GitHub
 https://github.com/apache/spark/pull/473#issuecomment-47072778.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Add Shortest-path computations to graphx.lib w...

2014-06-25 Thread andy327
Github user andy327 closed the pull request at:

https://github.com/apache/spark/pull/10


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix JIRA-983 and support exteranl sort for sor...

2014-06-25 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/931#issuecomment-47127748
  
@xiajunluan are you going to be able to address these soon? We'd like to 
get this merged quickly if possible.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-25 Thread mattf
Github user mattf commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-47130532
  
@andrewor14 what's the reproducer for the hangs when an exception is 
thrown case?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2244] Fix hang introduced by SPARK-1466

2014-06-25 Thread mattf
Github user mattf commented on the pull request:

https://github.com/apache/spark/pull/1197#issuecomment-47131121
  
@rxin not yet -

my current position is that the hang should be resolved independently of 
other changes (i.e. not in conjunction w/ a masked output change - keep the 
changed simple and single purpose). for that reason i still prefer the simple 
close() solution.

however, there is a case that @andrewor14 has mentioned that close() does 
not cover. i'd like to reproduce that case as well before making a final 
recommendation on approach.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2204] Launch tasks on the proper execut...

2014-06-25 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/1140#issuecomment-47131216
  
Jenkins, test this please. LGTM pending tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2204] Launch tasks on the proper execut...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1140#issuecomment-47131451
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2204] Launch tasks on the proper execut...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1140#issuecomment-47131468
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-25 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-47132857
  
@mattf try adding the following lines to `bin/spark-class` (anywhere near 
the lines with `SPARK_MEM` is fine):

```
echo Hello. This goes to stdout...
echo and interferes with pyspark reading the py4j port as an int
```

What pyspark tries to do is to read the string Hello. This goes to 
stdout... as an int and throws an exception. I think whether it hangs depends 
on the environment, but on mine I ran into the deadlock the python docs warned 
against.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2244] Fix hang introduced by SPARK-1466

2014-06-25 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/1197#issuecomment-47133138
  
@mattf, whether or not close() works out in the end, we still need to 
redirect all of Spark's logging to the console output. As long as we pass in 
`stderr=PIPE` in subprocess it will swallow all of this. Part of my PR is to 
fix that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2244] Fix hang introduced by SPARK-1466

2014-06-25 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/1197#issuecomment-47133424
  
My PR is intended to be a hot fix anyway. The whole issue with reading the 
py4j port through `stdout` is hacky and prone to interference from output of 
other scripts. If you would like to, you are welcome to submit a patch for the 
longer term solution.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2186: Spark SQL DSL support for simple a...

2014-06-25 Thread concretevitamin
Github user concretevitamin commented on the pull request:

https://github.com/apache/spark/pull/1211#issuecomment-47134060
  
Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-25 Thread mattf
Github user mattf commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-47134086
  
@andrewor14 thanks, i've been able to reproduce a hang when spark-class 
outputs something other than the port #


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2186: Spark SQL DSL support for simple a...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1211#issuecomment-47134233
  
Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2186: Spark SQL DSL support for simple a...

2014-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1211#issuecomment-47134219
  
 Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-25 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/1178#issuecomment-47134903
  
This looks good to me. I'm going merge it since pyspark is broken without 
this patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2244] Fix hang introduced by SPARK-1466

2014-06-25 Thread mattf
Github user mattf commented on the pull request:

https://github.com/apache/spark/pull/1197#issuecomment-47135124
  
@rxin  @andrewor14

from what i can tell there are three issues here -

a. hang on simple job; reported as SPARK-2244 and SPARK-2242; root cause is 
stderr buffer deadlock
b. masked output from shell subprocess; introduced by SPARK-1466; root 
cause is lack of pass through for stderr
c. fragile port passing between child and parent in pyspark

all should be addressed in isolation (andrewor14, the fact that your patch 
tries to address multiple concerns at the same time is why i'd prefer an 
alternative).

i recommend -
 . first, fix (a) w/ close() and resolve both SPARK-2242 and SPARK-2244
 . second, file a bug for (b) and address it w/ enhanced exception handling 
based on the current SPARK-2242 patch
 . third, file a new bug for (c) with a solution that is yet to be 
determined


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2242] HOTFIX: pyspark shell hangs on si...

2014-06-25 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/1178


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2258 / 2266] Fix a few worker UI bugs

2014-06-25 Thread pwendell
Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/1203#discussion_r14201947
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala 
---
@@ -333,18 +327,20 @@ private[spark] class Worker(
   finishedDrivers(driverId) = driver
   memoryUsed -= driver.driverDesc.mem
   coresUsed -= driver.driverDesc.cores
-}
 
-case x: DisassociatedEvent if x.remoteAddress == masterAddress =
-  logInfo(s$x Disassociated !)
-  masterDisconnected()
+case d @ DisassociatedEvent(localAddress, remoteAddress, inbound) =
+  if (remoteAddress == masterAddress) {
+logInfo(s$d Disassociated!)
+masterDisconnected()
+  } else {
+logWarning(sReceived unknown dissociation event: $d)
--- End diff --

I don't think this warning is a good idea. The worker also can become 
disassociated from the executor or driver actors, in those cases I'm not sure 
we want to log a warning.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2244] Fix hang introduced by SPARK-1466

2014-06-25 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/1197#issuecomment-47137043
  
The thing is pyspark is still broken even if we fix (a) but not (b). For 
example, if your driver cannot communicate with the master somehow, it normally 
prints the warning messages Cannot connect to master or something. If Spark 
logging is masked, then running `sc.parallelize` in this case still hangs 
without any output. This is actually the case I personally ran into in the 
first place.

Since, issues (a) and (b) are related and have a common simple fix, I think 
it makes sense to fix them both at once. I agree that (c) should be a new issue 
and is outside of the scope of this issue. For now, I just want to make sure 
pyspark is not broken on master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2258 / 2266] Fix a few worker UI bugs

2014-06-25 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/1203#issuecomment-47137130
  
@andrewor14 do you mind submitting a version of this without the code 
formatting changes that I can easily merge and backport into branch-1.0? I 
think there are only four lines here that relate to fixing those bugs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2258 / 2266] Fix a few worker UI bugs

2014-06-25 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/1203#discussion_r14202037
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala 
---
@@ -333,18 +327,20 @@ private[spark] class Worker(
   finishedDrivers(driverId) = driver
   memoryUsed -= driver.driverDesc.mem
   coresUsed -= driver.driverDesc.cores
-}
 
-case x: DisassociatedEvent if x.remoteAddress == masterAddress =
-  logInfo(s$x Disassociated !)
-  masterDisconnected()
+case d @ DisassociatedEvent(localAddress, remoteAddress, inbound) =
+  if (remoteAddress == masterAddress) {
+logInfo(s$d Disassociated!)
+masterDisconnected()
+  } else {
+logWarning(sReceived unknown dissociation event: $d)
--- End diff --

Ah I see, I guess that's the reason why it wasn't there in the existing 
code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1946] Submit tasks after (configured ra...

2014-06-25 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request:

https://github.com/apache/spark/pull/900#discussion_r14202320
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -46,9 +46,17 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
 {
   // Use an atomic variable to track total number of cores in the cluster 
for simplicity and speed
   var totalCoreCount = new AtomicInteger(0)
+  var totalExecutors = new AtomicInteger(0)
   val conf = scheduler.sc.conf
   private val timeout = AkkaUtils.askTimeout(conf)
   private val akkaFrameSize = AkkaUtils.maxFrameSizeBytes(conf)
+  // Submit tasks only after (registered executors / total executors) 
arrived the ratio.
--- End diff --

arrived the ratio -- is equal to at least this value


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


  1   2   3   >