date:20150330

[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...

2015-03-30 Thread uncleGen

Github user uncleGen closed the pull request at:

https://github.com/apache/spark/pull/4135


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...

2015-03-30 Thread uncleGen

Github user uncleGen commented on the pull request:

https://github.com/apache/spark/pull/4135#issuecomment-87967820
  
@JoshRosen Your comments are reasonable, and I have improved related code 
just as what you pointed. For the test suite, I just check if the state of 
`Receiver` and `ReceiverSupervisor` are correct. In the previous way, killing 
`task` could not stop `Receiver` and `ReceiverSupervisor` properly. I think it 
is enough. @tdas, what is your opinion?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread ilganeli

Github user ilganeli commented on the pull request:

https://github.com/apache/spark/pull/5277#issuecomment-87967758
  
Certainly - just trying to understand what it's complaining about. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5203][SQL] fix union with different dec...

2015-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4004#issuecomment-87967719
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29451/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread zsxwing

Github user zsxwing commented on the pull request:

https://github.com/apache/spark/pull/5277#issuecomment-87967707
  
@ilganeli Could you add 
```

ProblemFilters.exclude[MissingMethodProblem]("org.apache.spark.SparkContext.org$apache$spark$SparkContext$$SPARK_CONTEXT_CONSTRUCTOR_LOCK")
```
to `MimaExcludes.scala`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5203][SQL] fix union with different dec...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4004#issuecomment-87967713
  
  [Test build #29451 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29451/consoleFull)
 for   PR 4004 at commit 
[`ba93753`](https://github.com/apache/spark/commit/ba93753a16d8168d3119a102f4b171a812635938).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6575] [SQL] Adds configuration to disab...

2015-03-30 Thread liancheng

Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/5231#issuecomment-87967599
  
cc @marmbrus


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6623][SQL] Alias DataFrame.na.drop and ...

2015-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5284#issuecomment-87967586
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29450/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread ilganeli

Github user ilganeli commented on the pull request:

https://github.com/apache/spark/pull/5277#issuecomment-87967589
  
Any notion of what's going on with this MiMa failure? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6623][SQL] Alias DataFrame.na.drop and ...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5284#issuecomment-87967572
  
  [Test build #29450 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29450/consoleFull)
 for   PR 5284 at commit 
[`19f46b7`](https://github.com/apache/spark/commit/19f46b77b790d22eae78985405b8878b975da74f).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch **adds the following new dependencies:**
   * `activation-1.1.jar`
   * `aopalliance-1.0.jar`
   * `avro-1.7.7.jar`
   * `breeze-macros_2.10-0.11.2.jar`
   * `breeze_2.10-0.11.2.jar`
   * `commons-cli-1.2.jar`
   * `commons-codec-1.10.jar`
   * `commons-compress-1.4.1.jar`
   * `commons-io-2.1.jar`
   * `commons-lang-2.5.jar`
   * `gmbal-api-only-3.0.0-b023.jar`
   * `grizzly-framework-2.1.2.jar`
   * `grizzly-http-2.1.2.jar`
   * `grizzly-http-server-2.1.2.jar`
   * `grizzly-http-servlet-2.1.2.jar`
   * `grizzly-rcm-2.1.2.jar`
   * `guice-3.0.jar`
   * `hadoop-annotations-2.2.0.jar`
   * `hadoop-auth-2.2.0.jar`
   * `hadoop-client-2.2.0.jar`
   * `hadoop-common-2.2.0.jar`
   * `hadoop-hdfs-2.2.0.jar`
   * `hadoop-mapreduce-client-app-2.2.0.jar`
   * `hadoop-mapreduce-client-common-2.2.0.jar`
   * `hadoop-mapreduce-client-core-2.2.0.jar`
   * `hadoop-mapreduce-client-jobclient-2.2.0.jar`
   * `hadoop-mapreduce-client-shuffle-2.2.0.jar`
   * `hadoop-yarn-api-2.2.0.jar`
   * `hadoop-yarn-client-2.2.0.jar`
   * `hadoop-yarn-common-2.2.0.jar`
   * `hadoop-yarn-server-common-2.2.0.jar`
   * `jackson-annotations-2.4.0.jar`
   * `jackson-core-2.4.4.jar`
   * `jackson-databind-2.4.4.jar`
   * `jackson-jaxrs-1.8.8.jar`
   * `jackson-module-scala_2.10-2.4.4.jar`
   * `jackson-xc-1.8.8.jar`
   * `javax.inject-1.jar`
   * `javax.servlet-3.1.jar`
   * `javax.servlet-api-3.0.1.jar`
   * `jaxb-api-2.2.2.jar`
   * `jaxb-impl-2.2.3-1.jar`
   * `jersey-client-1.9.jar`
   * `jersey-core-1.9.jar`
   * `jersey-grizzly2-1.9.jar`
   * `jersey-guice-1.9.jar`
   * `jersey-json-1.9.jar`
   * `jersey-server-1.9.jar`
   * `jersey-test-framework-core-1.9.jar`
   * `jersey-test-framework-grizzly2-1.9.jar`
   * `jettison-1.1.jar`
   * `jetty-util-6.1.26.jar`
   * `management-api-3.0.0-b012.jar`
   * `protobuf-java-2.4.1.jar`
   * `spark-bagel_2.10-1.4.0-SNAPSHOT.jar`
   * `spark-catalyst_2.10-1.4.0-SNAPSHOT.jar`
   * `spark-core_2.10-1.4.0-SNAPSHOT.jar`
   * `spark-graphx_2.10-1.4.0-SNAPSHOT.jar`
   * `spark-launcher_2.10-1.4.0-SNAPSHOT.jar`
   * `spark-mllib_2.10-1.4.0-SNAPSHOT.jar`
   * `spark-network-common_2.10-1.4.0-SNAPSHOT.jar`
   * `spark-network-shuffle_2.10-1.4.0-SNAPSHOT.jar`
   * `spark-repl_2.10-1.4.0-SNAPSHOT.jar`
   * `spark-sql_2.10-1.4.0-SNAPSHOT.jar`
   * `spark-streaming_2.10-1.4.0-SNAPSHOT.jar`
   * `stax-api-1.0.1.jar`
   * `xz-1.0.jar`

 * This patch **removes the following dependencies:**
   * `breeze-macros_2.10-0.3.1.jar`
   * `breeze_2.10-0.10.jar`
   * `commons-codec-1.5.jar`
   * `commons-el-1.0.jar`
   * `commons-io-2.4.jar`
   * `commons-lang-2.4.jar`
   * `hadoop-client-1.0.4.jar`
   * `hadoop-core-1.0.4.jar`
   * `hsqldb-1.8.0.10.jar`
   * `jackson-annotations-2.3.0.jar`
   * `jackson-core-2.3.0.jar`
   * `jackson-databind-2.3.0.jar`
   * `jblas-1.2.3.jar`
   * `spark-bagel_2.10-1.3.0-SNAPSHOT.jar`
   * `spark-catalyst_2.10-1.3.0-SNAPSHOT.jar`
   * `spark-core_2.10-1.3.0-SNAPSHOT.jar`
   * `spark-graphx_2.10-1.3.0-SNAPSHOT.jar`
   * `spark-mllib_2.10-1.3.0-SNAPSHOT.jar`
   * `spark-network-common_2.10-1.3.0-SNAPSHOT.jar`
   * `spark-network-shuffle_2.10-1.3.0-SNAPSHOT.jar`
   * `spark-repl_2.10-1.3.0-SNAPSHOT.jar`
   * `spark-sql_2.10-1.3.0-SNAPSHOT.jar`
   * `spark-streaming_2.10-1.3.0-SNAPSHOT.jar`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...

2015-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4135#issuecomment-87967183
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29461/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4135#issuecomment-87967179
  
  [Test build #29461 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29461/consoleFull)
 for   PR 4135 at commit 
[`e6724ec`](https://github.com/apache/spark/commit/e6724ecac594e818973a6d813c4d012fca1bd06d).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `trait TaskInterruptionListener extends EventListener `
  * `class TaskInterruptionListenerException(errorMessages: Seq[String]) 
extends Exception `
  * `class UDFRegistration(object):`

 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6623][SQL] Alias DataFrame.na.drop and ...

2015-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5284#issuecomment-87966588
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29449/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6623][SQL] Alias DataFrame.na.drop and ...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5284#issuecomment-87966582
  
  [Test build #29449 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29449/consoleFull)
 for   PR 5284 at commit 
[`6618118`](https://github.com/apache/spark/commit/6618118312e8ec645a9cbe67fda4c60ede43218c).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class DataFrameNaFunctions(object):`

 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5277#issuecomment-87966531
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29460/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5277#issuecomment-87966526
  
  [Test build #29460 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29460/consoleFull)
 for   PR 5277 at commit 
[`a0e2c70`](https://github.com/apache/spark/commit/a0e2c70b5039b0f3e90b627697406bb99d84c30a).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6255] [MLLIB] Support multiclass classi...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5137#issuecomment-87964591
  
  [Test build #29448 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29448/consoleFull)
 for   PR 5137 at commit 
[`0bd531e`](https://github.com/apache/spark/commit/0bd531eb0b52686cc381560e8171b2f8eb6e4216).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class LinearClassificationModel(LinearModel):`
  * `class LogisticRegressionModel(LinearClassificationModel):`
  * `class SVMModel(LinearClassificationModel):`

 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6255] [MLLIB] Support multiclass classi...

2015-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5137#issuecomment-87964647
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29448/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4135#issuecomment-87964217
  
  [Test build #29461 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29461/consoleFull)
 for   PR 4135 at commit 
[`e6724ec`](https://github.com/apache/spark/commit/e6724ecac594e818973a6d813c4d012fca1bd06d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...

2015-03-30 Thread uncleGen

GitHub user uncleGen reopened a pull request:

https://github.com/apache/spark/pull/4135

[SPARK-5205][Streaming]:Inconsistent behaviour between Streaming job and 
others, when click kill link in WebUI

The "kill" link is used to kill a stage in job. It works in any kinds of 
Spark job but Spark Streaming. To be specific, we can only kill the stage which 
is used to run "Receiver", but not kill the "Receivers". Well, the stage can be 
killed and cleaned from the ui, but the receivers are still alive and receiving 
data. I think it dose not fit with the common sense. IMHO, killing the 
"receiver" stage means kill the "receivers" and stopping receiving data.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/uncleGen/spark master-clean-150121

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/4135.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4135


commit c90a288ca6fde01008ff3ab5d04970c8f120c4b1
Author: uncleGen 
Date:   2015-01-21T08:58:10Z

BUG FIX: Inconsistent behaviour between Streaming job and others, when 
click kill link in WebUI

commit fc3be2c0d3cd3e8f2f15a4a4f0fa504b28164d38
Author: uncleGen 
Date:   2015-01-21T09:16:12Z

fix

commit d85dede41dce4a32748037871241da1620ada564
Author: uncleGen 
Date:   2015-01-21T09:23:59Z

style fix

commit f3162618d3eb1de3e6690e1403bf1943145d316a
Author: uncleGen 
Date:   2015-02-03T02:36:52Z

resolve merge conflicts

commit 168115cb364285867bc9e7af400d48922712c71c
Author: uncleGen 
Date:   2015-03-02T03:46:42Z

resolve merge conflicts

commit 047931c531b7d307de201ba74ed5dcb7da7c6559
Author: uncleGen 
Date:   2015-03-02T05:24:18Z

resolve merge conflicts

commit d8b57dfff760ce7a3b2413c3dcada3e1572fb7e8
Author: uncleGen 
Date:   2015-03-09T03:28:37Z

Merge branch 'master-clean' into master-clean-150121

commit 4417ff06aeeea63b1a2f4e8a12d736a050abf057
Author: uncleGen 
Date:   2015-03-09T08:19:02Z

update

commit 963556d4b64c81e56913a2860f596c6b25b1286d
Author: uncleGen 
Date:   2015-03-10T03:14:15Z

add unit test

commit f997698a0233d351941fd7ddd9577a659806281e
Author: uncleGen 
Date:   2015-03-10T03:42:03Z

 fix unit test

commit 705118453ab6f9869ac090ca2bac009f167869cd
Author: uncleGen 
Date:   2015-03-10T08:57:21Z

minor fix

commit 92fb864e384e298eead53f10844365ea1887a929
Author: uncleGen 
Date:   2015-03-12T11:41:33Z

roll back

commit 98166e7e8b43d1a912b2b6a468eb6d8f1f0297f2
Author: uncleGen 
Date:   2015-03-12T11:44:41Z

minor fix

commit baa175898df8d36b36ec9b9494ebbd117acfae87
Author: uncleGen 
Date:   2015-03-12T11:47:23Z

minor fix

commit fe3e5d52bf8680a021b00dd598cdc5a3f1c7df7e
Author: uncleGen 
Date:   2015-03-12T11:51:11Z

minor fix

commit fb9716d3feee4523a3ee4cddd0a9c0926e228099
Author: uncleGen 
Date:   2015-03-12T11:52:33Z

minor fix

commit 02bf9a936f798b7eb40dcd1183ea14b5c2125deb
Author: uncleGen 
Date:   2015-03-12T12:17:46Z

minor fix

commit 2bac6564a0a1153ec1366bfa4da085a6e89d07d5
Author: uncleGen 
Date:   2015-03-12T13:49:21Z

minor fix

commit 2544f8e5ad0e205e4043bd47646f57d5ed1f42a8
Author: uncleGen 
Date:   2015-03-17T13:52:36Z

roll back to original approach

commit b5237d86ddfcfd0b1b9b5bcd6952902ee1d843b6
Author: uncleGen 
Date:   2015-03-17T14:03:10Z

resolve merge conflict

commit 96f21820250bec9c7286489baa0f7dbd96758df6
Author: uncleGen 
Date:   2015-03-18T02:38:58Z

resolve scala style error

commit b2916fb7cdb6047eb618c2da767d18eb5a4e0caf
Author: uncleGen 
Date:   2015-03-18T09:20:55Z

minor fix

commit 77df65a1d96bb98ee7617590825d63c8ee6a26f9
Author: uncleGen 
Date:   2015-03-31T06:30:53Z

Merge branch 'master-clean-tmp' into master-clean-150121

Conflicts:

streaming/src/main/scala/org/apache/spark/streaming/receiver/ReceiverSupervisor.scala
streaming/src/test/scala/org/apache/spark/streaming/ReceiverSuite.scala

commit e6724ecac594e818973a6d813c4d012fca1bd06d
Author: uncleGen 
Date:   2015-03-31T06:40:01Z

minor fix




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5277#issuecomment-87962731
  
  [Test build #29460 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29460/consoleFull)
 for   PR 5277 at commit 
[`a0e2c70`](https://github.com/apache/spark/commit/a0e2c70b5039b0f3e90b627697406bb99d84c30a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread zsxwing

Github user zsxwing commented on the pull request:

https://github.com/apache/spark/pull/5277#issuecomment-87962796
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6623][SQL] Alias DataFrame.na.drop and ...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5284#issuecomment-87962919
  
  [Test build #29459 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29459/consoleFull)
 for   PR 5284 at commit 
[`19f46b7`](https://github.com/apache/spark/commit/19f46b77b790d22eae78985405b8878b975da74f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6623][SQL] Alias DataFrame.na.drop and ...

2015-03-30 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/5284#issuecomment-87961515
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3533][Core][PySpark] Add saveAsTextFile...

2015-03-30 Thread ilganeli

Github user ilganeli commented on the pull request:

https://github.com/apache/spark/pull/4895#issuecomment-87961484
  
Anyone? I just hear crickets :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...

2015-03-30 Thread uncleGen

Github user uncleGen closed the pull request at:

https://github.com/apache/spark/pull/4135


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread ilganeli

Github user ilganeli commented on the pull request:

https://github.com/apache/spark/pull/5277#issuecomment-87958788
  
@zsxwing My reasoning for having both is not to allow other methods of sc 
to be called but to address the scenario where the context doesn't shut down 
completely during the shutdown step. ```stopping``` provides the 
synchronization and ```stopped``` is only set once all is complete. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6627] Some clean-up in shuffle code.

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5286#issuecomment-87957319
  
  [Test build #29458 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29458/consoleFull)
 for   PR 5286 at commit 
[`d1c0494`](https://github.com/apache/spark/commit/d1c049421511aacf7cfbd29936fbc200b828f028).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-4550. In sort-based shuffle, store map o...

2015-03-30 Thread pwendell

Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/4450#issuecomment-87956979
  
@sryza don't bother with my comments yet, still just taking a tour through 
this part of the code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4233] [SQL] WIP:Simplify the UDAF API (...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3247#issuecomment-87956702
  
  [Test build #29457 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29457/consoleFull)
 for   PR 3247 at commit 
[`13f4f15`](https://github.com/apache/spark/commit/13f4f15b59b4bca39409828576947f960704c18e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [HOTFIX] Some clean-up in shuffle code.

2015-03-30 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/5286#issuecomment-87956681
  
At a high level LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-87956612
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29444/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [HOTFIX] Some clean-up in shuffle code.

2015-03-30 Thread pwendell

Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/5286#issuecomment-87956528
  
We have to put that whenever we don't create a JIRA or else the scripts we 
use get messed up. I can just make a JIRA for the overall clean-up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [HOTFIX] Some clean-up in shuffle code.

2015-03-30 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5286#discussion_r27455173
  
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala 
---
@@ -439,14 +439,10 @@ private[spark] class BlockManager(
 // As an optimization for map output fetches, if the block is for a 
shuffle, return it
 // without acquiring a lock; the disk store never deletes (recent) 
items so this should work
 if (blockId.isShuffle) {
-  val shuffleBlockManager = shuffleManager.shuffleBlockManager
-  shuffleBlockManager.getBytes(blockId.asInstanceOf[ShuffleBlockId]) 
match {
-case Some(bytes) =>
-  Some(bytes)
-case None =>
-  throw new BlockException(
-blockId, s"Block $blockId not found on disk, though it should 
be")
-  }
+  val shuffleBlockManager = shuffleManager.shuffleBlockResolver
+  // TODO: This should gracefully handle case where local block is not 
available. Currently
+  // downstream code will throw an exception.
+  
Some(shuffleBlockManager.getBlockData(blockId.asInstanceOf[ShuffleBlockId]).nioByteBuffer())
--- End diff --

can you change Some -> Option, just in case nioByteBuffer returns null?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/5277#issuecomment-87955997
  
Woah, very weird MiMa error:

```
[error]  * synthetic method 
org$apache$spark$SparkContext$$SPARK_CONTEXT_CONSTRUCTOR_LOCK()java.lang.Object 
in object org.apache.spark.SparkContext does not have a correspondent in new 
version
[error]filter with: 
ProblemFilters.exclude[MissingMethodProblem]("org.apache.spark.SparkContext.org$apache$spark$SparkContext$$SPARK_CONTEXT_CONSTRUCTOR_LOCK")
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5277#issuecomment-87955365
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29455/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [HOTFIX] Some clean-up in shuffle code.

2015-03-30 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/5286#issuecomment-87955385
  
Why is this a HOTFIX?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5277#issuecomment-87955357
  
  [Test build #29455 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29455/consoleFull)
 for   PR 5277 at commit 
[`76fc825`](https://github.com/apache/spark/commit/76fc8256a8799ab9fac5beb491627cc54fbc1f24).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5124][Core] Move StopCoordinator to the...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5283#issuecomment-87954800
  
  [Test build #29446 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29446/consoleFull)
 for   PR 5283 at commit 
[`cf3e5a7`](https://github.com/apache/spark/commit/cf3e5a7766131de8d272664dbcdd065f821d4428).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5124][Core] Move StopCoordinator to the...

2015-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5283#issuecomment-87954806
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29446/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [HOTFIX] Some clean-up in shuffle code.

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5286#issuecomment-87954690
  
  [Test build #29456 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29456/consoleFull)
 for   PR 5286 at commit 
[`a406079`](https://github.com/apache/spark/commit/a4060797a83c591be288e8fca48affecb17412c6).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [HOTFIX] Some clean-up in shuffle code.

2015-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5286#issuecomment-87954699
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29456/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [HOTFIX] Some clean-up in shuffle code.

2015-03-30 Thread pwendell

Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/5286#discussion_r27454970
  
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala 
---
@@ -439,14 +439,10 @@ private[spark] class BlockManager(
 // As an optimization for map output fetches, if the block is for a 
shuffle, return it
 // without acquiring a lock; the disk store never deletes (recent) 
items so this should work
 if (blockId.isShuffle) {
-  val shuffleBlockManager = shuffleManager.shuffleBlockManager
-  shuffleBlockManager.getBytes(blockId.asInstanceOf[ShuffleBlockId]) 
match {
-case Some(bytes) =>
-  Some(bytes)
-case None =>
--- End diff --

This code path, I believe, could never be realized before, because all 
implementations returned `Some`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [HOTFIX] Some clean-up in shuffle code.

2015-03-30 Thread pwendell

Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/5286#discussion_r27454958
  
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala 
---
@@ -439,14 +439,10 @@ private[spark] class BlockManager(
 // As an optimization for map output fetches, if the block is for a 
shuffle, return it
 // without acquiring a lock; the disk store never deletes (recent) 
items so this should work
 if (blockId.isShuffle) {
-  val shuffleBlockManager = shuffleManager.shuffleBlockManager
-  shuffleBlockManager.getBytes(blockId.asInstanceOf[ShuffleBlockId]) 
match {
-case Some(bytes) =>
-  Some(bytes)
-case None =>
-  throw new BlockException(
-blockId, s"Block $blockId not found on disk, though it should 
be")
-  }
+  val shuffleBlockManager = shuffleManager.shuffleBlockResolver
+  // TODO: This should gracefully handle case where local block is not 
available. Currently
+  // downstream code will throw an exception.
+  
Some(shuffleBlockManager.getBlockData(blockId.asInstanceOf[ShuffleBlockId]).nioByteBuffer())
--- End diff --

This behavior is kind of lame (just wrapping this in a Some), but it 
preserves what was there before.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [HOTFIX] Some clean-up in shuffle code.

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5286#issuecomment-87954350
  
  [Test build #29456 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29456/consoleFull)
 for   PR 5286 at commit 
[`a406079`](https://github.com/apache/spark/commit/a4060797a83c591be288e8fca48affecb17412c6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread zsxwing

Github user zsxwing commented on the pull request:

https://github.com/apache/spark/pull/5277#issuecomment-87954451
  
Do we really need both `stopping` and `stopped`? From the usage of 
`assertNotStopped()`, I think if `sc.stop()` is called, calling other methods 
of `sc` should be rejected.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [HOTFIX] Some clean-up in shuffle code.

2015-03-30 Thread pwendell

Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/5286#issuecomment-87954124
  
/cc @aarondav and @rxin, with whom I discussed some of the existing design.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [HOTFIX] Some clean-up in shuffle code.

2015-03-30 Thread pwendell

GitHub user pwendell opened a pull request:

https://github.com/apache/spark/pull/5286

[HOTFIX] Some clean-up in shuffle code.

Before diving into review #4450 I did a look through the existing shuffle
code to learn how it works. Unfortunately, there are some very 
confusing things in this code. This patch makes a few small changes
to simplify things. It is not easily to concisely describe the changes 
because of how convoluted the issues were, but they are fairly small
logically:

1. There is a trait named `ShuffleBlockManager` that only deals with
   one logical function which is retrieving shuffle block data given shuffle
   block coordinates. This trait has two implementors 
FileShuffleBlockManager
   and IndexShuffleBlockManager. Confusingly the vast majority of those
   implementations have nothing to do with this particular functionality.
   So I've renamed the trait to ShuffleBlockResolver and documented it.
2. The aforementioned trait had two almost identical methods, for no good
   reason. I removed one method (getBytes) and modified callers to use the
   other one. I think the behavior is preserved in all cases.
3. The sort shuffle code uses an identifier "0" in the reduce slot of a
   BlockID as a placeholder. I made it into a constant since it needs to
   be consistent across multiple places.

I think for (3) there is actually a better solution that would avoid the
need to do this type of workaround/hack in the first place, but it's more
complex so I'm punting it for now.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pwendell/spark cleanup

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5286.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5286


commit a4060797a83c591be288e8fca48affecb17412c6
Author: Patrick Wendell 
Date:   2015-03-31T01:12:43Z

[HOTFIX] Some clean-up in shuffle code.

Before diving into review #4450 I did a look through the existing shuffle
code. Unfortunately, there are some very confusing things in this code.
This patch makes a few small changes to simplify things. It is not easily
to concisely describe the changes because of how convoluted the issues were:

1. There was a trait named ShuffleBlockManager that only deals with
   one logical function which is retrieving shuffle block data given shuffle
   block coordinates. This trait has two implementors 
FileShuffleBlockManager
   and IndexShuffleBlockManager. Confusingly the vast majority of those
   implementations have nothing to do with this particular functionality.
   So I've renamed the trait to ShuffleBlockResolver and documented it.
2. The aformentioned trait had two almost identical methods, for no good
   reason. I removed one method (getBytes) and modified callers to use the
   other one. I think the behavior is preserved in all cases.
3. The sort shuffle code uses an identifier "0" in the reduce slot of a
   BlockID as a placeholder. I made it into a constant since it needs to
   be consistent across multiple places.

I think for (3) there is actually a better solution that would avoid the
need to do this type of workaround/hack in the first place, but it's more
complex so I'm punting it for now.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/5277#discussion_r27454902
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1394,30 +1397,40 @@ class SparkContext(config: SparkConf) extends 
Logging with ExecutorAllocationCli
 
   /** Shut down the SparkContext. */
   def stop() {
-SparkContext.SPARK_CONTEXT_CONSTRUCTOR_LOCK.synchronized {
-  if (!stopped) {
-stopped = true
-postApplicationEnd()
-ui.foreach(_.stop())
-env.metricsSystem.report()
-metadataCleaner.cancel()
-cleaner.foreach(_.stop())
-dagScheduler.stop()
-dagScheduler = null
-listenerBus.stop()
-eventLogger.foreach(_.stop())
-env.actorSystem.stop(heartbeatReceiver)
-progressBar.foreach(_.stop())
-taskScheduler = null
-// TODO: Cache.stop()?
-env.stop()
-SparkEnv.set(null)
-logInfo("Successfully stopped SparkContext")
-SparkContext.clearActiveContext()
+// Use the stopping variable to ensure no contention for the stop 
scenario.
+// Still track the stopped variable for use elsewhere in the code.
+if (!stopped.get()) {
+  if(!stopping.get()) {
--- End diff --

Here should be `stopping.compareAndSet(false, true)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/5277#discussion_r27454877
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -95,10 +97,11 @@ class SparkContext(config: SparkConf) extends Logging 
with ExecutorAllocationCli
 
   val startTime = System.currentTimeMillis()
 
-  @volatile private var stopped: Boolean = false
+  @volatile private var stopped: AtomicBoolean = new AtomicBoolean(false)
--- End diff --

You can remove `@volatile` and use `val` for AtomicBoolean


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread JoshRosen

Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/5277#discussion_r27454712
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1791,6 +1804,11 @@ object SparkContext extends Logging {
   private val SPARK_CONTEXT_CONSTRUCTOR_LOCK = new Object()
 
   /**
+   * Lock to guard against deadlock when shutting down SparkContext.
+   */
+  private val shutdownLock = new ReentrantLock()
--- End diff --

Looks like this is now unused.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6618][SQL] HiveMetastoreCatalog.lookupR...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5281#issuecomment-87949580
  
  [Test build #29454 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29454/consoleFull)
 for   PR 5281 at commit 
[`591b4be`](https://github.com/apache/spark/commit/591b4bea531c575809764197d1925e634f424b90).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5277#issuecomment-87949577
  
  [Test build #29455 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29455/consoleFull)
 for   PR 5277 at commit 
[`76fc825`](https://github.com/apache/spark/commit/76fc8256a8799ab9fac5beb491627cc54fbc1f24).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6618][SQL] HiveMetastoreCatalog.lookupR...

2015-03-30 Thread yhuai

Github user yhuai commented on the pull request:

https://github.com/apache/spark/pull/5281#issuecomment-87948825
  
test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6614] OutputCommitCoordinator should cl...

2015-03-30 Thread sryza

Github user sryza commented on the pull request:

https://github.com/apache/spark/pull/5276#issuecomment-87948212
  
LGTM as well


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6614] OutputCommitCoordinator should cl...

2015-03-30 Thread sryza

Github user sryza commented on a diff in the pull request:

https://github.com/apache/spark/pull/5276#discussion_r27454573
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/OutputCommitCoordinator.scala ---
@@ -113,9 +113,11 @@ private[spark] class OutputCommitCoordinator(conf: 
SparkConf) extends Logging {
 logInfo(
   s"Task was denied committing, stage: $stage, partition: 
$partition, attempt: $attempt")
   case otherReason =>
-logDebug(s"Authorized committer $attempt (stage=$stage, 
partition=$partition) failed;" +
-  s" clearing lock")
-authorizedCommitters.remove(partition)
+if (authorizedCommitters.get(partition).exists(_ == attempt)) {
--- End diff --

This is being pedantic, but `authorizedCommitters.exists(_ == (partition, 
attempt))` would be more a little more concise.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6625][SQL] Add common string filters to...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5285#issuecomment-87948181
  
  [Test build #29453 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29453/consoleFull)
 for   PR 5285 at commit 
[`f021727`](https://github.com/apache/spark/commit/f021727507b1ef3a76da91c70cfd9d26009e5ba7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6625][SQL] Add common string filters to...

2015-03-30 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/5285#discussion_r27454390
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala ---
@@ -149,9 +149,12 @@ trait PrunedScan {
 
 /**
  * ::DeveloperApi::
- * A BaseRelation that can eliminate unneeded columns and filter using 
selected
+ * A BaseRelation that can eliminate unneeded columns and filters using 
selected
--- End diff --

I think this was actually correct before.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread ilganeli

Github user ilganeli commented on the pull request:

https://github.com/apache/spark/pull/5277#issuecomment-87946825
  
Ok so it looks like we have a simple fix for the time being, in the 
long-term, so we aren't assigning to the ```stopped``` before the entire 
process is complete, I'd prefer a slightly different solution. I would add a 
second AtomicBoolean ```stopping``` and make the if check to be ```if(!stopping 
&& !stopped)```. Stopping would be set true in the beginning of the process and 
stopped would be set true at the end. This would guard against incomplete 
stops. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6625][SQL] Add common string filters to...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5285#issuecomment-87945974
  
  [Test build #29452 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29452/consoleFull)
 for   PR 5285 at commit 
[`7695a52`](https://github.com/apache/spark/commit/7695a5258667174922db467e93e6a155d283cbfa).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/5277#issuecomment-87945219
  
> `stopped` also need to be AtomicBoolean to make the reading and writing 
stopped atomic.

Ah, right, since we're doing a compare-and-swap here (I got mixed up 
thinking of a different patch; sorry).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6562][SQL] DataFrame.replace

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5282#issuecomment-87944734
  
  [Test build #29445 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29445/consoleFull)
 for   PR 5282 at commit 
[`06f2c63`](https://github.com/apache/spark/commit/06f2c63a4a0b0343e201461ec0ac47eff38c9136).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class AtLeastNNonNulls(n: Int, children: Seq[Expression]) extends 
Predicate `

 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6562][SQL] DataFrame.replace

2015-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5282#issuecomment-87944746
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29445/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6625][SQL] Add common string filters to...

2015-03-30 Thread rxin

GitHub user rxin opened a pull request:

https://github.com/apache/spark/pull/5285

[SPARK-6625][SQL] Add common string filters to data sources.

Filters such as startsWith, endsWith, contains will be very useful for 
search-like data sources such as Succinct, Elastic Search, Solr, etc.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rxin/spark ds-string-filters

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5285.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5285


commit 7695a5258667174922db467e93e6a155d283cbfa
Author: Reynold Xin 
Date:   2015-03-31T05:36:02Z

[SPARK-6625][SQL] Add common string filters to data sources.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5203][SQL] fix union with different dec...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4004#issuecomment-87943467
  
  [Test build #29451 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29451/consoleFull)
 for   PR 4004 at commit 
[`ba93753`](https://github.com/apache/spark/commit/ba93753a16d8168d3119a102f4b171a812635938).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread zsxwing

Github user zsxwing commented on the pull request:

https://github.com/apache/spark/pull/5277#issuecomment-87943393
  
> Maybe we can resolve this deadlock by removing this synchronized call?

Removing `synchronized` is not enough. `stopped` also need to be 
AtomicBoolean to make the reading and writing `stopped` atomic.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6623][SQL] Alias DataFrame.na.drop and ...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5284#issuecomment-87943398
  
  [Test build #29450 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29450/consoleFull)
 for   PR 5284 at commit 
[`19f46b7`](https://github.com/apache/spark/commit/19f46b77b790d22eae78985405b8878b975da74f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/5277#issuecomment-87943315
  
In fact, it looks like the `synchronized` call in `stop()` was added in a 
very early iteration on that patch but became unnecessary after the code was 
refactored during code review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/5277#issuecomment-87942978
  
> From the comments and codes, only activeContext, contextBeingConstructed 
and stopped are protected by SPARK_CONTEXT_CONSTRUCTOR_LOCK.

@zsxwing, I took a closer look and I'm not even sure that we need the 
synchronization for guarding `stopped`, since it's volatile: 

```
  @volatile private var stopped: Boolean = false
```

I suppose that the `SPARK_CONTEXT_CONSTRUCTOR_LOCK` was preventing multiple 
threads from being in `stop()` at the same time, but I don't know that we need 
to enforce that requirement; it's not enforced in pre-1.2 versions of the code 
(such as 
https://github.com/apache/spark/blob/39761f515d65afff377873ee4701b9313c317a60/core/src/main/scala/org/apache/spark/SparkContext.scala#L1024).
  Maybe we can resolve this deadlock by removing this `synchronized` call?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6623][SQL] Alias DataFrame.na.drop and ...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5284#issuecomment-87942910
  
  [Test build #29449 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29449/consoleFull)
 for   PR 5284 at commit 
[`6618118`](https://github.com/apache/spark/commit/6618118312e8ec645a9cbe67fda4c60ede43218c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6618][SQL] HiveMetastoreCatalog.lookupR...

2015-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5281#issuecomment-87942853
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29447/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6618][SQL] HiveMetastoreCatalog.lookupR...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5281#issuecomment-87942793
  
  [Test build #29443 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29443/consoleFull)
 for   PR 5281 at commit 
[`b3a9625`](https://github.com/apache/spark/commit/b3a9625431b4bea113d685c9cd47060a8fd40640).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6618][SQL] HiveMetastoreCatalog.lookupR...

2015-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5281#issuecomment-87942799
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29443/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6623][SQL] Alias DataFrame.na.drop and ...

2015-03-30 Thread rxin

GitHub user rxin opened a pull request:

https://github.com/apache/spark/pull/5284

[SPARK-6623][SQL] Alias DataFrame.na.drop and DataFrame.na.fill in Python.

To maintain consistency with the Scala API.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rxin/spark df-na-alias

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5284.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5284


commit 6618118312e8ec645a9cbe67fda4c60ede43218c
Author: Reynold Xin 
Date:   2015-03-31T05:22:19Z

[SPARK-6623][SQL] Alias DataFrame.na.drop and DataFrame.na.fill in Python.

To maintain consistency with the Scala API.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6255] [MLLIB] Support multiclass classi...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5137#issuecomment-87942356
  
  [Test build #29448 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29448/consoleFull)
 for   PR 5137 at commit 
[`0bd531e`](https://github.com/apache/spark/commit/0bd531eb0b52686cc381560e8171b2f8eb6e4216).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WIP] [SPARK-6620] Speed up toDF() and rdd() f...

2015-03-30 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/5279#issuecomment-87942167
  
cc @davies since you guys are both changing this part of the code lately.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WIP] [SPARK-6620] Speed up toDF() and rdd() f...

2015-03-30 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/5279#issuecomment-87942119
  
I know you are probably still working on this - any benchmark numbers?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5371][SQL] Propagate types after functi...

2015-03-30 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5278#discussion_r27453747
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicOperators.scala
 ---
@@ -80,7 +80,7 @@ case class Union(left: LogicalPlan, right: LogicalPlan) 
extends BinaryNode {
 
   override lazy val resolved: Boolean =
 childrenResolved &&
-!left.output.zip(right.output).exists { case (l,r) => l.dataType != 
r.dataType }
+left.output.zip(right.output).forall { case (l,r) => l.dataType == 
r.dataType }
--- End diff --

this does look better!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread zsxwing

Github user zsxwing commented on the pull request:

https://github.com/apache/spark/pull/5277#issuecomment-87941679
  
> I am worried about the case where calling EventLoop.stop() from the event 
loop thread itself (which can happen transitively here) leads to a one-threaded 
deadlock, but I guess your other patch addresses this?

Yes.  Calling EventLoop.stop() from the event loop thread itself won't lead 
a one-threaded deadlock.

@JoshRosen Your thoughts about moving these `stop`s out of the lock?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5124][Core] Move StopCoordinator to the...

2015-03-30 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/5283


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5124][Core] Move StopCoordinator to the...

2015-03-30 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/5283#issuecomment-87941255
  
Merging in master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5124][Core] Move StopCoordinator to the...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5283#issuecomment-87940965
  
  [Test build #29446 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29446/consoleFull)
 for   PR 5283 at commit 
[`cf3e5a7`](https://github.com/apache/spark/commit/cf3e5a7766131de8d272664dbcdd065f821d4428).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5124][Core] Move StopCoordinator to the...

2015-03-30 Thread zsxwing

GitHub user zsxwing opened a pull request:

https://github.com/apache/spark/pull/5283

[SPARK-5124][Core] Move StopCoordinator to the receive method since it does 
not require a reply

Hotfix for #4588

cc @rxin

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zsxwing/spark hotfix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5283.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5283


commit cf3e5a7766131de8d272664dbcdd065f821d4428
Author: zsxwing 
Date:   2015-03-31T05:03:50Z

Move StopCoordinator to the receive method since it does not require a reply




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6621][Core] Fix the bug that calling Ev...

2015-03-30 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/5280#discussion_r27453086
  
--- Diff: core/src/main/scala/org/apache/spark/util/EventLoop.scala ---
@@ -76,9 +76,21 @@ private[spark] abstract class EventLoop[E](name: String) 
extends Logging {
   def stop(): Unit = {
 if (stopped.compareAndSet(false, true)) {
   eventThread.interrupt()
-  eventThread.join()
-  // Call onStop after the event thread exits to make sure onReceive 
happens before onStop
-  onStop()
+  var onStopCalled = false
+  try {
+eventThread.join()
+// Call onStop after the event thread exits to make sure onReceive 
happens before onStop
+onStopCalled = true
+onStop()
+  } catch {
+case ie: InterruptedException =>
+  Thread.currentThread().interrupt()
+  if (!onStopCalled) {
+// ie is thrown from `eventThread.join()`. Otherwise, we 
should not call `onStop` since
--- End diff --

If the current thread is eventThread, then eventThread.interrupt() set the 
interrupt flag. So `join` will throw `InterruptedException`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6621][Core] Fix the bug that calling Ev...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5280#issuecomment-87938423
  
  [Test build #29442 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29442/consoleFull)
 for   PR 5280 at commit 
[`521125e`](https://github.com/apache/spark/commit/521125e825bd4919c0a36c1092aaf789a42539bb).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class UDFRegistration(object):`

 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6621][Core] Fix the bug that calling Ev...

2015-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5280#issuecomment-87938429
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29442/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6608] [SQL] Makes DataFrame.rdd a lazy ...

2015-03-30 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5265#discussion_r27452963
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala ---
@@ -941,10 +941,10 @@ class DataFrame private[sql](
   
/
 
   /**
-   * Returns the content of the [[DataFrame]] as an [[RDD]] of [[Row]]s.
+   * Represents the content of the [[DataFrame]] as an [[RDD]] of [[Row]]s.
--- End diff --

can you update the doc to say that the RDD is memoized, i.e. once called, 
even if you change the spark sql configuration, it won't change the plan 
anymore?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6492][CORE] SparkContext.stop() can dea...

2015-03-30 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/5277#issuecomment-87938070
  
> Do you mean calling EventLoop.stop in dagScheduler.onError?

I was referring to the race that I reported in 
[SPARK-6492](https://issues.apache.org/jira/browse/SPARK-6492) (referred to by 
this PR), where the EventLoop.onError() calls SparkContext.stop(), which blocks 
on the SPARK_CONTEXT_CONSTRUCTOR_LOCK lock, while another thread has 
simultaneously called SparkContext.stop(), acquired the 
SPARK_CONTEXT_CONSTRUCTOR_LOCK, and called `DAGScheduler.stop()`, which calls 
`EventLoop.stop()`.  My comment was referring to this `EventLoop.stop()` call 
blocking indefinitely while waiting to `join()` on the event processing thread, 
which is blocked on acquiring the `SPARK_CONTEXT_CONSTRUCTOR_LOCK`.

Based on the comments upthread from @ilganeli, I don't think that we should 
adopt my earlier suggestion of having `EventLoop.stop()` being a no-op as long 
as some other thread is in the process of stopping the EventLoop, since this 
could result in a scenario where in-progress cleanup is still happening after a 
user's call to SparkContext.stop() has returned, which could lead to cleanup 
being skipped were the JVM to exit at that point.

I am worried about the case where calling `EventLoop.stop()` from the event 
loop thread itself (which can happen transitively here) leads to a one-threaded 
deadlock, but I guess your other patch addresses this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WIP] [SPARK-6620] Speed up toDF() and rdd() f...

2015-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5279#issuecomment-87937314
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29441/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WIP] [SPARK-6620] Speed up toDF() and rdd() f...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5279#issuecomment-87937299
  
  [Test build #29441 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29441/consoleFull)
 for   PR 5279 at commit 
[`ab7585b`](https://github.com/apache/spark/commit/ab7585baf23ac3cbc204ec791c1b667492e2cae2).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6621][Core] Fix the bug that calling Ev...

2015-03-30 Thread JoshRosen

Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/5280#discussion_r27452652
  
--- Diff: core/src/main/scala/org/apache/spark/util/EventLoop.scala ---
@@ -76,9 +76,21 @@ private[spark] abstract class EventLoop[E](name: String) 
extends Logging {
   def stop(): Unit = {
 if (stopped.compareAndSet(false, true)) {
   eventThread.interrupt()
-  eventThread.join()
-  // Call onStop after the event thread exits to make sure onReceive 
happens before onStop
-  onStop()
+  var onStopCalled = false
+  try {
+eventThread.join()
+// Call onStop after the event thread exits to make sure onReceive 
happens before onStop
+onStopCalled = true
+onStop()
+  } catch {
+case ie: InterruptedException =>
+  Thread.currentThread().interrupt()
+  if (!onStopCalled) {
+// ie is thrown from `eventThread.join()`. Otherwise, we 
should not call `onStop` since
--- End diff --

In the uint tests added here, when does `join()` throw 
`InterruptedException`? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3454] [WIP] separate json endpoints for...

2015-03-30 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/4435#issuecomment-87934275
  
(Don't worry, I have more review pending; I've got a bit of Jersey stuff I 
want to read up on and got preempted by a couple of other things, but I'll get 
to this very soon; sorry for the delays)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6578] [core] Fix thread-safety issue in...

2015-03-30 Thread normanmaurer

Github user normanmaurer commented on a diff in the pull request:

https://github.com/apache/spark/pull/5234#discussion_r27452092
  
--- Diff: 
network/common/src/main/java/org/apache/spark/network/protocol/MessageEncoder.java
 ---
@@ -72,9 +80,84 @@ public void encode(ChannelHandlerContext ctx, Message 
in, List out) {
 in.encode(header);
 assert header.writableBytes() == 0;
 
-out.add(header);
 if (body != null && bodyLength > 0) {
-  out.add(body);
+  out.add(new MessageWithHeader(header, headerLength, body, 
bodyLength));
+} else {
+  out.add(header);
 }
   }
+
+  /**
+   * A wrapper message that holds two separate pieces (a header and a 
body) to avoid
+   * copying the body's content.
+   */
+  private static class MessageWithHeader extends AbstractReferenceCounted 
implements FileRegion {
--- End diff --

I will take care to add support in netty



> Am 31.03.2015 um 05:46 schrieb Reynold Xin :
> 
> In 
network/common/src/main/java/org/apache/spark/network/protocol/MessageEncoder.java:
> 
> >  }
> >}
> > +
> > +  /**
> > +   * A wrapper message that holds two separate pieces (a header and a 
body) to avoid
> > +   * copying the body's content.
> > +   */
> > +  private static class MessageWithHeader extends 
AbstractReferenceCounted implements FileRegion {
> Yes - I think epoll won't work and we should just disable it. We'd need 
more work on Netty proper to support epoll for this.
> 
> â
> Reply to this email directly or view it on GitHub.
> 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6562][SQL] DataFrame.replace

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5282#issuecomment-87928897
  
  [Test build #29445 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29445/consoleFull)
 for   PR 5282 at commit 
[`06f2c63`](https://github.com/apache/spark/commit/06f2c63a4a0b0343e201461ec0ac47eff38c9136).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6562][SQL] DataFrame.replace

2015-03-30 Thread rxin

GitHub user rxin opened a pull request:

https://github.com/apache/spark/pull/5282

[SPARK-6562][SQL] DataFrame.replace

TODO:

Python support 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rxin/spark df-na-replace

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5282.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5282


commit 06f2c63a4a0b0343e201461ec0ac47eff38c9136
Author: Reynold Xin 
Date:   2015-03-31T04:06:04Z

[SPARK-6562][SQL] DataFrame.replace




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-87927735
  
  [Test build #29444 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29444/consoleFull)
 for   PR 5096 at commit 
[`0e788c0`](https://github.com/apache/spark/commit/0e788c08f3b418acc05d4d27298b65de8b6f8407).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6618][SQL] HiveMetastoreCatalog.lookupR...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5281#issuecomment-87927049
  
  [Test build #29443 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29443/consoleFull)
 for   PR 5281 at commit 
[`b3a9625`](https://github.com/apache/spark/commit/b3a9625431b4bea113d685c9cd47060a8fd40640).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6618][SQL] HiveMetastoreCatalog.lookupR...

2015-03-30 Thread yhuai

GitHub user yhuai opened a pull request:

https://github.com/apache/spark/pull/5281

[SPARK-6618][SQL] HiveMetastoreCatalog.lookupRelation should use 
fine-grained lock

JIRA: https://issues.apache.org/jira/browse/SPARK-6618

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yhuai/spark lookupRelationLock

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5281.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5281


commit b3a9625431b4bea113d685c9cd47060a8fd40640
Author: Yin Huai 
Date:   2015-03-31T03:55:00Z

Just protect client.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5371][SQL] Propagate types after functi...

2015-03-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5278#issuecomment-87926115
  
  [Test build #29440 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29440/consoleFull)
 for   PR 5278 at commit 
[`dc3581a`](https://github.com/apache/spark/commit/dc3581ae6d5c4d280527de962cd170b819f01600).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5371][SQL] Propagate types after functi...

2015-03-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5278#issuecomment-87926124
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29440/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 >

1 - 100 of 495 matches

Mail list logo