date:20160912

[GitHub] spark issue #15058: [SPARK-17505][MLLIB]Add setBins for BinaryClassification...

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15058
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15058: [SPARK-17505][MLLIB]Add setBins for BinaryClassification...

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15058
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65255/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15051: [SPARK-17499][ML][MLLib] make the default params ...

2016-09-12 Thread shivaram

Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/15051#discussion_r78406032
  
--- Diff: R/pkg/R/mllib.R ---
@@ -694,8 +694,8 @@ setMethod("predict", signature(object = "KMeansModel"),
 #' }
 #' @note spark.mlp since 2.1.0
 setMethod("spark.mlp", signature(data = "SparkDataFrame"),
-  function(data, blockSize = 128, layers = c(3, 5, 2), solver = 
"l-bfgs", maxIter = 100,
-   tol = 0.5, stepSize = 1, seed = 1) {
+  function(data, blockSize = 128, layers, solver = "l-bfgs", 
maxIter = 100,
+   tol = 1E-6, stepSize = 0.03, seed = -763139545) {
--- End diff --

Yeah that sounds fine. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15000: [SPARK-17437] Add uiWebUrl to JavaSparkContext and pyspa...

2016-09-12 Thread apetresc

Github user apetresc commented on the issue:

https://github.com/apache/spark/pull/15000
  
Well, here's the use case I want it for: I'm building some plugins for 
JupyterHub to make it more Spark-aware, and I want to be able to link the user 
out to the right WebUI for their kernel. Short of somehow making the launcher 
override the user's own `SPARK_CONF_DIR` to set the port manually to one that 
I'm already sure is open, there's no other way to do that. But I _do_ have 
access to the SparkContext so with this property, I can create the link 
effortlessly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15000: [SPARK-17437] Add uiWebUrl to JavaSparkContext and pyspa...

2016-09-12 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/15000
  
Is this Java or Pyspark? In Java you can still get this property directly 
from the underlying SparkContext.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15000: [SPARK-17437] Add uiWebUrl to JavaSparkContext and pyspa...

2016-09-12 Thread apetresc

Github user apetresc commented on the issue:

https://github.com/apache/spark/pull/15000
  
PySpark. I don't think anyone runs Java through Jupyter, haha.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15000: [SPARK-17437] Add uiWebUrl to JavaSparkContext and pyspa...

2016-09-12 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/15000
  
Ah right dumb question. Yeah I think it makes some sense ... maybe not even 
for Java because there are lots of methods we don't plumb through because you 
can easily access them directly rom scala. Python, OK.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15060: [SPARK-17507][ML][MLLib] check weight vector size...

2016-09-12 Thread WeichenXu123

GitHub user WeichenXu123 opened a pull request:

https://github.com/apache/spark/pull/15060

[SPARK-17507][ML][MLLib] check weight vector size in ANN

## What changes were proposed in this pull request?

as the TODO described,
check weight vector size and if wrong throw exception.

## How was this patch tested?

existing tests.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/WeichenXu123/spark 
check_input_weight_size_of_ann

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15060.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15060


commit e89aef5fbd5be5b255e623cdfca8ae75ecb92ea3
Author: WeichenXu 
Date:   2016-09-12T10:15:25Z

update.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15024: [SPARK-17470][SQL] unify path for data source table and ...

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15024
  
**[Test build #65258 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65258/consoleFull)**
 for PR 15024 at commit 
[`d24d6ed`](https://github.com/apache/spark/commit/d24d6edc1b753a8cbbc317048375afbf74b28ff1).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15024: [SPARK-17470][SQL] unify path for data source table and ...

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15024
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15024: [SPARK-17470][SQL] unify path for data source table and ...

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15024
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65258/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15060: [SPARK-17507][ML][MLLib] check weight vector size...

2016-09-12 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/15060#discussion_r78408999
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala ---
@@ -395,6 +395,14 @@ private[ann] class FeedForwardTopology private(val 
layers: Array[Layer]) extends
   override def model(weights: Vector): TopologyModel = 
FeedForwardModel(this, weights)
 
   override def model(seed: Long): TopologyModel = FeedForwardModel(this, 
seed)
+
+  def weightSize: Int = {
--- End diff --

Just `layers.map(_.weightSize).sum`? You don't even need a method really.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15060: [SPARK-17507][ML][MLLib] check weight vector size...

2016-09-12 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/15060#discussion_r78409055
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala ---
@@ -545,7 +553,9 @@ private[ann] object FeedForwardModel {
* @return model
*/
   def apply(topology: FeedForwardTopology, weights: Vector): 
FeedForwardModel = {
-// TODO: check that weights size is equal to sum of layers sizes
+if (weights.size != topology.weightSize) {
+  throw new Exception("Input weight vector has illegal size.")
--- End diff --

Throwing Exception is never great ... use `require` for consistency.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15060: [SPARK-17507][ML][MLLib] check weight vector size in ANN

2016-09-12 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/15060
  
Is there any other arg checking we can tighten up here? There are a couple 
places were weights need to match something else in size.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15024: [SPARK-17470][SQL] unify path for data source table and ...

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15024
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15024: [SPARK-17470][SQL] unify path for data source table and ...

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15024
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65260/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15024: [SPARK-17470][SQL] unify path for data source table and ...

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15024
  
**[Test build #65260 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65260/consoleFull)**
 for PR 15024 at commit 
[`35def5b`](https://github.com/apache/spark/commit/35def5b8a9ed38be4350083cdf0cf43e50bfe204).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15059: [SPARK-17506][SQL] Improve the check double values equal...

2016-09-12 Thread WeichenXu123

Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/15059
  
but relTol is defined in mllib and sql not reference it,  seems better to 
move it to spark-core project?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15059: [SPARK-17506][SQL] Improve the check double values equal...

2016-09-12 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/15059
  
Oh I see. Yes if we can move it to Spark's core test module that would be 
nicer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15035: [SPARK-17477]: SparkSQL cannot handle schema evolution f...

2016-09-12 Thread wgtmac

Github user wgtmac commented on the issue:

https://github.com/apache/spark/pull/15035
  
@HyukjinKwon This is not parquet specific, it applies to other data sources 
as well.
1. Change the reading path for parquet: It does not solve the problem. Some 
queries need to read all parquet files.
2. Make changes in row: yes, I have to change it per row because some 
parquet files have int while some parquet files have long. We can't know which 
row is good or problematic. 
3. Vectorized parquet reader: This is a good catch. I haven't considered 
this yet.

It would be great if you can come up with other good ideas and continue to 
work on it. Feedbacks and discussions are welcome. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #10655: SPARK-12639 SQL Improve Explain for Datasources with Han...

2016-09-12 Thread yhuai

Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/10655
  
test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15035: [SPARK-17477]: SparkSQL cannot handle schema evolution f...

2016-09-12 Thread wgtmac

Github user wgtmac commented on the issue:

https://github.com/apache/spark/pull/15035
  
@JoshRosen yes it may have mask overflow risk. This conversion happens when 
user provided schema or hive metastore schema has Long but the parquet files 
have Int as the schema. We cannot avoid this risk in this case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14678: [MINOR][SQL] Add missing functions for some options in S...

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14678
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65263/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14678: [MINOR][SQL] Add missing functions for some options in S...

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14678
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14678: [MINOR][SQL] Add missing functions for some options in S...

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14678
  
**[Test build #65263 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65263/consoleFull)**
 for PR 14678 at commit 
[`a26c08e`](https://github.com/apache/spark/commit/a26c08e402daf7c972e7267b25230defcf2ebdb9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15061: [SPARK-14818] Post-2.0 MiMa exclucsion and build ...

2016-09-12 Thread JoshRosen

GitHub user JoshRosen opened a pull request:

https://github.com/apache/spark/pull/15061

[SPARK-14818] Post-2.0 MiMa exclucsion and build changes

This patch makes a handful of post-Spark-2.0 MiMa exclusion and build 
updates. It should be merged to master and a subset of it should be picked into 
branch-2.0 in order to test Spark 2.0.1-SNAPSHOT.

- Remove the ` sketch`, `mllibLocal`, and `streamingKafka010` from the list 
of excluded subprojects so that MiMa checks them.
- Remove now-unnecessary special-case handling of the Kafka 0.8 artifact in 
`mimaSettings`.
- Move the exclusion added in SPARK-14743 from `v20excludes` to 
`v21excludes`, since that patch was only merged into master and not branch-2.0.
- Add exclusions for an API change introduced by SPARK-17096 / #14675.
- Add missing exclusions for the `o.a.spark.internal` and 
`o.a.spark.sql.internal` packages.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/JoshRosen/spark post-2.0-mima-changes

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15061.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15061


commit 1224e758fc4cf69e27f013615d52b5c96696506b
Author: Josh Rosen 
Date:   2016-09-11T01:32:27Z

Post-2.0 MiMa changes.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15061: [SPARK-14818] Post-2.0 MiMa exclucsion and build ...

2016-09-12 Thread JoshRosen

Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/15061#discussion_r78415480
  
--- Diff: project/MimaExcludes.scala ---
@@ -787,9 +792,10 @@ object MimaExcludes {
   
ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.sql.SQLContext.parquetFile"),
   
ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.sql.SQLContext.applySchema")
 ) ++ Seq(
-// [SPARK-14743] Improve delegation token handling in secure 
cluster
-
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.deploy.SparkHadoopUtil.getTimeFromNowToRenewal")
-  )
+  // SPARK-17096: Improve exception string reported through the 
StreamingQueryListener
--- End diff --

/cc @tdas, @zsxwing, I've added exclusions for the this change from #14675. 
I wanted to ping you here in case this change was unintentional and you think 
we'll need to restore compatibility.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14452: [SPARK-16849][SQL][WIP] Improve subquery execution by de...

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14452
  
**[Test build #65253 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65253/consoleFull)**
 for PR 14452 at commit 
[`df8b490`](https://github.com/apache/spark/commit/df8b4909c11f29e2d13cf112b8850cdf1afb6237).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14452: [SPARK-16849][SQL][WIP] Improve subquery execution by de...

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14452
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65253/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14452: [SPARK-16849][SQL][WIP] Improve subquery execution by de...

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14452
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15062: SPARK-17424: Fix unsound substitution bug in Scal...

2016-09-12 Thread rdblue

GitHub user rdblue opened a pull request:

https://github.com/apache/spark/pull/15062

SPARK-17424: Fix unsound substitution bug in ScalaReflection.

## What changes were proposed in this pull request?

This method gets a type's primary constructor and fills in type parameters 
with concrete types. For example, `MapPartitions[T, U] -> MapPartitions[Int, 
String]`. This Substitution fails when the actual type args are empty because 
they are still unknown. Instead, when there are no resolved types to subsitute, 
this returns the original args with unresolved type parameters.

## How was this patch tested?

This doesn't affect substitutions where the type args are determined. This 
fixes our case where the actual type args are empty and our job runs 
successfully.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rdblue/spark 
SPARK-17424-fix-unsound-reflect-substitution

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15062.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15062






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15009: [SPARK-17443][SPARK-11035] Stop Spark Application...

2016-09-12 Thread tgravescs

Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/15009#discussion_r78417431
  
--- Diff: 
launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java ---
@@ -538,6 +539,71 @@ public SparkAppHandle 
startApplication(SparkAppHandle.Listener... listeners) thr
 return handle;
   }
 
+  private String getAppName() throws IOException {
+String appName = 
builder.getEffectiveConfig().get(CHILD_PROCESS_LOGGER_NAME);
+if (appName == null) {
+  if (builder.appName != null) {
+appName = builder.appName;
+  } else if (builder.mainClass != null) {
+int dot = builder.mainClass.lastIndexOf(".");
+if (dot >= 0 && dot < builder.mainClass.length() - 1) {
+  appName = builder.mainClass.substring(dot + 1, 
builder.mainClass.length());
+} else {
+  appName = builder.mainClass;
+}
+  } else if (builder.appResource != null) {
+appName = new File(builder.appResource).getName();
+  } else {
+appName = String.valueOf(COUNTER.incrementAndGet());
+  }
+}
+return appName;
+  }
+
+  /**
+   * Starts a Spark application.
+   * 
+   * This method returns a handle that provides information about the 
running application and can
+   * be used to do basic interaction with it.
+   * 
+   * The returned handle assumes that the application will instantiate a 
single SparkContext
+   * during its lifetime. Once that context reports a final state (one 
that indicates the
+   * SparkContext has stopped), the handle will not perform new state 
transitions, so anything
+   * that happens after that cannot be monitored. The underlying 
application is launched as
+   * a Thread, {@link SparkAppHandle#kill()} can still be used to kill the 
spark application.
+   * 
+   * @since 2.1.0
+   * @param listeners Listeners to add to the handle before the app is 
launched.
+   * @return A handle for the launched application.
+   */
+  public SparkAppHandle 
startApplicationInProcess(SparkAppHandle.Listener... listeners) throws 
IOException {
--- End diff --

instead of having separate api perhaps we can just add an option to the 
builder pattern that indicates to start it in a Thread


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14750: [SPARK-17183][SQL] put hive serde table schema to table ...

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14750
  
**[Test build #65262 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65262/consoleFull)**
 for PR 14750 at commit 
[`edafaa6`](https://github.com/apache/spark/commit/edafaa6ca8018009c361f6057f2b11a16091c8b8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14750: [SPARK-17183][SQL] put hive serde table schema to table ...

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14750
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14750: [SPARK-17183][SQL] put hive serde table schema to table ...

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14750
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65262/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14969: [SPARK-17406][WEB UI] limit timeline executor events

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14969
  
**[Test build #65261 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65261/consoleFull)**
 for PR 14969 at commit 
[`ac99524`](https://github.com/apache/spark/commit/ac995249a58d297b45a08f2f88ff6bf87240df6e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14992: [SPARK-17378] [HOTFIX] Upgrade snappy-java to 1.1.2.6 --...

2016-09-12 Thread yhuai

Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/14992
  
Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14969: [SPARK-17406][WEB UI] limit timeline executor events

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14969
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14969: [SPARK-17406][WEB UI] limit timeline executor events

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14969
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65261/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15061: [SPARK-14818] Post-2.0 MiMa exclucsion and build ...

2016-09-12 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/15061#discussion_r78419280
  
--- Diff: project/MimaExcludes.scala ---
@@ -787,9 +792,10 @@ object MimaExcludes {
   
ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.sql.SQLContext.parquetFile"),
   
ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.sql.SQLContext.applySchema")
 ) ++ Seq(
-// [SPARK-14743] Improve delegation token handling in secure 
cluster
-
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.deploy.SparkHadoopUtil.getTimeFromNowToRenewal")
-  )
+  // SPARK-17096: Improve exception string reported through the 
StreamingQueryListener
--- End diff --

@JoshRosen That's the intentional broken. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13762: [SPARK-14926] [ML] OneVsRest labelMetadata uses incorrec...

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13762
  
**[Test build #65256 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65256/consoleFull)**
 for PR 13762 at commit 
[`b4badae`](https://github.com/apache/spark/commit/b4badae8bce863558eaeb831293ac8bbfc87b179).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14834: [SPARK-17163][ML] Unified LogisticRegression inte...

2016-09-12 Thread dbtsai

Github user dbtsai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14834#discussion_r78419383
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
 ---
@@ -311,8 +350,28 @@ class LogisticRegression @Since("1.2.0") (
 
 val histogram = labelSummarizer.histogram
 val numInvalid = labelSummarizer.countInvalid
-val numClasses = histogram.length
 val numFeatures = summarizer.mean.size
+val numFeaturesPlusIntercept = if (getFitIntercept) numFeatures + 1 
else numFeatures
+
+val numClasses = 
MetadataUtils.getNumClasses(dataset.schema($(labelCol))) match {
+  case Some(n: Int) =>
+require(n >= histogram.length, s"Specified number of classes $n 
was " +
+  s"less than the number of unique labels ${histogram.length}.")
+n
+  case None => histogram.length
+}
+
+val isBinaryClassification = numClasses == 1 || numClasses == 2
+val isMultinomial = $(family) match {
+  case "binomial" =>
+require(isBinaryClassification, s"Binomial family only supports 1 
or 2 " +
+s"outcome classes but found $numClasses.")
+false
+  case "multinomial" => true
+  case "auto" => !isBinaryClassification
+  case other => throw new IllegalArgumentException(s"Unsupported 
family: $other")
+}
--- End diff --

Oh, I original thought it was a typo. `isBinaryClassification` is not used 
in other place, and I think it's kind of leading people to interpret as 
`isBinomial`. Maybe we should just remove it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13762: [SPARK-14926] [ML] OneVsRest labelMetadata uses incorrec...

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13762
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65256/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13762: [SPARK-14926] [ML] OneVsRest labelMetadata uses incorrec...

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13762
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14834: [SPARK-17163][ML] Unified LogisticRegression inte...

2016-09-12 Thread dbtsai

Github user dbtsai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14834#discussion_r78420053
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
 ---
@@ -460,33 +577,74 @@ class LogisticRegression @Since("1.2.0") (
as a result, no scaling is needed.
  */
 val rawCoefficients = state.x.toArray.clone()
-var i = 0
-while (i < numFeatures) {
-  rawCoefficients(i) *= { if (featuresStd(i) != 0.0) 1.0 / 
featuresStd(i) else 0.0 }
-  i += 1
+val coefficientArray = Array.tabulate(numCoefficientSets * 
numFeatures) { i =>
+  // flatIndex will loop though rawCoefficients, and skip the 
intercept terms.
+  val flatIndex = if ($(fitIntercept)) i + i / numFeatures else i
+  val featureIndex = i % numFeatures
+  if (featuresStd(featureIndex) != 0.0) {
+rawCoefficients(flatIndex) / featuresStd(featureIndex)
+  } else {
+0.0
+  }
+}
+val coefficientMatrix =
+  new DenseMatrix(numCoefficientSets, numFeatures, 
coefficientArray, isTransposed = true)
+
+if ($(regParam) == 0.0 && isMultinomial) {
+  /*
+When no regularization is applied, the coefficients lack 
identifiability because
+we do not use a pivot class. We can add any constant value to 
the coefficients and
+get the same likelihood. So here, we choose the mean centered 
coefficients for
+reproducibility. This method follows the approach in glmnet, 
described here:
+
+Friedman, et al. "Regularization Paths for Generalized Linear 
Models via
+  Coordinate Descent," 
https://core.ac.uk/download/files/153/6287975.pdf
+   */
+  val coefficientMean = coefficientMatrix.values.sum / 
coefficientMatrix.values.length
+  coefficientMatrix.update(_ - coefficientMean)
 }
-bcFeaturesStd.destroy(blocking = false)
 
-if ($(fitIntercept)) {
-  (Vectors.dense(rawCoefficients.dropRight(1)).compressed, 
rawCoefficients.last,
-arrayBuilder.result())
+val interceptsArray: Array[Double] = if ($(fitIntercept)) {
+  Array.tabulate(numCoefficientSets) { i =>
+val coefIndex = (i + 1) * numFeaturesPlusIntercept - 1
+rawCoefficients(coefIndex)
+  }
+} else {
+  Array[Double]()
+}
+/*
+  The intercepts are never regularized, so we always center the 
mean.
+ */
+val interceptVector = if (interceptsArray.nonEmpty && 
isMultinomial) {
+  val interceptMean = interceptsArray.sum / numClasses
+  interceptsArray.indices.foreach { i => interceptsArray(i) -= 
interceptMean }
+  Vectors.dense(interceptsArray)
+} else if (interceptsArray.length == 1) {
+  Vectors.dense(interceptsArray)
 } else {
-  (Vectors.dense(rawCoefficients).compressed, 0.0, 
arrayBuilder.result())
+  Vectors.sparse(numCoefficientSets, Seq())
 }
+(coefficientMatrix, interceptVector, arrayBuilder.result())
--- End diff --

One way to unblock it without implementing `compress` in matrix will be for 
binary classification, a vector can be created, and compress can be called to 
return a dense or sparse vector. If it's sparse vector, you put it into CSR 
matrix without changing the internal data structure.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15063: [SPARK-17463][Core]Make CollectionAccumulator and...

2016-09-12 Thread zsxwing

GitHub user zsxwing opened a pull request:

https://github.com/apache/spark/pull/15063

[SPARK-17463][Core]Make CollectionAccumulator and SetAccumulator's value 
can be read thread-safely

## What changes were proposed in this pull request?

Make CollectionAccumulator and SetAccumulator's value can be read 
thread-safely to fix the ConcurrentModificationException reported in 
[JIRA](https://issues.apache.org/jira/browse/SPARK-17463).

## How was this patch tested?

Existing tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zsxwing/spark SPARK-17463

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15063.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15063


commit 4b8c277ebb1c8ff966e6d3c2676dfc34a1f0c483
Author: Shixiong Zhu 
Date:   2016-09-12T17:45:35Z

Make CollectionAccumulator and SetAccumulator's value can be read 
thread-safely




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15057: [BUILD] Closing some stale PRs and ones suggested to be ...

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15057
  
**[Test build #65254 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65254/consoleFull)**
 for PR 15057 at commit 
[`50a8a6c`](https://github.com/apache/spark/commit/50a8a6cc6da52e67553d3193e65cbf8fff83a475).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15057: [BUILD] Closing some stale PRs and ones suggested to be ...

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15057
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15057: [BUILD] Closing some stale PRs and ones suggested to be ...

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15057
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65254/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14961: [SPARK-17379] [BUILD] Upgrade netty-all to 4.0.41 final ...

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14961
  
**[Test build #65264 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65264/consoleFull)**
 for PR 14961 at commit 
[`502ebf4`](https://github.com/apache/spark/commit/502ebf45f4fa9791cbf26ec5ea7e0167ecbc68a0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15063: [SPARK-17463][Core]Make CollectionAccumulator and SetAcc...

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15063
  
**[Test build #65270 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65270/consoleFull)**
 for PR 15063 at commit 
[`4b8c277`](https://github.com/apache/spark/commit/4b8c277ebb1c8ff966e6d3c2676dfc34a1f0c483).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15063: [SPARK-17463][Core]Make CollectionAccumulator and SetAcc...

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15063
  
**[Test build #65271 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65271/consoleFull)**
 for PR 15063 at commit 
[`21b6b4d`](https://github.com/apache/spark/commit/21b6b4dbb9cbc977c6e4aa8527532b3e933bf7c2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15060: [SPARK-17507][ML][MLLib] check weight vector size in ANN

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15060
  
**[Test build #65266 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65266/consoleFull)**
 for PR 15060 at commit 
[`e89aef5`](https://github.com/apache/spark/commit/e89aef5fbd5be5b255e623cdfca8ae75ecb92ea3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15059: [SPARK-17506][SQL] Improve the check double values equal...

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15059
  
**[Test build #65265 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65265/consoleFull)**
 for PR 15059 at commit 
[`78f3733`](https://github.com/apache/spark/commit/78f37334164a015605d5c23ff7217a131c3ea3a7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15063: [SPARK-17463][Core]Make CollectionAccumulator and SetAcc...

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15063
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15063: [SPARK-17463][Core]Make CollectionAccumulator and SetAcc...

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15063
  
**[Test build #65271 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65271/consoleFull)**
 for PR 15063 at commit 
[`21b6b4d`](https://github.com/apache/spark/commit/21b6b4dbb9cbc977c6e4aa8527532b3e933bf7c2).
 * This patch **fails build dependency tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class SetAccumulator[T] extends AccumulatorV2[T, java.util.Set[T]] 
`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15063: [SPARK-17463][Core]Make CollectionAccumulator and SetAcc...

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15063
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65271/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15056: [SPARK-17503][Core] Fix memory leak in Memory sto...

2016-09-12 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/15056#discussion_r78424222
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala ---
@@ -663,31 +663,43 @@ private[spark] class MemoryStore(
 private[storage] class PartiallyUnrolledIterator[T](
 memoryStore: MemoryStore,
 unrollMemory: Long,
-unrolled: Iterator[T],
+private[this] var unrolled: Iterator[T],
 rest: Iterator[T])
   extends Iterator[T] {
 
-  private[this] var unrolledIteratorIsConsumed: Boolean = false
-  private[this] var iter: Iterator[T] = {
-val completionIterator = CompletionIterator[T, Iterator[T]](unrolled, {
-  unrolledIteratorIsConsumed = true
-  memoryStore.releaseUnrollMemoryForThisTask(MemoryMode.ON_HEAP, 
unrollMemory)
-})
-completionIterator ++ rest
--- End diff --

Let's see if I understand the problem. 

Because BlockManager may call `close` early, so we cannot rely on 
`CompletionIterator` to free the memory because we will never actually consume 
all of elements of `unrolled`, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15043: [SPARK-17491] Close serialization stream to fix wrong an...

2016-09-12 Thread JoshRosen

Github user JoshRosen commented on the issue:

https://github.com/apache/spark/pull/15043
  
#15056 also touches this code and creates a new test suite for this 
component so I'd prefer to merge that PR first.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14842: [SPARK-10747][SQL] Support NULLS FIRST|LAST claus...

2016-09-12 Thread ericl

Github user ericl commented on a diff in the pull request:

https://github.com/apache/spark/pull/14842#discussion_r78424193
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SortOrder.scala
 ---
@@ -71,12 +95,38 @@ case class SortPrefix(child: SortOrder) extends 
UnaryExpression {
 
   val nullValue = child.child.dataType match {
 case BooleanType | DateType | TimestampType | _: IntegralType =>
-  Long.MinValue
+  if (nullAsSmallest) {
+  Long.MinValue
+  } else {
+Long.MaxValue
+  }
 case dt: DecimalType if dt.precision - dt.scale <= 
Decimal.MAX_LONG_DIGITS =>
-  Long.MinValue
+  if (nullAsSmallest) {
+Long.MinValue
+  } else {
+Long.MaxValue
+  }
 case _: DecimalType =>
-  DoublePrefixComparator.computePrefix(Double.NegativeInfinity)
-case _ => 0L
+  if (nullAsSmallest) {
+DoublePrefixComparator.computePrefix(Double.NegativeInfinity)
+  } else {
+DoublePrefixComparator.computePrefix(Double.NaN)
+  }
+case _ =>
+  if (nullAsSmallest) {
+0L
+  } else {
+-1L
+  }
+  }
+
+  def nullAsSmallest: Boolean = {
--- End diff --

private. Also, you don't need the if else


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14842: [SPARK-10747][SQL] Support NULLS FIRST|LAST claus...

2016-09-12 Thread ericl

Github user ericl commented on a diff in the pull request:

https://github.com/apache/spark/pull/14842#discussion_r78424255
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SortOrder.scala
 ---
@@ -71,12 +95,38 @@ case class SortPrefix(child: SortOrder) extends 
UnaryExpression {
 
   val nullValue = child.child.dataType match {
 case BooleanType | DateType | TimestampType | _: IntegralType =>
-  Long.MinValue
+  if (nullAsSmallest) {
+  Long.MinValue
--- End diff --

nit: indentation


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14842: [SPARK-10747][SQL] Support NULLS FIRST|LAST claus...

2016-09-12 Thread ericl

Github user ericl commented on a diff in the pull request:

https://github.com/apache/spark/pull/14842#discussion_r78424378
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SortOrder.scala
 ---
@@ -21,26 +21,44 @@ import org.apache.spark.sql.catalyst.InternalRow
 import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
 import org.apache.spark.sql.catalyst.expressions.codegen.{CodegenContext, 
ExprCode}
 import org.apache.spark.sql.types._
-import 
org.apache.spark.util.collection.unsafe.sort.PrefixComparators.BinaryPrefixComparator
-import 
org.apache.spark.util.collection.unsafe.sort.PrefixComparators.DoublePrefixComparator
+import org.apache.spark.util.collection.unsafe.sort.PrefixComparators._
 
 abstract sealed class SortDirection {
   def sql: String
+  def defaultNullOrdering: NullOrdering
+}
+
+abstract sealed class NullOrdering {
+  def sql: String
 }
 
 case object Ascending extends SortDirection {
   override def sql: String = "ASC"
+  override def defaultNullOrdering: NullOrdering = NullsFirst
 }
 
+// default null order is last for desc
--- End diff --

remove


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15063: [SPARK-17463][Core]Make CollectionAccumulator and SetAcc...

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15063
  
**[Test build #3255 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3255/consoleFull)**
 for PR 15063 at commit 
[`21b6b4d`](https://github.com/apache/spark/commit/21b6b4dbb9cbc977c6e4aa8527532b3e933bf7c2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15062: SPARK-17424: Fix unsound substitution bug in ScalaReflec...

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15062
  
**[Test build #65269 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65269/consoleFull)**
 for PR 15062 at commit 
[`931f156`](https://github.com/apache/spark/commit/931f156450da83f82bddc4356fb14babd56ec625).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15061: [SPARK-14818] Post-2.0 MiMa exclucsion and build changes

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15061
  
**[Test build #65268 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65268/consoleFull)**
 for PR 15061 at commit 
[`1224e75`](https://github.com/apache/spark/commit/1224e758fc4cf69e27f013615d52b5c96696506b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #10655: [SPARK-12639][SQL] Improve Explain for Datasources with ...

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/10655
  
**[Test build #65267 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65267/consoleFull)**
 for PR 10655 at commit 
[`5a0daf6`](https://github.com/apache/spark/commit/5a0daf6590a711f376494419d5419ed2a2b7b26d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14842: [SPARK-10747][SQL] Support NULLS FIRST|LAST clause in OR...

2016-09-12 Thread ericl

Github user ericl commented on the issue:

https://github.com/apache/spark/pull/14842
  
A few more minor comments but otherwise the prefix parts look good to me!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14842: [SPARK-10747][SQL] Support NULLS FIRST|LAST clause in OR...

2016-09-12 Thread xwu0226

Github user xwu0226 commented on the issue:

https://github.com/apache/spark/pull/14842
  
@ericl Thanks so much for the detailed review and suggestions. I will fix 
the last comments. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15056: [SPARK-17503][Core] Fix memory leak in Memory sto...

2016-09-12 Thread JoshRosen

Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/15056#discussion_r78425128
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala ---
@@ -663,31 +663,43 @@ private[spark] class MemoryStore(
 private[storage] class PartiallyUnrolledIterator[T](
 memoryStore: MemoryStore,
 unrollMemory: Long,
-unrolled: Iterator[T],
+private[this] var unrolled: Iterator[T],
 rest: Iterator[T])
   extends Iterator[T] {
 
-  private[this] var unrolledIteratorIsConsumed: Boolean = false
-  private[this] var iter: Iterator[T] = {
-val completionIterator = CompletionIterator[T, Iterator[T]](unrolled, {
-  unrolledIteratorIsConsumed = true
-  memoryStore.releaseUnrollMemoryForThisTask(MemoryMode.ON_HEAP, 
unrollMemory)
-})
-completionIterator ++ rest
--- End diff --

I think the problem here is that the completion iterator is releasing the 
bookkeeping memory for the iterator as soon as the iterator is fully iterated, 
but the on-heap objects are being retained by the reference in the `unrolled` 
field. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15063: [SPARK-17463][Core]Make CollectionAccumulator and SetAcc...

2016-09-12 Thread JoshRosen

Github user JoshRosen commented on the issue:

https://github.com/apache/spark/pull/15063
  
I think that we may also want to do this for `BlockStatusesAccumulator`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #10655: [SPARK-12639][SQL] Improve Explain for Datasources with ...

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/10655
  
**[Test build #65267 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65267/consoleFull)**
 for PR 10655 at commit 
[`5a0daf6`](https://github.com/apache/spark/commit/5a0daf6590a711f376494419d5419ed2a2b7b26d).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #10655: [SPARK-12639][SQL] Improve Explain for Datasources with ...

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/10655
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65267/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #10655: [SPARK-12639][SQL] Improve Explain for Datasources with ...

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/10655
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15056: [SPARK-17503][Core] Fix memory leak in Memory sto...

2016-09-12 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/15056#discussion_r78426904
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala ---
@@ -663,31 +663,43 @@ private[spark] class MemoryStore(
 private[storage] class PartiallyUnrolledIterator[T](
 memoryStore: MemoryStore,
 unrollMemory: Long,
-unrolled: Iterator[T],
+private[this] var unrolled: Iterator[T],
 rest: Iterator[T])
   extends Iterator[T] {
 
-  private[this] var unrolledIteratorIsConsumed: Boolean = false
-  private[this] var iter: Iterator[T] = {
-val completionIterator = CompletionIterator[T, Iterator[T]](unrolled, {
-  unrolledIteratorIsConsumed = true
-  memoryStore.releaseUnrollMemoryForThisTask(MemoryMode.ON_HEAP, 
unrollMemory)
-})
-completionIterator ++ rest
--- End diff --

oh, yea, you are right.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15056: [SPARK-17503][Core] Fix memory leak in Memory store when...

2016-09-12 Thread JoshRosen

Github user JoshRosen commented on the issue:

https://github.com/apache/spark/pull/15056
  
LGTM as well, so I'm going to merge this to master and branch-2.0 
(2.0.1-SNAPSHOT). Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15056: [SPARK-17503][Core] Fix memory leak in Memory sto...

2016-09-12 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/15056


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15036: [SPARK-17483] Refactoring in BlockManager status ...

2016-09-12 Thread ericl

Github user ericl commented on a diff in the pull request:

https://github.com/apache/spark/pull/15036#discussion_r78430377
  
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala 
---
@@ -1316,21 +1303,31 @@ private[spark] class BlockManager(
 // The block has already been removed; do nothing.
 logWarning(s"Asked to remove block $blockId, which does not exist")
   case Some(info) =>
-// Removals are idempotent in disk store and memory store. At 
worst, we get a warning.
-val removedFromMemory = memoryStore.remove(blockId)
-val removedFromDisk = diskStore.remove(blockId)
-if (!removedFromMemory && !removedFromDisk) {
-  logWarning(s"Block $blockId could not be removed as it was not 
found in either " +
-"the disk, memory, or external block store")
-}
-blockInfoManager.removeBlock(blockId)
-val removeBlockStatus = getCurrentBlockStatus(blockId, info)
-if (tellMaster && info.tellMaster) {
-  reportBlockStatus(blockId, info, removeBlockStatus)
-}
-Option(TaskContext.get()).foreach { c =>
-  c.taskMetrics().incUpdatedBlockStatuses(blockId -> 
removeBlockStatus)
-}
+removeBlockInternal(blockId, tellMaster = tellMaster && 
info.tellMaster)
+addUpdatedBlockStatusToTaskMetrics(blockId, BlockStatus.empty)
+}
+  }
+
+  /**
+   * Internal version of [[removeBlock()]] which assumes that the caller 
already holds a write
+   * lock on the block.
+   */
+  private def removeBlockInternal(blockId: BlockId, tellMaster: Boolean): 
Unit = {
+// Removals are idempotent in disk store and memory store. At worst, 
we get a warning.
+val removedFromMemory = memoryStore.remove(blockId)
+val removedFromDisk = diskStore.remove(blockId)
+if (!removedFromMemory && !removedFromDisk) {
+  logWarning(s"Block $blockId could not be removed as it was not found 
on disk or in memory")
+}
+blockInfoManager.removeBlock(blockId)
+if (tellMaster) {
+  reportBlockStatus(blockId, BlockStatus.empty)
+}
+  }
+
+  private def addUpdatedBlockStatusToTaskMetrics(blockId: BlockId, status: 
BlockStatus): Unit = {
+Option(TaskContext.get()).foreach { c =>
--- End diff --

looks good


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15026: [SPARK-17472] [PYSPARK] Better error message for ...

2016-09-12 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/15026#discussion_r78432125
  
--- Diff: python/pyspark/broadcast.py ---
@@ -75,7 +75,13 @@ def __init__(self, sc=None, value=None, 
pickle_registry=None, path=None):
 self._path = path
 
 def dump(self, value, f):
-pickle.dump(value, f, 2)
+try:
+pickle.dump(value, f, 2)
+except pickle.PickleError:
+raise
+except Exception as e:
+msg = "Could not serialize broadcast: " + e.__class__.__name__ 
+ ": " + e.message
+raise pickle.PicklingError(msg)
--- End diff --

Can we log the stacktrace here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15026: [SPARK-17472] [PYSPARK] Better error message for ...

2016-09-12 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/15026#discussion_r78432383
  
--- Diff: python/pyspark/cloudpickle.py ---
@@ -109,6 +109,15 @@ def dump(self, obj):
 if 'recursion' in e.args[0]:
 msg = """Could not pickle object as excessively deep 
recursion required."""
 raise pickle.PicklingError(msg)
+except pickle.PickleError:
+raise
+except Exception as e:
+if "'i' format requires" in e.message:
+msg = "Object too large to serialize: " + e.message
+else:
+msg = "Could not serialize object: " + 
e.__class__.__name__ + ": " + e.message
+raise pickle.PicklingError(msg)
--- End diff --

Same here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15026: [SPARK-17472] [PYSPARK] Better error message for seriali...

2016-09-12 Thread davies

Github user davies commented on the issue:

https://github.com/apache/spark/pull/15026
  
Once we could log the original stacktrace, this looks good to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14467: [SPARK-16861][PYSPARK][CORE] Refactor PySpark accumulato...

2016-09-12 Thread holdenk

Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/14467
  
Ping @MLnick / @srowen ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15063: [SPARK-17463][Core]Make CollectionAccumulator and...

2016-09-12 Thread JoshRosen

Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/15063#discussion_r78432833
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/debug/package.scala ---
@@ -107,18 +109,20 @@ package object debug {
   case class DebugExec(child: SparkPlan) extends UnaryExecNode with 
CodegenSupport {
 def output: Seq[Attribute] = child.output
 
-class SetAccumulator[T] extends AccumulatorV2[T, HashSet[T]] {
-  private val _set = new HashSet[T]()
+class SetAccumulator[T] extends AccumulatorV2[T, java.util.Set[T]] {
+  private val _set = Collections.synchronizedSet(new 
java.util.HashSet[T]())
--- End diff --

If you use `Collections.synchronized*`, will serialization of those objects 
also be thread-safe (i.e. will `writeObject` synchronize properly)? What about 
if Kryo is used?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15063: [SPARK-17463][Core]Make CollectionAccumulator and SetAcc...

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15063
  
**[Test build #65272 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65272/consoleFull)**
 for PR 15063 at commit 
[`5a7183b`](https://github.com/apache/spark/commit/5a7183b8281dd4ced90c4c9d522c96cc6d8e9fb3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15024: [SPARK-17470][SQL] unify path for data source tab...

2016-09-12 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15024#discussion_r78433410
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -447,17 +461,10 @@ private[spark] class HiveExternalCatalog(conf: 
SparkConf, hadoopConf: Configurat
 } else {
   getProviderFromTableProperties(table).map { provider =>
 assert(provider != "hive", "Hive serde table should not save 
provider in table properties.")
-// SPARK-15269: Persisted data source tables always store the 
location URI as a storage
-// property named "path" instead of standard Hive `dataLocation`, 
because Hive only
-// allows directory paths as location URIs while Spark SQL data 
source tables also
-// allows file paths. So the standard Hive `dataLocation` is 
meaningless for Spark SQL
-// data source tables.
-// Spark SQL may also save external data source in Hive compatible 
format when
-// possible, so that these tables can be directly accessed by 
Hive. For these tables,
-// `dataLocation` is still necessary. Here we also check for input 
format because only
-// these Hive compatible tables set this field.
-val storage = if (table.tableType == EXTERNAL && 
table.storage.inputFormat.isEmpty) {
-  table.storage.copy(locationUri = None)
+// Data source table always put its location URI(if it has) in 
table properties, to work
+// around a hive metastore issue. We should read it back before 
return the table metadata.
+val storage = if (table.tableType == EXTERNAL) {
--- End diff --

For managed tables, do we need to set  
`table.properties.get(DATASOURCE_LOCATION)` to `locationUri`? Previously, Hive 
does it for us. Now, we explicitly remove it at multiple places. For example, 
`alterTable` removes it for both external tables and managed tables.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15063: [SPARK-17463][Core]Make CollectionAccumulator and...

2016-09-12 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/15063#discussion_r78433588
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/debug/package.scala ---
@@ -107,18 +109,20 @@ package object debug {
   case class DebugExec(child: SparkPlan) extends UnaryExecNode with 
CodegenSupport {
 def output: Seq[Attribute] = child.output
 
-class SetAccumulator[T] extends AccumulatorV2[T, HashSet[T]] {
-  private val _set = new HashSet[T]()
+class SetAccumulator[T] extends AccumulatorV2[T, java.util.Set[T]] {
+  private val _set = Collections.synchronizedSet(new 
java.util.HashSet[T]())
--- End diff --

For Java serialization, it's synchronized. See: 
http://www.grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/java/util/Collections.java#2080

Do we use Kryo to serialize Heartbeat?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14857: [SPARK-17261][PYSPARK] Using HiveContext after re-creati...

2016-09-12 Thread coderfi

Github user coderfi commented on the issue:

https://github.com/apache/spark/pull/14857
  
Awesome, we ran into this problem as well, and finally had some bandwidth 
to track down the cause well enough to be able to search for this pull request. 
Looking forward to test this in our fork.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15030: [SPARK-17474] [SQL] fix python udf in TakeOrdered...

2016-09-12 Thread JoshRosen

Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/15030#discussion_r78434025
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala ---
@@ -148,8 +148,8 @@ case class TakeOrderedAndProjectExec(
 localTopK, child.output, SinglePartition, serializer))
 shuffled.mapPartitions { iter =>
   val topK = 
org.apache.spark.util.collection.Utils.takeOrdered(iter.map(_.copy()), 
limit)(ord)
-  if (projectList.isDefined) {
-val proj = UnsafeProjection.create(projectList.get, child.output)
+  if (AttributeSet(projectList) != child.outputSet) {
--- End diff --

Should this be order-insensitive, set-based comparision or should it be 
using `AttributeSeq` instead? I'm wondering whether we could hit a bug in case 
the project happens to permute the child output columns, since in that case I 
think we'd end up skipping the final column-reordering projection.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15060: [SPARK-17507][ML][MLLib] check weight vector size in ANN

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15060
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65266/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15060: [SPARK-17507][ML][MLLib] check weight vector size in ANN

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15060
  
**[Test build #65266 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65266/consoleFull)**
 for PR 15060 at commit 
[`e89aef5`](https://github.com/apache/spark/commit/e89aef5fbd5be5b255e623cdfca8ae75ecb92ea3).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15060: [SPARK-17507][ML][MLLib] check weight vector size in ANN

2016-09-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15060
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15030: [SPARK-17474] [SQL] fix python udf in TakeOrdered...

2016-09-12 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/15030#discussion_r78435023
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala ---
@@ -148,8 +148,8 @@ case class TakeOrderedAndProjectExec(
 localTopK, child.output, SinglePartition, serializer))
 shuffled.mapPartitions { iter =>
   val topK = 
org.apache.spark.util.collection.Utils.takeOrdered(iter.map(_.copy()), 
limit)(ord)
-  if (projectList.isDefined) {
-val proj = UnsafeProjection.create(projectList.get, child.output)
+  if (AttributeSet(projectList) != child.outputSet) {
--- End diff --

Good point, we should just compare it with output as Seq directly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15023: Backport [SPARK-5847] Allow for configuring MetricsSyste...

2016-09-12 Thread marmbrus

Github user marmbrus commented on the issue:

https://github.com/apache/spark/pull/15023
  
Thanks for spending the time to backport this, but it does seem a little 
risky to include changes to the configuration system in a maintenance release.  
As such, I'd probably error on the side of caution and close this PR unless 
there are a lot of 1.6 users clamoring for this functionality.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15030: [SPARK-17474] [SQL] fix python udf in TakeOrderedAndProj...

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15030
  
**[Test build #65273 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65273/consoleFull)**
 for PR 15030 at commit 
[`1e319d8`](https://github.com/apache/spark/commit/1e319d8f4ef1adf69b4fffa928bc1ac0c0f21805).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15024: [SPARK-17470][SQL] unify path for data source table and ...

2016-09-12 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/15024
  

[`showDataSourceTableOptions`](https://github.com/apache/spark/blob/c0ae6bc6ea38909730fad36e653d3c7ab0a84b44/sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala#L866-L873)
 in SHOW CREATE TABLE also needs an update. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #12601: [SPARK-14525][SQL] Make DataFrameWrite.save work for jdb...

2016-09-12 Thread JustinPihony

Github user JustinPihony commented on the issue:

https://github.com/apache/spark/pull/12601
  
@srowen Documentation added.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14644: [MESOS] Enable GPU support with Mesos

2016-09-12 Thread skonto

Github user skonto commented on a diff in the pull request:

https://github.com/apache/spark/pull/14644#discussion_r78438011
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala
 ---
@@ -103,6 +103,7 @@ private[spark] class MesosCoarseGrainedSchedulerBackend(
   private val stateLock = new ReentrantLock
 
   val extraCoresPerExecutor = conf.getInt("spark.mesos.extra.cores", 0)
+  val maxGpus = conf.getInt("spark.mesos.gpus.max", 0)
--- End diff --

@tnachen I think it is only a threshold for each offer per node not 
necessarily per node. You may get multiple offers for gpus from the same node 
correct? You just check against the max you dont do any counting how many gpus 
you have assigned per node. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14467: [SPARK-16861][PYSPARK][CORE] Refactor PySpark accumulato...

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14467
  
**[Test build #65274 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65274/consoleFull)**
 for PR 14467 at commit 
[`6169c3c`](https://github.com/apache/spark/commit/6169c3c6ad0e17566467876edc43898e668037ce).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #11105: [SPARK-12469][CORE] Data Property accumulators for Spark

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/11105
  
**[Test build #65276 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65276/consoleFull)**
 for PR 11105 at commit 
[`491499d`](https://github.com/apache/spark/commit/491499d34e8481cfb9ef43a8871b52bbee3f4638).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #12601: [SPARK-14525][SQL] Make DataFrameWrite.save work for jdb...

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/12601
  
**[Test build #65275 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65275/consoleFull)**
 for PR 12601 at commit 
[`7ef7a48`](https://github.com/apache/spark/commit/7ef7a489b27fa6bd5d79ee4d428874162fd813de).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-09-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13758
  
**[Test build #65277 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65277/consoleFull)**
 for PR 13758 at commit 
[`45ae9bf`](https://github.com/apache/spark/commit/45ae9bf8295acad26b3b017a9653533843ec39b7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

< 1 2 3 4 5 6 >

201 - 300 of 559 matches

Mail list logo