date:20170414

[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17581
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75819/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17581
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17581
  
**[Test build #75819 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75819/testReport)**
 for PR 17581 at commit 
[`f8f85a3`](https://github.com/apache/spark/commit/f8f85a3c70e00d53195c95c7d884d6d8ef6a469a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17630: [SPARK-20318][SQL] Use Catalyst type for min/max in Colu...

2017-04-14 Thread wzhfy

Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/17630
  
> are we storing UTF8Strings directly in the catalog for statistics? That 
doesn't make sense ... if we are not, then we are not using internal types.

@rxin By "in the catalog for statistics", do you mean statistics in 
metastore? We still use external type for statistics in the metastore. What 
this pr changed were the types of min/max in `ColumnStat`. So we don't have 
this problem here.

> My concern is that the internal types are specific to the physical 
execution path and stats/CBO are independent of that. We can in the future 
change the internal data types without changing CBO.

Since literal values are internal, stats/CBO need to be consistent with 
them to do estimation. So it's hard for CBO to be independent of that. If the 
internal types are changed in the future, what we can do is to change the 
conversion contract defined in `ColumnStat` based on the changes on internal 
types.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17623: [SPARK-20292][SQL] Clean up string representation of Tre...

2017-04-14 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/17623
  
cc @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17623: [SPARK-20292][SQL][WIP] Clean up string representation o...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17623
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17623: [SPARK-20292][SQL][WIP] Clean up string representation o...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17623
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75818/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17623: [SPARK-20292][SQL][WIP] Clean up string representation o...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17623
  
**[Test build #75818 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75818/testReport)**
 for PR 17623 at commit 
[`a21675d`](https://github.com/apache/spark/commit/a21675d37d66a0fbf1a15a7e714bfe596814431d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17642: [SPARK-20343][BUILD] Force Avro 1.7.7 in sbt build to re...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17642
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75817/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17642: [SPARK-20343][BUILD] Force Avro 1.7.7 in sbt build to re...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17642
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17642: [SPARK-20343][BUILD] Force Avro 1.7.7 in sbt build to re...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17642
  
**[Test build #75817 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75817/testReport)**
 for PR 17642 at commit 
[`1ae57f2`](https://github.com/apache/spark/commit/1ae57f2e569462734d89d9c8c77e765859ce8393).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17556: [SPARK-16957][MLlib] Use weighted midpoints for split va...

2017-04-14 Thread facaiy

Github user facaiy commented on the issue:

https://github.com/apache/spark/pull/17556
  
@sethah Perhaps it's hard to compare R with Spark's behavior, since many 
factors involved. I'd like to read R GBM's code, and verify consistency of both 
side's design on split criteria. Is it OK?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17556: [SPARK-16957][MLlib] Use weighted midpoints for s...

2017-04-14 Thread facaiy

Github user facaiy commented on a diff in the pull request:

https://github.com/apache/spark/pull/17556#discussion_r111656245
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala ---
@@ -104,6 +104,18 @@ class RandomForestSuite extends SparkFunSuite with 
MLlibTestSparkContext {
   assert(splits.distinct.length === splits.length)
 }
 
+// SPARK-16957: Use weighted midpoints for split values.
+{
+  val fakeMetadata = new DecisionTreeMetadata(1, 0, 0, 0,
+Map(), Set(),
+Array(2), Gini, QuantileStrategy.Sort,
+0, 0, 0.0, 0, 0
+  )
+  val featureSamples = Array(0, 1, 0, 0, 1, 0, 1, 1).map(_.toDouble)
+  val splits = 
RandomForest.findSplitsForContinuousFeature(featureSamples, fakeMetadata, 0)
+  assert(splits === Array(0.5))
--- End diff --

add new case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17556: [SPARK-16957][MLlib] Use weighted midpoints for s...

2017-04-14 Thread facaiy

Github user facaiy commented on a diff in the pull request:

https://github.com/apache/spark/pull/17556#discussion_r111656240
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala ---
@@ -126,9 +138,10 @@ class RandomForestSuite extends SparkFunSuite with 
MLlibTestSparkContext {
 Array(3), Gini, QuantileStrategy.Sort,
 0, 0, 0.0, 0, 0
   )
-  val featureSamples = Array(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 4, 
5).map(_.toDouble)
+  val featureSamples = Array(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 3, 4, 5)
+.map(_.toDouble)
   val splits = 
RandomForest.findSplitsForContinuousFeature(featureSamples, fakeMetadata, 0)
-  assert(splits === Array(2.0, 3.0))
+  assert(splits === Array(2.0625, 3.5))
--- End diff --

done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17556: [SPARK-16957][MLlib] Use weighted midpoints for s...

2017-04-14 Thread facaiy

Github user facaiy commented on a diff in the pull request:

https://github.com/apache/spark/pull/17556#discussion_r111656235
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala ---
@@ -112,9 +124,9 @@ class RandomForestSuite extends SparkFunSuite with 
MLlibTestSparkContext {
 Array(5), Gini, QuantileStrategy.Sort,
 0, 0, 0.0, 0, 0
   )
-  val featureSamples = Array(1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 
3).map(_.toDouble)
+  val featureSamples = Array(1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 
3).map(_.toDouble)
   val splits = 
RandomForest.findSplitsForContinuousFeature(featureSamples, fakeMetadata, 0)
-  assert(splits === Array(1.0, 2.0))
+  assert(splits === Array(1.8, 2.2))
--- End diff --

done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17637: [SPARK-20337][CORE] Support upgrade a jar depende...

2017-04-14 Thread wangyum

Github user wangyum closed the pull request at:

https://github.com/apache/spark/pull/17637


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17581
  
**[Test build #75819 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75819/testReport)**
 for PR 17581 at commit 
[`f8f85a3`](https://github.com/apache/spark/commit/f8f85a3c70e00d53195c95c7d884d6d8ef6a469a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17642: [SPARK-20343][BUILD] Force Avro 1.7.7 in sbt buil...

2017-04-14 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/17642#discussion_r111654148
  
--- Diff: project/SparkBuild.scala ---
@@ -448,7 +448,9 @@ object DockerIntegrationTests {
  */
 object DependencyOverrides {
   lazy val settings = Seq(
-dependencyOverrides += "com.google.guava" % "guava" % "14.0.1")
+dependencyOverrides ++= Set(
--- End diff --

Using `Seq` produces an error as below:

```
[error] .../spark/project/SparkBuild.scala:451: No implicit for 
Append.Values[Set[sbt.ModuleID], Seq[sbt.ModuleID]] found,
[error]   so Seq[sbt.ModuleID] cannot be appended to Set[sbt.ModuleID]
[error] dependencyOverrides ++= Seq(
[error] ^
[error] one error found
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17642: [SPARK-20343][BUILD] Force Avro 1.7.7 in sbt build to re...

2017-04-14 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17642
  
cc @srowen and @vanzin. I think apparently it is a similar issue with 
[SPARK-11538](https://issues.apache.org/jira/browse/SPARK-11538). Could you 
check if it makes sense?

I think this is going to resolve the problem as a safe workaround.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17642: [SPARK-20343][BUILD] Force Avro 1.7.7 in sbt buil...

2017-04-14 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/17642#discussion_r111653875
  
--- Diff: project/SparkBuild.scala ---
@@ -448,7 +448,9 @@ object DockerIntegrationTests {
  */
 object DependencyOverrides {
   lazy val settings = Seq(
-dependencyOverrides += "com.google.guava" % "guava" % "14.0.1")
+dependencyOverrides ++= Set(
--- End diff --

It seems requires `Set`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17623: [SPARK-20292][SQL][WIP] Clean up string representation o...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17623
  
**[Test build #75818 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75818/testReport)**
 for PR 17623 at commit 
[`a21675d`](https://github.com/apache/spark/commit/a21675d37d66a0fbf1a15a7e714bfe596814431d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17642: [SPARK-20343][BUILD] Force Avro 1.7.7 in sbt build to re...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17642
  
**[Test build #75817 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75817/testReport)**
 for PR 17642 at commit 
[`1ae57f2`](https://github.com/apache/spark/commit/1ae57f2e569462734d89d9c8c77e765859ce8393).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17642: [SPARK-20343][BUILD] Force Avro 1.7.7 in sbt buil...

2017-04-14 Thread HyukjinKwon

GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/17642

[SPARK-20343][BUILD] Force Avro 1.7.7 in sbt build to resolve build failure 
in SBT Hadoop 2.6 master on Jenkins

## What changes were proposed in this pull request?

Currently, the build fails on the SBT master build but only for Hadoop 2.6. 
It seems the dependency resolution can be different. 

```
[error] 
/home/jenkins/workspace/spark-master-test-sbt-hadoop-2.6/core/src/main/scala/org/apache/spark/serializer/GenericAvroSerializer.scala:123:
 value createDatumWriter is not a member of org.apache.avro.generic.GenericData
[error] writerCache.getOrElseUpdate(schema, 
GenericData.get.createDatumWriter(schema))
[error] 
```


https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/2770/consoleFull

## How was this patch tested?

I tried many ways but I was unable to reproduce this in my local. Sean also 
tries the way I did but he was also unable to reproduce this.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark SPARK-20343

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17642.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17642


commit 1ae57f2e569462734d89d9c8c77e765859ce8393
Author: hyukjinkwon 
Date:   2017-04-15T00:25:33Z

Explicitly override Avro version




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17596: [SPARK-12837][CORE] Do not send the accumulator n...

2017-04-14 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17596#discussion_r111653739
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala
 ---
@@ -537,3 +539,27 @@ class ParquetFilterSuite extends QueryTest with 
ParquetTest with SharedSQLContex
 }
   }
 }
+
+class NumRowGroupsAcc extends AccumulatorV2[Integer, Integer] {
--- End diff --

oh. This approach looks good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17149: [SPARK-19257][SQL]location for table/partition/database ...

2017-04-14 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17149
  
@cnauroth Thank you so much for your help.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17149: [SPARK-19257][SQL]location for table/partition/database ...

2017-04-14 Thread cnauroth

Github user cnauroth commented on the issue:

https://github.com/apache/spark/pull/17149
  
@HyukjinKwon , nice to meet you!  I see I got notified here for a bit of 
Hadoop `Path` knowledge, and particularly on Windows.

> Is it okay to use both URIs and local file paths for the input string for 
org.apache.hadoop.fs.Path in general (when they are expected to be unescaped)?

Yes, this is correct.

Specifically on the topic of Windows, `Path` has special case logic for 
handling a Windows-specific local file path.  (This logic is only triggered if 
it detects the runtime OS is Windows.)  On Windows, I expect a call like `new 
Path("C:\\foo\\bar").toUri` to yield a correct `URI` pointing at that local 
file path, and further calling `toString` yields a correct `String` 
representation of the path.  Hadoop code often needs to take a path string that 
is possibly a relative path and pass it through `Path` to make it absolute and 
escape it according to Hadoop code expectations.

The standard invocation for doing this in the Hadoop code is `new 
Path(...).toUri();` or `new Path(...).toUri().toString();`.  This works across 
all platforms.  I don't have any knowledge of the Spark codebase, but I see 
this patch uses similar invocations, so I expect it's good.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17416: [SPARK-20075][CORE][WIP] Support classifier, packaging i...

2017-04-14 Thread BryanCutler

Github user BryanCutler commented on the issue:

https://github.com/apache/spark/pull/17416
  
@srowen , I finally had some time to look into this and I was able to get 
the correct jar on the classpath.  The fix was to use the code you had in the 
previous commit for `SparkSubmit.addDependenciesToIvy` so that the 
extraAttributes is set with `dd.addDependencyArtifact` and doesn't need to be 
in the `ModuleRevisionId` - so it was my bad advice that probably screwed this 
up :<

The reason is that when the `DefaultDependencyDescriptor` gets resolved in 
DefaultModuleDescriptor.java, if there are no artifacts defined, it adds 1 but 
does not copy over the `extraAttributes`, that's why the resolve report doesn't 
know about it.  But if there are artifacts (which come from 
`addDependencyArtifact`) then the `extraAttributes` are carried over.  wow, 
this is really confusing - hopefully this makes sense, see the code below 
`BasicResolver. getDependency(DependencyDescriptor dd, ResolveData data)` calls 
`DefaultModuleDescriptor.newDefaultInstance`

```java
public static DefaultModuleDescriptor newDefaultInstance(ModuleRevisionId 
mrid,
DependencyArtifactDescriptor[] artifacts) {
DefaultModuleDescriptor moduleDescriptor = new 
DefaultModuleDescriptor(mrid, "release",
null, true);
moduleDescriptor.addConfiguration(new 
Configuration(DEFAULT_CONFIGURATION));
if (artifacts != null && artifacts.length > 0) {
for (int i = 0; i < artifacts.length; i++) {
moduleDescriptor.addArtifact(DEFAULT_CONFIGURATION,
new MDArtifact(moduleDescriptor, artifacts[i].getName(),
artifacts[i].getType(), artifacts[i].getExt(), 
artifacts[i].getUrl(),
artifacts[i].getExtraAttributes()));
}
} else {
moduleDescriptor.addArtifact(DEFAULT_CONFIGURATION, new 
MDArtifact(moduleDescriptor,
mrid.getName(), "jar", "jar"));
}
moduleDescriptor.setLastModified(System.currentTimeMillis());
return moduleDescriptor;
}
```

I think that some other code you added in the second commit was also 
required, which is maybe why it didn't work for you in the first place, but 
give it another try.  Here is the output from my test, looks like it should 
work now:

```
bin/spark-submit --packages 
edu.stanford.nlp:stanford-corenlp:jar:models:3.4.1 -v 
examples/src/main/python/pi.py
Using properties file: /home/bryan/git/spark/conf/spark-defaults.conf
Adding default property: 
spark.history.fs.logDirectory=/home/bryan/git/spark/logs/history
Adding default property: 
spark.eventLog.dir=/home/bryan/git/spark/logs/history
Adding default property: drill.enable_unsafe_memory_access=false
Warning: Ignoring non-spark config property: 
drill.enable_unsafe_memory_access=false
Parsed arguments:
  master  local[*]
  deployMode  null
  executorMemory  null
  executorCores   null
  totalExecutorCores  null
  propertiesFile  /home/bryan/git/spark/conf/spark-defaults.conf
  driverMemorynull
  driverCores null
  driverExtraClassPathnull
  driverExtraLibraryPath  null
  driverExtraJavaOptions  null
  supervise   false
  queue   null
  numExecutorsnull
  files   null
  pyFiles null
  archivesnull
  mainClass   null
  primaryResource 
file:/home/bryan/git/spark/examples/src/main/python/pi.py
  namepi.py
  childArgs   []
  jarsnull
  packagesedu.stanford.nlp:stanford-corenlp:jar:models:3.4.1
  packagesExclusions  null
  repositoriesnull
  verbose true

Spark properties used, including those specified through
 --conf and those from the properties file 
/home/bryan/git/spark/conf/spark-defaults.conf:
  (spark.history.fs.logDirectory,/home/bryan/git/spark/logs/history)
  (spark.eventLog.dir,/home/bryan/git/spark/logs/history)


Ivy Default Cache set to: /home/bryan/.ivy2/cache
The jars for the packages stored in: /home/bryan/.ivy2/jars
:: loading settings :: url = 
jar:file:/home/bryan/git/spark/assembly/target/scala-2.11/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
edu.stanford.nlp#stanford-corenlp added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
confs: [default]
found edu.stanford.nlp#stanford-corenlp;3.4.1 in central
downloading

[GitHub] spark issue #17506: [SPARK-20189][DStream] Fix spark kinesis testcases to re...

2017-04-14 Thread yssharma

Github user yssharma commented on the issue:

https://github.com/apache/spark/pull/17506
  
Thanks @srowen 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17641: [SPARK-20329][SQL] Make timezone aware expression withou...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17641
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17641: [SPARK-20329][SQL] Make timezone aware expression withou...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17641
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75816/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17641: [SPARK-20329][SQL] Make timezone aware expression withou...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17641
  
**[Test build #75816 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75816/testReport)**
 for PR 17641 at commit 
[`0654409`](https://github.com/apache/spark/commit/0654409677dc8f569950fb54eb1d1d1239cdf870).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class ResolveTimeZone(conf: SQLConf) extends Rule[LogicalPlan] `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17640: [SPARK-17608][SPARKR]:Long type has incorrect serializat...

2017-04-14 Thread wangmiao1981

Github user wangmiao1981 commented on the issue:

https://github.com/apache/spark/pull/17640
  
cc @felixcheung 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17640: [SPARK-17608][SPARKR]:Long type has incorrect serializat...

2017-04-14 Thread wangmiao1981

Github user wangmiao1981 commented on the issue:

https://github.com/apache/spark/pull/17640
  
I will some bound check and error handling.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17640: [SPARK-17608][SPARKR]:Long type has incorrect serializat...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17640
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17640: [SPARK-17608][SPARKR]:Long type has incorrect serializat...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17640
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75815/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17640: [SPARK-17608][SPARKR]:Long type has incorrect serializat...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17640
  
**[Test build #75815 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75815/testReport)**
 for PR 17640 at commit 
[`03b82ac`](https://github.com/apache/spark/commit/03b82ac19dcbe17a70d9e45790dd24210b6d4f07).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17582: [SPARK-20239][Core] Improve HistoryServer's ACL mechanis...

2017-04-14 Thread ajbozarth

Github user ajbozarth commented on the issue:

https://github.com/apache/spark/pull/17582
  
Been following this but haven't had time to do a proper review, but 
@tgravescs since you brought up the UI vs API thing, as of 2.0 the UI gets it's 
list from the API so that's where the security has to be handled.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17641: [SPARK-20329][SQL] Make timezone aware expression withou...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17641
  
**[Test build #75816 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75816/testReport)**
 for PR 17641 at commit 
[`0654409`](https://github.com/apache/spark/commit/0654409677dc8f569950fb54eb1d1d1239cdf870).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17641: [SPARK-20329][SQL] Make timezone aware expression...

2017-04-14 Thread hvanhovell

GitHub user hvanhovell opened a pull request:

https://github.com/apache/spark/pull/17641

[SPARK-20329][SQL] Make timezone aware expression without timezone 
unresolved.

## What changes were proposed in this pull request?
TBD

## How was this patch tested?
TBD

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hvanhovell/spark SPARK-20329

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17641.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17641


commit 0654409677dc8f569950fb54eb1d1d1239cdf870
Author: Herman van Hovell 
Date:   2017-04-14T20:23:32Z

Make timezone aware expression without timezone unresolved.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operations in...

2017-04-14 Thread rdblue

Github user rdblue commented on the issue:

https://github.com/apache/spark/pull/17540
  
@cloud-fan, could you have another look at this?

There are a few new changes:
* withNewExecutionId now warns instead of throwing an exception, but still 
throws exceptions if spark.testing is defined
* SQLExecution.nested allows nested execution IDs without test failures or 
warnings. This is needed because several places will nest when 
withNewExceptionId is called at the high-level operations. CacheTableCommand is 
an example.

Over the last week, I've fixed nearly all of the tests. The remaining 
failure, SQLExecutionSuite.concurrent query execution (SPARK-10548), is fixed 
in maven, but fails in SBT. The problem is that exceptions are now only thrown 
if `spark.testing` is defined, and for some reason adding it to the test's 
SparkSession or SparkContext doesn't work on Jenkins. Because this test is 
reproducing a case that now will never happen for two reasons (the original 
multi-threading fix and throw only if spark.testing), I'd like to simply remove 
it. Let me know what you think about that.

Other changes to look at:
* `SQLMetricsSuite.save metrics` started failing because there is a nested 
execution ID. This is because there are two SQL physical plans. The first, 
`ExecutedCommandExec` links in a logical plan that is turned into a second 
physical plan *at runtime*. This means that the inner plan can't report the 
metrics that will be collected when analyzing the outer plan because it doesn't 
exist yet. The long-term solution is to fix `ExecutedCommandExec`, but for now 
this accepts any metrics created by the inner plan.
* `StreamExecution` wasn't calling `withNewExecutionId` and was caught by 
the new assertion. I added the call around the entire execution so that there 
isn't a new SQL execution for every batch. This required creating a special 
`queryExecution` to pass in.
* `DataFrameCallbackSuite` had to be updated to include commands that were 
previously not registered in the SQL tab. The new SQL executions are for 
dropping tables, so the result looks more correct than before.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17568: [SPARK-20254][SQL] Remove unnecessary data conversion fo...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17568
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75814/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17568: [SPARK-20254][SQL] Remove unnecessary data conversion fo...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17568
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17568: [SPARK-20254][SQL] Remove unnecessary data conversion fo...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17568
  
**[Test build #75814 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75814/testReport)**
 for PR 17568 at commit 
[`b47c1f4`](https://github.com/apache/spark/commit/b47c1f4e3f22febc2955c61c38ef794c6ecce158).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class EliminateMapObjectsSuite extends PlanTest `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17087: [SPARK-19372][SQL] Fix throwing a Java exception at df.f...

2017-04-14 Thread kiszk

Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/17087
  
@marmbrus could you please take a look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17640: [SPARK-17608][SPARKR]:Long type has incorrect serializat...

2017-04-14 Thread frreiss

Github user frreiss commented on the issue:

https://github.com/apache/spark/pull/17640
  
Overall, this looks like a sensible approach to a messy problem.
You might want to think about adding some overflow handling to the SQL-->R 
translation. That is, if a Dataframe contains a `bigint` value that cannot be 
expressed as a `Double`, it would be safer to convert that value to NaN instead 
of stripping the lower-order bits off the `bigint`. The `bigint` column in the 
source Dataframe could hold a unique identifier or a hash value.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operations in...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17540
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75813/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operations in...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17540
  
Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operations in...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17540
  
**[Test build #75813 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75813/testReport)**
 for PR 17540 at commit 
[`30fa4fc`](https://github.com/apache/spark/commit/30fa4fc8603e68f9295fc65e573f96140bb04ac6).
 * This patch **fails Spark unit tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17568: [SPARK-20254][SQL] Remove unnecessary data conver...

2017-04-14 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/17568#discussion_r111614833
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DatasetPrimitiveSuite.scala ---
@@ -253,4 +256,27 @@ class DatasetPrimitiveSuite extends QueryTest with 
SharedSQLContext {
 checkDataset(Seq(PackageClass(1)).toDS(), PackageClass(1))
   }
 
+  test("SPARK-20254: Remove unnecessary data conversion for primitive 
array") {
--- End diff --

Thank you for pointing it out. I implemented non-e2e tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17568: [SPARK-20254][SQL] Remove unnecessary data conver...

2017-04-14 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/17568#discussion_r111614744
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/objects.scala
 ---
@@ -96,3 +99,32 @@ object CombineTypedFilters extends Rule[LogicalPlan] {
 }
   }
 }
+
+/**
+ * Removes MapObjects when the following conditions are satisfied
+ *   1. Mapobject(e) where e is lambdavariable(), which means types for 
input output
+ *  are primitive types
+ *   2. no custom collection class specified
+ * representation of data item.  For example back to back map operations.
+ */
+object EliminateMapObjects extends Rule[LogicalPlan] {
+  private def convertDataTypeToArrayClass(dt: DataType): Class[_] = dt 
match {
+case IntegerType => classOf[Array[Int]]
+case LongType => classOf[Array[Long]]
+case DoubleType => classOf[Array[Double]]
+case FloatType => classOf[Array[Float]]
+case ShortType => classOf[Array[Short]]
+case ByteType => classOf[Array[Byte]]
+case BooleanType => classOf[Array[Boolean]]
+  }
+
+  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+case _ @ DeserializeToObject(_ @ Invoke(
--- End diff --

Yes, I can do.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17568: [SPARK-20254][SQL] Remove unnecessary data conver...

2017-04-14 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/17568#discussion_r111614710
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
 ---
@@ -368,6 +369,8 @@ case class NullPropagation(conf: SQLConf) extends 
Rule[LogicalPlan] {
   case EqualNullSafe(Literal(null, _), r) => IsNull(r)
   case EqualNullSafe(l, Literal(null, _)) => IsNull(l)
 
+  case a @ AssertNotNull(c, _) if !c.nullable => c
--- End diff --

Good cattch. done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17530: [SPARK-5158] Access kerberized HDFS from Spark standalon...

2017-04-14 Thread mgummelt

Github user mgummelt commented on the issue:

https://github.com/apache/spark/pull/17530
  
> Right now the PR doesn't set that, so it needs to be set under the user's 
HADOOP_CONF even though it had no real effect. That probably should be changed.

Yep, same problem I'm seeing.  Thanks.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17621: [SPARK-6227][MLLIB][PYSPARK] Implement PySpark wrappers ...

2017-04-14 Thread MechCoder

Github user MechCoder commented on the issue:

https://github.com/apache/spark/pull/17621
  
Thanks @MLnick !


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17640: [SPARK-17608][SPARKR]:Long type has incorrect serializat...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17640
  
**[Test build #75815 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75815/testReport)**
 for PR 17640 at commit 
[`03b82ac`](https://github.com/apache/spark/commit/03b82ac19dcbe17a70d9e45790dd24210b6d4f07).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17640: [SPARK-17608][SPARKR]:Long type has incorrect ser...

2017-04-14 Thread wangmiao1981

GitHub user wangmiao1981 opened a pull request:

https://github.com/apache/spark/pull/17640

[SPARK-17608][SPARKR]:Long type has incorrect serialization/deserialization

## What changes were proposed in this pull request?
`bigint` is not supported in schema and the serialization is not `Double`.

Add `bigint` support in schema and serialized and deserialized as `Double`.

This fix is orthogonal to the precision problem in 
https://issues.apache.org/jira/browse/SPARK-12360  

## How was this patch tested?

Add a new unit test.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/wangmiao1981/spark summary

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17640.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17640


commit 03b82ac19dcbe17a70d9e45790dd24210b6d4f07
Author: wm...@hotmail.com 
Date:   2017-04-14T17:43:35Z

add bigint support




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17568: [SPARK-20254][SQL] Remove unnecessary data conversion fo...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17568
  
**[Test build #75814 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75814/testReport)**
 for PR 17568 at commit 
[`b47c1f4`](https://github.com/apache/spark/commit/b47c1f4e3f22febc2955c61c38ef794c6ecce158).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17630: [SPARK-20318][SQL] Use Catalyst type for min/max in Colu...

2017-04-14 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/17630
  
Wait - are we storing UTF8Strings directly in the catalog for statistics? 
That doesn't make sense ... if we are not, then we are not using internal 
types. In that case we should document clearly what's happening.

My concern is that the internal types are specific to the physical 
execution path and stats/CBO are independent of that. We can in the future 
change the internal data types without changing CBO, and completely screw 
ourselves.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17633: [SPARK-20331][SQL] Enhanced Hive partition pruning predi...

2017-04-14 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/17633
  
Then it should work.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17623: [SPARK-20292][SQL][WIP] Clean up string representation o...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17623
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17623: [SPARK-20292][SQL][WIP] Clean up string representation o...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17623
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75811/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17623: [SPARK-20292][SQL][WIP] Clean up string representation o...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17623
  
**[Test build #75811 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75811/testReport)**
 for PR 17623 at commit 
[`c83396e`](https://github.com/apache/spark/commit/c83396e0906e781a493648d70067a91880f9cf8f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17637: [SPARK-20337][CORE] Support upgrade a jar dependency and...

2017-04-14 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/17637
  
This does not work. Any classes that have already been loaded from the old 
jar will not be unloaded. So you're going to end up with really odd issues when 
two classes from different jars don't agree with each other.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17623: [SPARK-20292][SQL][WIP] Clean up string representation o...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17623
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17623: [SPARK-20292][SQL][WIP] Clean up string representation o...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17623
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75810/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17623: [SPARK-20292][SQL][WIP] Clean up string representation o...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17623
  
**[Test build #75810 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75810/testReport)**
 for PR 17623 at commit 
[`5c057ba`](https://github.com/apache/spark/commit/5c057ba7eb2a68a276387988d5c3eb6419a0cba8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operations in...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17540
  
**[Test build #75813 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75813/testReport)**
 for PR 17540 at commit 
[`30fa4fc`](https://github.com/apache/spark/commit/30fa4fc8603e68f9295fc65e573f96140bb04ac6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13440: [SPARK-15699] [ML] Implement a Chi-Squared test statisti...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13440
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75812/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13440: [SPARK-15699] [ML] Implement a Chi-Squared test statisti...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13440
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13440: [SPARK-15699] [ML] Implement a Chi-Squared test statisti...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13440
  
**[Test build #75812 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75812/testReport)**
 for PR 13440 at commit 
[`6762a18`](https://github.com/apache/spark/commit/6762a18dd558b61d5b292787115d6e8c8768ed12).
 * This patch **fails to generate documentation**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13440: [SPARK-15699] [ML] Implement a Chi-Squared test statisti...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13440
  
**[Test build #75812 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75812/testReport)**
 for PR 13440 at commit 
[`6762a18`](https://github.com/apache/spark/commit/6762a18dd558b61d5b292787115d6e8c8768ed12).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17633: [SPARK-20331][SQL] Enhanced Hive partition pruning predi...

2017-04-14 Thread mallman

Github user mallman commented on the issue:

https://github.com/apache/spark/pull/17633
  
> Does this work for non-Hive tables?

This is geared towards Hive partitioned tables. If we have another system 
that prunes table partitions based on a string-ified pruning predicate I'm 
unaware. Do you have one in mind?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13440: [SPARK-15699] [ML] Implement a Chi-Squared test statisti...

2017-04-14 Thread erikerlandson

Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/13440
  
test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17639: [SPARK-19716][SQL][follow-up] UnresolvedMapObjects shoul...

2017-04-14 Thread koertkuipers

Github user koertkuipers commented on the issue:

https://github.com/apache/spark/pull/17639
  
@cloud-fan thanks for doing this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17639: [SPARK-19716][SQL][follow-up] UnresolvedMapObjects shoul...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17639
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17639: [SPARK-19716][SQL][follow-up] UnresolvedMapObjects shoul...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17639
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75809/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17639: [SPARK-19716][SQL][follow-up] UnresolvedMapObjects shoul...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17639
  
**[Test build #75809 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75809/testReport)**
 for PR 17639 at commit 
[`bb0a14a`](https://github.com/apache/spark/commit/bb0a14a391e7316a1d5caee900f295c9487c6e8a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class UnresolvedMapObjects(`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17623: [SPARK-20292][SQL][WIP] Clean up string representation o...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17623
  
**[Test build #75811 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75811/testReport)**
 for PR 17623 at commit 
[`c83396e`](https://github.com/apache/spark/commit/c83396e0906e781a493648d70067a91880f9cf8f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17623: [SPARK-20292][SQL][WIP] Clean up string representation o...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17623
  
**[Test build #75810 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75810/testReport)**
 for PR 17623 at commit 
[`5c057ba`](https://github.com/apache/spark/commit/5c057ba7eb2a68a276387988d5c3eb6419a0cba8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17582: [SPARK-20239][Core] Improve HistoryServer's ACL mechanis...

2017-04-14 Thread tgravescs

Github user tgravescs commented on the issue:

https://github.com/apache/spark/pull/17582
  
Sorry again the wording above and all the different configs are a bit 
confusing to me as to what the real issues are here.

>Here actually has two list of acls, one is controlled by 
spark.acls.enabled, if user "A" is not added 
to this acl list, then user "A" cannot see the app list 
(//api/v1/applications). But if this app is run by user 
"A", then user "A" could still see the details of app, like 
(//api/v1/applications//jobs), this acl is 
controlled by "spark.history.ui.acls.enabled", and user "A" is automatically in 
the acl list (because of run by him).

You are mixing things here.  You say that if user "A" is not added to acl 
list he cannot see the app list. This is broken then and I assume only applies 
to rest api not UI?  But I'm not sure what that has to do with your second 
sentence, if user "A" ran the app then of course he can see the details of the 
app, that is intended. I'm not sure what that has to do with the first issue?  
If you don't have spark.history.ui.acls.enabled then it is up to what the user 
set.  Generally in any secure environment you should set 
spark.history.ui.acls.enabled=true and it should enforce acls no matter what 
user set.  It might help for you to describe these in terms of configs.  Which 
exact configs are set on the history server and which exact configs are set on 
the application side and which exact apis are being used (Rest vs Web UI).


so all the urls you list are the REST API, is this only an issue with rest 
api or the actual web UI as well?  It sounds like things are definitely broke 
there but I'm not sure it requires changing the configs just fixing the things 
that are broken.

Its supposed to be that if spark.history.ui.acls.enable is enabled it 
doesn't matter what the setting of spark.acls.enable is, acls should always be 
enforced on the history server.  see the description: 
https://spark.apache.org/docs/latest/monitoring.html

Certain UI's don't have information that should be sensitive. I thought the 
list of applications was one of those things, all users should be able to see 
the entire list of applications.  Nothing sensitive there, but once you look at 
the application details that should be acl'd.  If someone added something 
sensitive then it should be protected or it should be moved from that page.

My opinions on your response to @vanzin 
1. No, there shouldn't be sensitive information there and many times a user 
is looking for a job run by say a headless user or other user.  I guess you 
could filter only the jobs that user has acls to but that makes it more 
complicated.  Do you have a concrete reason it should be protected?  Note that 
this follow how other Hadoop UI's work.

2. That is just broken, event log should be protected.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17592: [SPARK-20243][TESTS] DebugFilesystem.assertNoOpenStreams...

2017-04-14 Thread hvanhovell

Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/17592
  
I cherry picked this into branch-2.1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17623: [SPARK-20292][SQL][WIP] Clean up string representation o...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17623
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75808/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17623: [SPARK-20292][SQL][WIP] Clean up string representation o...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17623
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17623: [SPARK-20292][SQL][WIP] Clean up string representation o...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17623
  
**[Test build #75808 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75808/testReport)**
 for PR 17623 at commit 
[`0be2db8`](https://github.com/apache/spark/commit/0be2db809b28d7e9debbc319145d2928201798c2).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17639: [SPARK-19716][SQL][follow-up] UnresolvedMapObjects shoul...

2017-04-14 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/17639
  
cc @koertkuipers 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17639: [SPARK-19716][SQL][follow-up] UnresolvedMapObjects shoul...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17639
  
**[Test build #75809 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75809/testReport)**
 for PR 17639 at commit 
[`bb0a14a`](https://github.com/apache/spark/commit/bb0a14a391e7316a1d5caee900f295c9487c6e8a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17639: [SPARK-19716][SQL][follow-up] UnresolvedMapObject...

2017-04-14 Thread cloud-fan

GitHub user cloud-fan opened a pull request:

https://github.com/apache/spark/pull/17639

[SPARK-19716][SQL][follow-up] UnresolvedMapObjects should always be 
serializable

## What changes were proposed in this pull request?

In https://github.com/apache/spark/pull/17398 we introduced 
`UnresolvedMapObjects` as a placeholder of `MapObjects`. Unfortunately 
`UnresolvedMapObjects` is not serializable as its `function` may reference 
Scala `Type` which is not serializable.

Ideally this is fine, as we will never serialize and send unresolved 
expressions to executors. However users may accidentally do this, e.g. 
mistakenly reference an encoder instance when implementing `Aggregator`, we 
should fix it so that it's just a performance issue(more network traffic) and 
should not fail the query.

## How was this patch tested?

N/A

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cloud-fan/spark minor

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17639.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17639


commit bb0a14a391e7316a1d5caee900f295c9487c6e8a
Author: Wenchen Fan 
Date:   2017-04-14T13:15:36Z

UnresolvedMapObjects should always be serializable




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17637: [SPARK-20337][CORE] Support upgrade a jar dependency and...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17637
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17637: [SPARK-20337][CORE] Support upgrade a jar dependency and...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17637
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75806/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17637: [SPARK-20337][CORE] Support upgrade a jar dependency and...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17637
  
**[Test build #75806 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75806/testReport)**
 for PR 17637 at commit 
[`eb4cb86`](https://github.com/apache/spark/commit/eb4cb8653565fcb66d7c7222cc7b765383bfce45).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17568: [SPARK-20254][SQL] Remove unnecessary data conversion fo...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17568
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17568: [SPARK-20254][SQL] Remove unnecessary data conversion fo...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17568
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75807/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17568: [SPARK-20254][SQL] Remove unnecessary data conversion fo...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17568
  
**[Test build #75807 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75807/testReport)**
 for PR 17568 at commit 
[`1515947`](https://github.com/apache/spark/commit/1515947d7a8497bb1f9365d40e1534dff44f0f04).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17533: [WIP][SPARK-20219] Schedule tasks based on size of input...

2017-04-14 Thread jinxing64

Github user jinxing64 commented on the issue:

https://github.com/apache/spark/pull/17533
  
I think the failed unit test can be fixed in 
https://github.com/apache/spark/pull/17634 and 
https://github.com/apache/spark/pull/17603


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17636: [SPARK-20334][SQL] Return a better error message when co...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17636
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17636: [SPARK-20334][SQL] Return a better error message when co...

2017-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17636
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75805/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17636: [SPARK-20334][SQL] Return a better error message when co...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17636
  
**[Test build #75805 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75805/testReport)**
 for PR 17636 at commit 
[`c4e1a01`](https://github.com/apache/spark/commit/c4e1a010c16d753360c6bc576518d71820de1243).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17374: [SPARK-19019][PYTHON][BRANCH-2.0] Fix hijacked `collecti...

2017-04-14 Thread jbloom22

Github user jbloom22 commented on the issue:

https://github.com/apache/spark/pull/17374
  
Our users (https://hail.is) are running into this bug. Will the backport be 
merged soon? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17364: [SPARK-20038] [SQL]: FileFormatWriter.ExecuteWriteTask.r...

2017-04-14 Thread steveloughran

Github user steveloughran commented on the issue:

https://github.com/apache/spark/pull/17364
  
thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17623: [SPARK-20292][SQL][WIP] Clean up string representation o...

2017-04-14 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17623
  
**[Test build #75808 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75808/testReport)**
 for PR 17623 at commit 
[`0be2db8`](https://github.com/apache/spark/commit/0be2db809b28d7e9debbc319145d2928201798c2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17477: [SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on J...

2017-04-14 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/17477
  
The build should use 1.7.7, yes. Hadoop pulls in 1.7.4, but, it does so in 
2.6 and 2.7. And the SBT and Maven builds seem to get that right as intended 
because the POM directly overrides this version. (The only component on a 
different Avro is the Flume module but that's not the problem here.)

I also can't reproduce this locally. It builds fine for me too with the 
same commands.

I am open to workarounds, though I also don't know what will be sufficient 
because we can't reproduce it. I am pretty sure the Avro 1.7.4 dependency is 
coming from `hadoop-common` but no idea why only in 2.6.

sbt-unidoc has a newer version, 0.4.0, but updating it requires other 
changes I don't know how to make and I don't see a reason to think it's the 
problem.

I wonder if the problem is that `core` does not directly declare a 
dependency on `org.apache.avro:avro` but uses it. If so then adding this might 
do the trick in the core POM:

```
  
org.apache.avro
avro
  
```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 >

1 - 100 of 220 matches

Mail list logo