[GitHub] spark pull request #21771: [SPARK-24807][CORE] Adding files/jars twice: outp...

2018-07-14 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21771#discussion_r202531999
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1555,6 +1559,9 @@ class SparkContext(config: SparkConf) extends Logging 
{
   Utils.fetchFile(uri.toString, new 
File(SparkFiles.getRootDirectory()), conf,
 env.securityManager, hadoopConfiguration, timestamp, useCache = 
false)
   postEnvironmentUpdate()
+} else {
+  logWarning(s"The path $path has been added already. Overwriting of 
added paths " +
--- End diff --

I was wondering how common it is for an user to add the same jar expecting 
it will overwrite since mostly we consider those cases as immutable resources.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21771: [SPARK-24807][CORE] Adding files/jars twice: output a wa...

2018-07-14 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21771
  
We could have updated the doc for `spark.files.overwrite` too since the 
confusion probably with this configuration.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21771: [SPARK-24807][CORE] Adding files/jars twice: outp...

2018-07-14 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21771#discussion_r202531920
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1849,6 +1858,9 @@ class SparkContext(config: SparkConf) extends Logging 
{
 if (addedJars.putIfAbsent(key, timestamp).isEmpty) {
   logInfo(s"Added JAR $path at $key with timestamp $timestamp")
   postEnvironmentUpdate()
+} else {
+  logWarning(s"The jar $path has been added already. Overwriting 
of added jars " +
--- End diff --

This could confuse what it means with `spark.files.overwrite`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21771: [SPARK-24807][CORE] Adding files/jars twice: outp...

2018-07-14 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21771#discussion_r202531879
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1555,6 +1559,9 @@ class SparkContext(config: SparkConf) extends Logging 
{
   Utils.fetchFile(uri.toString, new 
File(SparkFiles.getRootDirectory()), conf,
 env.securityManager, hadoopConfiguration, timestamp, useCache = 
false)
   postEnvironmentUpdate()
+} else {
+  logWarning(s"The path $path has been added already. Overwriting of 
added paths " +
--- End diff --

I meant a comment.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21770
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-07-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21770
  
**[Test build #93013 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93013/testReport)**
 for PR 21770 at commit 
[`d817f9d`](https://github.com/apache/spark/commit/d817f9d046ad1d6cf63e4f7d929b449cc6d83552).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `public final class BufferHolder `


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21770
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93013/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21740: [SPARK-18230][MLLib]Throw a better exception, if ...

2018-07-14 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/21740#discussion_r202531693
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/recommendation/MatrixFactorizationModel.scala
 ---
@@ -75,10 +75,22 @@ class MatrixFactorizationModel @Since("0.8.0") (
 }
   }
 
+  /** Check for the invalid user. */
+  private def validateUser(user: Int): Unit = {
+require(userFeatures.lookup(user).nonEmpty, s"userId: $user not found 
in the model")
--- End diff --

The problem is you now look up the user and product twice. Instead, the 
callers should probably check whether `userFeatures.lookup(user)` is empty 
before calling `.head` and otherwise throw the exception.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21740: [SPARK-18230][MLLib]Throw a better exception, if ...

2018-07-14 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/21740#discussion_r202531686
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/mllib/recommendation/MatrixFactorizationModelSuite.scala
 ---
@@ -72,6 +72,22 @@ class MatrixFactorizationModelSuite extends 
SparkFunSuite with MLlibTestSparkCon
 }
   }
 
+  test("invalid user and product") {
+val model = new MatrixFactorizationModel(rank, userFeatures, 
prodFeatures)
+assert(intercept[IllegalArgumentException]  {
--- End diff --

I don't think it's necessary to assert about the exact message string. 
Maybe assert it contains the user ID. But just checking for the exception also 
seems close enough.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21556: [SPARK-24549][SQL] Support Decimal type push down to the...

2018-07-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21556
  
**[Test build #93014 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93014/testReport)**
 for PR 21556 at commit 
[`e31c201`](https://github.com/apache/spark/commit/e31c2010fa7cd8ade77691b59940108465df4b54).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21556: [SPARK-24549][SQL] Support Decimal type push down to the...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21556
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/960/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21556: [SPARK-24549][SQL] Support Decimal type push down to the...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21556
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21771: [SPARK-24807][CORE] Adding files/jars twice: outp...

2018-07-14 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21771


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21764: [SPARK-24802] Optimization Rule Exclusion

2018-07-14 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/21764
  
ok, thx for the kind explanation.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21771: [SPARK-24807][CORE] Adding files/jars twice: outp...

2018-07-14 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21771#discussion_r202531555
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1555,6 +1559,9 @@ class SparkContext(config: SparkConf) extends Logging 
{
   Utils.fetchFile(uri.toString, new 
File(SparkFiles.getRootDirectory()), conf,
 env.securityManager, hadoopConfiguration, timestamp, useCache = 
false)
   postEnvironmentUpdate()
+} else {
+  logWarning(s"The path $path has been added already. Overwriting of 
added paths " +
--- End diff --

The notes are also reasonable to me. This is a common user error. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21771: [SPARK-24807][CORE] Adding files/jars twice: output a wa...

2018-07-14 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21771
  
Thanks! Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21771: [SPARK-24807][CORE] Adding files/jars twice: output a wa...

2018-07-14 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21771
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21771: [SPARK-24807][CORE] Adding files/jars twice: outp...

2018-07-14 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21771#discussion_r202531553
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1555,6 +1559,9 @@ class SparkContext(config: SparkConf) extends Logging 
{
   Utils.fetchFile(uri.toString, new 
File(SparkFiles.getRootDirectory()), conf,
 env.securityManager, hadoopConfiguration, timestamp, useCache = 
false)
   postEnvironmentUpdate()
+} else {
+  logWarning(s"The path $path has been added already. Overwriting of 
added paths " +
--- End diff --

We normally do not post the JIRA number in the message.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-07-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21770
  
**[Test build #93013 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93013/testReport)**
 for PR 21770 at commit 
[`d817f9d`](https://github.com/apache/spark/commit/d817f9d046ad1d6cf63e4f7d929b449cc6d83552).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21770
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21770
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/959/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-07-14 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/21770
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21764: [SPARK-24802] Optimization Rule Exclusion

2018-07-14 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21764
  
@maropu This is for advanced end users or Spark developers. External conf 
looks fine, but I have to admit this might be rarely used. BTW, after having 
this conf, we can deprecate a few internal configurations that are used for 
disabling specific optimizer rules in SPARK 3.0. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21768: [SPARK-24776][SQL]Avro unit test: deduplicate cod...

2018-07-14 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21768


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21711: [SPARK-24681][SQL] Verify nested column names in ...

2018-07-14 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21711#discussion_r202531134
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -138,17 +138,35 @@ private[spark] class HiveExternalCatalog(conf: 
SparkConf, hadoopConf: Configurat
   }
 
   /**
-   * Checks the validity of data column names. Hive metastore disallows 
the table to use comma in
-   * data column names. Partition columns do not have such a restriction. 
Views do not have such
-   * a restriction.
+   * Checks the validity of data column names. Hive metastore disallows 
the table to use some
+   * special characters (',', ':', and ';') in data column names. 
Partition columns do not have
+   * such a restriction. Views do not have such a restriction.
*/
   private def verifyDataSchema(
   tableName: TableIdentifier, tableType: CatalogTableType, dataSchema: 
StructType): Unit = {
 if (tableType != VIEW) {
-  dataSchema.map(_.name).foreach { colName =>
-if (colName.contains(",")) {
-  throw new AnalysisException("Cannot create a table having a 
column whose name contains " +
-s"commas in Hive metastore. Table: $tableName; Column: 
$colName")
+  val invalidChars = Seq(",", ":", ";")
+  def verifyNestedColumnNames(schema: StructType): Unit = 
schema.foreach { f =>
+f.dataType match {
+  case st: StructType => verifyNestedColumnNames(st)
+  case _ if invalidChars.exists(f.name.contains) =>
+throw new AnalysisException("Cannot create a table having a 
nested column whose name " +
+  s"contains invalid characters (${invalidChars.map(c => 
s"'$c'").mkString(", ")}) " +
--- End diff --

something wrong, right?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21711: [SPARK-24681][SQL] Verify nested column names in ...

2018-07-14 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21711#discussion_r202531122
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -138,17 +138,35 @@ private[spark] class HiveExternalCatalog(conf: 
SparkConf, hadoopConf: Configurat
   }
 
   /**
-   * Checks the validity of data column names. Hive metastore disallows 
the table to use comma in
-   * data column names. Partition columns do not have such a restriction. 
Views do not have such
-   * a restriction.
+   * Checks the validity of data column names. Hive metastore disallows 
the table to use some
+   * special characters (',', ':', and ';') in data column names. 
Partition columns do not have
--- End diff --

> in data column names.
->
> in data column names, including nested column names. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21711: [SPARK-24681][SQL] Verify nested column names in ...

2018-07-14 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21711#discussion_r202531090
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala 
---
@@ -2005,6 +2005,24 @@ class SQLQuerySuite extends QueryTest with 
SQLTestUtils with TestHiveSingleton {
 }
   }
 
+  test("SPARK-24681 checks if nested column names do not include ',', ':', 
and ';'") {
--- End diff --

Move it to HiveDDLSuite?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21770
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93012/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-07-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21770
  
**[Test build #93012 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93012/testReport)**
 for PR 21770 at commit 
[`d817f9d`](https://github.com/apache/spark/commit/d817f9d046ad1d6cf63e4f7d929b449cc6d83552).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `public final class BufferHolder `


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21770
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21771: [SPARK-24807][CORE] Adding files/jars twice: outp...

2018-07-14 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21771#discussion_r202530515
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1555,6 +1559,9 @@ class SparkContext(config: SparkConf) extends Logging 
{
   Utils.fetchFile(uri.toString, new 
File(SparkFiles.getRootDirectory()), conf,
 env.securityManager, hadoopConfiguration, timestamp, useCache = 
false)
   postEnvironmentUpdate()
+} else {
+  logWarning(s"The path $path has been added already. Overwriting of 
added paths " +
--- End diff --

Shell we leave a JIRA link as a comment for example SPARK-16787 and/or 
SPARK-19417


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21771: [SPARK-24807][CORE] Adding files/jars twice: output a wa...

2018-07-14 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21771
  
Seems fine to me.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21771: [SPARK-24807][CORE] Adding files/jars twice: outp...

2018-07-14 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21771#discussion_r202530492
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1555,6 +1559,9 @@ class SparkContext(config: SparkConf) extends Logging 
{
   Utils.fetchFile(uri.toString, new 
File(SparkFiles.getRootDirectory()), conf,
 env.securityManager, hadoopConfiguration, timestamp, useCache = 
false)
   postEnvironmentUpdate()
+} else {
+  logWarning(s"The path $path has been added already. Overwriting of 
added paths " +
--- End diff --

eh, @MaxGekk, how about we just give warnings without notes for now?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21771: [SPARK-24807][CORE] Adding files/jars twice: outp...

2018-07-14 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21771#discussion_r202530444
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1555,6 +1559,9 @@ class SparkContext(config: SparkConf) extends Logging 
{
   Utils.fetchFile(uri.toString, new 
File(SparkFiles.getRootDirectory()), conf,
 env.securityManager, hadoopConfiguration, timestamp, useCache = 
false)
   postEnvironmentUpdate()
+} else {
+  logWarning(s"The path $path has been added already. Overwriting of 
added paths " +
--- End diff --

How does it relate to `spark.files.overwrite`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21771: [SPARK-24807][CORE] Adding files/jars twice: outp...

2018-07-14 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21771#discussion_r202530411
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1555,6 +1559,9 @@ class SparkContext(config: SparkConf) extends Logging 
{
   Utils.fetchFile(uri.toString, new 
File(SparkFiles.getRootDirectory()), conf,
 env.securityManager, hadoopConfiguration, timestamp, useCache = 
false)
   postEnvironmentUpdate()
+} else {
+  logWarning(s"The path $path has been added already. Overwriting of 
added paths " +
--- End diff --

How does it relate to `spark.files.overwrite` and how much likely do we 
reach here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21770
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/958/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-07-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21770
  
**[Test build #93012 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93012/testReport)**
 for PR 21770 at commit 
[`d817f9d`](https://github.com/apache/spark/commit/d817f9d046ad1d6cf63e4f7d929b449cc6d83552).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21770
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21741: [SPARK-24718][SQL] Timestamp support pushdown to ...

2018-07-14 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21741


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-07-14 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/21770
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21741: [SPARK-24718][SQL] Timestamp support pushdown to parquet...

2018-07-14 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21741
  
Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21657: [SPARK-24676][SQL] Project required data from CSV parsed...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21657
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93010/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21657: [SPARK-24676][SQL] Project required data from CSV parsed...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21657
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21657: [SPARK-24676][SQL] Project required data from CSV parsed...

2018-07-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21657
  
**[Test build #93010 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93010/testReport)**
 for PR 21657 at commit 
[`dd5bb59`](https://github.com/apache/spark/commit/dd5bb595fe282035e0cf8996cfea15724fe0f674).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21770
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93011/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21770
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-07-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21770
  
**[Test build #93011 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93011/testReport)**
 for PR 21770 at commit 
[`d817f9d`](https://github.com/apache/spark/commit/d817f9d046ad1d6cf63e4f7d929b449cc6d83552).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `public final class BufferHolder `


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21764: [SPARK-24802] Optimization Rule Exclusion

2018-07-14 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/21764
  
@gatorsmile aha, ok. We need to make this option not `internal` but 
`external`?

BTW, the interfaces to add/delete optimizer rules (addition via 
`ExperimentalMethods` and deletion via `SQLConf`) are different and is this 
design ok?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-07-14 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/21770
  
Also, `SparkException` needs to extend `RuntimeException` instead of 
`Exception` because some generated codes do not have code to catch the 
exception. But, the change causes Mima test failures, so it is removed from 
this pr now. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21770
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21770
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/957/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-07-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21770
  
**[Test build #93011 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93011/testReport)**
 for PR 21770 at commit 
[`d817f9d`](https://github.com/apache/spark/commit/d817f9d046ad1d6cf63e4f7d929b449cc6d83552).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21766: [SPARK-24803][SQL] add support for numeric

2018-07-14 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/21766
  
aha, I didn't know that and postgresql also uses them interchangeably, too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21657: [SPARK-24676][SQL] Project required data from CSV...

2018-07-14 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/21657#discussion_r202528977
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala
 ---
@@ -197,15 +202,21 @@ class UnivocityParser(
*/
   def parse(input: String): InternalRow = doParse(input)
 
+  private val getToken = if (options.columnPruning) {
+(tokens: Array[String], index: Int) => tokens(index)
+  } else {
+(tokens: Array[String], index: Int) => tokens(tokenIndexArr(index))
+  }
+
   private def convert(tokens: Array[String]): InternalRow = {
-if (tokens.length != schema.length) {
+if (tokens.length != parsedSchema.length) {
--- End diff --

will do


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21657: [SPARK-24676][SQL] Project required data from CSV...

2018-07-14 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/21657#discussion_r202528933
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala
 ---
@@ -38,24 +38,29 @@ class UnivocityParser(
 requiredSchema: StructType,
 val options: CSVOptions) extends Logging {
   require(requiredSchema.toSet.subsetOf(dataSchema.toSet),
-"requiredSchema should be the subset of schema.")
+s"requiredSchema (${requiredSchema.catalogString}) should be the 
subset of " +
+  s"dataSchema (${dataSchema.catalogString}).")
 
   def this(schema: StructType, options: CSVOptions) = this(schema, schema, 
options)
 
   // A `ValueConverter` is responsible for converting the given value to a 
desired type.
   private type ValueConverter = String => Any
 
+  // This index is used to reorder parsed tokens
+  private val tokenIndexArr =
+requiredSchema.map(f => 
java.lang.Integer.valueOf(dataSchema.indexOf(f))).toArray
+
   val tokenizer = {
 val parserSetting = options.asParserSettings
 if (options.columnPruning && requiredSchema.length < 
dataSchema.length) {
-  val tokenIndexArr = requiredSchema.map(f => 
java.lang.Integer.valueOf(dataSchema.indexOf(f)))
   parserSetting.selectIndexes(tokenIndexArr: _*)
 }
 new CsvParser(parserSetting)
   }
-  private val schema = if (options.columnPruning) requiredSchema else 
dataSchema
 
-  private val row = new GenericInternalRow(schema.length)
+  private val parsedSchema = if (options.columnPruning) requiredSchema 
else dataSchema
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21657: [SPARK-24676][SQL] Project required data from CSV...

2018-07-14 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/21657#discussion_r202528931
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala
 ---
@@ -223,8 +234,8 @@ class UnivocityParser(
 } else {
   try {
 var i = 0
-while (i < schema.length) {
-  row(i) = valueConverters(i).apply(tokens(i))
+while (i < requiredSchema.length) {
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21657: [SPARK-24676][SQL] Project required data from CSV...

2018-07-14 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21657#discussion_r202528391
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala
 ---
@@ -197,15 +202,21 @@ class UnivocityParser(
*/
   def parse(input: String): InternalRow = doParse(input)
 
+  private val getToken = if (options.columnPruning) {
+(tokens: Array[String], index: Int) => tokens(index)
+  } else {
+(tokens: Array[String], index: Int) => tokens(tokenIndexArr(index))
+  }
+
   private def convert(tokens: Array[String]): InternalRow = {
-if (tokens.length != schema.length) {
+if (tokens.length != parsedSchema.length) {
--- End diff --

If possible, could you add a test case that satisfy `tokens.length != 
parsedSchema.length`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21657: [SPARK-24676][SQL] Project required data from CSV...

2018-07-14 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21657#discussion_r202528267
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala
 ---
@@ -38,24 +38,29 @@ class UnivocityParser(
 requiredSchema: StructType,
 val options: CSVOptions) extends Logging {
   require(requiredSchema.toSet.subsetOf(dataSchema.toSet),
-"requiredSchema should be the subset of schema.")
+s"requiredSchema (${requiredSchema.catalogString}) should be the 
subset of " +
+  s"dataSchema (${dataSchema.catalogString}).")
 
   def this(schema: StructType, options: CSVOptions) = this(schema, schema, 
options)
 
   // A `ValueConverter` is responsible for converting the given value to a 
desired type.
   private type ValueConverter = String => Any
 
+  // This index is used to reorder parsed tokens
+  private val tokenIndexArr =
+requiredSchema.map(f => 
java.lang.Integer.valueOf(dataSchema.indexOf(f))).toArray
+
   val tokenizer = {
 val parserSetting = options.asParserSettings
 if (options.columnPruning && requiredSchema.length < 
dataSchema.length) {
-  val tokenIndexArr = requiredSchema.map(f => 
java.lang.Integer.valueOf(dataSchema.indexOf(f)))
   parserSetting.selectIndexes(tokenIndexArr: _*)
 }
 new CsvParser(parserSetting)
   }
-  private val schema = if (options.columnPruning) requiredSchema else 
dataSchema
 
-  private val row = new GenericInternalRow(schema.length)
+  private val parsedSchema = if (options.columnPruning) requiredSchema 
else dataSchema
--- End diff --

Add a comment like

> // When column pruning is enabled, the parser only parses the required 
columns based on their positions in the data schema. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21657: [SPARK-24676][SQL] Project required data from CSV...

2018-07-14 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21657#discussion_r202528372
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala
 ---
@@ -223,8 +234,8 @@ class UnivocityParser(
 } else {
   try {
 var i = 0
-while (i < schema.length) {
-  row(i) = valueConverters(i).apply(tokens(i))
+while (i < requiredSchema.length) {
--- End diff --

Add the comment like

> // When the length of the returned tokens is identical to the length of 
the parsed schema, we just need to convert the tokens that correspond to the 
required columns.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21657: [SPARK-24676][SQL] Project required data from CSV...

2018-07-14 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21657#discussion_r202527982
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
 ---
@@ -1579,4 +1579,20 @@ class CSVSuite extends QueryTest with 
SharedSQLContext with SQLTestUtils with Te
   }
 }
   }
+
+  test("SPARK-24676 project required data from parsed data when 
columnPruning disabled") {
+withSQLConf(SQLConf.CSV_PARSER_COLUMN_PRUNING.key -> "false") {
+  withTempPath { path =>
+val dir = path.getAbsolutePath
+spark.range(10).selectExpr("id % 2 AS p", "id AS c0", "id AS 
c1").write.partitionBy("p")
+  .option("header", "true").csv(dir)
+var df = spark.read.option("header", 
true).csv(dir).selectExpr("sum(p)", "count(c0)")
--- End diff --

Normally, we do not use `var` for DataFrame even in test cases


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21657: [SPARK-24676][SQL] Project required data from CSV...

2018-07-14 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21657#discussion_r202528247
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala
 ---
@@ -38,24 +38,29 @@ class UnivocityParser(
 requiredSchema: StructType,
 val options: CSVOptions) extends Logging {
   require(requiredSchema.toSet.subsetOf(dataSchema.toSet),
-"requiredSchema should be the subset of schema.")
+s"requiredSchema (${requiredSchema.catalogString}) should be the 
subset of " +
+  s"dataSchema (${dataSchema.catalogString}).")
 
   def this(schema: StructType, options: CSVOptions) = this(schema, schema, 
options)
 
   // A `ValueConverter` is responsible for converting the given value to a 
desired type.
   private type ValueConverter = String => Any
 
+  // This index is used to reorder parsed tokens
+  private val tokenIndexArr =
+requiredSchema.map(f => 
java.lang.Integer.valueOf(dataSchema.indexOf(f))).toArray
+
   val tokenizer = {
 val parserSetting = options.asParserSettings
 if (options.columnPruning && requiredSchema.length < 
dataSchema.length) {
--- End diff --

Can be simplified to
> // When to-be-parsed schema is shorter than the to-be-read data schema, 
we let Univocity CSV parser select a sequence of fields for reading by their 
positions.
> `if (parsedSchema.length < dataSchema.length)`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21657: [SPARK-24676][SQL] Project required data from CSV...

2018-07-14 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21657#discussion_r202527784
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala
 ---
@@ -38,24 +38,29 @@ class UnivocityParser(
 requiredSchema: StructType,
--- End diff --

Could you add the parameter descriptions of `dataSchema` and 
`requiredSchema` above class `UnivocityParser`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20611: [SPARK-23425][SQL]Support wildcard in HDFS path f...

2018-07-14 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/20611#discussion_r202528097
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ---
@@ -303,94 +303,44 @@ case class LoadDataCommand(
   s"partitioned, but a partition spec was provided.")
   }
 }
-
-val loadPath =
+val loadPath = {
   if (isLocal) {
-val uri = Utils.resolveURI(path)
-val file = new File(uri.getPath)
-val exists = if (file.getAbsolutePath.contains("*")) {
-  val fileSystem = FileSystems.getDefault
-  val dir = file.getParentFile.getAbsolutePath
-  if (dir.contains("*")) {
-throw new AnalysisException(
-  s"LOAD DATA input path allows only filename wildcard: $path")
-  }
-
-  // Note that special characters such as "*" on Windows are not 
allowed as a path.
-  // Calling `WindowsFileSystem.getPath` throws an exception if 
there are in the path.
-  val dirPath = fileSystem.getPath(dir)
-  val pathPattern = new File(dirPath.toAbsolutePath.toString, 
file.getName).toURI.getPath
-  val safePathPattern = if (Utils.isWindows) {
-// On Windows, the pattern should not start with slashes for 
absolute file paths.
-pathPattern.stripPrefix("/")
-  } else {
-pathPattern
-  }
-  val files = new File(dir).listFiles()
-  if (files == null) {
-false
-  } else {
-val matcher = fileSystem.getPathMatcher("glob:" + 
safePathPattern)
-files.exists(f => 
matcher.matches(fileSystem.getPath(f.getAbsolutePath)))
-  }
-} else {
-  new File(file.getAbsolutePath).exists()
-}
-if (!exists) {
-  throw new AnalysisException(s"LOAD DATA input path does not 
exist: $path")
-}
-uri
+val localFS = FileContext.getLocalFSFileContext()
+localFS.makeQualified(new Path(path))
   } else {
-val uri = new URI(path)
-val hdfsUri = if (uri.getScheme() != null && uri.getAuthority() != 
null) {
-  uri
-} else {
-  // Follow Hive's behavior:
-  // If no schema or authority is provided with non-local inpath,
-  // we will use hadoop configuration "fs.defaultFS".
-  val defaultFSConf = 
sparkSession.sessionState.newHadoopConf().get("fs.defaultFS")
-  val defaultFS = if (defaultFSConf == null) {
-new URI("")
-  } else {
-new URI(defaultFSConf)
-  }
-
-  val scheme = if (uri.getScheme() != null) {
--- End diff --

My point is really about using a quasi-private method here. The 
implementation of that method actually looks like the old code here, although 
not entirely the same. I'm wondering whether it was therefore based on the 
implementation of the method you're trying to call already?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21657: [SPARK-24676][SQL] Project required data from CSV parsed...

2018-07-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21657
  
**[Test build #93010 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93010/testReport)**
 for PR 21657 at commit 
[`dd5bb59`](https://github.com/apache/spark/commit/dd5bb595fe282035e0cf8996cfea15724fe0f674).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21657: [SPARK-24676][SQL] Project required data from CSV parsed...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21657
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/956/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21657: [SPARK-24676][SQL] Project required data from CSV parsed...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21657
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21769: [SPARK-24805][SQL] Do not ignore avro files without exte...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21769
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93009/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21769: [SPARK-24805][SQL] Do not ignore avro files without exte...

2018-07-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21769
  
**[Test build #93009 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93009/testReport)**
 for PR 21769 at commit 
[`3b75c27`](https://github.com/apache/spark/commit/3b75c2719abc1a6295f9fd64646853b4f1928575).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21769: [SPARK-24805][SQL] Do not ignore avro files without exte...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21769
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21657: [SPARK-24676][SQL] Project required data from CSV parsed...

2018-07-14 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21657
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21657: [SPARK-24676][SQL] Project required data from CSV parsed...

2018-07-14 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21657
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21766: [SPARK-24803][SQL] add support for numeric

2018-07-14 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21766
  
They are not exactly the same in ANSI SQL, although MS SQL Server looks 
like use them interchangeably. 
https://docs.microsoft.com/en-us/sql/t-sql/data-types/decimal-and-numeric-transact-sql?view=sql-server-2017

> 26) NUMERIC species the data type exact numeric, with the decimal 
precision and scale specified by the  and .
> 27) DECIMAL species the data type exact numeric, with the decimal scale 
specified by the  and the implementation-de ned decimal precision equal 
to or greater than the value of the specified .


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21769: [SPARK-24805][SQL] Do not ignore avro files without exte...

2018-07-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21769
  
**[Test build #93009 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93009/testReport)**
 for PR 21769 at commit 
[`3b75c27`](https://github.com/apache/spark/commit/3b75c2719abc1a6295f9fd64646853b4f1928575).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-07-14 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21770
  
cc @rednaxelafx 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21769: [SPARK-24805][SQL] Do not ignore avro files without exte...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21769
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21769: [SPARK-24805][SQL] Do not ignore avro files without exte...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21769
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93008/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21769: [SPARK-24805][SQL] Do not ignore avro files without exte...

2018-07-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21769
  
**[Test build #93008 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93008/testReport)**
 for PR 21769 at commit 
[`a7d078e`](https://github.com/apache/spark/commit/a7d078e439a8ea79260377a966dc31c2d3ae38f7).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21769: [SPARK-24805][SQL] Do not ignore avro files without exte...

2018-07-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21769
  
**[Test build #93008 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93008/testReport)**
 for PR 21769 at commit 
[`a7d078e`](https://github.com/apache/spark/commit/a7d078e439a8ea79260377a966dc31c2d3ae38f7).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21769: [SPARK-24805][SQL] Do not ignore avro files witho...

2018-07-14 Thread MaxGekk
Github user MaxGekk commented on a diff in the pull request:

https://github.com/apache/spark/pull/21769#discussion_r202526423
  
--- Diff: 
external/avro/src/main/scala/org/apache/spark/sql/avro/AvroFileFormat.scala ---
@@ -64,7 +64,7 @@ private[avro] class AvroFileFormat extends FileFormat 
with DataSourceRegister {
 // Schema evolution is not supported yet. Here we only pick a single 
random sample file to
 // figure out the schema of the whole dataset.
 val sampleFile =
-  if 
(conf.getBoolean(AvroFileFormat.IgnoreFilesWithoutExtensionProperty, true)) {
+  if (AvroFileFormat.ignoreFilesWithoutExtensions(conf)) {
--- End diff --

The Hadoop config can be changed like:
```scala
spark
  .sqlContext
  .sparkContext
  .hadoopConfiguration
  .set("avro.mapred.ignore.inputs.without.extension", "true")
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-07-14 Thread mridulm
Github user mridulm commented on the issue:

https://github.com/apache/spark/pull/21698
  

Taking a step back and analyzing the solution for the problem at hand.
There are three main issues with the proposal:

* It does not solve the problem in a general manner.
  * I gave example of zip, sample - it applies to any order sensitive 
closure.
* It does not fix the issue when a child stage has one or more completed 
tasks.
  * Even if we assume it is a specific fix for repartition/coalasce - even 
there it does not solve the problem and can cause data loss.
* It causes performance regression to existing workaround.
  * The common workaround for this issue is to checkpoint + action or do a 
local/global sort (I believe sql does the latter now ?).
  * The proposal causes performance regression for these existing 
workarounds.

The corner case where the proposal works is if :
a) order sensitive stage has finished and
b) no task in child stage has finished fetching its shuffle input.

This is a fairly narrow subset, and why I dont believe the current approach 
helps.
Having said that, if it is possible to enhance the approach, that would be 
great !
This is a fairly nasty issue which hurts users, and typically people who 
are aware of the problem tend to always pay a performance cost to avoid the 
corner case.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21750: [SPARK-24754][ML] Minhash integer overflow

2018-07-14 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21750


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21750: [SPARK-24754][ML] Minhash integer overflow

2018-07-14 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/21750
  
Merged to master


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21769: [SPARK-24805][SQL] Do not ignore avro files witho...

2018-07-14 Thread MaxGekk
Github user MaxGekk commented on a diff in the pull request:

https://github.com/apache/spark/pull/21769#discussion_r202525034
  
--- Diff: 
external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala ---
@@ -623,7 +624,7 @@ class AvroSuite extends SparkFunSuite {
   spark.read.avro("*/*/*/*/*/*/*/something.avro")
 }
 
-intercept[FileNotFoundException] {
+intercept[java.io.IOException] {
   TestUtils.withTempDir { dir =>
 FileUtils.touch(new File(dir, "test"))
 spark.read.avro(dir.toString)
--- End diff --

I would actually remove this piece of code from the test, and write a 
separate test (reading a folder with files without `.avro` extensions) in which 
the Hadoop's parameter `avro.mapred.ignore.inputs.without.extension` is set to 
`true` explicitly. Unfortunatelly there is no method like `withSQLConf` for 
Hadoop's configs.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21769: [SPARK-24805][SQL] Do not ignore avro files witho...

2018-07-14 Thread MaxGekk
Github user MaxGekk commented on a diff in the pull request:

https://github.com/apache/spark/pull/21769#discussion_r202524884
  
--- Diff: 
external/avro/src/main/scala/org/apache/spark/sql/avro/AvroFileFormat.scala ---
@@ -64,7 +64,7 @@ private[avro] class AvroFileFormat extends FileFormat 
with DataSourceRegister {
 // Schema evolution is not supported yet. Here we only pick a single 
random sample file to
 // figure out the schema of the whole dataset.
 val sampleFile =
-  if 
(conf.getBoolean(AvroFileFormat.IgnoreFilesWithoutExtensionProperty, true)) {
+  if (AvroFileFormat.ignoreFilesWithoutExtensions(conf)) {
--- End diff --

Here is how people use the option so far: 
https://github.com/databricks/spark-avro/issues/71#issuecomment-280844562 . 
Probably we should seperatly from the PR discuss how we are going to fix the 
"bug" and break backward compatibily.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21769: [SPARK-24805][SQL] Do not ignore avro files witho...

2018-07-14 Thread MaxGekk
Github user MaxGekk commented on a diff in the pull request:

https://github.com/apache/spark/pull/21769#discussion_r202524696
  
--- Diff: 
external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala ---
@@ -809,4 +810,16 @@ class AvroSuite extends SparkFunSuite {
   assert(readDf.collect().sameElements(writeDf.collect()))
 }
   }
+
+  test("SPARK-24805: reading files without .avro extension") {
--- End diff --

Does it just introduce unnesseccary dependency here and overcomplicate the 
test? I can create small (with just one row) avro file without `.avro` 
extension especially for the test. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21556: [SPARK-24549][SQL] Support Decimal type push down to the...

2018-07-14 Thread rdblue
Github user rdblue commented on the issue:

https://github.com/apache/spark/pull/21556
  
I misunderstood how it was safe as well. It was Yuming's clarification that 
helped.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21769: [SPARK-24805][SQL] Do not ignore avro files witho...

2018-07-14 Thread MaxGekk
Github user MaxGekk commented on a diff in the pull request:

https://github.com/apache/spark/pull/21769#discussion_r202524552
  
--- Diff: 
external/avro/src/main/scala/org/apache/spark/sql/avro/AvroFileFormat.scala ---
@@ -64,7 +64,7 @@ private[avro] class AvroFileFormat extends FileFormat 
with DataSourceRegister {
 // Schema evolution is not supported yet. Here we only pick a single 
random sample file to
 // figure out the schema of the whole dataset.
 val sampleFile =
-  if 
(conf.getBoolean(AvroFileFormat.IgnoreFilesWithoutExtensionProperty, true)) {
+  if (AvroFileFormat.ignoreFilesWithoutExtensions(conf)) {
--- End diff --

The `avro.mapred.ignore.inputs.without.extension` is hadoop's parameter. 
This PR aims to change the default behavior only. I don't want to convert the 
hadoop parameter to Avro datasource option here. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21771: [SPARK-24807][CORE] Adding files/jars twice: output a wa...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21771
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93007/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21771: [SPARK-24807][CORE] Adding files/jars twice: output a wa...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21771
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21771: [SPARK-24807][CORE] Adding files/jars twice: output a wa...

2018-07-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21771
  
**[Test build #93007 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93007/testReport)**
 for PR 21771 at commit 
[`4a583c6`](https://github.com/apache/spark/commit/4a583c60e6fdf0f98b0d7b5afa4e81af64cb5ebb).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21771: [SPARK-24807][CORE] Adding files/jars twice: output a wa...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21771
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93005/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21771: [SPARK-24807][CORE] Adding files/jars twice: output a wa...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21771
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21771: [SPARK-24807][CORE] Adding files/jars twice: output a wa...

2018-07-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21771
  
**[Test build #93005 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93005/testReport)**
 for PR 21771 at commit 
[`0942fd2`](https://github.com/apache/spark/commit/0942fd26e47d685280a85bff465d5ee01708f608).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21741: [SPARK-24718][SQL] Timestamp support pushdown to parquet...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21741
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21741: [SPARK-24718][SQL] Timestamp support pushdown to parquet...

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21741
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93006/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21741: [SPARK-24718][SQL] Timestamp support pushdown to parquet...

2018-07-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21741
  
**[Test build #93006 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93006/testReport)**
 for PR 21741 at commit 
[`f206457`](https://github.com/apache/spark/commit/f20645750017e7d78bd3ccbadce8ee579bfaeee8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  class SerializableConfiguration(@transient var value: Configuration)`
  * `  class IncompatibleSchemaException(msg: String, ex: Throwable = null) 
extends Exception(msg, ex)`
  * `  case class SchemaType(dataType: DataType, nullable: Boolean)`
  * `  implicit class AvroDataFrameWriter[T](writer: DataFrameWriter[T]) `
  * `  implicit class AvroDataFrameReader(reader: DataFrameReader) `
  * `class KMeansModel (@Since(\"1.0.0\") val clusterCenters: 
Array[Vector],`
  * `abstract class ArraySetLike extends 
BinaryArrayExpressionWithImplicitCast `
  * `case class ArrayUnion(left: Expression, right: Expression) extends 
ArraySetLike `


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21102: [SPARK-23913][SQL] Add array_intersect function

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21102
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93002/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21102: [SPARK-23913][SQL] Add array_intersect function

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21102
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21102: [SPARK-23913][SQL] Add array_intersect function

2018-07-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21102
  
**[Test build #93002 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93002/testReport)**
 for PR 21102 at commit 
[`5492572`](https://github.com/apache/spark/commit/5492572bb83b314cd31d116d3b344ae3c4596dbd).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21103: [SPARK-23915][SQL] Add array_except function

2018-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21103
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >