[GitHub] [spark] cloud-fan commented on a change in pull request #29152: [SPARK-32356][SQL] Forbid create view with null type

2020-07-19 Thread GitBox


cloud-fan commented on a change in pull request #29152:
URL: https://github.com/apache/spark/pull/29152#discussion_r457078454



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/PlanResolutionSuite.scala
##
@@ -1557,6 +1557,17 @@ class PlanResolutionSuite extends AnalysisTest {
 checkFailure("testcat.tab", "foo")
   }
 
+  test("SPARK-32356: forbid null type in create view") {
+val sql1 = "create view v as select null as c"
+val sql2 = "alter view v as select null as c"
+Seq(sql1, sql2).foreach { sql =>
+  val msg = intercept[AnalysisException] {
+parseAndResolve(sql)
+  }.getMessage
+  assert(msg.contains(s"Cannot create tables with ${NullType.simpleString} 
type."))

Review comment:
   shall we update the error message to be `tables/views`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #29152: [SPARK-32356][SQL] Forbid create view with null type

2020-07-19 Thread GitBox


cloud-fan commented on a change in pull request #29152:
URL: https://github.com/apache/spark/pull/29152#discussion_r457078204



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/PlanResolutionSuite.scala
##
@@ -1557,6 +1557,17 @@ class PlanResolutionSuite extends AnalysisTest {
 checkFailure("testcat.tab", "foo")
   }
 
+  test("SPARK-32356: forbid null type in create view") {
+val sql1 = "create view v as select null as c"
+val sql2 = "alter view v as select null as c"

Review comment:
   can we test temp view as well? also the df api `df.createTempView`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] huaxingao commented on pull request #29159: [SPARK-32310][ML][PySpark] ML params default value parity

2020-07-19 Thread GitBox


huaxingao commented on pull request #29159:
URL: https://github.com/apache/spark/pull/29159#issuecomment-660820452


   cc @srowen @viirya @zhengruifeng 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] huaxingao commented on a change in pull request #29139: [SPARK-32339][ML][DOC] Improve MLlib BLAS native acceleration docs

2020-07-19 Thread GitBox


huaxingao commented on a change in pull request #29139:
URL: https://github.com/apache/spark/pull/29139#discussion_r457076755



##
File path: docs/ml-guide.md
##
@@ -62,23 +62,13 @@ The primary Machine Learning API for Spark is now the 
[DataFrame](sql-programmin
 
 # Dependencies
 
-MLlib uses the linear algebra package [Breeze](http://www.scalanlp.org/), 
which depends on
-[netlib-java](https://github.com/fommil/netlib-java) for optimised numerical 
processing.
-If native libraries[^1] are not available at runtime, you will see a warning 
message and a pure JVM
-implementation will be used instead.
+MLlib uses linear algebra packages [Breeze](http://www.scalanlp.org/) and 
[netlib-java](https://github.com/fommil/netlib-java) for optimised numerical 
processing[^1]. Those packages may call native acceleration libraries such as 
[Intel 
MKL](https://software.intel.com/content/www/us/en/develop/tools/math-kernel-library.html)
 or [OpenBLAS](http://www.openblas.net) if they are available as system 
libraries or in runtime library paths. 
 
-Due to licensing issues with runtime proprietary binaries, we do not include 
`netlib-java`'s native
-proxies by default.
-To configure `netlib-java` / Breeze to use system optimised binaries, include
-`com.github.fommil.netlib:all:1.1.2` (or build Spark with `-Pnetlib-lgpl`) as 
a dependency of your
-project and read the [netlib-java](https://github.com/fommil/netlib-java) 
documentation for your
-platform's additional installation instructions.
-
-The most popular native BLAS such as [Intel 
MKL](https://software.intel.com/en-us/mkl), 
[OpenBLAS](http://www.openblas.net), can use multiple threads in a single 
operation, which can conflict with Spark's execution model.
-
-Configuring these BLAS implementations to use a single thread for operations 
may actually improve performance (see 
[SPARK-21305](https://issues.apache.org/jira/browse/SPARK-21305)). It is 
usually optimal to match this to the number of cores each Spark task is 
configured to use, which is 1 by default and typically left at 1.
-
-Please refer to resources like the following to understand how to configure 
the number of threads these BLAS implementations use: [Intel 
MKL](https://software.intel.com/en-us/articles/recommended-settings-for-calling-intel-mkl-routines-from-multi-threaded-applications)
 or [Intel 
oneMKL](https://software.intel.com/en-us/onemkl-linux-developer-guide-improving-performance-with-threading)
 and [OpenBLAS](https://github.com/xianyi/OpenBLAS/wiki/faq#multi-threaded). 
Note that if nativeBLAS is not properly configured in system, java 
implementation(f2jBLAS) will be used as fallback option.
+Due to differing OSS licenses, `netlib-java`'s native proxies can't be 
distributed with Spark. See [MLlib Linear Algebra Acceleration 
Guide](ml-linalg-guide.md) for how to enable accelerated linear algebra 
processing. If accelerated native libraries are not enabled, you will see a 
warning message below and a pure JVM implementation will be used instead:

Review comment:
   `ml-linalg-guide.html` instead of `ml-linalg-guide.md`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #29101: [SPARK-32302][SQL] Partially push down disjunctive predicates through Join/Partitions

2020-07-19 Thread GitBox


cloud-fan commented on pull request #29101:
URL: https://github.com/apache/spark/pull/29101#issuecomment-660817399


   The github action checks are all passed. We don't need to wait for jenkins. 
@wangyum can you do the final sign-off?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #29158: [SPARK-32362][SQL][TEST] AdaptiveQueryExecSuite misses verifying AE results

2020-07-19 Thread GitBox


cloud-fan commented on a change in pull request #29158:
URL: https://github.com/apache/spark/pull/29158#discussion_r457071998



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala
##
@@ -68,7 +68,9 @@ class AdaptiveQueryExecSuite
 val result = dfAdaptive.collect()
 withSQLConf(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> "false") {
   val df = sql(query)
-  QueryTest.sameRows(result.toSeq, df.collect().toSeq)
+  QueryTest.sameRows(result.toSeq, df.collect().toSeq).foreach {

Review comment:
   good catch! can we use `checkAnswer(df, result)`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #29158: [SPARK-32362][SQL][TEST] AdaptiveQueryExecSuite misses verifying AE results

2020-07-19 Thread GitBox


cloud-fan commented on pull request #29158:
URL: https://github.com/apache/spark/pull/29158#issuecomment-660816143


   cc @maryannxue @JkSelf 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-19 Thread GitBox


AngersZh commented on a change in pull request #29085:
URL: https://github.com/apache/spark/pull/29085#discussion_r457071605



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala
##
@@ -713,13 +714,18 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder(conf) {
 }
 (Seq.empty, Option(name), props.toSeq, recordHandler)
 
-  case null =>
+  case null if conf.getConf(CATALOG_IMPLEMENTATION).equals("hive") =>
 // Use default (serde) format.
 val name = conf.getConfString("hive.script.serde",
   "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe")
 val props = Seq("field.delim" -> "\t")
 val recordHandler = Option(conf.getConfString(configKey, 
defaultConfigValue))
 (Nil, Option(name), props, recordHandler)
+
+  // SPARK-32106: When there is no definition about format, we return 
empty result
+  // to use a built-in default Serde in SparkScriptTransformationExec.
+  case null =>
+(Nil, None, Seq.empty, None)

Review comment:
   > > CalenderIntervalType/ArrayType/MapType/StructType as input of hive 
default serde will throw error
   > 
   > btw, we already have end-2-end tests for the unspported cases in the hive 
side?
   
   Added 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gengliangwang commented on pull request #29101: [SPARK-32302][SQL] Partially push down disjunctive predicates through Join/Partitions

2020-07-19 Thread GitBox


gengliangwang commented on pull request #29101:
URL: https://github.com/apache/spark/pull/29101#issuecomment-660814078


   @maropu I checked the output of the optimized query plan of the 3 queries 
and they are equivalent.  I think the performance result should be consistent. 
   [after.txt](https://github.com/apache/spark/files/4945705/after.txt)
   [before.txt](https://github.com/apache/spark/files/4945706/before.txt)
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29159: [SPARK-32310][ML][PySpark] ML params default value parity

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29159:
URL: https://github.com/apache/spark/pull/29159#issuecomment-660812454







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29159: [SPARK-32310][ML][PySpark] ML params default value parity

2020-07-19 Thread GitBox


AmplabJenkins commented on pull request #29159:
URL: https://github.com/apache/spark/pull/29159#issuecomment-660812454







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29159: [SPARK-32310][ML][PySpark] ML params default value parity

2020-07-19 Thread GitBox


SparkQA commented on pull request #29159:
URL: https://github.com/apache/spark/pull/29159#issuecomment-660812005


   **[Test build #126152 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126152/testReport)**
 for PR 29159 at commit 
[`f657d77`](https://github.com/apache/spark/commit/f657d7778f2574914e955b461d0e4dd8d92c7bcf).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] huaxingao opened a new pull request #29159: [SPARK-32310][ML][PySpark] ML params default value parity

2020-07-19 Thread GitBox


huaxingao opened a new pull request #29159:
URL: https://github.com/apache/spark/pull/29159


   
   
   ### What changes were proposed in this pull request?
   backporting the changes to 3.0
   set params default values in trait Params for feature and tuning in both 
Scala and Python.
   
   ### Why are the changes needed?
   Make ML has the same default param values between estimator and its 
corresponding transformer, and also between Scala and Python.
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Existing and modified tests
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29117: [SPARK-32363][PYTHON][BUILD] Avoid using --user in pip installation test and explicitly choose conda and source for (de)activat

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660808078


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/126151/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29117: [SPARK-32363][PYTHON][BUILD] Avoid using --user in pip installation test and explicitly choose conda and source for (de)activate

2020-07-19 Thread GitBox


SparkQA removed a comment on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660805371


   **[Test build #126151 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126151/testReport)**
 for PR 29117 at commit 
[`ece8906`](https://github.com/apache/spark/commit/ece89067ebaf67b84f6d3c108ec15c6b569957a1).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29117: [SPARK-32363][PYTHON][BUILD] Avoid using --user in pip installation test and explicitly choose conda and source for (de)activat

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660808073


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29117: [SPARK-32363][PYTHON][BUILD] Avoid using --user in pip installation test and explicitly choose conda and source for (de)activate

2020-07-19 Thread GitBox


SparkQA commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660808046


   **[Test build #126151 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126151/testReport)**
 for PR 29117 at commit 
[`ece8906`](https://github.com/apache/spark/commit/ece89067ebaf67b84f6d3c108ec15c6b569957a1).
* This patch **fails PySpark pip packaging tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29117: [SPARK-32363][PYTHON][BUILD] Avoid using --user in pip installation test and explicitly choose conda and source for (de)activate

2020-07-19 Thread GitBox


AmplabJenkins commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660808073







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29117: [SPARK-32363][PYTHON][BUILD] Avoid using --user in pip installation test and explicitly choose conda and source for (de)activat

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660805680







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29117: [SPARK-32363][PYTHON][BUILD] Avoid using --user in pip installation test and explicitly choose conda and source for (de)activate

2020-07-19 Thread GitBox


AmplabJenkins commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660805680







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29117: [SPARK-32363][PYTHON][BUILD] Avoid using --user in pip installation test and explicitly choose conda and source for (de)activate

2020-07-19 Thread GitBox


SparkQA commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660805371


   **[Test build #126151 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126151/testReport)**
 for PR 29117 at commit 
[`ece8906`](https://github.com/apache/spark/commit/ece89067ebaf67b84f6d3c108ec15c6b569957a1).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29117: [SPARK-32363][PYTHON][BUILD] Avoid using --user in pip installation test and explicitly choose conda and source for (de)activate

2020-07-19 Thread GitBox


SparkQA removed a comment on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660797226


   **[Test build #126149 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126149/testReport)**
 for PR 29117 at commit 
[`7a9cf67`](https://github.com/apache/spark/commit/7a9cf6718ae2b7d266ba2e67923fa7fe8ccf8fae).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29117: [SPARK-32363][PYTHON][BUILD] Avoid using --user in pip installation test and explicitly choose conda and source for (de)activat

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660804936







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29117: [SPARK-32363][PYTHON][BUILD] Avoid using --user in pip installation test and explicitly choose conda and source for (de)activate

2020-07-19 Thread GitBox


AmplabJenkins commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660804936







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29117: [SPARK-32363][PYTHON][BUILD] Avoid using --user in pip installation test and explicitly choose conda and source for (de)activate

2020-07-19 Thread GitBox


SparkQA commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660804807


   **[Test build #126149 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126149/testReport)**
 for PR 29117 at commit 
[`7a9cf67`](https://github.com/apache/spark/commit/7a9cf6718ae2b7d266ba2e67923fa7fe8ccf8fae).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dbtsai commented on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-19 Thread GitBox


dbtsai commented on pull request #28708:
URL: https://github.com/apache/spark/pull/28708#issuecomment-660802517


   Thanks! This is a great milestone. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29117: [SPARK-32363][PYTHON][BUILD] Avoid using --user in pip installation test and explicitly choose conda and source for (de)activat

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660801711


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/126150/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29117: [SPARK-32363][PYTHON][BUILD] Avoid using --user in pip installation test and explicitly choose conda and source for (de)activat

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660801706


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29117: [SPARK-32363][PYTHON][BUILD] Avoid using --user in pip installation test and explicitly choose conda and source for (de)activate

2020-07-19 Thread GitBox


SparkQA removed a comment on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660799121


   **[Test build #126150 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126150/testReport)**
 for PR 29117 at commit 
[`7690935`](https://github.com/apache/spark/commit/769093576cf0f5d79e2069df82031310f498e017).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29117: [SPARK-32363][PYTHON][BUILD] Avoid using --user in pip installation test and explicitly choose conda and source for (de)activate

2020-07-19 Thread GitBox


AmplabJenkins commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660801706







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29117: [SPARK-32363][PYTHON][BUILD] Avoid using --user in pip installation test and explicitly choose conda and source for (de)activate

2020-07-19 Thread GitBox


SparkQA commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660801689


   **[Test build #126150 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126150/testReport)**
 for PR 29117 at commit 
[`7690935`](https://github.com/apache/spark/commit/769093576cf0f5d79e2069df82031310f498e017).
* This patch **fails PySpark pip packaging tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660799407







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


AmplabJenkins commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660799407







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


SparkQA commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660799121


   **[Test build #126150 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126150/testReport)**
 for PR 29117 at commit 
[`7690935`](https://github.com/apache/spark/commit/769093576cf0f5d79e2069df82031310f498e017).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29104: [SPARK-32290][SQL] NotInSubquery SingleColumn Optimize

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29104:
URL: https://github.com/apache/spark/pull/29104#issuecomment-660798223


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/126138/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29104: [SPARK-32290][SQL] NotInSubquery SingleColumn Optimize

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29104:
URL: https://github.com/apache/spark/pull/29104#issuecomment-660798219


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29104: [SPARK-32290][SQL] NotInSubquery SingleColumn Optimize

2020-07-19 Thread GitBox


AmplabJenkins commented on pull request #29104:
URL: https://github.com/apache/spark/pull/29104#issuecomment-660798219







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29104: [SPARK-32290][SQL] NotInSubquery SingleColumn Optimize

2020-07-19 Thread GitBox


SparkQA commented on pull request #29104:
URL: https://github.com/apache/spark/pull/29104#issuecomment-660797872


   **[Test build #126138 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126138/testReport)**
 for PR 29104 at commit 
[`dc76141`](https://github.com/apache/spark/commit/dc761417a17530fa198d0471902605e6acd70995).
* This patch **fails PySpark pip packaging tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29104: [SPARK-32290][SQL] NotInSubquery SingleColumn Optimize

2020-07-19 Thread GitBox


SparkQA removed a comment on pull request #29104:
URL: https://github.com/apache/spark/pull/29104#issuecomment-660766081


   **[Test build #126138 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126138/testReport)**
 for PR 29104 at commit 
[`dc76141`](https://github.com/apache/spark/commit/dc761417a17530fa198d0471902605e6acd70995).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660797488







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


AmplabJenkins commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660797488







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


SparkQA commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660797226


   **[Test build #126149 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126149/testReport)**
 for PR 29117 at commit 
[`7a9cf67`](https://github.com/apache/spark/commit/7a9cf6718ae2b7d266ba2e67923fa7fe8ccf8fae).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] holdenk commented on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-19 Thread GitBox


holdenk commented on pull request #28708:
URL: https://github.com/apache/spark/pull/28708#issuecomment-660796492


   Merged to dev branch



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] asfgit closed pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-19 Thread GitBox


asfgit closed pull request #28708:
URL: https://github.com/apache/spark/pull/28708


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


HyukjinKwon commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660796154


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LantaoJin commented on a change in pull request #29021: [SPARK-32201][SQL] More general skew join pattern matching

2020-07-19 Thread GitBox


LantaoJin commented on a change in pull request #29021:
URL: https://github.com/apache/spark/pull/29021#discussion_r457039081



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala
##
@@ -340,3 +340,28 @@ case class BroadcastPartitioning(mode: BroadcastMode) 
extends Partitioning {
 case _ => false
   }
 }
+
+/**

Review comment:
   Hi @JkSelf I will provide another approach that removes this 
`CoalescedHashPartitioning` and simplify the code. But current implementation 
with `CoalescedHashPartitioning` might be more general for more cases.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29085:
URL: https://github.com/apache/spark/pull/29085#issuecomment-660794069


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/126144/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29085:
URL: https://github.com/apache/spark/pull/29085#issuecomment-660794065


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-19 Thread GitBox


SparkQA removed a comment on pull request #29085:
URL: https://github.com/apache/spark/pull/29085#issuecomment-660786632


   **[Test build #126144 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126144/testReport)**
 for PR 29085 at commit 
[`72b2155`](https://github.com/apache/spark/commit/72b215558b5d3e326ebe2416367a9d33455f9d58).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-19 Thread GitBox


AmplabJenkins commented on pull request #29085:
URL: https://github.com/apache/spark/pull/29085#issuecomment-660794065







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29158: [SPARK-32362][SQL][TEST] AdaptiveQueryExecSuite misses verifying AE results

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29158:
URL: https://github.com/apache/spark/pull/29158#issuecomment-660793719







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-19 Thread GitBox


SparkQA commented on pull request #29085:
URL: https://github.com/apache/spark/pull/29085#issuecomment-660793992


   **[Test build #126144 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126144/testReport)**
 for PR 29085 at commit 
[`72b2155`](https://github.com/apache/spark/commit/72b215558b5d3e326ebe2416367a9d33455f9d58).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29158: [SPARK-32362][SQL][TEST] AdaptiveQueryExecSuite misses verifying AE results

2020-07-19 Thread GitBox


AmplabJenkins commented on pull request #29158:
URL: https://github.com/apache/spark/pull/29158#issuecomment-660793719







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29158: [SPARK-32362][SQL][TEST] AdaptiveQueryExecSuite misses verifying AE results

2020-07-19 Thread GitBox


SparkQA commented on pull request #29158:
URL: https://github.com/apache/spark/pull/29158#issuecomment-660793434


   **[Test build #126148 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126148/testReport)**
 for PR 29158 at commit 
[`e6d3083`](https://github.com/apache/spark/commit/e6d308335ef7b4a78a5fcc9cda83e623214d9990).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LantaoJin commented on pull request #29158: [SPARK-32362][SQL][TEST] AdaptiveQueryExecSuite misses verifying AE results

2020-07-19 Thread GitBox


LantaoJin commented on pull request #29158:
URL: https://github.com/apache/spark/pull/29158#issuecomment-660793421


   cc @cloud-fan 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


SparkQA removed a comment on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660786642


   **[Test build #126143 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126143/testReport)**
 for PR 29117 at commit 
[`7a9cf67`](https://github.com/apache/spark/commit/7a9cf6718ae2b7d266ba2e67923fa7fe8ccf8fae).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660792705







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LantaoJin opened a new pull request #29158: [SPARK-32362][SQL][TEST] AdaptiveQueryExecSuite misses verifying AE results

2020-07-19 Thread GitBox


LantaoJin opened a new pull request #29158:
URL: https://github.com/apache/spark/pull/29158


   ### What changes were proposed in this pull request?
   Verify results for `AdaptiveQueryExecSuite`
   
   
   ### Why are the changes needed?
   `AdaptiveQueryExecSuite` misses verifying AE results
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   Exists unit tests.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


AmplabJenkins commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660792705







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


SparkQA commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660792637


   **[Test build #126143 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126143/testReport)**
 for PR 29117 at commit 
[`7a9cf67`](https://github.com/apache/spark/commit/7a9cf6718ae2b7d266ba2e67923fa7fe8ccf8fae).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660789389







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29085:
URL: https://github.com/apache/spark/pull/29085#issuecomment-660790160







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-19 Thread GitBox


AmplabJenkins commented on pull request #29085:
URL: https://github.com/apache/spark/pull/29085#issuecomment-660790160







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #29101: [SPARK-32302][SQL] Partially push down disjunctive predicates through Join/Partitions

2020-07-19 Thread GitBox


maropu commented on pull request #29101:
URL: https://github.com/apache/spark/pull/29101#issuecomment-660790062


   This PR itself looks okay, so just to check; have you checked that this PR 
can get the same performance gain? 
   ```
   SQL | Before this PR | After this PR
   --- | --- | ---
   TPCDS 5T Q13 | 84s | 21s
   TPCDS 5T q85 | 66s | 34s
   TPCH 1T q19 | 37s | 32s
   ```
   https://github.com/apache/spark/pull/28733#issue-428291092



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-19 Thread GitBox


SparkQA commented on pull request #29085:
URL: https://github.com/apache/spark/pull/29085#issuecomment-660789956


   **[Test build #126147 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126147/testReport)**
 for PR 29085 at commit 
[`e16c136`](https://github.com/apache/spark/commit/e16c13620032f8062cb0fcd6ecad9836c97febf7).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


SparkQA removed a comment on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660783515


   **[Test build #126142 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126142/testReport)**
 for PR 29117 at commit 
[`7a9cf67`](https://github.com/apache/spark/commit/7a9cf6718ae2b7d266ba2e67923fa7fe8ccf8fae).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


AmplabJenkins commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660789389







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


SparkQA commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660789324


   **[Test build #126142 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126142/testReport)**
 for PR 29117 at commit 
[`7a9cf67`](https://github.com/apache/spark/commit/7a9cf6718ae2b7d266ba2e67923fa7fe8ccf8fae).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29142: [SPARK-32360][SQL] Add MaxMinBy to support eliminate sorts

2020-07-19 Thread GitBox


AmplabJenkins commented on pull request #29142:
URL: https://github.com/apache/spark/pull/29142#issuecomment-660788635







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29085:
URL: https://github.com/apache/spark/pull/29085#issuecomment-660788675







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-19 Thread GitBox


AmplabJenkins commented on pull request #29085:
URL: https://github.com/apache/spark/pull/29085#issuecomment-660788675







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29142: [SPARK-32360][SQL] Add MaxMinBy to support eliminate sorts

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29142:
URL: https://github.com/apache/spark/pull/29142#issuecomment-660788635







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-19 Thread GitBox


SparkQA commented on pull request #29085:
URL: https://github.com/apache/spark/pull/29085#issuecomment-660788345


   **[Test build #126146 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126146/testReport)**
 for PR 29085 at commit 
[`a3628ac`](https://github.com/apache/spark/commit/a3628ac576ef9fbe06e87ad4ff36043897e0056a).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29142: [SPARK-32360][SQL] Add MaxMinBy to support eliminate sorts

2020-07-19 Thread GitBox


SparkQA commented on pull request #29142:
URL: https://github.com/apache/spark/pull/29142#issuecomment-660788332


   **[Test build #126145 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126145/testReport)**
 for PR 29142 at commit 
[`cd93b70`](https://github.com/apache/spark/commit/cd93b707dfd9e033a0580d688a19fe044af379f9).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-19 Thread GitBox


AngersZh commented on a change in pull request #29085:
URL: https://github.com/apache/spark/pull/29085#discussion_r457025605



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/BaseScriptTransformationSuite.scala
##
@@ -0,0 +1,390 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import java.sql.{Date, Timestamp}
+
+import org.json4s.DefaultFormats
+import org.json4s.JsonDSL._
+import org.json4s.jackson.JsonMethods._
+import org.scalatest.Assertions._
+import org.scalatest.BeforeAndAfterEach
+import org.scalatest.exceptions.TestFailedException
+
+import org.apache.spark.{SparkException, TaskContext, TestUtils}
+import org.apache.spark.rdd.RDD
+import org.apache.spark.sql.{Column, Row}
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.expressions.{Attribute, 
AttributeReference, Expression, GenericInternalRow}
+import org.apache.spark.sql.catalyst.plans.physical.Partitioning
+import org.apache.spark.sql.functions._
+import org.apache.spark.sql.test.SQLTestUtils
+import org.apache.spark.sql.types._
+import org.apache.spark.unsafe.types.CalendarInterval
+
+abstract class BaseScriptTransformationSuite extends SparkPlanTest with 
SQLTestUtils
+  with BeforeAndAfterEach {
+  import testImplicits._
+  import ScriptTransformationIOSchema._
+
+  protected val uncaughtExceptionHandler = new TestUncaughtExceptionHandler
+
+  private var defaultUncaughtExceptionHandler: Thread.UncaughtExceptionHandler 
= _
+
+  protected override def beforeAll(): Unit = {
+super.beforeAll()
+defaultUncaughtExceptionHandler = Thread.getDefaultUncaughtExceptionHandler
+Thread.setDefaultUncaughtExceptionHandler(uncaughtExceptionHandler)
+  }
+
+  protected override def afterAll(): Unit = {
+super.afterAll()
+Thread.setDefaultUncaughtExceptionHandler(defaultUncaughtExceptionHandler)
+  }
+
+  override protected def afterEach(): Unit = {
+super.afterEach()
+uncaughtExceptionHandler.cleanStatus()
+  }
+
+  def isHive23OrSpark: Boolean
+
+  def createScriptTransformationExec(
+  input: Seq[Expression],
+  script: String,
+  output: Seq[Attribute],
+  child: SparkPlan,
+  ioschema: ScriptTransformationIOSchema): BaseScriptTransformationExec
+
+  test("cat without SerDe") {
+assume(TestUtils.testCommandAvailable("/bin/bash"))
+
+val rowsDf = Seq("a", "b", "c").map(Tuple1.apply).toDF("a")
+checkAnswer(
+  rowsDf,
+  (child: SparkPlan) => createScriptTransformationExec(
+input = Seq(rowsDf.col("a").expr),
+script = "cat",
+output = Seq(AttributeReference("a", StringType)()),
+child = child,
+ioschema = defaultIOSchema
+  ),
+  rowsDf.collect())
+assert(uncaughtExceptionHandler.exception.isEmpty)
+  }
+
+  test("script transformation should not swallow errors from upstream 
operators (no serde)") {
+assume(TestUtils.testCommandAvailable("/bin/bash"))
+
+val rowsDf = Seq("a", "b", "c").map(Tuple1.apply).toDF("a")
+val e = intercept[TestFailedException] {
+  checkAnswer(
+rowsDf,
+(child: SparkPlan) => createScriptTransformationExec(
+  input = Seq(rowsDf.col("a").expr),
+  script = "cat",
+  output = Seq(AttributeReference("a", StringType)()),
+  child = ExceptionInjectingOperator(child),
+  ioschema = defaultIOSchema
+),
+rowsDf.collect())
+}
+assert(e.getMessage().contains("intentional exception"))
+// Before SPARK-25158, uncaughtExceptionHandler will catch 
IllegalArgumentException
+assert(uncaughtExceptionHandler.exception.isEmpty)
+  }
+
+  test("SPARK-25990: TRANSFORM should handle different data types correctly") {
+assume(TestUtils.testCommandAvailable("python"))
+val scriptFilePath = getTestResourcePath("test_script.py")
+
+withTempView("v") {
+  val df = Seq(
+(1, "1", 1.0, BigDecimal(1.0), new Timestamp(1)),
+(2, "2", 2.0, BigDecimal(2.0), new Timestamp(2)),
+(3, "3", 3.0, BigDecimal(3.0), new Timestamp(3))
+  ).toDF("a", "b", "c", "d", "e") // Note column d's data type is 
Decimal(38, 18)
+

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-19 Thread GitBox


AngersZh commented on a change in pull request #29085:
URL: https://github.com/apache/spark/pull/29085#discussion_r457025094



##
File path: sql/core/src/test/resources/sql-tests/inputs/transform.sql
##
@@ -0,0 +1,49 @@
+-- Test data.
+CREATE OR REPLACE TEMPORARY VIEW t1 AS SELECT * FROM VALUES
+('a'), ('b'), ('v')
+as t1(a);
+
+CREATE OR REPLACE TEMPORARY VIEW t2 AS SELECT * FROM VALUES
+('1', true, unhex('537061726B2053514C'), tinyint(1), array_position(array(3, 
2, 1), 1), float(1.0), 1.0, Decimal(1.0), timestamp(1), current_date),
+('2', false, unhex('537061726B2053514C'), tinyint(2),  array_position(array(3, 
2, 1), 2), float(2.0), 2.0, Decimal(2.0), timestamp(2), current_date),
+('3', true, unhex('537061726B2053514C'), tinyint(3),  array_position(array(3, 
2, 1), 1), float(3.0), 3.0, Decimal(3.0), timestamp(3), current_date)
+as t2(a,b,c,d,e,f,g,h,i,j);
+
+SELECT TRANSFORM(a)

Review comment:
   Added some case without serde.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-19 Thread GitBox


AngersZh commented on a change in pull request #29085:
URL: https://github.com/apache/spark/pull/29085#discussion_r457025094



##
File path: sql/core/src/test/resources/sql-tests/inputs/transform.sql
##
@@ -0,0 +1,49 @@
+-- Test data.
+CREATE OR REPLACE TEMPORARY VIEW t1 AS SELECT * FROM VALUES
+('a'), ('b'), ('v')
+as t1(a);
+
+CREATE OR REPLACE TEMPORARY VIEW t2 AS SELECT * FROM VALUES
+('1', true, unhex('537061726B2053514C'), tinyint(1), array_position(array(3, 
2, 1), 1), float(1.0), 1.0, Decimal(1.0), timestamp(1), current_date),
+('2', false, unhex('537061726B2053514C'), tinyint(2),  array_position(array(3, 
2, 1), 2), float(2.0), 2.0, Decimal(2.0), timestamp(2), current_date),
+('3', true, unhex('537061726B2053514C'), tinyint(3),  array_position(array(3, 
2, 1), 1), float(3.0), 3.0, Decimal(3.0), timestamp(3), current_date)
+as t2(a,b,c,d,e,f,g,h,i,j);
+
+SELECT TRANSFORM(a)

Review comment:
   Added some case without serde.
   With serde will show different when with/without hive





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29085:
URL: https://github.com/apache/spark/pull/29085#issuecomment-660786972







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-19 Thread GitBox


AmplabJenkins commented on pull request #29085:
URL: https://github.com/apache/spark/pull/29085#issuecomment-660786972







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660786626







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-19 Thread GitBox


SparkQA commented on pull request #29085:
URL: https://github.com/apache/spark/pull/29085#issuecomment-660786632


   **[Test build #126144 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126144/testReport)**
 for PR 29085 at commit 
[`72b2155`](https://github.com/apache/spark/commit/72b215558b5d3e326ebe2416367a9d33455f9d58).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


AmplabJenkins commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660786626







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


SparkQA commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660786642


   **[Test build #126143 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126143/testReport)**
 for PR 29117 at commit 
[`7a9cf67`](https://github.com/apache/spark/commit/7a9cf6718ae2b7d266ba2e67923fa7fe8ccf8fae).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


SparkQA removed a comment on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660779916


   **[Test build #126140 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126140/testReport)**
 for PR 29117 at commit 
[`7a9cf67`](https://github.com/apache/spark/commit/7a9cf6718ae2b7d266ba2e67923fa7fe8ccf8fae).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


SparkQA commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660786336


   **[Test build #126140 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126140/testReport)**
 for PR 29117 at commit 
[`7a9cf67`](https://github.com/apache/spark/commit/7a9cf6718ae2b7d266ba2e67923fa7fe8ccf8fae).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660784319







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29157: [SPARK-32344][SQL][2.4] Unevaluable expr is set to FIRST/LAST ignoreNullsExpr in distinct aggregates

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29157:
URL: https://github.com/apache/spark/pull/29157#issuecomment-660784274







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


AmplabJenkins commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660784319







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29157: [SPARK-32344][SQL][2.4] Unevaluable expr is set to FIRST/LAST ignoreNullsExpr in distinct aggregates

2020-07-19 Thread GitBox


AmplabJenkins commented on pull request #29157:
URL: https://github.com/apache/spark/pull/29157#issuecomment-660784274







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29157: [SPARK-32344][SQL][2.4] Unevaluable expr is set to FIRST/LAST ignoreNullsExpr in distinct aggregates

2020-07-19 Thread GitBox


SparkQA commented on pull request #29157:
URL: https://github.com/apache/spark/pull/29157#issuecomment-660783497


   **[Test build #126141 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126141/testReport)**
 for PR 29157 at commit 
[`c190886`](https://github.com/apache/spark/commit/c190886bed931f7084439b0a737c4a1cfeb90bc3).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


SparkQA commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660783515


   **[Test build #126142 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126142/testReport)**
 for PR 29117 at commit 
[`7a9cf67`](https://github.com/apache/spark/commit/7a9cf6718ae2b7d266ba2e67923fa7fe8ccf8fae).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon removed a comment on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


HyukjinKwon removed a comment on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660778721







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


HyukjinKwon commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660783001


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu opened a new pull request #29157: [SPARK-32344][SQL][2.4] Unevaluable expr is set to FIRST/LAST ignoreNullsExpr in distinct aggregates

2020-07-19 Thread GitBox


maropu opened a new pull request #29157:
URL: https://github.com/apache/spark/pull/29157


   
   
   ### What changes were proposed in this pull request?
   
   This PR intends to fix a bug of distinct FIRST/LAST aggregates in v2.4.6;
   ```
   scala> sql("SELECT FIRST(DISTINCT v) FROM VALUES 1, 2, 3 t(v)").show()
   ...
   Caused by: java.lang.UnsupportedOperationException: Cannot evaluate 
expression: false#37
 at 
org.apache.spark.sql.catalyst.expressions.Unevaluable$class.eval(Expression.scala:258)
 at 
org.apache.spark.sql.catalyst.expressions.AttributeReference.eval(namedExpressions.scala:226)
 at 
org.apache.spark.sql.catalyst.expressions.aggregate.First.ignoreNulls(First.scala:68)
 at 
org.apache.spark.sql.catalyst.expressions.aggregate.First.updateExpressions$lzycompute(First.scala:82)
 at 
org.apache.spark.sql.catalyst.expressions.aggregate.First.updateExpressions(First.scala:81)
 at 
org.apache.spark.sql.execution.aggregate.HashAggregateExec$$anonfun$15.apply(HashAggregateExec.scala:268)
   ```
   A root cause of this bug is that the `Aggregation` strategy replaces a 
foldable boolean `ignoreNullsExpr` expr with a `Unevaluable` expr 
(`AttributeReference`) for distinct FIRST/LAST aggregate functions. But, this 
operation cannot be allowed because the `Analyzer` has checked that it must be 
foldabe;
   
https://github.com/apache/spark/blob/ffdbbae1d465fe2c710d020de62ca1a6b0b924d9/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/First.scala#L74-L76
   So, this PR proposes to change a vriable for `IGNORE NULLS`  from 
`Expression` to `Boolean` to avoid the case.
   
   ### Why are the changes needed?
   
   Bugfix.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Added a test in `DataFrameAggregateSuite`.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


AmplabJenkins removed a comment on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660780221







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


AmplabJenkins commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660780221







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


SparkQA commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660779916


   **[Test build #126140 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126140/testReport)**
 for PR 29117 at commit 
[`7a9cf67`](https://github.com/apache/spark/commit/7a9cf6718ae2b7d266ba2e67923fa7fe8ccf8fae).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gengliangwang commented on a change in pull request #29137: [SPARK-32337][SQL] Show initial plan in AQE plan tree string

2020-07-19 Thread GitBox


gengliangwang commented on a change in pull request #29137:
URL: https://github.com/apache/spark/pull/29137#discussion_r457011061



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
##
@@ -288,25 +291,59 @@ case class AdaptiveSparkPlanExec(
   addSuffix,
   maxFields,
   printNodeId)
-currentPhysicalPlan.generateTreeString(
+plans.zipWithIndex.foreach { case ((name, plan), i) =>

Review comment:
   Since there are always only two plans, shall we just call 
`initialPlan.generateTreeString`  and `currentPhysicalPlan.generateTreeString`  
here?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gengliangwang commented on a change in pull request #29137: [SPARK-32337][SQL] Show initial plan in AQE plan tree string

2020-07-19 Thread GitBox


gengliangwang commented on a change in pull request #29137:
URL: https://github.com/apache/spark/pull/29137#discussion_r457011061



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
##
@@ -288,25 +291,59 @@ case class AdaptiveSparkPlanExec(
   addSuffix,
   maxFields,
   printNodeId)
-currentPhysicalPlan.generateTreeString(
+plans.zipWithIndex.foreach { case ((name, plan), i) =>

Review comment:
   Since there are always only two plans. Shall we just call 
`initialPlan.generateTreeString`  and `currentPhysicalPlan.generateTreeString`  
here?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-19 Thread GitBox


HyukjinKwon commented on pull request #29117:
URL: https://github.com/apache/spark/pull/29117#issuecomment-660778707







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   >