[GitHub] [spark] beliefer commented on a change in pull request #29891: [SPARK-30796][SQL] Add parameter position for REGEXP_REPLACE

2020-10-12 Thread GitBox


beliefer commented on a change in pull request #29891:
URL: https://github.com/apache/spark/pull/29891#discussion_r503684400



##
File path: 
sql/core/src/test/resources/sql-tests/results/regexp-functions.sql.out
##
@@ -252,3 +252,53 @@ SELECT regexp_extract_all('a 2b 14m', '(\\d+)?([a-z]+)', 1)
 struct>
 -- !query output
 ["","2","14"]
+
+
+-- !query
+SELECT regexp_replace('healthy, wealthy, and wise', '\\w+thy', 'something')
+-- !query schema
+struct
+-- !query output
+something, something, and wise
+
+
+-- !query
+SELECT regexp_replace('healthy, wealthy, and wise', '\\w+thy', 'something', -2)
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.AnalysisException
+cannot resolve 'regexp_replace('healthy, wealthy, and wise', '\\w+thy', 
'something', -2)' due to data type mismatch: Position expression must be 
positive, but got: -2; line 1 pos 7

Review comment:
   It seems not need.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #29891: [SPARK-30796][SQL] Add parameter position for REGEXP_REPLACE

2020-10-12 Thread GitBox


beliefer commented on a change in pull request #29891:
URL: https://github.com/apache/spark/pull/29891#discussion_r503683723



##
File path: sql/core/src/test/resources/sql-tests/inputs/regexp-functions.sql
##
@@ -31,3 +31,11 @@ SELECT regexp_extract_all('1a 2b 14m', '(\\d+)([a-z]+)', 3);
 SELECT regexp_extract_all('1a 2b 14m', '(\\d+)([a-z]+)', -1);
 SELECT regexp_extract_all('1a 2b 14m', '(\\d+)?([a-z]+)', 1);
 SELECT regexp_extract_all('a 2b 14m', '(\\d+)?([a-z]+)', 1);
+
+-- regexp_replace
+SELECT regexp_replace('healthy, wealthy, and wise', '\\w+thy', 'something');
+SELECT regexp_replace('healthy, wealthy, and wise', '\\w+thy', 'something', 
-2);
+SELECT regexp_replace('healthy, wealthy, and wise', '\\w+thy', 'something', 0);
+SELECT regexp_replace('healthy, wealthy, and wise', '\\w+thy', 'something', 1);
+SELECT regexp_replace('healthy, wealthy, and wise', '\\w+thy', 'something', 2);
+SELECT regexp_replace('healthy, wealthy, and wise', '\\w+thy', 'something', 8);

Review comment:
   OK





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29950: [SPARK-32945][SQL] Avoid collapsing projects if reaching max allowed common exprs

2020-10-12 Thread GitBox


SparkQA commented on pull request #29950:
URL: https://github.com/apache/spark/pull/29950#issuecomment-707502613


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34333/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29933: [SPARK-26533][SQL] Support query auto timeout cancel on thriftserver

2020-10-12 Thread GitBox


SparkQA commented on pull request #29933:
URL: https://github.com/apache/spark/pull/29933#issuecomment-707501088


   **[Test build #129728 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129728/testReport)**
 for PR 29933 at commit 
[`27fcb7f`](https://github.com/apache/spark/commit/27fcb7f9f25eb3fd2497809c064c0226a4ea238b).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #29933: [SPARK-26533][SQL] Support query auto timeout cancel on thriftserver

2020-10-12 Thread GitBox


maropu commented on a change in pull request #29933:
URL: https://github.com/apache/spark/pull/29933#discussion_r503676101



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##
@@ -893,6 +893,16 @@ object SQLConf {
   .booleanConf
   .createWithDefault(false)
 
+  val THRIFTSERVER_QUERY_TIMEOUT =
+buildConf("spark.sql.thriftServer.queryTimeout")
+  .doc("Specifies a global timeout value for Thrift Server. A query will 
be cancelled " +
+"automatically after the specified time. If a timeout value is set for 
each statement " +
+"(e.g., via `java.sql.Statement.setQueryTimeout`), the value takes 
precedence. " +
+"If the value is zero or negative, no timeout happens.")

Review comment:
   Updated, could you check this again?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29881: [SPARK-32852][SQL] spark.sql.hive.metastore.jars support HDFS location

2020-10-12 Thread GitBox


AmplabJenkins removed a comment on pull request #29881:
URL: https://github.com/apache/spark/pull/29881#issuecomment-707497982







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29881: [SPARK-32852][SQL] spark.sql.hive.metastore.jars support HDFS location

2020-10-12 Thread GitBox


AmplabJenkins commented on pull request #29881:
URL: https://github.com/apache/spark/pull/29881#issuecomment-707497982







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29881: [SPARK-32852][SQL] spark.sql.hive.metastore.jars support HDFS location

2020-10-12 Thread GitBox


SparkQA removed a comment on pull request #29881:
URL: https://github.com/apache/spark/pull/29881#issuecomment-707462326


   **[Test build #129726 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129726/testReport)**
 for PR 29881 at commit 
[`4016327`](https://github.com/apache/spark/commit/4016327a018cde039433d0b74b4dab2daacad45e).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29881: [SPARK-32852][SQL] spark.sql.hive.metastore.jars support HDFS location

2020-10-12 Thread GitBox


SparkQA commented on pull request #29881:
URL: https://github.com/apache/spark/pull/29881#issuecomment-707497354


   **[Test build #129726 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129726/testReport)**
 for PR 29881 at commit 
[`4016327`](https://github.com/apache/spark/commit/4016327a018cde039433d0b74b4dab2daacad45e).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #29933: [SPARK-26533][SQL] Support query auto timeout cancel on thriftserver

2020-10-12 Thread GitBox


maropu commented on a change in pull request #29933:
URL: https://github.com/apache/spark/pull/29933#discussion_r503673867



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##
@@ -893,6 +893,16 @@ object SQLConf {
   .booleanConf
   .createWithDefault(false)
 
+  val THRIFTSERVER_QUERY_TIMEOUT =
+buildConf("spark.sql.thriftServer.queryTimeout")
+  .doc("Specifies a global timeout value for Thrift Server. A query will 
be cancelled " +
+"automatically after the specified time. If a timeout value is set for 
each statement " +
+"(e.g., via `java.sql.Statement.setQueryTimeout`), the value takes 
precedence. " +
+"If the value is zero or negative, no timeout happens.")

Review comment:
   Sure.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #29933: [SPARK-26533][SQL] Support query auto timeout cancel on thriftserver

2020-10-12 Thread GitBox


maropu commented on a change in pull request #29933:
URL: https://github.com/apache/spark/pull/29933#discussion_r503673867



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##
@@ -893,6 +893,16 @@ object SQLConf {
   .booleanConf
   .createWithDefault(false)
 
+  val THRIFTSERVER_QUERY_TIMEOUT =
+buildConf("spark.sql.thriftServer.queryTimeout")
+  .doc("Specifies a global timeout value for Thrift Server. A query will 
be cancelled " +
+"automatically after the specified time. If a timeout value is set for 
each statement " +
+"(e.g., via `java.sql.Statement.setQueryTimeout`), the value takes 
precedence. " +
+"If the value is zero or negative, no timeout happens.")

Review comment:
   okay.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #29891: [SPARK-30796][SQL] Add parameter position for REGEXP_REPLACE

2020-10-12 Thread GitBox


maropu commented on a change in pull request #29891:
URL: https://github.com/apache/spark/pull/29891#discussion_r503673727



##
File path: 
sql/core/src/test/resources/sql-tests/results/regexp-functions.sql.out
##
@@ -252,3 +252,53 @@ SELECT regexp_extract_all('a 2b 14m', '(\\d+)?([a-z]+)', 1)
 struct>
 -- !query output
 ["","2","14"]
+
+
+-- !query
+SELECT regexp_replace('healthy, wealthy, and wise', '\\w+thy', 'something')
+-- !query schema
+struct
+-- !query output
+something, something, and wise
+
+
+-- !query
+SELECT regexp_replace('healthy, wealthy, and wise', '\\w+thy', 'something', -2)
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.AnalysisException
+cannot resolve 'regexp_replace('healthy, wealthy, and wise', '\\w+thy', 
'something', -2)' due to data type mismatch: Position expression must be 
positive, but got: -2; line 1 pos 7

Review comment:
   hm, I see. But, the existing other functions (e.g., `StringLocate`) can 
accept a negative value for a position param. So, how about following the Spark 
way?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on pull request #30013: [SPARK-32455][ML][Follow-Up] LogisticRegressionModel prediction optimization - fix incorrect initialization

2020-10-12 Thread GitBox


zhengruifeng commented on pull request #30013:
URL: https://github.com/apache/spark/pull/30013#issuecomment-707491859


   Merged to master, thanks all!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng closed pull request #30013: [SPARK-32455][ML][Follow-Up] LogisticRegressionModel prediction optimization - fix incorrect initialization

2020-10-12 Thread GitBox


zhengruifeng closed pull request #30013:
URL: https://github.com/apache/spark/pull/30013


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29950: [SPARK-32945][SQL] Avoid collapsing projects if reaching max allowed common exprs

2020-10-12 Thread GitBox


SparkQA commented on pull request #29950:
URL: https://github.com/apache/spark/pull/29950#issuecomment-707488433


   **[Test build #129727 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129727/testReport)**
 for PR 29950 at commit 
[`9bfafc7`](https://github.com/apache/spark/commit/9bfafc75d66987fd550052e6b3d53efe703b55ec).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #29950: [SPARK-32945][SQL] Avoid collapsing projects if reaching max allowed common exprs

2020-10-12 Thread GitBox


viirya commented on a change in pull request #29950:
URL: https://github.com/apache/spark/pull/29950#discussion_r503666979



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##
@@ -1926,6 +1926,19 @@ object SQLConf {
 .booleanConf
 .createWithDefault(true)
 
+  val MAX_COMMON_EXPRS_IN_COLLAPSE_PROJECT =
+buildConf("spark.sql.optimizer.maxCommonExprsInCollapseProject")
+  .doc("An integer number indicates the maximum allowed number of a common 
expression " +

Review comment:
   Not exactly the same, but I revised the doc.

##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##
@@ -1926,6 +1926,19 @@ object SQLConf {
 .booleanConf
 .createWithDefault(true)
 
+  val MAX_COMMON_EXPRS_IN_COLLAPSE_PROJECT =
+buildConf("spark.sql.optimizer.maxCommonExprsInCollapseProject")
+  .doc("An integer number indicates the maximum allowed number of a common 
expression " +
+"can be collapsed into upper Project from lower Project by optimizer 
rule " +
+"`CollapseProject`. Normally `CollapseProject` will collapse adjacent 
Project " +
+"and merge expressions. But in some edge cases, expensive expressions 
might be " +
+"duplicated many times in merged Project by this optimization. This 
config sets " +
+"a maximum number. Once an expression is duplicated more than this 
number " +
+"if merging two Project, Spark SQL will skip the merging.")
+  .version("3.1.0")
+  .intConf

Review comment:
   Added, thanks.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #29950: [SPARK-32945][SQL] Avoid collapsing projects if reaching max allowed common exprs

2020-10-12 Thread GitBox


viirya commented on a change in pull request #29950:
URL: https://github.com/apache/spark/pull/29950#discussion_r503666896



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##
@@ -216,7 +216,7 @@ abstract class Optimizer(catalogManager: CatalogManager)
 // The following batch should be executed after batch "Join Reorder" and 
"LocalRelation".
 Batch("Check Cartesian Products", Once,
   CheckCartesianProducts) :+
-Batch("RewriteSubquery", Once,
+Batch("RewriteSubquery", fixedPoint,

Review comment:
   Added one comment.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30012: [SPARK-XXX][INFRA] Rebalance GitHub Action jobs

2020-10-12 Thread GitBox


AmplabJenkins removed a comment on pull request #30012:
URL: https://github.com/apache/spark/pull/30012#issuecomment-707486155







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30012: [SPARK-XXX][INFRA] Rebalance GitHub Action jobs

2020-10-12 Thread GitBox


AmplabJenkins commented on pull request #30012:
URL: https://github.com/apache/spark/pull/30012#issuecomment-707486155







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30012: [SPARK-XXX][INFRA] Rebalance GitHub Action jobs

2020-10-12 Thread GitBox


SparkQA commented on pull request #30012:
URL: https://github.com/apache/spark/pull/30012#issuecomment-707485524


   **[Test build #129721 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129721/testReport)**
 for PR 30012 at commit 
[`44fe3ae`](https://github.com/apache/spark/commit/44fe3ae41f213e0afde8257e883bf3f2f8839cec).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30012: [SPARK-XXX][INFRA] Rebalance GitHub Action jobs

2020-10-12 Thread GitBox


SparkQA removed a comment on pull request #30012:
URL: https://github.com/apache/spark/pull/30012#issuecomment-707445221


   **[Test build #129721 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129721/testReport)**
 for PR 30012 at commit 
[`44fe3ae`](https://github.com/apache/spark/commit/44fe3ae41f213e0afde8257e883bf3f2f8839cec).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #29950: [SPARK-32945][SQL] Avoid collapsing projects if reaching max allowed common exprs

2020-10-12 Thread GitBox


viirya commented on a change in pull request #29950:
URL: https://github.com/apache/spark/pull/29950#discussion_r503665670



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##
@@ -760,12 +756,43 @@ object CollapseProject extends Rule[LogicalPlan] {
   s.copy(child = p2.copy(projectList = buildCleanedProjectList(l1, 
p2.projectList)))
   }
 
+  private def collapseProjects(plan: LogicalPlan): LogicalPlan = plan match {
+case p1 @ Project(_, p2: Project) =>
+  val maxCommonExprs = SQLConf.get.maxCommonExprsInCollapseProject
+
+  if (haveCommonNonDeterministicOutput(p1.projectList, p2.projectList) ||
+  getLargestNumOfCommonOutput(p1.projectList, p2.projectList) > 
maxCommonExprs) {
+p1
+  } else {
+collapseProjects(
+  p2.copy(projectList = buildCleanedProjectList(p1.projectList, 
p2.projectList)))
+  }
+case _ => plan
+  }
+
   private def collectAliases(projectList: Seq[NamedExpression]): 
AttributeMap[Alias] = {
 AttributeMap(projectList.collect {
   case a: Alias => a.toAttribute -> a
 })
   }
 
+  // Counts for the largest times common outputs from lower operator are used 
in upper operators.
+  private def getLargestNumOfCommonOutput(

Review comment:
   Two places looks similar however the parameters are slightly different. 
We can make them share same code, but the code lines are just few and 
refactoring needs more change, so seems not worth to me.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #29891: [SPARK-30796][SQL] Add parameter position for REGEXP_REPLACE

2020-10-12 Thread GitBox


maropu commented on pull request #29891:
URL: https://github.com/apache/spark/pull/29891#issuecomment-707485073


   I meant `RegExpExtract` in spark. We don't need to add a `position` param in 
`RegExpExtract` and `RegExpExtractAll`, too? 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30022: SPARK-33090 Upgrade Google Guava to 29.0-jre

2020-10-12 Thread GitBox


AmplabJenkins removed a comment on pull request #30022:
URL: https://github.com/apache/spark/pull/30022#issuecomment-707483670


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30022: SPARK-33090 Upgrade Google Guava to 29.0-jre

2020-10-12 Thread GitBox


AmplabJenkins commented on pull request #30022:
URL: https://github.com/apache/spark/pull/30022#issuecomment-707484086


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30022: SPARK-33090 Upgrade Google Guava to 29.0-jre

2020-10-12 Thread GitBox


AmplabJenkins commented on pull request #30022:
URL: https://github.com/apache/spark/pull/30022#issuecomment-707483670


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sfcoy opened a new pull request #30022: SPARK-33090 Upgrade Google Guava to 29.0-jre

2020-10-12 Thread GitBox


sfcoy opened a new pull request #30022:
URL: https://github.com/apache/spark/pull/30022


* Compatible with Hadoop > 3.2.0
* Future proof for a while
   
   ### What changes were proposed in this pull request?
   
   Upgrade the Google Guava dependency for compatibility with Hadoop 3.2.1 and 
Hadoop 3.3.0. 
   
   ### Why are the changes needed?
   
   Spark fails at runtime with NoSuchMethodExceptions when built/run in 
conjunction with these versions, which make use of 
com.google.common.base.Preconditions methods that are not present in the 
version of Guava currently specified for Spark.
   
   
   ### Does this PR introduce _any_ user-facing change?
   This change introduces new dependencies into the build which are imported by 
the guava pom file.
   
   ### How was this patch tested?
   We are currently running ETL production processes using Spark builds with 
this Guava version.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


AmplabJenkins commented on pull request #30021:
URL: https://github.com/apache/spark/pull/30021#issuecomment-707481533







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


SparkQA commented on pull request #30021:
URL: https://github.com/apache/spark/pull/30021#issuecomment-707481521


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34332/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


AmplabJenkins removed a comment on pull request #30021:
URL: https://github.com/apache/spark/pull/30021#issuecomment-707481533







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #29950: [SPARK-32945][SQL] Avoid collapsing projects if reaching max allowed common exprs

2020-10-12 Thread GitBox


viirya commented on a change in pull request #29950:
URL: https://github.com/apache/spark/pull/29950#discussion_r503662637



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##
@@ -216,7 +216,7 @@ abstract class Optimizer(catalogManager: CatalogManager)
 // The following batch should be executed after batch "Join Reorder" and 
"LocalRelation".
 Batch("Check Cartesian Products", Once,
   CheckCartesianProducts) :+
-Batch("RewriteSubquery", Once,
+Batch("RewriteSubquery", fixedPoint,

Review comment:
   Isn't `FixedPoint(1)` also to run the batch once?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #30020: [SPARK-33123][TESTS] Ignore `GitHub Action file` change in Amplab Jenkins

2020-10-12 Thread GitBox


viirya commented on a change in pull request #30020:
URL: https://github.com/apache/spark/pull/30020#discussion_r503660823



##
File path: dev/run-tests.py
##
@@ -55,6 +55,10 @@ def determine_modules_for_files(filenames):
 for filename in filenames:
 if filename in ("appveyor.yml",):
 continue
+if (os.environ.get("AMPLAB_JENKINS") and
+filename in (".github/workflows/build_and_test.yml",

Review comment:
   sounds good.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


AmplabJenkins removed a comment on pull request #30021:
URL: https://github.com/apache/spark/pull/30021#issuecomment-707479180


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/129725/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


SparkQA removed a comment on pull request #30021:
URL: https://github.com/apache/spark/pull/30021#issuecomment-707462315


   **[Test build #129725 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129725/testReport)**
 for PR 30021 at commit 
[`e3c90cf`](https://github.com/apache/spark/commit/e3c90cff4058cea3cec49905cab28022d8c46d96).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


AmplabJenkins removed a comment on pull request #30021:
URL: https://github.com/apache/spark/pull/30021#issuecomment-707479171


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


AmplabJenkins commented on pull request #30021:
URL: https://github.com/apache/spark/pull/30021#issuecomment-707479171







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30009: [SPARK-32907][ML] adaptively blockify instances - LinearSVC

2020-10-12 Thread GitBox


AmplabJenkins removed a comment on pull request #30009:
URL: https://github.com/apache/spark/pull/30009#issuecomment-707478934







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30009: [SPARK-32907][ML] adaptively blockify instances - LinearSVC

2020-10-12 Thread GitBox


AmplabJenkins commented on pull request #30009:
URL: https://github.com/apache/spark/pull/30009#issuecomment-707478934







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29881: [SPARK-32852][SQL] spark.sql.hive.metastore.jars support HDFS location

2020-10-12 Thread GitBox


AmplabJenkins removed a comment on pull request #29881:
URL: https://github.com/apache/spark/pull/29881#issuecomment-707478598


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/34331/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


SparkQA commented on pull request #30021:
URL: https://github.com/apache/spark/pull/30021#issuecomment-707479031


   **[Test build #129725 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129725/testReport)**
 for PR 30021 at commit 
[`e3c90cf`](https://github.com/apache/spark/commit/e3c90cff4058cea3cec49905cab28022d8c46d96).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29881: [SPARK-32852][SQL] spark.sql.hive.metastore.jars support HDFS location

2020-10-12 Thread GitBox


AmplabJenkins removed a comment on pull request #29881:
URL: https://github.com/apache/spark/pull/29881#issuecomment-707478589


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29881: [SPARK-32852][SQL] spark.sql.hive.metastore.jars support HDFS location

2020-10-12 Thread GitBox


SparkQA commented on pull request #29881:
URL: https://github.com/apache/spark/pull/29881#issuecomment-707478580


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34331/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29881: [SPARK-32852][SQL] spark.sql.hive.metastore.jars support HDFS location

2020-10-12 Thread GitBox


AmplabJenkins commented on pull request #29881:
URL: https://github.com/apache/spark/pull/29881#issuecomment-707478589







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30009: [SPARK-32907][ML] adaptively blockify instances - LinearSVC

2020-10-12 Thread GitBox


SparkQA commented on pull request #30009:
URL: https://github.com/apache/spark/pull/30009#issuecomment-707478699


   **[Test build #129724 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129724/testReport)**
 for PR 30009 at commit 
[`9cd1053`](https://github.com/apache/spark/commit/9cd10535b962b18421a6a21a79564fe6e7fae157).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] lanisyutin commented on pull request #11459: [SPARK-13025] Allow users to set initial model in logistic regression

2020-10-12 Thread GitBox


lanisyutin commented on pull request #11459:
URL: https://github.com/apache/spark/pull/11459#issuecomment-707478601


   Hi, all, what about the status of this issue? why it was not merged? There 
are any ways to set `initialWeights` now?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30009: [SPARK-32907][ML] adaptively blockify instances - LinearSVC

2020-10-12 Thread GitBox


SparkQA removed a comment on pull request #30009:
URL: https://github.com/apache/spark/pull/30009#issuecomment-707460083


   **[Test build #129724 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129724/testReport)**
 for PR 30009 at commit 
[`9cd1053`](https://github.com/apache/spark/commit/9cd10535b962b18421a6a21a79564fe6e7fae157).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


SparkQA commented on pull request #30021:
URL: https://github.com/apache/spark/pull/30021#issuecomment-707476349


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34332/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30009: [SPARK-32907][ML] adaptively blockify instances - LinearSVC

2020-10-12 Thread GitBox


AmplabJenkins removed a comment on pull request #30009:
URL: https://github.com/apache/spark/pull/30009#issuecomment-707475324







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30009: [SPARK-32907][ML] adaptively blockify instances - LinearSVC

2020-10-12 Thread GitBox


AmplabJenkins commented on pull request #30009:
URL: https://github.com/apache/spark/pull/30009#issuecomment-707475324







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30009: [SPARK-32907][ML] adaptively blockify instances - LinearSVC

2020-10-12 Thread GitBox


SparkQA commented on pull request #30009:
URL: https://github.com/apache/spark/pull/30009#issuecomment-707475316


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34330/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


AmplabJenkins commented on pull request #30021:
URL: https://github.com/apache/spark/pull/30021#issuecomment-707474058







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


AmplabJenkins removed a comment on pull request #30021:
URL: https://github.com/apache/spark/pull/30021#issuecomment-707474058







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


SparkQA commented on pull request #30021:
URL: https://github.com/apache/spark/pull/30021#issuecomment-707474046


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34329/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29881: [SPARK-32852][SQL] spark.sql.hive.metastore.jars support HDFS location

2020-10-12 Thread GitBox


SparkQA commented on pull request #29881:
URL: https://github.com/apache/spark/pull/29881#issuecomment-707471876


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34331/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30011: [WIP][SPARK-32281][SQL] Spark keep SORTED spec in metastore

2020-10-12 Thread GitBox


AmplabJenkins removed a comment on pull request #30011:
URL: https://github.com/apache/spark/pull/30011#issuecomment-707471528







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30011: [WIP][SPARK-32281][SQL] Spark keep SORTED spec in metastore

2020-10-12 Thread GitBox


AmplabJenkins commented on pull request #30011:
URL: https://github.com/apache/spark/pull/30011#issuecomment-707471528







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30011: [WIP][SPARK-32281][SQL] Spark keep SORTED spec in metastore

2020-10-12 Thread GitBox


SparkQA commented on pull request #30011:
URL: https://github.com/apache/spark/pull/30011#issuecomment-707471512


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34328/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30009: [SPARK-32907][ML] adaptively blockify instances - LinearSVC

2020-10-12 Thread GitBox


SparkQA commented on pull request #30009:
URL: https://github.com/apache/spark/pull/30009#issuecomment-707470333


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34330/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


AmplabJenkins removed a comment on pull request #30021:
URL: https://github.com/apache/spark/pull/30021#issuecomment-707468073


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/129723/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


SparkQA removed a comment on pull request #30021:
URL: https://github.com/apache/spark/pull/30021#issuecomment-707457646


   **[Test build #129723 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129723/testReport)**
 for PR 30021 at commit 
[`a10ae2f`](https://github.com/apache/spark/commit/a10ae2fd1ac2a14b11b6675c01014d6ea0ad16c9).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


AmplabJenkins removed a comment on pull request #30021:
URL: https://github.com/apache/spark/pull/30021#issuecomment-707468068


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


AmplabJenkins commented on pull request #30021:
URL: https://github.com/apache/spark/pull/30021#issuecomment-707468068







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


SparkQA commented on pull request #30021:
URL: https://github.com/apache/spark/pull/30021#issuecomment-707467921


   **[Test build #129723 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129723/testReport)**
 for PR 30021 at commit 
[`a10ae2f`](https://github.com/apache/spark/commit/a10ae2fd1ac2a14b11b6675c01014d6ea0ad16c9).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


SparkQA commented on pull request #30021:
URL: https://github.com/apache/spark/pull/30021#issuecomment-707467211


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34329/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30011: [WIP][SPARK-32281][SQL] Spark keep SORTED spec in metastore

2020-10-12 Thread GitBox


SparkQA commented on pull request #30011:
URL: https://github.com/apache/spark/pull/30011#issuecomment-707466369


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34328/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #29891: [SPARK-30796][SQL] Add parameter position for REGEXP_REPLACE

2020-10-12 Thread GitBox


beliefer commented on a change in pull request #29891:
URL: https://github.com/apache/spark/pull/29891#discussion_r503648480



##
File path: 
sql/core/src/test/resources/sql-tests/results/regexp-functions.sql.out
##
@@ -252,3 +252,53 @@ SELECT regexp_extract_all('a 2b 14m', '(\\d+)?([a-z]+)', 1)
 struct>
 -- !query output
 ["","2","14"]
+
+
+-- !query
+SELECT regexp_replace('healthy, wealthy, and wise', '\\w+thy', 'something')
+-- !query schema
+struct
+-- !query output
+something, something, and wise
+
+
+-- !query
+SELECT regexp_replace('healthy, wealthy, and wise', '\\w+thy', 'something', -2)
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.AnalysisException
+cannot resolve 'regexp_replace('healthy, wealthy, and wise', '\\w+thy', 
'something', -2)' due to data type mismatch: Position expression must be 
positive, but got: -2; line 1 pos 7

Review comment:
   Yes. All the database tell us position must be positive integer.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #29891: [SPARK-30796][SQL] Add parameter position for REGEXP_REPLACE

2020-10-12 Thread GitBox


beliefer commented on a change in pull request #29891:
URL: https://github.com/apache/spark/pull/29891#discussion_r503648228



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##
@@ -318,16 +320,49 @@ case class StringSplit(str: Expression, regex: 
Expression, limit: Expression)
  */
 // scalastyle:off line.size.limit
 @ExpressionDescription(
-  usage = "_FUNC_(str, regexp, rep) - Replaces all substrings of `str` that 
match `regexp` with `rep`.",
+  usage = "_FUNC_(str, regexp, rep[, position]) - Replaces all substrings of 
`str` that match `regexp` with `rep`.",
+  arguments = """
+Arguments:
+  * str - a string expression to search for a regular expression pattern 
match.
+  * regexp - a string representing a regular expression. The regex string 
should be a
+  Java regular expression.
+
+  Since Spark 2.0, string literals (including regex patterns) are 
unescaped in our SQL
+  parser. For example, to match "\abc", a regular expression for 
`regexp` can be
+  "^\\abc$".
+
+  There is a SQL config 'spark.sql.parser.escapedStringLiterals' that 
can be used to
+  fallback to the Spark 1.6 behavior regarding string literal parsing. 
For example,
+  if the config is enabled, the `regexp` that can match "\abc" is 
"^\abc$".
+  * rep - a string expression to replace matched substrings.
+  * position - a positive integer expression that indicates the position 
within `str` to begin searching.
+  The default is 1. If position is greater than the number of 
characters in `str`, the result is `str`.
+  """,
   examples = """
 Examples:
   > SELECT _FUNC_('100-200', '(\\d+)', 'num');
num-num
   """,
   since = "1.5.0")
 // scalastyle:on line.size.limit
-case class RegExpReplace(subject: Expression, regexp: Expression, rep: 
Expression)
-  extends TernaryExpression with ImplicitCastInputTypes with NullIntolerant {
+case class RegExpReplace(subject: Expression, regexp: Expression, rep: 
Expression, pos: Expression)
+  extends QuaternaryExpression with ImplicitCastInputTypes with NullIntolerant 
{
+
+  def this(subject: Expression, regexp: Expression, rep: Expression) =
+this(subject, regexp, rep, Literal(1))
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (!pos.foldable) {

Review comment:
   Yes. Because all the database tell us pos must be positive





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30012: [SPARK-XXX][INFRA] Rebalance GitHub Action jobs

2020-10-12 Thread GitBox


SparkQA commented on pull request #30012:
URL: https://github.com/apache/spark/pull/30012#issuecomment-707463803


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34327/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #29891: [SPARK-30796][SQL] Add parameter position for REGEXP_REPLACE

2020-10-12 Thread GitBox


beliefer commented on a change in pull request #29891:
URL: https://github.com/apache/spark/pull/29891#discussion_r503647723



##
File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala
##
@@ -2538,7 +2538,7 @@ object functions {
* @since 1.5.0
*/
   def regexp_replace(e: Column, pattern: String, replacement: String): Column 
= withExpr {
-RegExpReplace(e.expr, lit(pattern).expr, lit(replacement).expr)
+new RegExpReplace(e.expr, lit(pattern).expr, lit(replacement).expr)

Review comment:
   OK





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30012: [SPARK-XXX][INFRA] Rebalance GitHub Action jobs

2020-10-12 Thread GitBox


AmplabJenkins commented on pull request #30012:
URL: https://github.com/apache/spark/pull/30012#issuecomment-707463818







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30012: [SPARK-XXX][INFRA] Rebalance GitHub Action jobs

2020-10-12 Thread GitBox


AmplabJenkins removed a comment on pull request #30012:
URL: https://github.com/apache/spark/pull/30012#issuecomment-707463818







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29933: [SPARK-26533][SQL] Support query auto timeout cancel on thriftserver

2020-10-12 Thread GitBox


AmplabJenkins removed a comment on pull request #29933:
URL: https://github.com/apache/spark/pull/29933#issuecomment-707463501







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29933: [SPARK-26533][SQL] Support query auto timeout cancel on thriftserver

2020-10-12 Thread GitBox


AmplabJenkins commented on pull request #29933:
URL: https://github.com/apache/spark/pull/29933#issuecomment-707463501







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29933: [SPARK-26533][SQL] Support query auto timeout cancel on thriftserver

2020-10-12 Thread GitBox


SparkQA removed a comment on pull request #29933:
URL: https://github.com/apache/spark/pull/29933#issuecomment-707386154


   **[Test build #129714 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129714/testReport)**
 for PR 29933 at commit 
[`b3fc95e`](https://github.com/apache/spark/commit/b3fc95e7c00e3321670963e8542277ccc5bf061c).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29933: [SPARK-26533][SQL] Support query auto timeout cancel on thriftserver

2020-10-12 Thread GitBox


SparkQA commented on pull request #29933:
URL: https://github.com/apache/spark/pull/29933#issuecomment-707462834


   **[Test build #129714 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129714/testReport)**
 for PR 29933 at commit 
[`b3fc95e`](https://github.com/apache/spark/commit/b3fc95e7c00e3321670963e8542277ccc5bf061c).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29881: [SPARK-32852][SQL] spark.sql.hive.metastore.jars support HDFS location

2020-10-12 Thread GitBox


SparkQA commented on pull request #29881:
URL: https://github.com/apache/spark/pull/29881#issuecomment-707462326


   **[Test build #129726 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129726/testReport)**
 for PR 29881 at commit 
[`4016327`](https://github.com/apache/spark/commit/4016327a018cde039433d0b74b4dab2daacad45e).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


SparkQA commented on pull request #30021:
URL: https://github.com/apache/spark/pull/30021#issuecomment-707462315


   **[Test build #129725 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129725/testReport)**
 for PR 30021 at commit 
[`e3c90cf`](https://github.com/apache/spark/commit/e3c90cff4058cea3cec49905cab28022d8c46d96).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on pull request #29891: [SPARK-30796][SQL] Add parameter position for REGEXP_REPLACE

2020-10-12 Thread GitBox


beliefer commented on pull request #29891:
URL: https://github.com/apache/spark/pull/29891#issuecomment-707461906


   > How about the other regex functions, e.g.,`RegExpExtract`? Looks all the 
regex-like functions in Oracle have a position parameter.
   
   Oracle have not `RegExpExtract`.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on pull request #30016: [SPARK-33119][SQL] ScalarSubquery should returns the first two rows to avoid Driver OOM

2020-10-12 Thread GitBox


AngersZh commented on pull request #30016:
URL: https://github.com/apache/spark/pull/30016#issuecomment-707461957


   Nice



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30012: [SPARK-XXX][INFRA] Rebalance GitHub Action jobs

2020-10-12 Thread GitBox


AmplabJenkins removed a comment on pull request #30012:
URL: https://github.com/apache/spark/pull/30012#issuecomment-707461708







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu removed a comment on pull request #30016: [SPARK-33119][SQL] ScalarSubquery should returns the first two rows to avoid Driver OOM

2020-10-12 Thread GitBox


AngersZh removed a comment on pull request #30016:
URL: https://github.com/apache/spark/pull/30016#issuecomment-707461957


   Nice



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30012: [SPARK-XXX][INFRA] Rebalance GitHub Action jobs

2020-10-12 Thread GitBox


AmplabJenkins commented on pull request #30012:
URL: https://github.com/apache/spark/pull/30012#issuecomment-707461708







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29092: [SPARK-32295][SQL] Add not null and size > 0 filters before inner explode/inline to benefit from predicate pushdown

2020-10-12 Thread GitBox


AmplabJenkins removed a comment on pull request #29092:
URL: https://github.com/apache/spark/pull/29092#issuecomment-707460800


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/34326/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29092: [SPARK-32295][SQL] Add not null and size > 0 filters before inner explode/inline to benefit from predicate pushdown

2020-10-12 Thread GitBox


AmplabJenkins removed a comment on pull request #29092:
URL: https://github.com/apache/spark/pull/29092#issuecomment-707460791


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30012: [SPARK-XXX][INFRA] Rebalance GitHub Action jobs

2020-10-12 Thread GitBox


SparkQA removed a comment on pull request #30012:
URL: https://github.com/apache/spark/pull/30012#issuecomment-707417045


   **[Test build #129716 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129716/testReport)**
 for PR 30012 at commit 
[`6c74d7e`](https://github.com/apache/spark/commit/6c74d7e9e772d0a6868bf4520f15c13af7c0f29c).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30012: [SPARK-XXX][INFRA] Rebalance GitHub Action jobs

2020-10-12 Thread GitBox


SparkQA commented on pull request #30012:
URL: https://github.com/apache/spark/pull/30012#issuecomment-707460912


   **[Test build #129716 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129716/testReport)**
 for PR 30012 at commit 
[`6c74d7e`](https://github.com/apache/spark/commit/6c74d7e9e772d0a6868bf4520f15c13af7c0f29c).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29092: [SPARK-32295][SQL] Add not null and size > 0 filters before inner explode/inline to benefit from predicate pushdown

2020-10-12 Thread GitBox


AmplabJenkins commented on pull request #29092:
URL: https://github.com/apache/spark/pull/29092#issuecomment-707460791







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29092: [SPARK-32295][SQL] Add not null and size > 0 filters before inner explode/inline to benefit from predicate pushdown

2020-10-12 Thread GitBox


SparkQA commented on pull request #29092:
URL: https://github.com/apache/spark/pull/29092#issuecomment-707460775


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34326/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on pull request #29881: [SPARK-32852][SQL] spark.sql.hive.metastore.jars support HDFS location

2020-10-12 Thread GitBox


AngersZh commented on pull request #29881:
URL: https://github.com/apache/spark/pull/29881#issuecomment-707459944


   Any more update?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30009: [SPARK-32907][ML] adaptively blockify instances - LinearSVC

2020-10-12 Thread GitBox


SparkQA commented on pull request #30009:
URL: https://github.com/apache/spark/pull/30009#issuecomment-707460083


   **[Test build #129724 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129724/testReport)**
 for PR 30009 at commit 
[`9cd1053`](https://github.com/apache/spark/commit/9cd10535b962b18421a6a21a79564fe6e7fae157).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


beliefer commented on a change in pull request #30021:
URL: https://github.com/apache/spark/pull/30021#discussion_r503643546



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
##
@@ -2974,9 +2974,9 @@ class Analyzer(
*/
   object ResolveWindowFrame extends Rule[LogicalPlan] {
 def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions {
-  case WindowExpression(wf: WindowFunction, WindowSpecDefinition(_, _, f: 
SpecifiedWindowFrame))

Review comment:
   OK





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


cloud-fan commented on a change in pull request #30021:
URL: https://github.com/apache/spark/pull/30021#discussion_r503643542



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
##
@@ -2974,9 +2974,9 @@ class Analyzer(
*/
   object ResolveWindowFrame extends Rule[LogicalPlan] {
 def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions {
-  case WindowExpression(wf: WindowFunction, WindowSpecDefinition(_, _, f: 
SpecifiedWindowFrame))
-  if wf.frame != UnspecifiedFrame && wf.frame != f =>
-failAnalysis(s"Window Frame $f must match the required frame 
${wf.frame}")
+  case WindowExpression(owf: OffsetWindowFunction,
+WindowSpecDefinition(_, _, _: SpecifiedWindowFrame)) =>

Review comment:
   `OffsetWindowFunction` can use `UnspecifiedFrame` now?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30012: [SPARK-XXX][INFRA] Rebalance GitHub Action jobs

2020-10-12 Thread GitBox


SparkQA commented on pull request #30012:
URL: https://github.com/apache/spark/pull/30012#issuecomment-707458370


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34327/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


cloud-fan commented on a change in pull request #30021:
URL: https://github.com/apache/spark/pull/30021#discussion_r503643245



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
##
@@ -2974,9 +2974,9 @@ class Analyzer(
*/
   object ResolveWindowFrame extends Rule[LogicalPlan] {
 def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions {
-  case WindowExpression(wf: WindowFunction, WindowSpecDefinition(_, _, f: 
SpecifiedWindowFrame))

Review comment:
   We should still keep this case, maybe other functions can hit it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

2020-10-12 Thread GitBox


beliefer commented on pull request #2:
URL: https://github.com/apache/spark/pull/2#issuecomment-707458123


   @maropu If there have a lot of like, the reduceLeft will construct very deep 
tree. This will lead to unlimited function calls to increase the height of the 
thread stack.
   ```
   at org.apache.spark.sql.catalyst.trees.TreeNode.foreach(TreeNode.scala:175)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreach$1(TreeNode.scala:175)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreach$1$adapted(TreeNode.scala:175)
at scala.collection.immutable.List.foreach(List.scala:392)
   at org.apache.spark.sql.catalyst.trees.TreeNode.foreach(TreeNode.scala:175)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreach$1(TreeNode.scala:175)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreach$1$adapted(TreeNode.scala:175)
at scala.collection.immutable.List.foreach(List.scala:392)
   at org.apache.spark.sql.catalyst.trees.TreeNode.foreach(TreeNode.scala:175)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreach$1(TreeNode.scala:175)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreach$1$adapted(TreeNode.scala:175)
at scala.collection.immutable.List.foreach(List.scala:392)
   ..
   
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30021: [SPARK-33125][SQL] Improve the error when Lead and Lag are not allowed to specify window frame

2020-10-12 Thread GitBox


SparkQA commented on pull request #30021:
URL: https://github.com/apache/spark/pull/30021#issuecomment-707457646


   **[Test build #129723 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129723/testReport)**
 for PR 30021 at commit 
[`a10ae2f`](https://github.com/apache/spark/commit/a10ae2fd1ac2a14b11b6675c01014d6ea0ad16c9).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on pull request #30011: [WIP][SPARK-32281][SQL] Spark keep SORTED spec in metastore

2020-10-12 Thread GitBox


AngersZh commented on pull request #30011:
URL: https://github.com/apache/spark/pull/30011#issuecomment-707456632


   @dongjoon-hyun @cloud-fan 
   Do you know why we not support bucket `DESC`?  Seems we should support this 
best for this problem.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30011: [WIP][SPARK-32281][SQL] Spark keep SORTED spec in metastore

2020-10-12 Thread GitBox


AmplabJenkins removed a comment on pull request #30011:
URL: https://github.com/apache/spark/pull/30011#issuecomment-707455575







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30011: [WIP][SPARK-32281][SQL] Spark keep SORTED spec in metastore

2020-10-12 Thread GitBox


SparkQA commented on pull request #30011:
URL: https://github.com/apache/spark/pull/30011#issuecomment-707455567


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34325/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30011: [WIP][SPARK-32281][SQL] Spark keep SORTED spec in metastore

2020-10-12 Thread GitBox


AmplabJenkins commented on pull request #30011:
URL: https://github.com/apache/spark/pull/30011#issuecomment-707455575







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   8   9   >