[GitHub] [spark] AmplabJenkins commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


AmplabJenkins commented on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633577959







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633577959







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


SparkQA commented on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633577087


   **[Test build #123080 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123080/testReport)**
 for PR 28633 at commit 
[`7eeea97`](https://github.com/apache/spark/commit/7eeea97ef6dba0638652ec94a07a792d08ade8f4).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


SparkQA removed a comment on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633461749


   **[Test build #123080 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123080/testReport)**
 for PR 28633 at commit 
[`7eeea97`](https://github.com/apache/spark/commit/7eeea97ef6dba0638652ec94a07a792d08ade8f4).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


AmplabJenkins commented on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633575285







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633575285







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


SparkQA commented on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633574333


   **[Test build #123079 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123079/testReport)**
 for PR 28633 at commit 
[`bc6a2c0`](https://github.com/apache/spark/commit/bc6a2c0afe3eeb20c1b60849a9d986c658e08c35).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


SparkQA removed a comment on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633454615


   **[Test build #123079 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123079/testReport)**
 for PR 28633 at commit 
[`bc6a2c0`](https://github.com/apache/spark/commit/bc6a2c0afe3eeb20c1b60849a9d986c658e08c35).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28540: [SPARK-31719][SQL] Refactor JoinSelection

2020-05-25 Thread GitBox


maropu commented on a change in pull request #28540:
URL: https://github.com/apache/spark/pull/28540#discussion_r429928063



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
##
@@ -208,3 +208,161 @@ object ExtractPythonUDFFromJoinCondition extends 
Rule[LogicalPlan] with Predicat
   }
   }
 }
+
+sealed abstract class BuildSide
+
+case object BuildRight extends BuildSide
+
+case object BuildLeft extends BuildSide
+
+trait JoinSelectionHelper {
+
+  def getBroadcastBuildSide(
+  left: LogicalPlan,
+  right: LogicalPlan,
+  joinType: JoinType,
+  hint: JoinHint,
+  onlyLookingAtHint: Boolean,

Review comment:
   Ah, `hintOnly` looks nice to me.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28540: [SPARK-31719][SQL] Refactor JoinSelection

2020-05-25 Thread GitBox


cloud-fan commented on a change in pull request #28540:
URL: https://github.com/apache/spark/pull/28540#discussion_r429927305



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
##
@@ -208,3 +208,161 @@ object ExtractPythonUDFFromJoinCondition extends 
Rule[LogicalPlan] with Predicat
   }
   }
 }
+
+sealed abstract class BuildSide
+
+case object BuildRight extends BuildSide
+
+case object BuildLeft extends BuildSide
+
+trait JoinSelectionHelper {
+
+  def getBroadcastBuildSide(
+  left: LogicalPlan,
+  right: LogicalPlan,
+  joinType: JoinType,
+  hint: JoinHint,
+  onlyLookingAtHint: Boolean,

Review comment:
   If we want to be super clear, maybe "pickJoinSideByHintOnly"? or simply 
"hintOnly".





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28636: [SPARK-31818][SQL][test-hive1.2] Fix pushing down filters with `java.time.Instant` values in ORC

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28636:
URL: https://github.com/apache/spark/pull/28636#issuecomment-633556652







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28636: [SPARK-31818][SQL][test-hive1.2] Fix pushing down filters with `java.time.Instant` values in ORC

2020-05-25 Thread GitBox


SparkQA commented on pull request #28636:
URL: https://github.com/apache/spark/pull/28636#issuecomment-633559538


   **[Test build #123084 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123084/testReport)**
 for PR 28636 at commit 
[`5290d96`](https://github.com/apache/spark/commit/5290d9642cd5d25bf26c0087753d053194f55de1).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28636: [SPARK-31818][SQL][test-hive1.2] Fix pushing down filters with `java.time.Instant` values in ORC

2020-05-25 Thread GitBox


AmplabJenkins commented on pull request #28636:
URL: https://github.com/apache/spark/pull/28636#issuecomment-633556652







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on pull request #28636: [SPARK-31818][SQL][test-hive1.2] Fix pushing down filters with `java.time.Instant` values in ORC

2020-05-25 Thread GitBox


MaxGekk commented on pull request #28636:
URL: https://github.com/apache/spark/pull/28636#issuecomment-633556551


   @cloud-fan @HyukjinKwon @dongjoon-hyun Please, review this PR. It is similar 
to https://github.com/apache/spark/pull/28261 but for timestamps.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk opened a new pull request #28636: [SPARK-31818][SQL][test-hive1.2] Fix pushing down filters with `java.time.Instant` values in ORC

2020-05-25 Thread GitBox


MaxGekk opened a new pull request #28636:
URL: https://github.com/apache/spark/pull/28636


   ### What changes were proposed in this pull request?
   Convert `java.time.Instant` to `java.sql.Timestamp` in pushed down filters 
to ORC datasource when Java 8 time API enabled.
   
   ### Why are the changes needed?
   The changes fix the exception raised while pushing date filters when 
`spark.sql.datetime.java8API.enabled` is set to `true`:
   ```
   java.lang.IllegalArgumentException: Wrong value class java.time.Instant for 
TIMESTAMP.EQUALS leaf
at 
org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$PredicateLeafImpl.checkLiteralType(SearchArgumentImpl.java:192)
at 
org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$PredicateLeafImpl.(SearchArgumentImpl.java:75)
   ```
   
   ### Does this PR introduce any user-facing change?
   Yes
   
   ### How was this patch tested?
   Added tests to `OrcFilterSuite`.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633549317







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


AmplabJenkins commented on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633549317







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


SparkQA removed a comment on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633429823


   **[Test build #123077 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123077/testReport)**
 for PR 28633 at commit 
[`c5ffc27`](https://github.com/apache/spark/commit/c5ffc27b6064190c6e7e4cf85765abe304d637c7).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


SparkQA commented on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633548522


   **[Test build #123077 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123077/testReport)**
 for PR 28633 at commit 
[`c5ffc27`](https://github.com/apache/spark/commit/c5ffc27b6064190c6e7e4cf85765abe304d637c7).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28551: [SPARK-31393][SQL][FOLLOW-UP] Show the correct alias in schema for expression

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28551:
URL: https://github.com/apache/spark/pull/28551#issuecomment-633543877







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28551: [SPARK-31393][SQL][FOLLOW-UP] Show the correct alias in schema for expression

2020-05-25 Thread GitBox


AmplabJenkins commented on pull request #28551:
URL: https://github.com/apache/spark/pull/28551#issuecomment-633543877







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28551: [SPARK-31393][SQL][FOLLOW-UP] Show the correct alias in schema for expression

2020-05-25 Thread GitBox


SparkQA commented on pull request #28551:
URL: https://github.com/apache/spark/pull/28551#issuecomment-633543378


   **[Test build #123083 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123083/testReport)**
 for PR 28551 at commit 
[`53cdbb7`](https://github.com/apache/spark/commit/53cdbb76f81d250bebb8e4b6ed43afa6db25b8c0).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gaborgsomogyi commented on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector

2020-05-25 Thread GitBox


gaborgsomogyi commented on pull request #28635:
URL: https://github.com/apache/spark/pull/28635#issuecomment-633543073


   cc @HeartSaVioR 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gaborgsomogyi edited a comment on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector

2020-05-25 Thread GitBox


gaborgsomogyi edited a comment on pull request #28635:
URL: https://github.com/apache/spark/pull/28635#issuecomment-633511928


   Apart from manual testing I've tried to add docker integration test (failed) 
and tried out the following:
   * Set up an active directory docker => not yet supported, please see 
[this](https://github.com/microsoft/mssql-docker/issues/165) link for details.
   * Use Hadoop `MiniKdc` => simply never authenticated, furthermore MS never 
claimed it is working
   
   If there will be a working docker image with an Active Directory instance we 
can try it again. In the meantime if somebody has an idea how to overcome this 
feel free to add.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #28551: [SPARK-31393][SQL][FOLLOW-UP] Show the correct alias in schema for expression

2020-05-25 Thread GitBox


HyukjinKwon commented on pull request #28551:
URL: https://github.com/apache/spark/pull/28551#issuecomment-633541832







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633536412







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


AmplabJenkins commented on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633536412







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


SparkQA removed a comment on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633413383


   **[Test build #123074 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123074/testReport)**
 for PR 28633 at commit 
[`c5ffc27`](https://github.com/apache/spark/commit/c5ffc27b6064190c6e7e4cf85765abe304d637c7).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gaborgsomogyi commented on a change in pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector

2020-05-25 Thread GitBox


gaborgsomogyi commented on a change in pull request #28635:
URL: https://github.com/apache/spark/pull/28635#discussion_r429886742



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/MSSQLConnectionProvider.scala
##
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources.jdbc.connection
+
+import java.security.PrivilegedExceptionAction
+import java.sql.{Connection, Driver}
+import java.util.Properties
+
+import org.apache.hadoop.security.UserGroupInformation
+
+import org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions
+
+private[sql] class MSSQLConnectionProvider(
+driver: Driver,
+options: JDBCOptions,
+parserMethod: String = "parseAndMergeProperties"
+  ) extends SecureConnectionProvider(driver, options) {
+  override val appEntry: String = {
+val configName = "jaasConfigurationName"
+val appEntryDefault = "SQLJDBCDriver"
+
+val parseURL = try {

Review comment:
   There are basically 2 approaches to parse the URL to get 
`jaasConfigurationName`:
   * Try to call private 
[parseAndMergeProperties](https://github.com/microsoft/mssql-jdbc/blob/0d4e97f401dc0e55779460d9709dd7ee399246be/src/main/java/com/microsoft/sqlserver/jdbc/SQLServerDriver.java#L831-L854)
 => tried it first
   * Parse the URL manually => used as fallback
   
   Both way has been tested in `MSSQLConnectionProviderSuite`.
   

##
File path: 
external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MsSqlServerIntegrationSuite.scala
##
@@ -27,7 +27,7 @@ import org.apache.spark.tags.DockerTest
 @DockerTest
 class MsSqlServerIntegrationSuite extends DockerJDBCIntegrationSuite {
   override val db = new DatabaseOnDocker {
-override val imageName = "mcr.microsoft.com/mssql/server:2017-GA-ubuntu"
+override val imageName = 
"mcr.microsoft.com/mssql/server:2019-GA-ubuntu-16.04"

Review comment:
   This is not absolutely necessary, if we think we can extract it into a 
new PR. Thought it would be overkill.

##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/MSSQLConnectionProvider.scala
##
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources.jdbc.connection
+
+import java.security.PrivilegedExceptionAction
+import java.sql.{Connection, Driver}
+import java.util.Properties
+
+import org.apache.hadoop.security.UserGroupInformation
+
+import org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions
+
+private[sql] class MSSQLConnectionProvider(

Review comment:
   The implementation is based on 
[this](https://docs.microsoft.com/en-us/sql/connect/jdbc/using-kerberos-integrated-authentication-to-connect-to-sql-server?view=sql-server-ver15).

##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/MSSQLConnectionProviderSuite.scala
##
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a 

[GitHub] [spark] SparkQA commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


SparkQA commented on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633535658


   **[Test build #123074 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123074/testReport)**
 for PR 28633 at commit 
[`c5ffc27`](https://github.com/apache/spark/commit/c5ffc27b6064190c6e7e4cf85765abe304d637c7).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28630: [SPARK-31806][SQL][TESTS] Check reading date/timestamp from legacy parquet: dictionary encoding, w/o Spark version

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28630:
URL: https://github.com/apache/spark/pull/28630#issuecomment-633530671







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28630: [SPARK-31806][SQL][TESTS] Check reading date/timestamp from legacy parquet: dictionary encoding, w/o Spark version

2020-05-25 Thread GitBox


AmplabJenkins commented on pull request #28630:
URL: https://github.com/apache/spark/pull/28630#issuecomment-633530671







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28630: [SPARK-31806][SQL][TESTS] Check reading date/timestamp from legacy parquet: dictionary encoding, w/o Spark version

2020-05-25 Thread GitBox


SparkQA removed a comment on pull request #28630:
URL: https://github.com/apache/spark/pull/28630#issuecomment-633413390


   **[Test build #123075 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123075/testReport)**
 for PR 28630 at commit 
[`0add1a2`](https://github.com/apache/spark/commit/0add1a22dfff851bae71d9330f984ace41e5663f).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28630: [SPARK-31806][SQL][TESTS] Check reading date/timestamp from legacy parquet: dictionary encoding, w/o Spark version

2020-05-25 Thread GitBox


SparkQA commented on pull request #28630:
URL: https://github.com/apache/spark/pull/28630#issuecomment-633529870


   **[Test build #123075 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123075/testReport)**
 for PR 28630 at commit 
[`0add1a2`](https://github.com/apache/spark/commit/0add1a22dfff851bae71d9330f984ace41e5663f).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector

2020-05-25 Thread GitBox


SparkQA commented on pull request #28635:
URL: https://github.com/apache/spark/pull/28635#issuecomment-633525575


   **[Test build #123082 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123082/testReport)**
 for PR 28635 at commit 
[`88306f0`](https://github.com/apache/spark/commit/88306f00cf6aea7636f432e3c9a04c0f44137770).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28635:
URL: https://github.com/apache/spark/pull/28635#issuecomment-633512198


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123081/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector

2020-05-25 Thread GitBox


SparkQA commented on pull request #28635:
URL: https://github.com/apache/spark/pull/28635#issuecomment-633512184


   **[Test build #123081 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123081/testReport)**
 for PR 28635 at commit 
[`3c30c4a`](https://github.com/apache/spark/commit/3c30c4aae5ea5c210967ae30c52a68544db2d048).
* This patch **fails build dependency tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector

2020-05-25 Thread GitBox


AmplabJenkins commented on pull request #28635:
URL: https://github.com/apache/spark/pull/28635#issuecomment-633512192







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28635:
URL: https://github.com/apache/spark/pull/28635#issuecomment-633511059







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector

2020-05-25 Thread GitBox


SparkQA removed a comment on pull request #28635:
URL: https://github.com/apache/spark/pull/28635#issuecomment-633510744


   **[Test build #123081 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123081/testReport)**
 for PR 28635 at commit 
[`3c30c4a`](https://github.com/apache/spark/commit/3c30c4aae5ea5c210967ae30c52a68544db2d048).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gaborgsomogyi commented on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector

2020-05-25 Thread GitBox


gaborgsomogyi commented on pull request #28635:
URL: https://github.com/apache/spark/pull/28635#issuecomment-633511928


   Apart from manual testing I've tried to add docker integration test (failed) 
and tried out the following:
   * Set up an active directory docker => not yet supported, please see 
[this](https://github.com/microsoft/mssql-docker/issues/165) link for details.
   * Use Hadoop `MiniKdc` => simply never authenticated, furthermore MS never 
claimed it is working
   
   If there will be a working docker image with an Active Directory instance we 
can try to it again. In the meantime if somebody has an idea how to overcome 
this feel free to add.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector

2020-05-25 Thread GitBox


AmplabJenkins commented on pull request #28635:
URL: https://github.com/apache/spark/pull/28635#issuecomment-633511059


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector

2020-05-25 Thread GitBox


SparkQA commented on pull request #28635:
URL: https://github.com/apache/spark/pull/28635#issuecomment-633510744


   **[Test build #123081 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123081/testReport)**
 for PR 28635 at commit 
[`3c30c4a`](https://github.com/apache/spark/commit/3c30c4aae5ea5c210967ae30c52a68544db2d048).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gaborgsomogyi opened a new pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector

2020-05-25 Thread GitBox


gaborgsomogyi opened a new pull request #28635:
URL: https://github.com/apache/spark/pull/28635


   ### What changes were proposed in this pull request?
   When loading DataFrames from JDBC datasource with Kerberos authentication, 
remote executors (yarn-client/cluster etc. modes) fail to establish a 
connection due to lack of Kerberos ticket or ability to generate it.
   
   This is a real issue when trying to ingest data from kerberized data sources 
(SQL Server, Oracle) in enterprise environment where exposing simple 
authentication access is not an option due to IT policy issues.
   
   In this PR I've added MS SQL support.
   
   What this PR contains:
   * Added `MSSQLConnectionProvider`
   * Added `MSSQLConnectionProviderSuite`
   * Changed MS SQL JDBC driver to use the latest (test scope only)
   * Changed `MsSqlServerIntegrationSuite` docker image to use the latest
   * Added a version comment to `MariaDBConnectionProvider` to increase 
trackability
   
   ### Why are the changes needed?
   Missing JDBC kerberos support.
   
   ### Does this PR introduce _any_ user-facing change?
   Yes, now user is able to connect to MS SQL using kerberos.
   
   ### How was this patch tested?
   * Additional + existing unit tests
   * Existing integration tests
   * Test on cluster manually
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on pull request #28551: [SPARK-31393][SQL][FOLLOW-UP] Show the correct alias in schema for expression

2020-05-25 Thread GitBox


beliefer commented on pull request #28551:
URL: https://github.com/apache/spark/pull/28551#issuecomment-633493077


   cc @HyukjinKwon 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] prakharjain09 edited a comment on pull request #28634: [SPARK-31810][TEST] Fix AlterTableRecoverPartitions test using incorrect api to modify RDD_PARALLEL_LISTING_THRESHOLD

2020-05-25 Thread GitBox


prakharjain09 edited a comment on pull request #28634:
URL: https://github.com/apache/spark/pull/28634#issuecomment-633481802


   cc - @srowen @cloud-fan @Dooyoung-Hwang 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] prakharjain09 commented on pull request #28634: [SPARK-31810][TEST] Fix AlterTableRecoverPartitions test using incorrect api to modify RDD_PARALLEL_LISTING_THRESHOLD

2020-05-25 Thread GitBox


prakharjain09 commented on pull request #28634:
URL: https://github.com/apache/spark/pull/28634#issuecomment-633481802


   cc - @srowen @cloud-fan 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


AmplabJenkins commented on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633462469







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633462469







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


SparkQA commented on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633461749


   **[Test build #123080 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123080/testReport)**
 for PR 28633 at commit 
[`7eeea97`](https://github.com/apache/spark/commit/7eeea97ef6dba0638652ec94a07a792d08ade8f4).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28626: [SPARK-28481][SQL] More expressions should extend NullIntolerant

2020-05-25 Thread GitBox


maropu commented on a change in pull request #28626:
URL: https://github.com/apache/spark/pull/28626#discussion_r429811184



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/expressions/ExpressionInfoSuite.scala
##
@@ -156,4 +158,74 @@ class ExpressionInfoSuite extends SparkFunSuite with 
SharedSparkSession {
   }
 }
   }
+
+  test("Check whether should extend NullIntolerant") {
+// Only check expressions extended from these expressions
+val parentExpressionNames = Seq(classOf[UnaryExpression], 
classOf[BinaryExpression],
+  classOf[TernaryExpression], classOf[QuaternaryExpression],
+  classOf[SeptenaryExpression]).map(_.getName)

Review comment:
   Ur, I got it. okay and please leave some comments about that, too.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633455287







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


AmplabJenkins commented on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633455287







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28621: [SPARK-31803][ML] Make sure instance weight is not negative

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28621:
URL: https://github.com/apache/spark/pull/28621#issuecomment-633454104


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123078/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


SparkQA commented on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633454615


   **[Test build #123079 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123079/testReport)**
 for PR 28633 at commit 
[`bc6a2c0`](https://github.com/apache/spark/commit/bc6a2c0afe3eeb20c1b60849a9d986c658e08c35).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28621: [SPARK-31803][ML] Make sure instance weight is not negative

2020-05-25 Thread GitBox


SparkQA removed a comment on pull request #28621:
URL: https://github.com/apache/spark/pull/28621#issuecomment-633437134


   **[Test build #123078 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123078/testReport)**
 for PR 28621 at commit 
[`e1b4422`](https://github.com/apache/spark/commit/e1b44228646fdffcba27684b46fdd2ec79d549ea).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28621: [SPARK-31803][ML] Make sure instance weight is not negative

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28621:
URL: https://github.com/apache/spark/pull/28621#issuecomment-633454093


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28621: [SPARK-31803][ML] Make sure instance weight is not negative

2020-05-25 Thread GitBox


AmplabJenkins commented on pull request #28621:
URL: https://github.com/apache/spark/pull/28621#issuecomment-633454093







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28621: [SPARK-31803][ML] Make sure instance weight is not negative

2020-05-25 Thread GitBox


SparkQA commented on pull request #28621:
URL: https://github.com/apache/spark/pull/28621#issuecomment-633453976


   **[Test build #123078 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123078/testReport)**
 for PR 28621 at commit 
[`e1b4422`](https://github.com/apache/spark/commit/e1b44228646fdffcba27684b46fdd2ec79d549ea).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


HyukjinKwon commented on a change in pull request #28633:
URL: https://github.com/apache/spark/pull/28633#discussion_r429808596



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
##
@@ -312,6 +312,10 @@ case object NamePlaceholder extends LeafExpression with 
Unevaluable {
  * Returns a Row containing the evaluation of all children expressions.
  */
 object CreateStruct extends FunctionBuilder {
+  /**
+   * Returns a named struct with generating names or using the names when 
available.
+   * It should be used only for an internal purpose.

Review comment:
   I pushed some changes but let me know if you prefer this way. I don't 
mind changing it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


HyukjinKwon commented on a change in pull request #28633:
URL: https://github.com/apache/spark/pull/28633#discussion_r429805322



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
##
@@ -434,6 +449,11 @@ case class CreateNamedStruct(children: Seq[Expression]) 
extends Expression {
   }
 
   override def prettyName: String = "named_struct"
+
+  override def sql: String = getTagValue(FUNC_ALIAS).map { alias =>

Review comment:
   Yup, should be better to do that.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


HyukjinKwon commented on a change in pull request #28633:
URL: https://github.com/apache/spark/pull/28633#discussion_r429805059



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
##
@@ -312,6 +312,10 @@ case object NamePlaceholder extends LeafExpression with 
Unevaluable {
  * Returns a Row containing the evaluation of all children expressions.
  */
 object CreateStruct extends FunctionBuilder {
+  /**
+   * Returns a named struct with generating names or using the names when 
available.
+   * It should be used only for an internal purpose.

Review comment:
   There are some cases when `CreateNamedStruct` is inserted, e.g.) 
https://github.com/apache/spark/pull/28633#discussion_r429756482
   
   I was thinking that it's fine to treat the cases as just using 
`named_struct` internally. Also, it makes many diff in SQL tests and etc. 
Thought it's better to minimise the diff.
   
   Actually moving to `CreateNamedStruct` is the first way I tried. One thing 
is that `CreateNamedStruct` case class has the same signature so 
`CreateNamedStruct` companion object can't have the same signature at `apply`.
   
   I could consistently have `CreateNamedStruct.create` and 
`CreateStruct.create` if you prefer this way.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] hvanhovell commented on a change in pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


hvanhovell commented on a change in pull request #28633:
URL: https://github.com/apache/spark/pull/28633#discussion_r429802771



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
##
@@ -434,6 +449,11 @@ case class CreateNamedStruct(children: Seq[Expression]) 
extends Expression {
   }
 
   override def prettyName: String = "named_struct"
+
+  override def sql: String = getTagValue(FUNC_ALIAS).map { alias =>

Review comment:
   Shouldn't you override prettyName as well? In general this seems to have 
been solved with either a mixin or at the Expression level.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] hvanhovell commented on a change in pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


hvanhovell commented on a change in pull request #28633:
URL: https://github.com/apache/spark/pull/28633#discussion_r429802509



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
##
@@ -312,6 +312,10 @@ case object NamePlaceholder extends LeafExpression with 
Unevaluable {
  * Returns a Row containing the evaluation of all children expressions.
  */
 object CreateStruct extends FunctionBuilder {
+  /**
+   * Returns a named struct with generating names or using the names when 
available.
+   * It should be used only for an internal purpose.

Review comment:
   You could also consider moving this into the `CreateNamedStruct` 
companion to avoid confusion. BTW in which places do we call this method and 
still want to retain the old name?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


HyukjinKwon commented on a change in pull request #28633:
URL: https://github.com/apache/spark/pull/28633#discussion_r429800396



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
##
@@ -320,12 +324,23 @@ object CreateStruct extends FunctionBuilder {
 })
   }
 
+  /**
+   * Returns a named struct with a pretty SQL name. It will show the pretty 
SQL string
+   * in its output column name as if `struct(...)` was called. Should be used 
for an
+   * external purpose.
+   */
+  def create(children: Seq[Expression]): CreateNamedStruct = {
+val expr = CreateStruct(children)
+expr.setTagValue(FUNC_ALIAS, "struct")
+expr
+  }
+
   /**
* Entry to use in the function registry.
*/
   val registryEntry: (String, (ExpressionInfo, FunctionBuilder)) = {
 val info: ExpressionInfo = new ExpressionInfo(
-  "org.apache.spark.sql.catalyst.expressions.NamedStruct",
+  classOf[CreateNamedStruct].getCanonicalName,

Review comment:
   Seems like it will need more changes than the current changes to reuse 
it because we should have a different description for `struct` specifically.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


HyukjinKwon commented on a change in pull request #28633:
URL: https://github.com/apache/spark/pull/28633#discussion_r429796543



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
##
@@ -312,6 +312,10 @@ case object NamePlaceholder extends LeafExpression with 
Unevaluable {
  * Returns a Row containing the evaluation of all children expressions.
  */
 object CreateStruct extends FunctionBuilder {
+  /**
+   * Returns a named struct with generating names or using the names when 
available.
+   * It should be used only for an internal purpose.

Review comment:
   Hm, I will rephrase it to clarify like it shouldn't be used when 
`struct` is explicitly specified by a user.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] attilapiros commented on a change in pull request #28606: [MINOR][YARN]False report isAllNodeBlacklisted when RM is having issue

2020-05-25 Thread GitBox


attilapiros commented on a change in pull request #28606:
URL: https://github.com/apache/spark/pull/28606#discussion_r429795946



##
File path: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocatorBlacklistTracker.scala
##
@@ -103,7 +103,14 @@ private[spark] class YarnAllocatorBlacklistTracker(
 refreshBlacklistedNodes()
   }
 
-  def isAllNodeBlacklisted: Boolean = currentBlacklistedYarnNodes.size >= 
numClusterNodes
+  def isAllNodeBlacklisted: Boolean = {
+if (numClusterNodes <= 0) {

Review comment:
   `numClusterNodes == 0` would be better





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28553: [SPARK-31734][ML][PySpark] Add weight support in ClusteringEvaluator

2020-05-25 Thread GitBox


AmplabJenkins commented on pull request #28553:
URL: https://github.com/apache/spark/pull/28553#issuecomment-633443120







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28553: [SPARK-31734][ML][PySpark] Add weight support in ClusteringEvaluator

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28553:
URL: https://github.com/apache/spark/pull/28553#issuecomment-633443120







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28553: [SPARK-31734][ML][PySpark] Add weight support in ClusteringEvaluator

2020-05-25 Thread GitBox


SparkQA removed a comment on pull request #28553:
URL: https://github.com/apache/spark/pull/28553#issuecomment-633413387


   **[Test build #123073 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123073/testReport)**
 for PR 28553 at commit 
[`cab88cc`](https://github.com/apache/spark/commit/cab88cc940bc9fba7f66a4d385986a7f603692a8).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28553: [SPARK-31734][ML][PySpark] Add weight support in ClusteringEvaluator

2020-05-25 Thread GitBox


SparkQA commented on pull request #28553:
URL: https://github.com/apache/spark/pull/28553#issuecomment-633442641


   **[Test build #123073 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123073/testReport)**
 for PR 28553 at commit 
[`cab88cc`](https://github.com/apache/spark/commit/cab88cc940bc9fba7f66a4d385986a7f603692a8).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `  case class ClusterStats(featureSum: Vector, squaredNormSum: Double, 
weightSum: Double)`



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] hvanhovell commented on a change in pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


hvanhovell commented on a change in pull request #28633:
URL: https://github.com/apache/spark/pull/28633#discussion_r429792739



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
##
@@ -312,6 +312,10 @@ case object NamePlaceholder extends LeafExpression with 
Unevaluable {
  * Returns a Row containing the evaluation of all children expressions.
  */
 object CreateStruct extends FunctionBuilder {
+  /**
+   * Returns a named struct with generating names or using the names when 
available.
+   * It should be used only for an internal purpose.

Review comment:
   Everything in catalyst is private right?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28621: [SPARK-31803][ML] Make sure instance weight is not negative

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28621:
URL: https://github.com/apache/spark/pull/28621#issuecomment-633437651







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28621: [SPARK-31803][ML] Make sure instance weight is not negative

2020-05-25 Thread GitBox


AmplabJenkins commented on pull request #28621:
URL: https://github.com/apache/spark/pull/28621#issuecomment-633437651







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28621: [SPARK-31803][ML] Make sure instance weight is not negative

2020-05-25 Thread GitBox


SparkQA commented on pull request #28621:
URL: https://github.com/apache/spark/pull/28621#issuecomment-633437134


   **[Test build #123078 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123078/testReport)**
 for PR 28621 at commit 
[`e1b4422`](https://github.com/apache/spark/commit/e1b44228646fdffcba27684b46fdd2ec79d549ea).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


maropu commented on a change in pull request #28633:
URL: https://github.com/apache/spark/pull/28633#discussion_r429787806



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
##
@@ -320,12 +324,23 @@ object CreateStruct extends FunctionBuilder {
 })
   }
 
+  /**
+   * Returns a named struct with a pretty SQL name. It will show the pretty 
SQL string
+   * in its output column name as if `struct(...)` was called. Should be used 
for an
+   * external purpose.
+   */
+  def create(children: Seq[Expression]): CreateNamedStruct = {
+val expr = CreateStruct(children)
+expr.setTagValue(FUNC_ALIAS, "struct")
+expr
+  }
+
   /**
* Entry to use in the function registry.
*/
   val registryEntry: (String, (ExpressionInfo, FunctionBuilder)) = {
 val info: ExpressionInfo = new ExpressionInfo(
-  "org.apache.spark.sql.catalyst.expressions.NamedStruct",
+  classOf[CreateNamedStruct].getCanonicalName,

Review comment:
   nit tnohgh, we cannot reuse the `ExpressionInfo` of `CreateNameStruct` 
via reflection here?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28634: [SPARK-31810][TEST] Fix test where spark conf was modified using incorrect api

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28634:
URL: https://github.com/apache/spark/pull/28634#issuecomment-633429669


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633430366







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


AmplabJenkins commented on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633430366







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


SparkQA commented on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633429823


   **[Test build #123077 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123077/testReport)**
 for PR 28633 at commit 
[`c5ffc27`](https://github.com/apache/spark/commit/c5ffc27b6064190c6e7e4cf85765abe304d637c7).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28634: [SPARK-31810][TEST] Fix test where spark conf was modified using incorrect api

2020-05-25 Thread GitBox


AmplabJenkins commented on pull request #28634:
URL: https://github.com/apache/spark/pull/28634#issuecomment-633430139


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28634: [SPARK-31810][TEST] Fix test where spark conf was modified using incorrect api

2020-05-25 Thread GitBox


AmplabJenkins commented on pull request #28634:
URL: https://github.com/apache/spark/pull/28634#issuecomment-633429669


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] prakharjain09 opened a new pull request #28634: [SPARK-31810][TEST] Fix test where spark conf was modified using incorrect api

2020-05-25 Thread GitBox


prakharjain09 opened a new pull request #28634:
URL: https://github.com/apache/spark/pull/28634


   ### What changes were proposed in this pull request?
   Use the correct API in AlterTableRecoverPartition tests to modify the 
`RDD_PARALLEL_LISTING_THRESHOLD` conf.
   
   ### Why are the changes needed?
   The existing AlterTableRecoverPartitions test modify the 
RDD_PARALLEL_LISTING_THRESHOLD as a SQLConf using the withSQLConf API. But 
since, this is not a SQLConf, it is not overridden and so the test doesn't end 
up testing the required behaviour.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   This is UT Fix. UTs are still passing after the fix.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


maropu commented on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633426646


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28627:
URL: https://github.com/apache/spark/pull/28627#issuecomment-633420266







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test

2020-05-25 Thread GitBox


AmplabJenkins commented on pull request #28627:
URL: https://github.com/apache/spark/pull/28627#issuecomment-633420266







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test

2020-05-25 Thread GitBox


SparkQA commented on pull request #28627:
URL: https://github.com/apache/spark/pull/28627#issuecomment-633419679


   **[Test build #123076 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123076/testReport)**
 for PR 28627 at commit 
[`70d07a3`](https://github.com/apache/spark/commit/70d07a3785799e5f44e21cb9c541faaa650ad6c0).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sarutak commented on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test

2020-05-25 Thread GitBox


sarutak commented on pull request #28627:
URL: https://github.com/apache/spark/pull/28627#issuecomment-633417531


   retest this please.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28627:
URL: https://github.com/apache/spark/pull/28627#issuecomment-633414769


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123071/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] huaxingao commented on a change in pull request #28595: [SPARK-31781][ML][PySpark] Move param k (number of clusters) to shared params

2020-05-25 Thread GitBox


huaxingao commented on a change in pull request #28595:
URL: https://github.com/apache/spark/pull/28595#discussion_r429765422



##
File path: 
mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala
##
@@ -562,4 +562,20 @@ trait HasBlockSize extends Params {
   /** @group expertGetParam */
   final def getBlockSize: Int = $(blockSize)
 }
+
+/**
+ * Trait for shared param k. This trait may be changed or
+ * removed between minor versions.
+ */
+trait HasK extends Params {
+
+  /**
+   * Param for The number of clusters to create. Must be ( 1). Note that 
it is possible for fewer than k clusters to be returned.
+   * @group param
+   */
+  final val k: IntParam = new IntParam(this, "k", "The number of clusters to 
create. Must be (> 1). Note that it is possible for fewer than k clusters to be 
returned", ParamValidators.gt(1))

Review comment:
   Yes, we lost the ```@Since``` annotations for ```k``` and ```getK```. We 
still have since 2.4.0 for ```setK```, though. 
   For all the shared params, we don't have ```@Since``` annotations for the 
param and getXXX, only have ```@Since``` annotations for setXXX.

##
File path: 
mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala
##
@@ -562,4 +562,20 @@ trait HasBlockSize extends Params {
   /** @group expertGetParam */
   final def getBlockSize: Int = $(blockSize)
 }
+
+/**
+ * Trait for shared param k. This trait may be changed or
+ * removed between minor versions.
+ */
+trait HasK extends Params {
+
+  /**
+   * Param for The number of clusters to create. Must be ( 1). Note that 
it is possible for fewer than k clusters to be returned.
+   * @group param
+   */
+  final val k: IntParam = new IntParam(this, "k", "The number of clusters to 
create. Must be (> 1). Note that it is possible for fewer than k clusters to be 
returned", ParamValidators.gt(1))
+
+  /** @group getParam */
+  final def getK: Int = $(k)

Review comment:
   Right, no MiMia problems if this isn't final, but we can't change it 
because it's generated code. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28553: [SPARK-31734][ML][PySpark] Add weight support in ClusteringEvaluator

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28553:
URL: https://github.com/apache/spark/pull/28553#issuecomment-633414094







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28630: [SPARK-31806][SQL][TESTS] Check reading date/timestamp from legacy parquet: dictionary encoding, w/o Spark version

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28630:
URL: https://github.com/apache/spark/pull/28630#issuecomment-633414056







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


SparkQA removed a comment on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633407168


   **[Test build #123072 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123072/testReport)**
 for PR 28633 at commit 
[`59acc36`](https://github.com/apache/spark/commit/59acc368427fb0462c7b1fb9efa7e77cb3f0b9de).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test

2020-05-25 Thread GitBox


SparkQA commented on pull request #28627:
URL: https://github.com/apache/spark/pull/28627#issuecomment-633414635


   **[Test build #123071 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123071/testReport)**
 for PR 28627 at commit 
[`70d07a3`](https://github.com/apache/spark/commit/70d07a3785799e5f44e21cb9c541faaa650ad6c0).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28627:
URL: https://github.com/apache/spark/pull/28627#issuecomment-633414764


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test

2020-05-25 Thread GitBox


AmplabJenkins commented on pull request #28627:
URL: https://github.com/apache/spark/pull/28627#issuecomment-633414764







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


SparkQA commented on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633414636


   **[Test build #123072 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123072/testReport)**
 for PR 28633 at commit 
[`59acc36`](https://github.com/apache/spark/commit/59acc368427fb0462c7b1fb9efa7e77cb3f0b9de).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


AmplabJenkins commented on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633414667







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty

2020-05-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28633:
URL: https://github.com/apache/spark/pull/28633#issuecomment-633414667


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test

2020-05-25 Thread GitBox


SparkQA removed a comment on pull request #28627:
URL: https://github.com/apache/spark/pull/28627#issuecomment-633357514


   **[Test build #123071 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123071/testReport)**
 for PR 28627 at commit 
[`70d07a3`](https://github.com/apache/spark/commit/70d07a3785799e5f44e21cb9c541faaa650ad6c0).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >