[GitHub] [spark] AmplabJenkins commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
AmplabJenkins commented on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633577959 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
AmplabJenkins removed a comment on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633577959 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
SparkQA commented on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633577087 **[Test build #123080 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123080/testReport)** for PR 28633 at commit [`7eeea97`](https://github.com/apache/spark/commit/7eeea97ef6dba0638652ec94a07a792d08ade8f4). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
SparkQA removed a comment on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633461749 **[Test build #123080 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123080/testReport)** for PR 28633 at commit [`7eeea97`](https://github.com/apache/spark/commit/7eeea97ef6dba0638652ec94a07a792d08ade8f4). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
AmplabJenkins commented on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633575285 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
AmplabJenkins removed a comment on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633575285 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
SparkQA commented on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633574333 **[Test build #123079 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123079/testReport)** for PR 28633 at commit [`bc6a2c0`](https://github.com/apache/spark/commit/bc6a2c0afe3eeb20c1b60849a9d986c658e08c35). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
SparkQA removed a comment on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633454615 **[Test build #123079 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123079/testReport)** for PR 28633 at commit [`bc6a2c0`](https://github.com/apache/spark/commit/bc6a2c0afe3eeb20c1b60849a9d986c658e08c35). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #28540: [SPARK-31719][SQL] Refactor JoinSelection
maropu commented on a change in pull request #28540: URL: https://github.com/apache/spark/pull/28540#discussion_r429928063 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala ## @@ -208,3 +208,161 @@ object ExtractPythonUDFFromJoinCondition extends Rule[LogicalPlan] with Predicat } } } + +sealed abstract class BuildSide + +case object BuildRight extends BuildSide + +case object BuildLeft extends BuildSide + +trait JoinSelectionHelper { + + def getBroadcastBuildSide( + left: LogicalPlan, + right: LogicalPlan, + joinType: JoinType, + hint: JoinHint, + onlyLookingAtHint: Boolean, Review comment: Ah, `hintOnly` looks nice to me. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28540: [SPARK-31719][SQL] Refactor JoinSelection
cloud-fan commented on a change in pull request #28540: URL: https://github.com/apache/spark/pull/28540#discussion_r429927305 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala ## @@ -208,3 +208,161 @@ object ExtractPythonUDFFromJoinCondition extends Rule[LogicalPlan] with Predicat } } } + +sealed abstract class BuildSide + +case object BuildRight extends BuildSide + +case object BuildLeft extends BuildSide + +trait JoinSelectionHelper { + + def getBroadcastBuildSide( + left: LogicalPlan, + right: LogicalPlan, + joinType: JoinType, + hint: JoinHint, + onlyLookingAtHint: Boolean, Review comment: If we want to be super clear, maybe "pickJoinSideByHintOnly"? or simply "hintOnly". This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28636: [SPARK-31818][SQL][test-hive1.2] Fix pushing down filters with `java.time.Instant` values in ORC
AmplabJenkins removed a comment on pull request #28636: URL: https://github.com/apache/spark/pull/28636#issuecomment-633556652 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28636: [SPARK-31818][SQL][test-hive1.2] Fix pushing down filters with `java.time.Instant` values in ORC
SparkQA commented on pull request #28636: URL: https://github.com/apache/spark/pull/28636#issuecomment-633559538 **[Test build #123084 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123084/testReport)** for PR 28636 at commit [`5290d96`](https://github.com/apache/spark/commit/5290d9642cd5d25bf26c0087753d053194f55de1). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28636: [SPARK-31818][SQL][test-hive1.2] Fix pushing down filters with `java.time.Instant` values in ORC
AmplabJenkins commented on pull request #28636: URL: https://github.com/apache/spark/pull/28636#issuecomment-633556652 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on pull request #28636: [SPARK-31818][SQL][test-hive1.2] Fix pushing down filters with `java.time.Instant` values in ORC
MaxGekk commented on pull request #28636: URL: https://github.com/apache/spark/pull/28636#issuecomment-633556551 @cloud-fan @HyukjinKwon @dongjoon-hyun Please, review this PR. It is similar to https://github.com/apache/spark/pull/28261 but for timestamps. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk opened a new pull request #28636: [SPARK-31818][SQL][test-hive1.2] Fix pushing down filters with `java.time.Instant` values in ORC
MaxGekk opened a new pull request #28636: URL: https://github.com/apache/spark/pull/28636 ### What changes were proposed in this pull request? Convert `java.time.Instant` to `java.sql.Timestamp` in pushed down filters to ORC datasource when Java 8 time API enabled. ### Why are the changes needed? The changes fix the exception raised while pushing date filters when `spark.sql.datetime.java8API.enabled` is set to `true`: ``` java.lang.IllegalArgumentException: Wrong value class java.time.Instant for TIMESTAMP.EQUALS leaf at org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$PredicateLeafImpl.checkLiteralType(SearchArgumentImpl.java:192) at org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$PredicateLeafImpl.(SearchArgumentImpl.java:75) ``` ### Does this PR introduce any user-facing change? Yes ### How was this patch tested? Added tests to `OrcFilterSuite`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
AmplabJenkins removed a comment on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633549317 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
AmplabJenkins commented on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633549317 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
SparkQA removed a comment on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633429823 **[Test build #123077 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123077/testReport)** for PR 28633 at commit [`c5ffc27`](https://github.com/apache/spark/commit/c5ffc27b6064190c6e7e4cf85765abe304d637c7). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
SparkQA commented on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633548522 **[Test build #123077 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123077/testReport)** for PR 28633 at commit [`c5ffc27`](https://github.com/apache/spark/commit/c5ffc27b6064190c6e7e4cf85765abe304d637c7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28551: [SPARK-31393][SQL][FOLLOW-UP] Show the correct alias in schema for expression
AmplabJenkins removed a comment on pull request #28551: URL: https://github.com/apache/spark/pull/28551#issuecomment-633543877 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28551: [SPARK-31393][SQL][FOLLOW-UP] Show the correct alias in schema for expression
AmplabJenkins commented on pull request #28551: URL: https://github.com/apache/spark/pull/28551#issuecomment-633543877 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28551: [SPARK-31393][SQL][FOLLOW-UP] Show the correct alias in schema for expression
SparkQA commented on pull request #28551: URL: https://github.com/apache/spark/pull/28551#issuecomment-633543378 **[Test build #123083 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123083/testReport)** for PR 28551 at commit [`53cdbb7`](https://github.com/apache/spark/commit/53cdbb76f81d250bebb8e4b6ed43afa6db25b8c0). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gaborgsomogyi commented on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector
gaborgsomogyi commented on pull request #28635: URL: https://github.com/apache/spark/pull/28635#issuecomment-633543073 cc @HeartSaVioR This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gaborgsomogyi edited a comment on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector
gaborgsomogyi edited a comment on pull request #28635: URL: https://github.com/apache/spark/pull/28635#issuecomment-633511928 Apart from manual testing I've tried to add docker integration test (failed) and tried out the following: * Set up an active directory docker => not yet supported, please see [this](https://github.com/microsoft/mssql-docker/issues/165) link for details. * Use Hadoop `MiniKdc` => simply never authenticated, furthermore MS never claimed it is working If there will be a working docker image with an Active Directory instance we can try it again. In the meantime if somebody has an idea how to overcome this feel free to add. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #28551: [SPARK-31393][SQL][FOLLOW-UP] Show the correct alias in schema for expression
HyukjinKwon commented on pull request #28551: URL: https://github.com/apache/spark/pull/28551#issuecomment-633541832 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
AmplabJenkins removed a comment on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633536412 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
AmplabJenkins commented on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633536412 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
SparkQA removed a comment on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633413383 **[Test build #123074 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123074/testReport)** for PR 28633 at commit [`c5ffc27`](https://github.com/apache/spark/commit/c5ffc27b6064190c6e7e4cf85765abe304d637c7). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gaborgsomogyi commented on a change in pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector
gaborgsomogyi commented on a change in pull request #28635: URL: https://github.com/apache/spark/pull/28635#discussion_r429886742 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/MSSQLConnectionProvider.scala ## @@ -0,0 +1,94 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.datasources.jdbc.connection + +import java.security.PrivilegedExceptionAction +import java.sql.{Connection, Driver} +import java.util.Properties + +import org.apache.hadoop.security.UserGroupInformation + +import org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions + +private[sql] class MSSQLConnectionProvider( +driver: Driver, +options: JDBCOptions, +parserMethod: String = "parseAndMergeProperties" + ) extends SecureConnectionProvider(driver, options) { + override val appEntry: String = { +val configName = "jaasConfigurationName" +val appEntryDefault = "SQLJDBCDriver" + +val parseURL = try { Review comment: There are basically 2 approaches to parse the URL to get `jaasConfigurationName`: * Try to call private [parseAndMergeProperties](https://github.com/microsoft/mssql-jdbc/blob/0d4e97f401dc0e55779460d9709dd7ee399246be/src/main/java/com/microsoft/sqlserver/jdbc/SQLServerDriver.java#L831-L854) => tried it first * Parse the URL manually => used as fallback Both way has been tested in `MSSQLConnectionProviderSuite`. ## File path: external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MsSqlServerIntegrationSuite.scala ## @@ -27,7 +27,7 @@ import org.apache.spark.tags.DockerTest @DockerTest class MsSqlServerIntegrationSuite extends DockerJDBCIntegrationSuite { override val db = new DatabaseOnDocker { -override val imageName = "mcr.microsoft.com/mssql/server:2017-GA-ubuntu" +override val imageName = "mcr.microsoft.com/mssql/server:2019-GA-ubuntu-16.04" Review comment: This is not absolutely necessary, if we think we can extract it into a new PR. Thought it would be overkill. ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/MSSQLConnectionProvider.scala ## @@ -0,0 +1,94 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.datasources.jdbc.connection + +import java.security.PrivilegedExceptionAction +import java.sql.{Connection, Driver} +import java.util.Properties + +import org.apache.hadoop.security.UserGroupInformation + +import org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions + +private[sql] class MSSQLConnectionProvider( Review comment: The implementation is based on [this](https://docs.microsoft.com/en-us/sql/connect/jdbc/using-kerberos-integrated-authentication-to-connect-to-sql-server?view=sql-server-ver15). ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/MSSQLConnectionProviderSuite.scala ## @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a
[GitHub] [spark] SparkQA commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
SparkQA commented on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633535658 **[Test build #123074 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123074/testReport)** for PR 28633 at commit [`c5ffc27`](https://github.com/apache/spark/commit/c5ffc27b6064190c6e7e4cf85765abe304d637c7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28630: [SPARK-31806][SQL][TESTS] Check reading date/timestamp from legacy parquet: dictionary encoding, w/o Spark version
AmplabJenkins removed a comment on pull request #28630: URL: https://github.com/apache/spark/pull/28630#issuecomment-633530671 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28630: [SPARK-31806][SQL][TESTS] Check reading date/timestamp from legacy parquet: dictionary encoding, w/o Spark version
AmplabJenkins commented on pull request #28630: URL: https://github.com/apache/spark/pull/28630#issuecomment-633530671 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28630: [SPARK-31806][SQL][TESTS] Check reading date/timestamp from legacy parquet: dictionary encoding, w/o Spark version
SparkQA removed a comment on pull request #28630: URL: https://github.com/apache/spark/pull/28630#issuecomment-633413390 **[Test build #123075 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123075/testReport)** for PR 28630 at commit [`0add1a2`](https://github.com/apache/spark/commit/0add1a22dfff851bae71d9330f984ace41e5663f). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28630: [SPARK-31806][SQL][TESTS] Check reading date/timestamp from legacy parquet: dictionary encoding, w/o Spark version
SparkQA commented on pull request #28630: URL: https://github.com/apache/spark/pull/28630#issuecomment-633529870 **[Test build #123075 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123075/testReport)** for PR 28630 at commit [`0add1a2`](https://github.com/apache/spark/commit/0add1a22dfff851bae71d9330f984ace41e5663f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector
SparkQA commented on pull request #28635: URL: https://github.com/apache/spark/pull/28635#issuecomment-633525575 **[Test build #123082 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123082/testReport)** for PR 28635 at commit [`88306f0`](https://github.com/apache/spark/commit/88306f00cf6aea7636f432e3c9a04c0f44137770). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector
AmplabJenkins removed a comment on pull request #28635: URL: https://github.com/apache/spark/pull/28635#issuecomment-633512198 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123081/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector
SparkQA commented on pull request #28635: URL: https://github.com/apache/spark/pull/28635#issuecomment-633512184 **[Test build #123081 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123081/testReport)** for PR 28635 at commit [`3c30c4a`](https://github.com/apache/spark/commit/3c30c4aae5ea5c210967ae30c52a68544db2d048). * This patch **fails build dependency tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector
AmplabJenkins commented on pull request #28635: URL: https://github.com/apache/spark/pull/28635#issuecomment-633512192 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector
AmplabJenkins removed a comment on pull request #28635: URL: https://github.com/apache/spark/pull/28635#issuecomment-633511059 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector
SparkQA removed a comment on pull request #28635: URL: https://github.com/apache/spark/pull/28635#issuecomment-633510744 **[Test build #123081 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123081/testReport)** for PR 28635 at commit [`3c30c4a`](https://github.com/apache/spark/commit/3c30c4aae5ea5c210967ae30c52a68544db2d048). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gaborgsomogyi commented on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector
gaborgsomogyi commented on pull request #28635: URL: https://github.com/apache/spark/pull/28635#issuecomment-633511928 Apart from manual testing I've tried to add docker integration test (failed) and tried out the following: * Set up an active directory docker => not yet supported, please see [this](https://github.com/microsoft/mssql-docker/issues/165) link for details. * Use Hadoop `MiniKdc` => simply never authenticated, furthermore MS never claimed it is working If there will be a working docker image with an Active Directory instance we can try to it again. In the meantime if somebody has an idea how to overcome this feel free to add. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector
AmplabJenkins commented on pull request #28635: URL: https://github.com/apache/spark/pull/28635#issuecomment-633511059 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector
SparkQA commented on pull request #28635: URL: https://github.com/apache/spark/pull/28635#issuecomment-633510744 **[Test build #123081 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123081/testReport)** for PR 28635 at commit [`3c30c4a`](https://github.com/apache/spark/commit/3c30c4aae5ea5c210967ae30c52a68544db2d048). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gaborgsomogyi opened a new pull request #28635: [SPARK-31337][SQL]Support MS SQL Kerberos login in JDBC connector
gaborgsomogyi opened a new pull request #28635: URL: https://github.com/apache/spark/pull/28635 ### What changes were proposed in this pull request? When loading DataFrames from JDBC datasource with Kerberos authentication, remote executors (yarn-client/cluster etc. modes) fail to establish a connection due to lack of Kerberos ticket or ability to generate it. This is a real issue when trying to ingest data from kerberized data sources (SQL Server, Oracle) in enterprise environment where exposing simple authentication access is not an option due to IT policy issues. In this PR I've added MS SQL support. What this PR contains: * Added `MSSQLConnectionProvider` * Added `MSSQLConnectionProviderSuite` * Changed MS SQL JDBC driver to use the latest (test scope only) * Changed `MsSqlServerIntegrationSuite` docker image to use the latest * Added a version comment to `MariaDBConnectionProvider` to increase trackability ### Why are the changes needed? Missing JDBC kerberos support. ### Does this PR introduce _any_ user-facing change? Yes, now user is able to connect to MS SQL using kerberos. ### How was this patch tested? * Additional + existing unit tests * Existing integration tests * Test on cluster manually This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #28551: [SPARK-31393][SQL][FOLLOW-UP] Show the correct alias in schema for expression
beliefer commented on pull request #28551: URL: https://github.com/apache/spark/pull/28551#issuecomment-633493077 cc @HyukjinKwon This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] prakharjain09 edited a comment on pull request #28634: [SPARK-31810][TEST] Fix AlterTableRecoverPartitions test using incorrect api to modify RDD_PARALLEL_LISTING_THRESHOLD
prakharjain09 edited a comment on pull request #28634: URL: https://github.com/apache/spark/pull/28634#issuecomment-633481802 cc - @srowen @cloud-fan @Dooyoung-Hwang This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] prakharjain09 commented on pull request #28634: [SPARK-31810][TEST] Fix AlterTableRecoverPartitions test using incorrect api to modify RDD_PARALLEL_LISTING_THRESHOLD
prakharjain09 commented on pull request #28634: URL: https://github.com/apache/spark/pull/28634#issuecomment-633481802 cc - @srowen @cloud-fan This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
AmplabJenkins commented on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633462469 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
AmplabJenkins removed a comment on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633462469 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
SparkQA commented on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633461749 **[Test build #123080 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123080/testReport)** for PR 28633 at commit [`7eeea97`](https://github.com/apache/spark/commit/7eeea97ef6dba0638652ec94a07a792d08ade8f4). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #28626: [SPARK-28481][SQL] More expressions should extend NullIntolerant
maropu commented on a change in pull request #28626: URL: https://github.com/apache/spark/pull/28626#discussion_r429811184 ## File path: sql/core/src/test/scala/org/apache/spark/sql/expressions/ExpressionInfoSuite.scala ## @@ -156,4 +158,74 @@ class ExpressionInfoSuite extends SparkFunSuite with SharedSparkSession { } } } + + test("Check whether should extend NullIntolerant") { +// Only check expressions extended from these expressions +val parentExpressionNames = Seq(classOf[UnaryExpression], classOf[BinaryExpression], + classOf[TernaryExpression], classOf[QuaternaryExpression], + classOf[SeptenaryExpression]).map(_.getName) Review comment: Ur, I got it. okay and please leave some comments about that, too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
AmplabJenkins removed a comment on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633455287 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
AmplabJenkins commented on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633455287 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28621: [SPARK-31803][ML] Make sure instance weight is not negative
AmplabJenkins removed a comment on pull request #28621: URL: https://github.com/apache/spark/pull/28621#issuecomment-633454104 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123078/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
SparkQA commented on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633454615 **[Test build #123079 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123079/testReport)** for PR 28633 at commit [`bc6a2c0`](https://github.com/apache/spark/commit/bc6a2c0afe3eeb20c1b60849a9d986c658e08c35). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28621: [SPARK-31803][ML] Make sure instance weight is not negative
SparkQA removed a comment on pull request #28621: URL: https://github.com/apache/spark/pull/28621#issuecomment-633437134 **[Test build #123078 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123078/testReport)** for PR 28621 at commit [`e1b4422`](https://github.com/apache/spark/commit/e1b44228646fdffcba27684b46fdd2ec79d549ea). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28621: [SPARK-31803][ML] Make sure instance weight is not negative
AmplabJenkins removed a comment on pull request #28621: URL: https://github.com/apache/spark/pull/28621#issuecomment-633454093 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28621: [SPARK-31803][ML] Make sure instance weight is not negative
AmplabJenkins commented on pull request #28621: URL: https://github.com/apache/spark/pull/28621#issuecomment-633454093 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28621: [SPARK-31803][ML] Make sure instance weight is not negative
SparkQA commented on pull request #28621: URL: https://github.com/apache/spark/pull/28621#issuecomment-633453976 **[Test build #123078 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123078/testReport)** for PR 28621 at commit [`e1b4422`](https://github.com/apache/spark/commit/e1b44228646fdffcba27684b46fdd2ec79d549ea). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
HyukjinKwon commented on a change in pull request #28633: URL: https://github.com/apache/spark/pull/28633#discussion_r429808596 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala ## @@ -312,6 +312,10 @@ case object NamePlaceholder extends LeafExpression with Unevaluable { * Returns a Row containing the evaluation of all children expressions. */ object CreateStruct extends FunctionBuilder { + /** + * Returns a named struct with generating names or using the names when available. + * It should be used only for an internal purpose. Review comment: I pushed some changes but let me know if you prefer this way. I don't mind changing it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
HyukjinKwon commented on a change in pull request #28633: URL: https://github.com/apache/spark/pull/28633#discussion_r429805322 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala ## @@ -434,6 +449,11 @@ case class CreateNamedStruct(children: Seq[Expression]) extends Expression { } override def prettyName: String = "named_struct" + + override def sql: String = getTagValue(FUNC_ALIAS).map { alias => Review comment: Yup, should be better to do that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
HyukjinKwon commented on a change in pull request #28633: URL: https://github.com/apache/spark/pull/28633#discussion_r429805059 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala ## @@ -312,6 +312,10 @@ case object NamePlaceholder extends LeafExpression with Unevaluable { * Returns a Row containing the evaluation of all children expressions. */ object CreateStruct extends FunctionBuilder { + /** + * Returns a named struct with generating names or using the names when available. + * It should be used only for an internal purpose. Review comment: There are some cases when `CreateNamedStruct` is inserted, e.g.) https://github.com/apache/spark/pull/28633#discussion_r429756482 I was thinking that it's fine to treat the cases as just using `named_struct` internally. Also, it makes many diff in SQL tests and etc. Thought it's better to minimise the diff. Actually moving to `CreateNamedStruct` is the first way I tried. One thing is that `CreateNamedStruct` case class has the same signature so `CreateNamedStruct` companion object can't have the same signature at `apply`. I could consistently have `CreateNamedStruct.create` and `CreateStruct.create` if you prefer this way. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] hvanhovell commented on a change in pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
hvanhovell commented on a change in pull request #28633: URL: https://github.com/apache/spark/pull/28633#discussion_r429802771 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala ## @@ -434,6 +449,11 @@ case class CreateNamedStruct(children: Seq[Expression]) extends Expression { } override def prettyName: String = "named_struct" + + override def sql: String = getTagValue(FUNC_ALIAS).map { alias => Review comment: Shouldn't you override prettyName as well? In general this seems to have been solved with either a mixin or at the Expression level. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] hvanhovell commented on a change in pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
hvanhovell commented on a change in pull request #28633: URL: https://github.com/apache/spark/pull/28633#discussion_r429802509 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala ## @@ -312,6 +312,10 @@ case object NamePlaceholder extends LeafExpression with Unevaluable { * Returns a Row containing the evaluation of all children expressions. */ object CreateStruct extends FunctionBuilder { + /** + * Returns a named struct with generating names or using the names when available. + * It should be used only for an internal purpose. Review comment: You could also consider moving this into the `CreateNamedStruct` companion to avoid confusion. BTW in which places do we call this method and still want to retain the old name? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
HyukjinKwon commented on a change in pull request #28633: URL: https://github.com/apache/spark/pull/28633#discussion_r429800396 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala ## @@ -320,12 +324,23 @@ object CreateStruct extends FunctionBuilder { }) } + /** + * Returns a named struct with a pretty SQL name. It will show the pretty SQL string + * in its output column name as if `struct(...)` was called. Should be used for an + * external purpose. + */ + def create(children: Seq[Expression]): CreateNamedStruct = { +val expr = CreateStruct(children) +expr.setTagValue(FUNC_ALIAS, "struct") +expr + } + /** * Entry to use in the function registry. */ val registryEntry: (String, (ExpressionInfo, FunctionBuilder)) = { val info: ExpressionInfo = new ExpressionInfo( - "org.apache.spark.sql.catalyst.expressions.NamedStruct", + classOf[CreateNamedStruct].getCanonicalName, Review comment: Seems like it will need more changes than the current changes to reuse it because we should have a different description for `struct` specifically. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
HyukjinKwon commented on a change in pull request #28633: URL: https://github.com/apache/spark/pull/28633#discussion_r429796543 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala ## @@ -312,6 +312,10 @@ case object NamePlaceholder extends LeafExpression with Unevaluable { * Returns a Row containing the evaluation of all children expressions. */ object CreateStruct extends FunctionBuilder { + /** + * Returns a named struct with generating names or using the names when available. + * It should be used only for an internal purpose. Review comment: Hm, I will rephrase it to clarify like it shouldn't be used when `struct` is explicitly specified by a user. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] attilapiros commented on a change in pull request #28606: [MINOR][YARN]False report isAllNodeBlacklisted when RM is having issue
attilapiros commented on a change in pull request #28606: URL: https://github.com/apache/spark/pull/28606#discussion_r429795946 ## File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocatorBlacklistTracker.scala ## @@ -103,7 +103,14 @@ private[spark] class YarnAllocatorBlacklistTracker( refreshBlacklistedNodes() } - def isAllNodeBlacklisted: Boolean = currentBlacklistedYarnNodes.size >= numClusterNodes + def isAllNodeBlacklisted: Boolean = { +if (numClusterNodes <= 0) { Review comment: `numClusterNodes == 0` would be better This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28553: [SPARK-31734][ML][PySpark] Add weight support in ClusteringEvaluator
AmplabJenkins commented on pull request #28553: URL: https://github.com/apache/spark/pull/28553#issuecomment-633443120 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28553: [SPARK-31734][ML][PySpark] Add weight support in ClusteringEvaluator
AmplabJenkins removed a comment on pull request #28553: URL: https://github.com/apache/spark/pull/28553#issuecomment-633443120 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28553: [SPARK-31734][ML][PySpark] Add weight support in ClusteringEvaluator
SparkQA removed a comment on pull request #28553: URL: https://github.com/apache/spark/pull/28553#issuecomment-633413387 **[Test build #123073 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123073/testReport)** for PR 28553 at commit [`cab88cc`](https://github.com/apache/spark/commit/cab88cc940bc9fba7f66a4d385986a7f603692a8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28553: [SPARK-31734][ML][PySpark] Add weight support in ClusteringEvaluator
SparkQA commented on pull request #28553: URL: https://github.com/apache/spark/pull/28553#issuecomment-633442641 **[Test build #123073 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123073/testReport)** for PR 28553 at commit [`cab88cc`](https://github.com/apache/spark/commit/cab88cc940bc9fba7f66a4d385986a7f603692a8). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` case class ClusterStats(featureSum: Vector, squaredNormSum: Double, weightSum: Double)` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] hvanhovell commented on a change in pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
hvanhovell commented on a change in pull request #28633: URL: https://github.com/apache/spark/pull/28633#discussion_r429792739 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala ## @@ -312,6 +312,10 @@ case object NamePlaceholder extends LeafExpression with Unevaluable { * Returns a Row containing the evaluation of all children expressions. */ object CreateStruct extends FunctionBuilder { + /** + * Returns a named struct with generating names or using the names when available. + * It should be used only for an internal purpose. Review comment: Everything in catalyst is private right? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28621: [SPARK-31803][ML] Make sure instance weight is not negative
AmplabJenkins removed a comment on pull request #28621: URL: https://github.com/apache/spark/pull/28621#issuecomment-633437651 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28621: [SPARK-31803][ML] Make sure instance weight is not negative
AmplabJenkins commented on pull request #28621: URL: https://github.com/apache/spark/pull/28621#issuecomment-633437651 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28621: [SPARK-31803][ML] Make sure instance weight is not negative
SparkQA commented on pull request #28621: URL: https://github.com/apache/spark/pull/28621#issuecomment-633437134 **[Test build #123078 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123078/testReport)** for PR 28621 at commit [`e1b4422`](https://github.com/apache/spark/commit/e1b44228646fdffcba27684b46fdd2ec79d549ea). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
maropu commented on a change in pull request #28633: URL: https://github.com/apache/spark/pull/28633#discussion_r429787806 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala ## @@ -320,12 +324,23 @@ object CreateStruct extends FunctionBuilder { }) } + /** + * Returns a named struct with a pretty SQL name. It will show the pretty SQL string + * in its output column name as if `struct(...)` was called. Should be used for an + * external purpose. + */ + def create(children: Seq[Expression]): CreateNamedStruct = { +val expr = CreateStruct(children) +expr.setTagValue(FUNC_ALIAS, "struct") +expr + } + /** * Entry to use in the function registry. */ val registryEntry: (String, (ExpressionInfo, FunctionBuilder)) = { val info: ExpressionInfo = new ExpressionInfo( - "org.apache.spark.sql.catalyst.expressions.NamedStruct", + classOf[CreateNamedStruct].getCanonicalName, Review comment: nit tnohgh, we cannot reuse the `ExpressionInfo` of `CreateNameStruct` via reflection here? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28634: [SPARK-31810][TEST] Fix test where spark conf was modified using incorrect api
AmplabJenkins removed a comment on pull request #28634: URL: https://github.com/apache/spark/pull/28634#issuecomment-633429669 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
AmplabJenkins removed a comment on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633430366 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
AmplabJenkins commented on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633430366 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
SparkQA commented on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633429823 **[Test build #123077 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123077/testReport)** for PR 28633 at commit [`c5ffc27`](https://github.com/apache/spark/commit/c5ffc27b6064190c6e7e4cf85765abe304d637c7). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28634: [SPARK-31810][TEST] Fix test where spark conf was modified using incorrect api
AmplabJenkins commented on pull request #28634: URL: https://github.com/apache/spark/pull/28634#issuecomment-633430139 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28634: [SPARK-31810][TEST] Fix test where spark conf was modified using incorrect api
AmplabJenkins commented on pull request #28634: URL: https://github.com/apache/spark/pull/28634#issuecomment-633429669 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] prakharjain09 opened a new pull request #28634: [SPARK-31810][TEST] Fix test where spark conf was modified using incorrect api
prakharjain09 opened a new pull request #28634: URL: https://github.com/apache/spark/pull/28634 ### What changes were proposed in this pull request? Use the correct API in AlterTableRecoverPartition tests to modify the `RDD_PARALLEL_LISTING_THRESHOLD` conf. ### Why are the changes needed? The existing AlterTableRecoverPartitions test modify the RDD_PARALLEL_LISTING_THRESHOLD as a SQLConf using the withSQLConf API. But since, this is not a SQLConf, it is not overridden and so the test doesn't end up testing the required behaviour. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? This is UT Fix. UTs are still passing after the fix. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
maropu commented on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633426646 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test
AmplabJenkins removed a comment on pull request #28627: URL: https://github.com/apache/spark/pull/28627#issuecomment-633420266 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test
AmplabJenkins commented on pull request #28627: URL: https://github.com/apache/spark/pull/28627#issuecomment-633420266 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test
SparkQA commented on pull request #28627: URL: https://github.com/apache/spark/pull/28627#issuecomment-633419679 **[Test build #123076 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123076/testReport)** for PR 28627 at commit [`70d07a3`](https://github.com/apache/spark/commit/70d07a3785799e5f44e21cb9c541faaa650ad6c0). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak commented on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test
sarutak commented on pull request #28627: URL: https://github.com/apache/spark/pull/28627#issuecomment-633417531 retest this please. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test
AmplabJenkins removed a comment on pull request #28627: URL: https://github.com/apache/spark/pull/28627#issuecomment-633414769 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123071/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] huaxingao commented on a change in pull request #28595: [SPARK-31781][ML][PySpark] Move param k (number of clusters) to shared params
huaxingao commented on a change in pull request #28595: URL: https://github.com/apache/spark/pull/28595#discussion_r429765422 ## File path: mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala ## @@ -562,4 +562,20 @@ trait HasBlockSize extends Params { /** @group expertGetParam */ final def getBlockSize: Int = $(blockSize) } + +/** + * Trait for shared param k. This trait may be changed or + * removed between minor versions. + */ +trait HasK extends Params { + + /** + * Param for The number of clusters to create. Must be ( 1). Note that it is possible for fewer than k clusters to be returned. + * @group param + */ + final val k: IntParam = new IntParam(this, "k", "The number of clusters to create. Must be (> 1). Note that it is possible for fewer than k clusters to be returned", ParamValidators.gt(1)) Review comment: Yes, we lost the ```@Since``` annotations for ```k``` and ```getK```. We still have since 2.4.0 for ```setK```, though. For all the shared params, we don't have ```@Since``` annotations for the param and getXXX, only have ```@Since``` annotations for setXXX. ## File path: mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala ## @@ -562,4 +562,20 @@ trait HasBlockSize extends Params { /** @group expertGetParam */ final def getBlockSize: Int = $(blockSize) } + +/** + * Trait for shared param k. This trait may be changed or + * removed between minor versions. + */ +trait HasK extends Params { + + /** + * Param for The number of clusters to create. Must be ( 1). Note that it is possible for fewer than k clusters to be returned. + * @group param + */ + final val k: IntParam = new IntParam(this, "k", "The number of clusters to create. Must be (> 1). Note that it is possible for fewer than k clusters to be returned", ParamValidators.gt(1)) + + /** @group getParam */ + final def getK: Int = $(k) Review comment: Right, no MiMia problems if this isn't final, but we can't change it because it's generated code. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28553: [SPARK-31734][ML][PySpark] Add weight support in ClusteringEvaluator
AmplabJenkins removed a comment on pull request #28553: URL: https://github.com/apache/spark/pull/28553#issuecomment-633414094 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28630: [SPARK-31806][SQL][TESTS] Check reading date/timestamp from legacy parquet: dictionary encoding, w/o Spark version
AmplabJenkins removed a comment on pull request #28630: URL: https://github.com/apache/spark/pull/28630#issuecomment-633414056 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
SparkQA removed a comment on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633407168 **[Test build #123072 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123072/testReport)** for PR 28633 at commit [`59acc36`](https://github.com/apache/spark/commit/59acc368427fb0462c7b1fb9efa7e77cb3f0b9de). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test
SparkQA commented on pull request #28627: URL: https://github.com/apache/spark/pull/28627#issuecomment-633414635 **[Test build #123071 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123071/testReport)** for PR 28627 at commit [`70d07a3`](https://github.com/apache/spark/commit/70d07a3785799e5f44e21cb9c541faaa650ad6c0). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test
AmplabJenkins removed a comment on pull request #28627: URL: https://github.com/apache/spark/pull/28627#issuecomment-633414764 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test
AmplabJenkins commented on pull request #28627: URL: https://github.com/apache/spark/pull/28627#issuecomment-633414764 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
SparkQA commented on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633414636 **[Test build #123072 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123072/testReport)** for PR 28633 at commit [`59acc36`](https://github.com/apache/spark/commit/59acc368427fb0462c7b1fb9efa7e77cb3f0b9de). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
AmplabJenkins commented on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633414667 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28633: [SPARK-31808][SQL] Makes struct function's output name and class name pretty
AmplabJenkins removed a comment on pull request #28633: URL: https://github.com/apache/spark/pull/28633#issuecomment-633414667 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test
SparkQA removed a comment on pull request #28627: URL: https://github.com/apache/spark/pull/28627#issuecomment-633357514 **[Test build #123071 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123071/testReport)** for PR 28627 at commit [`70d07a3`](https://github.com/apache/spark/commit/70d07a3785799e5f44e21cb9c541faaa650ad6c0). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org