[GitHub] [spark] dchvn opened a new pull request #34750: [SPARK-37495][PYTHON] Skip identical index checking of Series.compare when config 'compute.eager_check' is disabled
dchvn opened a new pull request #34750: URL: https://github.com/apache/spark/pull/34750 ### What changes were proposed in this pull request? Skip identical index checking of Series.compare when config 'compute.eager_check' is disabled ### Why are the changes needed? identical index checking is expensive, so we should use config 'compute.eager_check' to skip this one ### Does this PR introduce _any_ user-facing change? Yes Before this PR ```python >>> psser1 = ps.Series([1, 2, 3, 4, 5], index=pd.Index([1, 2, 3, 4, 5])) >>> psser2 = ps.Series([1, 2, 3, 4, 5], index=pd.Index([1, 2, 4, 3, 5])) >>> psser1.compare(psser2) Traceback (most recent call last): File "", line 1, in File "/u02/spark/python/pyspark/pandas/series.py", line 5851, in compare raise ValueError("Can only compare identically-labeled Series objects") ValueError: Can only compare identically-labeled Series objects ``` After this PR, when config 'compute.eager_check' is False, pandas-on-Spark just proceeds and performs by ignoring the identical index checking. ```python >>> with ps.option_context("compute.eager_check", False): ... psser1.compare(psser2) ... self other 3 3 4 4 4 3 ``` ### How was this patch tested? Unit tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LuciferYang commented on a change in pull request #34719: [SPARK-37381][SQL] Unify v1 and v2 SHOW CREATE TABLE tests
LuciferYang commented on a change in pull request #34719: URL: https://github.com/apache/spark/pull/34719#discussion_r758953291 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowCreateTableSuite.scala ## @@ -0,0 +1,140 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command.v1 + +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.catalog.CatalogTable +import org.apache.spark.sql.execution.command + +/** + * This base suite contains unified tests for the `SHOW CREATE TABLE` command that checks V1 + * table catalogs. The tests that cannot run for all V1 catalogs are located in more + * specific test suites: + * + * - V1 In-Memory catalog: `org.apache.spark.sql.execution.command.v1.ShowCreateTableSuite` + * - V1 Hive External catalog: + * `org.apache.spark.sql.hive.execution.command.ShowCreateTableSuite` + */ +trait ShowCreateTableSuiteBase extends command.ShowCreateTableSuiteBase +with command.TestsV1AndV2Commands { + + test("CATS") { Review comment: I guess `CATS` is testing `Create Table As Select`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34731: [SPARK-37153][PYTHON] Inline type hints for python/pyspark/profiler.py
SparkQA commented on pull request #34731: URL: https://github.com/apache/spark/pull/34731#issuecomment-982309084 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50216/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34746: [SPARK-37489][PYTHON] Skip hasnans check in numops if eager_check disable
SparkQA commented on pull request #34746: URL: https://github.com/apache/spark/pull/34746#issuecomment-982313723 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50217/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #34684: [SPARK-37442][SQL] InMemoryRelation statistics bug causing broadcast join failures with AQE enabled
cloud-fan commented on a change in pull request #34684: URL: https://github.com/apache/spark/pull/34684#discussion_r758962344 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala ## @@ -237,7 +238,23 @@ case class CachedRDDBuilder( } def isCachedColumnBuffersLoaded: Boolean = { -_cachedColumnBuffers != null +_cachedColumnBuffers != null && isCachedRDDLoaded + } + + def isCachedRDDLoaded: Boolean = { Review comment: OK, shall we at least mark `_cachedColumnBuffersAreLoaded` as `volatile`? In case two threads calling `isCachedRDDLoaded` at the same time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Yikun commented on a change in pull request #34717: [SPARK-37465][PYTHON] Bump minimum pandas version to 1.0.5
Yikun commented on a change in pull request #34717: URL: https://github.com/apache/spark/pull/34717#discussion_r758898358 ## File path: python/docs/source/user_guide/sql/arrow_pandas.rst ## @@ -387,7 +387,7 @@ working with timestamps in ``pandas_udf``\s to get the best performance, see Recommended Pandas and PyArrow Versions ~~~ -For usage with pyspark.sql, the minimum supported versions of Pandas is 0.23.2 and PyArrow is 1.0.0. +For usage with pyspark.sql, the minimum supported versions of Pandas is 1.0.5 and PyArrow is 1.0.0. Review comment: How about: For usage with pyspark.sql, the minimum supported versions of Pandas is **1.0.5** and PyArrow is 1.0.0. **Lower versions (such as there are some known issues under with v1.0.0, v1.0.1, see more in [link](https://github.com/apache/spark/pull/34717)) or** higher versions may be used, however, compatibility and data correctness can not be guaranteed and should be verified by the user. Maybe need more suggestion from native speaker. T_T, and if it's necessary we could do it in next commits in this PR or followup. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov
SparkQA commented on pull request #34213: URL: https://github.com/apache/spark/pull/34213#issuecomment-982319967 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50219/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34657: [WIP] Support TimedeltaIndex in pandas API on Spark
SparkQA commented on pull request #34657: URL: https://github.com/apache/spark/pull/34657#issuecomment-982322524 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50218/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ChenMichael commented on a change in pull request #34684: [SPARK-37442][SQL] InMemoryRelation statistics bug causing broadcast join failures with AQE enabled
ChenMichael commented on a change in pull request #34684: URL: https://github.com/apache/spark/pull/34684#discussion_r758965590 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala ## @@ -237,7 +238,23 @@ case class CachedRDDBuilder( } def isCachedColumnBuffersLoaded: Boolean = { -_cachedColumnBuffers != null +_cachedColumnBuffers != null && isCachedRDDLoaded + } + + def isCachedRDDLoaded: Boolean = { Review comment: Makes sense. Added -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34741: [SPARK-37463][SQL] Read/Write Timestamp ntz from/to Orc uses UTC time zone
SparkQA commented on pull request #34741: URL: https://github.com/apache/spark/pull/34741#issuecomment-982322853 **[Test build #145735 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145735/testReport)** for PR 34741 at commit [`ee11bf4`](https://github.com/apache/spark/commit/ee11bf4990bdc7bac5202d164ca144e9cbed6fa8). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class UDFBasicProfiler(BasicProfiler):` * `case class PrettyPythonUDF(` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34741: [SPARK-37463][SQL] Read/Write Timestamp ntz from/to Orc uses UTC time zone
SparkQA removed a comment on pull request #34741: URL: https://github.com/apache/spark/pull/34741#issuecomment-982193103 **[Test build #145735 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145735/testReport)** for PR 34741 at commit [`ee11bf4`](https://github.com/apache/spark/commit/ee11bf4990bdc7bac5202d164ca144e9cbed6fa8). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun opened a new pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi
dongjoon-hyun opened a new pull request #34751: URL: https://github.com/apache/spark/pull/34751 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34731: [SPARK-37153][PYTHON] Inline type hints for python/pyspark/profiler.py
AmplabJenkins commented on pull request #34731: URL: https://github.com/apache/spark/pull/34731#issuecomment-982324056 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50216/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34746: [SPARK-37489][PYTHON] Skip hasnans check in numops if eager_check disable
AmplabJenkins commented on pull request #34746: URL: https://github.com/apache/spark/pull/34746#issuecomment-982324060 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50217/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34657: [WIP] Support TimedeltaIndex in pandas API on Spark
AmplabJenkins commented on pull request #34657: URL: https://github.com/apache/spark/pull/34657#issuecomment-982324057 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50218/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34741: [SPARK-37463][SQL] Read/Write Timestamp ntz from/to Orc uses UTC time zone
AmplabJenkins commented on pull request #34741: URL: https://github.com/apache/spark/pull/34741#issuecomment-982324154 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145735/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34746: [SPARK-37489][PYTHON] Skip hasnans check in numops if eager_check disable
AmplabJenkins removed a comment on pull request #34746: URL: https://github.com/apache/spark/pull/34746#issuecomment-982324060 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50217/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34741: [SPARK-37463][SQL] Read/Write Timestamp ntz from/to Orc uses UTC time zone
AmplabJenkins removed a comment on pull request #34741: URL: https://github.com/apache/spark/pull/34741#issuecomment-982324154 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145735/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34657: [WIP] Support TimedeltaIndex in pandas API on Spark
AmplabJenkins removed a comment on pull request #34657: URL: https://github.com/apache/spark/pull/34657#issuecomment-982324057 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50218/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Peng-Lei commented on a change in pull request #34719: [SPARK-37381][SQL] Unify v1 and v2 SHOW CREATE TABLE tests
Peng-Lei commented on a change in pull request #34719: URL: https://github.com/apache/spark/pull/34719#discussion_r758966706 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowCreateTableSuite.scala ## @@ -0,0 +1,140 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command.v1 + +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.catalog.CatalogTable +import org.apache.spark.sql.execution.command + +/** + * This base suite contains unified tests for the `SHOW CREATE TABLE` command that checks V1 + * table catalogs. The tests that cannot run for all V1 catalogs are located in more + * specific test suites: + * + * - V1 In-Memory catalog: `org.apache.spark.sql.execution.command.v1.ShowCreateTableSuite` + * - V1 Hive External catalog: + * `org.apache.spark.sql.hive.execution.command.ShowCreateTableSuite` + */ +trait ShowCreateTableSuiteBase extends command.ShowCreateTableSuiteBase +with command.TestsV1AndV2Commands { + + test("CATS") { Review comment: Sorry. Wrong title. I did mean to express `Create Table As Select`. It should be `CTAS` instead of `CATS`. `CATS` is short name of `Contention-Aware Transaction Scheduling` in MySQL. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34731: [SPARK-37153][PYTHON] Inline type hints for python/pyspark/profiler.py
AmplabJenkins removed a comment on pull request #34731: URL: https://github.com/apache/spark/pull/34731#issuecomment-982324056 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50216/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Peng-Lei commented on a change in pull request #34719: [SPARK-37381][SQL] Unify v1 and v2 SHOW CREATE TABLE tests
Peng-Lei commented on a change in pull request #34719: URL: https://github.com/apache/spark/pull/34719#discussion_r758966959 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowCreateTableSuite.scala ## @@ -0,0 +1,140 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command.v1 + +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.catalog.CatalogTable +import org.apache.spark.sql.execution.command + +/** + * This base suite contains unified tests for the `SHOW CREATE TABLE` command that checks V1 + * table catalogs. The tests that cannot run for all V1 catalogs are located in more + * specific test suites: + * + * - V1 In-Memory catalog: `org.apache.spark.sql.execution.command.v1.ShowCreateTableSuite` + * - V1 Hive External catalog: + * `org.apache.spark.sql.hive.execution.command.ShowCreateTableSuite` + */ +trait ShowCreateTableSuiteBase extends command.ShowCreateTableSuiteBase +with command.TestsV1AndV2Commands { + + test("CATS") { Review comment: I will address the comments together later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi
SparkQA commented on pull request #34751: URL: https://github.com/apache/spark/pull/34751#issuecomment-982324903 **[Test build #145749 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145749/testReport)** for PR 34751 at commit [`f845a10`](https://github.com/apache/spark/commit/f845a10f24f23a18ec230b3b8581e3bce866cc61). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34750: [SPARK-37495][PYTHON] Skip identical index checking of Series.compare when config 'compute.eager_check' is disabled
SparkQA commented on pull request #34750: URL: https://github.com/apache/spark/pull/34750#issuecomment-982324974 **[Test build #145750 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145750/testReport)** for PR 34750 at commit [`2bbb84d`](https://github.com/apache/spark/commit/2bbb84d30157e9b855f81c5448c326b34b302937). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ChenMichael commented on a change in pull request #34684: [SPARK-37442][SQL] InMemoryRelation statistics bug causing broadcast join failures with AQE enabled
ChenMichael commented on a change in pull request #34684: URL: https://github.com/apache/spark/pull/34684#discussion_r758965590 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala ## @@ -237,7 +238,23 @@ case class CachedRDDBuilder( } def isCachedColumnBuffersLoaded: Boolean = { -_cachedColumnBuffers != null +_cachedColumnBuffers != null && isCachedRDDLoaded + } + + def isCachedRDDLoaded: Boolean = { Review comment: Sure. added -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Yikun commented on pull request #34746: [SPARK-37489][PYTHON] Skip hasnans check in numops if eager_check disable
Yikun commented on pull request #34746: URL: https://github.com/apache/spark/pull/34746#issuecomment-982325271 ``` ERROR [2.132s]: test_reuse_worker_of_parallelize_range (pyspark.tests.test_worker.WorkerReuseTest) -- Traceback (most recent call last): File "/__w/spark/spark/python/pyspark/tests/test_worker.py", line 195, in test_reuse_worker_of_parallelize_range self.assertTrue(pid in previous_pids) AssertionError: False is not true ``` flaky test failure and unrelated, and looks like it failed many times[1][2][3] before, I submitted a JIRA: https://issues.apache.org/jira/browse/SPARK-37498 , will take a look when I get time. [1] https://github.com/apache/spark/runs/1182154542?check_suite_focus=true [2] https://github.com/apache/spark/pull/33657#issuecomment-893969310 [3] https://github.com/Yikun/spark/runs/4362783540?check_suite_focus=true -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi
dongjoon-hyun commented on pull request #34751: URL: https://github.com/apache/spark/pull/34751#issuecomment-982326438 cc @holdenk , @shrutig , @viirya -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #34719: [SPARK-37381][SQL] Unify v1 and v2 SHOW CREATE TABLE tests
cloud-fan commented on a change in pull request #34719: URL: https://github.com/apache/spark/pull/34719#discussion_r758970183 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/ShowCreateTableSuite.scala ## @@ -0,0 +1,141 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command.v2 + +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.execution.command + +/** + * The class contains tests for the `SHOW CREATE TABLE` command to check V2 table catalogs. + */ +class ShowCreateTableSuite extends command.ShowCreateTableSuiteBase with CommandSuiteBase { + test("SPARK-33898: SHOW CREATE TABLE AS SERDE") { +val db = "ns1" +val table = "tbl" +withNamespaceAndTable(db, table) { t => + spark.sql(s"CREATE TABLE $t (id bigint, data string) $defaultUsing") + val e = intercept[AnalysisException] { +sql(s"SHOW CREATE TABLE $t AS SERDE") + } + assert(e.message.contains(s"SHOW CREATE TABLE AS SERDE is not supported for v2 tables.")) +} + } + + test("CTAS") { +val db = "ns1" +val table = "ddl_test" +withNamespaceAndTable(db, table) { t => + sql( +s"""CREATE TABLE $t + |$defaultUsing + |PARTITIONED BY (a) + |COMMENT 'This is a comment' + |TBLPROPERTIES ('a' = '1') + |AS SELECT 1 AS a, "foo" AS b + """.stripMargin + ) + val showDDL = getShowCreateDDL(s"SHOW CREATE TABLE $t") + assert(showDDL === Array( +s"CREATE TABLE $t (", +"`a` INT,", +"`b` STRING)", +defaultUsing, +"PARTITIONED BY (a)", +"COMMENT 'This is a comment'", +"TBLPROPERTIES(", +"'a' = '1')" + )) +} + } + + test("SPARK-33898: SHOW CREATE TABLE") { +val db = "ns1" +val table = "tbl" +withNamespaceAndTable(db, table) { t => + sql( +s""" + |CREATE TABLE $t ( + | a bigint NOT NULL, + | b bigint, + | c bigint, + | `extraCol` ARRAY, + | `` STRUCT> + |) + |$defaultUsing + |OPTIONS ( + | from = 0, + | to = 1, + | via = 2) + |COMMENT 'This is a comment' + |TBLPROPERTIES ('prop1' = '1', 'prop2' = '2', 'prop3' = 3, 'prop4' = 4) + |PARTITIONED BY (a) + |LOCATION '/tmp' Review comment: yes please. Let's fix all the issues found in this PR first, then we can come back and merge this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sunchao commented on a change in pull request #34635: [SPARK-37205][YARN] Introduce a new config 'spark.yarn.am.tokenConfRegex' to support renewing delegation tokens in a multi-cluste
sunchao commented on a change in pull request #34635: URL: https://github.com/apache/spark/pull/34635#discussion_r758970173 ## File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ## @@ -340,6 +344,40 @@ private[spark] class Client( amContainer.setTokens(ByteBuffer.wrap(serializedCreds)) } + /** + * Set configurations sent from AM to RM for renewing delegation tokens. + */ + private def setTokenConf(amContainer: ContainerLaunchContext): Unit = { +// SPARK-37205: this regex is used to grep a list of configurations and send them to YARN RM +// for fetching delegation tokens. See YARN-5910 for more details. +// The feature is only supported in Hadoop 3.x and up, hence the check below. +val regex = sparkConf.get(config.AM_SEND_TOKEN_CONF) +if (regex.nonEmpty && VersionUtils.isHadoop3) { + logInfo(s"Processing token conf (spark.yarn.am.sendTokenConf) with regex $regex") + val dob = new DataOutputBuffer(); + val copy = new Configuration(false); + copy.clear(); + hadoopConf.asScala.foreach { entry => +if (entry.getKey.matches(regex.get)) { + copy.set(entry.getKey, entry.getValue) + logInfo(s"Captured key: ${entry.getKey} -> value: ${entry.getValue}") +} + } + copy.write(dob); + + // since this method was added in Hadoop 2.9 and 3.0, we use reflection here to avoid Review comment: Got it. I added the change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #34264: [SPARK-36462][K8S] Add the ability to selectively disable watching or polling
dongjoon-hyun commented on a change in pull request #34264: URL: https://github.com/apache/spark/pull/34264#discussion_r758971094 ## File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsWatchSnapshotSource.scala ## @@ -22,24 +22,30 @@ import io.fabric8.kubernetes.api.model.Pod import io.fabric8.kubernetes.client.{KubernetesClient, Watcher, WatcherException} import io.fabric8.kubernetes.client.Watcher.Action +import org.apache.spark.SparkConf +import org.apache.spark.deploy.k8s.Config.KUBERNETES_EXECUTOR_ENABLE_API_WATCHER import org.apache.spark.deploy.k8s.Constants._ import org.apache.spark.internal.Logging import org.apache.spark.util.Utils private[spark] class ExecutorPodsWatchSnapshotSource( snapshotsStore: ExecutorPodsSnapshotsStore, -kubernetesClient: KubernetesClient) extends Logging { +kubernetesClient: KubernetesClient, +conf: SparkConf) extends Logging { Review comment: For this one, I made a PR to provide a better backward compatibility and to help the downstreams' ExternalClusterManager. Since these classes are unchanged since 2.4.0, I believe we can declare it `stable developer API` and maintain it more carefully. - https://github.com/apache/spark/pull/34751 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi
SparkQA commented on pull request #34751: URL: https://github.com/apache/spark/pull/34751#issuecomment-982329574 **[Test build #145749 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145749/testReport)** for PR 34751 at commit [`f845a10`](https://github.com/apache/spark/commit/f845a10f24f23a18ec230b3b8581e3bce866cc61). * This patch **fails to build**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class ExecutorPodsPollingSnapshotSource(` * `class ExecutorPodsWatchSnapshotSource(` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi
AmplabJenkins removed a comment on pull request #34751: URL: https://github.com/apache/spark/pull/34751#issuecomment-982329599 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145749/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi
SparkQA removed a comment on pull request #34751: URL: https://github.com/apache/spark/pull/34751#issuecomment-982324903 **[Test build #145749 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145749/testReport)** for PR 34751 at commit [`f845a10`](https://github.com/apache/spark/commit/f845a10f24f23a18ec230b3b8581e3bce866cc61). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi
AmplabJenkins commented on pull request #34751: URL: https://github.com/apache/spark/pull/34751#issuecomment-982329599 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145749/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi
SparkQA commented on pull request #34751: URL: https://github.com/apache/spark/pull/34751#issuecomment-982329931 **[Test build #145751 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145751/testReport)** for PR 34751 at commit [`8129ab1`](https://github.com/apache/spark/commit/8129ab1dd1e2e384caf5851a616e3b255953e9b0). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34635: [SPARK-37205][YARN] Introduce a new config 'spark.yarn.am.tokenConfRegex' to support renewing delegation tokens in a multi-cluster environmen
SparkQA commented on pull request #34635: URL: https://github.com/apache/spark/pull/34635#issuecomment-982330102 **[Test build #145752 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145752/testReport)** for PR 34635 at commit [`c69cb6e`](https://github.com/apache/spark/commit/c69cb6e9720e8e1f443aa2c52492c5639928435a). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34746: [SPARK-37489][PYTHON] Skip hasnans check in numops if eager_check disable
HyukjinKwon commented on pull request #34746: URL: https://github.com/apache/spark/pull/34746#issuecomment-982330232 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34732: [SPARK-37291][PYSPARK][FOLLOWUP] PySpark create SparkSession should pass initialSessionOptions
HyukjinKwon commented on pull request #34732: URL: https://github.com/apache/spark/pull/34732#issuecomment-982330696 Merged to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #34746: [SPARK-37489][PYTHON] Skip hasnans check in numops if eager_check disable
HyukjinKwon closed pull request #34746: URL: https://github.com/apache/spark/pull/34746 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #34732: [SPARK-37291][PYSPARK][FOLLOWUP] PySpark create SparkSession should pass initialSessionOptions
HyukjinKwon closed pull request #34732: URL: https://github.com/apache/spark/pull/34732 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sunchao commented on pull request #34656: [SPARK-37376][SQL] Introduce a new DataSource V2 interface HasPartitionKey
sunchao commented on pull request #34656: URL: https://github.com/apache/spark/pull/34656#issuecomment-982331198 @dongjoon-hyun done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34656: [SPARK-37376][SQL] Introduce a new DataSource V2 interface HasPartitionKey
SparkQA commented on pull request #34656: URL: https://github.com/apache/spark/pull/34656#issuecomment-982331806 **[Test build #145753 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145753/testReport)** for PR 34656 at commit [`6b920e1`](https://github.com/apache/spark/commit/6b920e1e3109d6f2c37150cb2ffd168790a35d3e). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi
dongjoon-hyun commented on pull request #34751: URL: https://github.com/apache/spark/pull/34751#issuecomment-982332947 Thank you for review, @viirya . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sungpeo opened a new pull request #34752: [SPARK][STREAMING] minRatePerPartition should be multiplied with secsPerBatch
sungpeo opened a new pull request #34752: URL: https://github.com/apache/spark/pull/34752 ### What changes were proposed in this pull request? `maxRatePerPartition` means "max messages per partition per second". But minRatePerPartition does not. ("max messages per partition per a batch"). minRatePerPartition should be multiplied with secsPerBatch ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #34264: [SPARK-36462][K8S] Add the ability to selectively disable watching or polling
dongjoon-hyun commented on pull request #34264: URL: https://github.com/apache/spark/pull/34264#issuecomment-982333916 BTW, in general, I agree with your demands and requirements in this PR. The only concerns are - the better backward compatibility - the visibility of these configurations -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #34656: [SPARK-37376][SQL] Introduce a new DataSource V2 interface HasPartitionKey
dongjoon-hyun commented on pull request #34656: URL: https://github.com/apache/spark/pull/34656#issuecomment-982334173 Thank you, @sunchao ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov
HyukjinKwon commented on pull request #34213: URL: https://github.com/apache/spark/pull/34213#issuecomment-982334604 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov
HyukjinKwon closed pull request #34213: URL: https://github.com/apache/spark/pull/34213 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi
SparkQA commented on pull request #34751: URL: https://github.com/apache/spark/pull/34751#issuecomment-982334949 **[Test build #145751 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145751/testReport)** for PR 34751 at commit [`8129ab1`](https://github.com/apache/spark/commit/8129ab1dd1e2e384caf5851a616e3b255953e9b0). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi
SparkQA commented on pull request #34751: URL: https://github.com/apache/spark/pull/34751#issuecomment-982335210 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50220/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #34656: [SPARK-37376][SQL] Introduce a new DataSource V2 interface HasPartitionKey
dongjoon-hyun commented on pull request #34656: URL: https://github.com/apache/spark/pull/34656#issuecomment-982335508 Although it looks good to me, gentle ping once more, @cloud-fan @rdblue @viirya @huaxingao . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34750: [SPARK-37495][PYTHON] Skip identical index checking of Series.compare when config 'compute.eager_check' is disabled
SparkQA commented on pull request #34750: URL: https://github.com/apache/spark/pull/34750#issuecomment-982337997 **[Test build #145750 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145750/testReport)** for PR 34750 at commit [`2bbb84d`](https://github.com/apache/spark/commit/2bbb84d30157e9b855f81c5448c326b34b302937). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source
HyukjinKwon commented on a change in pull request #34596: URL: https://github.com/apache/spark/pull/34596#discussion_r758979898 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVInferSchema.scala ## @@ -160,6 +169,17 @@ class CSVInferSchema(val options: CSVOptions) extends Serializable { private def tryParseDouble(field: String): DataType = { if ((allCatch opt field.toDouble).isDefined || isInfOrNan(field)) { DoubleType +} else { + tryParseTimestampNTZ(field) +} + } + + private def tryParseTimestampNTZ(field: String): DataType = { +// We can only parse the value as TimestampNTZType if it does not have zone-offset or +// time-zone component and can be parsed with the timestamp formatter. +// Otherwise, it is likely to be a timestamp with timezone. +if ((allCatch opt timestampNTZFormatter.parseWithoutTimeZone(field, true)).isDefined) { Review comment: Should maybe we skip the parsing if `SQLConf.get.timestampType` is set to `TIMESTAMP_LTZ` since parsing is non-trivial op? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi
SparkQA commented on pull request #34751: URL: https://github.com/apache/spark/pull/34751#issuecomment-982338326 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50222/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source
HyukjinKwon commented on a change in pull request #34596: URL: https://github.com/apache/spark/pull/34596#discussion_r758980984 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVInferSchema.scala ## @@ -38,6 +39,13 @@ class CSVInferSchema(val options: CSVOptions) extends Serializable { legacyFormat = FAST_DATE_FORMAT, isParsing = true) + private val timestampNTZFormatter = TimestampFormatter( +options.timestampNTZFormatInRead, +options.zoneId, +legacyFormat = FAST_DATE_FORMAT, +isParsing = true, +forTimestampNTZ = true) Review comment: this part I'd defer to @MaxGekk to review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34750: [SPARK-37495][PYTHON] Skip identical index checking of Series.compare when config 'compute.eager_check' is disabled
SparkQA removed a comment on pull request #34750: URL: https://github.com/apache/spark/pull/34750#issuecomment-982324974 **[Test build #145750 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145750/testReport)** for PR 34750 at commit [`2bbb84d`](https://github.com/apache/spark/commit/2bbb84d30157e9b855f81c5448c326b34b302937). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi
SparkQA removed a comment on pull request #34751: URL: https://github.com/apache/spark/pull/34751#issuecomment-982329931 **[Test build #145751 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145751/testReport)** for PR 34751 at commit [`8129ab1`](https://github.com/apache/spark/commit/8129ab1dd1e2e384caf5851a616e3b255953e9b0). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Peng-Lei opened a new pull request #34753: [SPARK-37494][SQL] Unify v1 and v2 options output of `SHOW CREATE TABLE` command
Peng-Lei opened a new pull request #34753: URL: https://github.com/apache/spark/pull/34753 ### What changes were proposed in this pull request? 1. Change the v1 `SHOW CREATE TABLE` command behaviors that options output match v2. eg: `'key' = 'value'` 2. sort the order of options output. 3. sort the order of properties output. ### Why are the changes needed? match v2 behaviors and disscuss at [#comments](https://github.com/apache/spark/pull/34719#discussion_r758156350) ### Does this PR introduce _any_ user-facing change? Yes. when `SHOW CREATE TABLE` the output of properties and options is sorted and options output is like `'key' = 'value'` ### How was this patch tested? Add test case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34635: [SPARK-37205][YARN] Introduce a new config 'spark.yarn.am.tokenConfRegex' to support renewing delegation tokens in a multi-cluster environmen
SparkQA commented on pull request #34635: URL: https://github.com/apache/spark/pull/34635#issuecomment-982342499 **[Test build #145752 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145752/testReport)** for PR 34635 at commit [`c69cb6e`](https://github.com/apache/spark/commit/c69cb6e9720e8e1f443aa2c52492c5639928435a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34635: [SPARK-37205][YARN] Introduce a new config 'spark.yarn.am.tokenConfRegex' to support renewing delegation tokens in a multi-cluster en
SparkQA removed a comment on pull request #34635: URL: https://github.com/apache/spark/pull/34635#issuecomment-982330102 **[Test build #145752 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145752/testReport)** for PR 34635 at commit [`c69cb6e`](https://github.com/apache/spark/commit/c69cb6e9720e8e1f443aa2c52492c5639928435a). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34723: [SPARK-37492][SQL] Optimize Orc test code with withAllNativeOrcReaders
SparkQA commented on pull request #34723: URL: https://github.com/apache/spark/pull/34723#issuecomment-982344119 **[Test build #145738 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145738/testReport)** for PR 34723 at commit [`96bb4f5`](https://github.com/apache/spark/commit/96bb4f5a2c4e06d902062369bb26c05b47b4d8a5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34723: [SPARK-37492][SQL] Optimize Orc test code with withAllNativeOrcReaders
SparkQA removed a comment on pull request #34723: URL: https://github.com/apache/spark/pull/34723#issuecomment-982219652 **[Test build #145738 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145738/testReport)** for PR 34723 at commit [`96bb4f5`](https://github.com/apache/spark/commit/96bb4f5a2c4e06d902062369bb26c05b47b4d8a5). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] huaxingao commented on pull request #34060: [SPARK-36850][SQL] Migrate CreateTableStatement to v2 command framework
huaxingao commented on pull request #34060: URL: https://github.com/apache/spark/pull/34060#issuecomment-982348104 Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34750: [SPARK-37495][PYTHON] Skip identical index checking of Series.compare when config 'compute.eager_check' is disabled
SparkQA commented on pull request #34750: URL: https://github.com/apache/spark/pull/34750#issuecomment-982348283 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50221/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on pull request #31847: [SPARK-34755][SQL] Support the utils for transform number format
MaxGekk commented on pull request #31847: URL: https://github.com/apache/spark/pull/31847#issuecomment-982348272 Since the functions are broadly used in other systems, it makes sense to support them in Spark that can make migration to Spark easier, I believe. @beliefer Could you re-open this PR, please. @cloud-fan Do you agree? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov
SparkQA commented on pull request #34213: URL: https://github.com/apache/spark/pull/34213#issuecomment-982349531 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50219/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dchvn commented on pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov
dchvn commented on pull request #34213: URL: https://github.com/apache/spark/pull/34213#issuecomment-982350720 Thanks! @itholic @Yikun @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LuciferYang commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock
LuciferYang commented on a change in pull request #34745: URL: https://github.com/apache/spark/pull/34745#discussion_r758992879 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProviderSuite.scala ## @@ -68,12 +69,20 @@ class ConnectionProviderSuite override def canHandle(driver: Driver, options: Map[String, String]): Boolean = true override def getConnection(driver: Driver, options: Map[String, String]): Connection = throw new RuntimeException() + override def needsModifySecurityConfiguration( Review comment: Should I write the default method to `JdbcConnectionProvider` (return false) and override it only where necessary ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProviderSuite.scala ## @@ -68,12 +69,20 @@ class ConnectionProviderSuite override def canHandle(driver: Driver, options: Map[String, String]): Boolean = true override def getConnection(driver: Driver, options: Map[String, String]): Connection = throw new RuntimeException() + override def needsModifySecurityConfiguration( Review comment: Should we write the default method to `JdbcConnectionProvider` (return false) and override it only where necessary -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] summaryzb commented on a change in pull request #34749: [SPARK-37493][CORE] show driver's gc time and duration time in executors page
summaryzb commented on a change in pull request #34749: URL: https://github.com/apache/spark/pull/34749#discussion_r758993284 ## File path: core/src/main/scala/org/apache/spark/metrics/ExecutorMetricType.scala ## @@ -137,7 +138,9 @@ case object GarbageCollectionMetrics extends ExecutorMetricType with Logging { override private[spark] def getMetricValues(memoryManager: MemoryManager): Array[Long] = { val gcMetrics = new Array[Long](names.length) // minorCount, minorTime, majorCount, majorTime -ManagementFactory.getGarbageCollectorMXBeans.asScala.foreach { mxBean => +val mxBeans = ManagementFactory.getGarbageCollectorMXBeans.asScala +gcMetrics(4) = mxBeans.map(_.getCollectionTime).sum Review comment: in common case they are the same, but when we use nonBuiltInCollectors, "gcMetrics(1) + gcMetrics(3)" is zero, but the actual total gc time is not zero -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34750: [SPARK-37495][PYTHON] Skip identical index checking of Series.compare when config 'compute.eager_check' is disabled
AmplabJenkins commented on pull request #34750: URL: https://github.com/apache/spark/pull/34750#issuecomment-982353320 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145750/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov
AmplabJenkins commented on pull request #34213: URL: https://github.com/apache/spark/pull/34213#issuecomment-982353317 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50219/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34723: [SPARK-37492][SQL] Optimize Orc test code with withAllNativeOrcReaders
AmplabJenkins commented on pull request #34723: URL: https://github.com/apache/spark/pull/34723#issuecomment-982353318 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145738/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi
AmplabJenkins commented on pull request #34751: URL: https://github.com/apache/spark/pull/34751#issuecomment-982353203 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34635: [SPARK-37205][YARN] Introduce a new config 'spark.yarn.am.tokenConfRegex' to support renewing delegation tokens in a multi-cluster envi
AmplabJenkins commented on pull request #34635: URL: https://github.com/apache/spark/pull/34635#issuecomment-982353319 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145752/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov
AmplabJenkins removed a comment on pull request #34213: URL: https://github.com/apache/spark/pull/34213#issuecomment-982353317 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50219/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi
AmplabJenkins removed a comment on pull request #34751: URL: https://github.com/apache/spark/pull/34751#issuecomment-982353202 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34723: [SPARK-37492][SQL] Optimize Orc test code with withAllNativeOrcReaders
AmplabJenkins removed a comment on pull request #34723: URL: https://github.com/apache/spark/pull/34723#issuecomment-982353318 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145738/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34635: [SPARK-37205][YARN] Introduce a new config 'spark.yarn.am.tokenConfRegex' to support renewing delegation tokens in a multi-clus
AmplabJenkins removed a comment on pull request #34635: URL: https://github.com/apache/spark/pull/34635#issuecomment-982353319 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145752/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34750: [SPARK-37495][PYTHON] Skip identical index checking of Series.compare when config 'compute.eager_check' is disabled
AmplabJenkins removed a comment on pull request #34750: URL: https://github.com/apache/spark/pull/34750#issuecomment-982353320 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145750/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34568: [SPARK-37287][SQL] Pull out dynamic partition and bucket sort from FileFormatWriter
SparkQA commented on pull request #34568: URL: https://github.com/apache/spark/pull/34568#issuecomment-982353646 **[Test build #145740 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145740/testReport)** for PR 34568 at commit [`7fddb62`](https://github.com/apache/spark/commit/7fddb62cc5d6f93b9525162fdf4bd4602903e248). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34752: [SPARK][STREAMING] minRatePerPartition should be multiplied with secsPerBatch
AmplabJenkins commented on pull request #34752: URL: https://github.com/apache/spark/pull/34752#issuecomment-982353991 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34568: [SPARK-37287][SQL] Pull out dynamic partition and bucket sort from FileFormatWriter
SparkQA removed a comment on pull request #34568: URL: https://github.com/apache/spark/pull/34568#issuecomment-982226440 **[Test build #145740 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145740/testReport)** for PR 34568 at commit [`7fddb62`](https://github.com/apache/spark/commit/7fddb62cc5d6f93b9525162fdf4bd4602903e248). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34750: [SPARK-37495][PYTHON] Skip identical index checking of Series.compare when config 'compute.eager_check' is disabled
SparkQA commented on pull request #34750: URL: https://github.com/apache/spark/pull/34750#issuecomment-982354345 **[Test build #145756 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145756/testReport)** for PR 34750 at commit [`dabc3c5`](https://github.com/apache/spark/commit/dabc3c5d854b0e7f22eb0f65005e3bc4a3b83016). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi
SparkQA commented on pull request #34751: URL: https://github.com/apache/spark/pull/34751#issuecomment-982354267 **[Test build #145755 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145755/testReport)** for PR 34751 at commit [`fb1d62a`](https://github.com/apache/spark/commit/fb1d62af938107a43c6140c06c1681117ba965dd). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34753: [SPARK-37494][SQL] Unify v1 and v2 options output of `SHOW CREATE TABLE` command
SparkQA commented on pull request #34753: URL: https://github.com/apache/spark/pull/34753#issuecomment-982354219 **[Test build #145754 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145754/testReport)** for PR 34753 at commit [`97e52cb`](https://github.com/apache/spark/commit/97e52cb1856fda7148fd05f553ec090a256527b5). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34568: [SPARK-37287][SQL] Pull out dynamic partition and bucket sort from FileFormatWriter
AmplabJenkins commented on pull request #34568: URL: https://github.com/apache/spark/pull/34568#issuecomment-982354923 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145740/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34635: [SPARK-37205][YARN] Introduce a new config 'spark.yarn.am.tokenConfRegex' to support renewing delegation tokens in a multi-cluster environmen
SparkQA commented on pull request #34635: URL: https://github.com/apache/spark/pull/34635#issuecomment-982355152 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50224/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] summaryzb commented on a change in pull request #34749: [SPARK-37493][CORE] show driver's gc time and duration time in executors page
summaryzb commented on a change in pull request #34749: URL: https://github.com/apache/spark/pull/34749#discussion_r758995760 ## File path: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala ## @@ -89,7 +89,44 @@ private[spark] class AppStatusStore( } else { base } -filtered.asScala.map(_.info).filter(_.id != FALLBACK_BLOCK_MANAGER_ID.executorId).toSeq +filtered.asScala.map(_.info) + .filter(_.id != FALLBACK_BLOCK_MANAGER_ID.executorId) + .map(replaceExec).toSeq + } + + def replaceExec(origin: v1.ExecutorSummary): v1.ExecutorSummary = { Review comment: make it private -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34568: [SPARK-37287][SQL] Pull out dynamic partition and bucket sort from FileFormatWriter
AmplabJenkins removed a comment on pull request #34568: URL: https://github.com/apache/spark/pull/34568#issuecomment-982354923 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145740/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34635: [SPARK-37205][YARN] Introduce a new config 'spark.yarn.am.tokenConfRegex' to support renewing delegation tokens in a multi-cluster environmen
SparkQA commented on pull request #34635: URL: https://github.com/apache/spark/pull/34635#issuecomment-982356662 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50223/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LuciferYang commented on a change in pull request #34743: [SPARK-37488][CORE] When `TaskLocation` is `HDFSCacheTaskLocation` or `HostTaskLocation`, check if executor is alive on the h
LuciferYang commented on a change in pull request #34743: URL: https://github.com/apache/spark/pull/34743#discussion_r758996701 ## File path: core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala ## @@ -291,6 +291,21 @@ class TaskSetManagerSuite assert(manager.resourceOffer("execA", "host1", ANY)._1.get.index === 0) } + test("skip unsatisfiable locality levels (the case TaskLocation is HostTaskLocation)") { Review comment: Maybe we should add `SPARK-37488` to the test name as prefix -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34656: [SPARK-37376][SQL] Introduce a new DataSource V2 interface HasPartitionKey
SparkQA commented on pull request #34656: URL: https://github.com/apache/spark/pull/34656#issuecomment-982357268 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50225/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sadikovi commented on a change in pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source
sadikovi commented on a change in pull request #34596: URL: https://github.com/apache/spark/pull/34596#discussion_r75908 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVInferSchema.scala ## @@ -160,6 +169,17 @@ class CSVInferSchema(val options: CSVOptions) extends Serializable { private def tryParseDouble(field: String): DataType = { if ((allCatch opt field.toDouble).isDefined || isInfOrNan(field)) { DoubleType +} else { + tryParseTimestampNTZ(field) +} + } + + private def tryParseTimestampNTZ(field: String): DataType = { +// We can only parse the value as TimestampNTZType if it does not have zone-offset or +// time-zone component and can be parsed with the timestamp formatter. +// Otherwise, it is likely to be a timestamp with timezone. +if ((allCatch opt timestampNTZFormatter.parseWithoutTimeZone(field, true)).isDefined) { Review comment: Could you elaborate a bit more? Thanks. My understanding was that the config indicated whether the output of parsing should be treated as TimestampNTZ or TimestampLTZ. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] guiyanakuang commented on a change in pull request #34743: [SPARK-37488][CORE] When `TaskLocation` is `HDFSCacheTaskLocation` or `HostTaskLocation`, check if executor is alive on the
guiyanakuang commented on a change in pull request #34743: URL: https://github.com/apache/spark/pull/34743#discussion_r759001312 ## File path: core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala ## @@ -291,6 +291,21 @@ class TaskSetManagerSuite assert(manager.resourceOffer("execA", "host1", ANY)._1.get.index === 0) } + test("skip unsatisfiable locality levels (the case TaskLocation is HostTaskLocation)") { Review comment: Thanks for the reminder, I'll add it later -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yangwwei commented on pull request #34672: [SPARK-37394][CORE] Skip registering with ESS if a customized shuffle manager is configured
yangwwei commented on pull request #34672: URL: https://github.com/apache/spark/pull/34672#issuecomment-982363102 @mridulm , @attilapiros , @tgravescs could you pls help to review the changes again? Per @attilapiros 's suggestion, I have added a method in the ShuffleManager trait and this is allowed to be overridden when needed. The default returns true, so there is no behavior change. I have also updated the "how this was tested" with more details about the tests I've done locally. Note, this is still an "incompatible" change to the 3rd party shuffle service implementations. Adding a method with a default implementation in a trait will require a re-compile of the RSS's server/client library. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yangwwei edited a comment on pull request #34672: [SPARK-37394][CORE] Skip registering with ESS if a customized shuffle manager is configured
yangwwei edited a comment on pull request #34672: URL: https://github.com/apache/spark/pull/34672#issuecomment-982363102 @mridulm , @attilapiros , @tgravescs could you pls help to review the changes again? Per @attilapiros 's suggestion, I have added a method in the ShuffleManager trait and this is allowed to be overridden when needed. The default returns true, so there is no behavior change. I have also updated the "[How this was tested](https://github.com/apache/spark/pull/34672#issue-785084737)" with more details about the tests I've done locally. Note, this is still an "incompatible" change to the 3rd party shuffle service implementations. Adding a method with a default implementation in a trait will require a re-compile of the RSS's server/client library. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi
SparkQA commented on pull request #34751: URL: https://github.com/apache/spark/pull/34751#issuecomment-982364378 **[Test build #145755 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145755/testReport)** for PR 34751 at commit [`fb1d62a`](https://github.com/apache/spark/commit/fb1d62af938107a43c6140c06c1681117ba965dd). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi
SparkQA removed a comment on pull request #34751: URL: https://github.com/apache/spark/pull/34751#issuecomment-982354267 **[Test build #145755 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145755/testReport)** for PR 34751 at commit [`fb1d62a`](https://github.com/apache/spark/commit/fb1d62af938107a43c6140c06c1681117ba965dd). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer opened a new pull request #31847: [SPARK-34755][SQL] Support the utils for transform number format
beliefer opened a new pull request #31847: URL: https://github.com/apache/spark/pull/31847 ### What changes were proposed in this pull request? Data Type Formatting Functions: `to_number` and `to_char` is very useful. There are some mainstream database support the syntax. **PostgreSQL:** **Oracle:** **Vertica** **Redshift** **DB2** **Teradata** **Snowflake:** **Exasol** **Phoenix** **Singlestore** **Intersystems** The implement has many different between `Postgresql` ,`Oracle` and `Phoenix`. So, this PR follows the implement of `to_number` in `Oracle` that give a strict parameter verification. So, this PR follows the implement of `to_number` in `Phoenix` that uses BigDecimal. This PR support the patterns for numeric formatting as follows: Pattern | Description -- | -- 9 | Value with the specified number of digits 0 | Value with leading zeros . (period) | Decimal point , (comma) | Group (thousand) separator S | Sign anchored to number (uses locale) $ | a value with a leading dollar sign D | Decimal point (uses locale) G | Group separator (uses locale) ### Why are the changes needed? to_number and to_char are very useful for formatted currency to number conversion. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Jenkins test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34750: [SPARK-37495][PYTHON] Skip identical index checking of Series.compare when config 'compute.eager_check' is disabled
SparkQA commented on pull request #34750: URL: https://github.com/apache/spark/pull/34750#issuecomment-982370952 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50221/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org