[GitHub] [spark] SparkQA commented on pull request #32377: [SPARK-35021][SQL] Group exception messages in connector/catalog

2021-04-28 Thread GitBox
SparkQA commented on pull request #32377: URL: https://github.com/apache/spark/pull/32377#issuecomment-828903816 **[Test build #138057 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138057/testReport)** for PR 32377 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #32377: [SPARK-35021][SQL] Group exception messages in connector/catalog

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32377: URL: https://github.com/apache/spark/pull/32377#issuecomment-828903827 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138057/ -- This

[GitHub] [spark] yaooqinn commented on a change in pull request #31960: [SPARK-34786][SQL] Read Parquet unsigned int64 logical type that stored as signed int64 physical type to decimal(20, 0)

2021-04-28 Thread GitBox
yaooqinn commented on a change in pull request #31960: URL: https://github.com/apache/spark/pull/31960#discussion_r622694764 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala ## @@ -144,7 +141,7 @@ class

[GitHub] [spark] SparkQA commented on pull request #32303: [SPARK-34382][SQL] Support LATERAL subqueries

2021-04-28 Thread GitBox
SparkQA commented on pull request #32303: URL: https://github.com/apache/spark/pull/32303#issuecomment-828902507 **[Test build #138059 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138059/testReport)** for PR 32303 at commit

[GitHub] [spark] SparkQA commented on pull request #32350: [SPARK-35231][SQL] logical.Range override maxRowsPerPartition

2021-04-28 Thread GitBox
SparkQA commented on pull request #32350: URL: https://github.com/apache/spark/pull/32350#issuecomment-828902437 **[Test build #138058 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138058/testReport)** for PR 32350 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32383: [SPARK-35255][BUILD] Automated formatting for Scala Code for Blank Lines

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32383: URL: https://github.com/apache/spark/pull/32383#issuecomment-828902192 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42575/

[GitHub] [spark] SparkQA commented on pull request #32377: [SPARK-35021][SQL] Group exception messages in connector/catalog

2021-04-28 Thread GitBox
SparkQA commented on pull request #32377: URL: https://github.com/apache/spark/pull/32377#issuecomment-828902443 **[Test build #138057 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138057/testReport)** for PR 32377 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #32383: [SPARK-35255][BUILD] Automated formatting for Scala Code for Blank Lines

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32383: URL: https://github.com/apache/spark/pull/32383#issuecomment-828902192 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42575/ --

[GitHub] [spark] Yikun commented on a change in pull request #32386: [SPARK-34887][PYTHON] Port Koalas dependencies into PySpark

2021-04-28 Thread GitBox
Yikun commented on a change in pull request #32386: URL: https://github.com/apache/spark/pull/32386#discussion_r622692651 ## File path: python/setup.py ## @@ -250,14 +257,22 @@ def run(self): license='http://www.apache.org/licenses/LICENSE-2.0', # Don't

[GitHub] [spark] SparkQA commented on pull request #32383: [SPARK-35255][BUILD] Automated formatting for Scala Code for Blank Lines

2021-04-28 Thread GitBox
SparkQA commented on pull request #32383: URL: https://github.com/apache/spark/pull/32383#issuecomment-828901432 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42575/ -- This is an automated message from the

[GitHub] [spark] zhengruifeng commented on a change in pull request #32350: [SPARK-35231][SQL] logical.Range override maxRowsPerPartition

2021-04-28 Thread GitBox
zhengruifeng commented on a change in pull request #32350: URL: https://github.com/apache/spark/pull/32350#discussion_r622693358 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala ## @@ -161,6 +162,7 @@ case class

[GitHub] [spark] cloud-fan commented on a change in pull request #32361: [SPARK-35240][SS] Use CheckpointFileManager for checkpoint file manipulation

2021-04-28 Thread GitBox
cloud-fan commented on a change in pull request #32361: URL: https://github.com/apache/spark/pull/32361#discussion_r622693159 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CheckpointFileManager.scala ## @@ -83,6 +83,9 @@ trait

[GitHub] [spark] zhengruifeng commented on a change in pull request #32350: [SPARK-35231][SQL] logical.Range override maxRowsPerPartition

2021-04-28 Thread GitBox
zhengruifeng commented on a change in pull request #32350: URL: https://github.com/apache/spark/pull/32350#discussion_r622693068 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala ## @@ -69,6 +69,7 @@ case class

[GitHub] [spark] cloud-fan commented on a change in pull request #32361: [SPARK-35240][SS] Use CheckpointFileManager for checkpoint file manipulation

2021-04-28 Thread GitBox
cloud-fan commented on a change in pull request #32361: URL: https://github.com/apache/spark/pull/32361#discussion_r622692757 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CheckpointFileManager.scala ## @@ -83,6 +83,9 @@ trait

[GitHub] [spark] SparkQA commented on pull request #32383: [SPARK-35255][BUILD] Automated formatting for Scala Code for Blank Lines

2021-04-28 Thread GitBox
SparkQA commented on pull request #32383: URL: https://github.com/apache/spark/pull/32383#issuecomment-828900138 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42575/ -- This is an automated message from the Apache

[GitHub] [spark] cloud-fan commented on a change in pull request #31960: [SPARK-34786][SQL] Read Parquet unsigned int64 logical type that stored as signed int64 physical type to decimal(20, 0)

2021-04-28 Thread GitBox
cloud-fan commented on a change in pull request #31960: URL: https://github.com/apache/spark/pull/31960#discussion_r622691560 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala ## @@ -144,7 +141,7 @@ class

[GitHub] [spark] zhengruifeng commented on pull request #32350: [SPARK-35231][SQL] logical.Range override maxRowsPerPartition

2021-04-28 Thread GitBox
zhengruifeng commented on pull request #32350: URL: https://github.com/apache/spark/pull/32350#issuecomment-828898461 To add a similar test in `CombiningLimitsSuite`, some additional changes are involved. I'm not sure whether to switch to a simple test like: ``` scala>

[GitHub] [spark] yaooqinn edited a comment on pull request #32373: [SPARK-35248] Spark shall load system class first in IsolatedClientLoader

2021-04-28 Thread GitBox
yaooqinn edited a comment on pull request #32373: URL: https://github.com/apache/spark/pull/32373#issuecomment-828889996 can you describe the jar loading order in the PR description both when userClassPathFirst and !userClassPathFirst? -- This is an automated message from the Apache Git

[GitHub] [spark] beliefer commented on pull request #32377: [SPARK-35021][SQL] Group exception messages in connector/catalog

2021-04-28 Thread GitBox
beliefer commented on pull request #32377: URL: https://github.com/apache/spark/pull/32377#issuecomment-828890355 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32373: [SPARK-35248] Spark shall load system class first in IsolatedClientLoader

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32373: URL: https://github.com/apache/spark/pull/32373#issuecomment-828228167 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] beliefer commented on pull request #32367: [SPARK-35020][SQL] Group exception messages in catalyst/util

2021-04-28 Thread GitBox
beliefer commented on pull request #32367: URL: https://github.com/apache/spark/pull/32367#issuecomment-828890043 ping @allisonwang-db -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] yaooqinn commented on pull request #32373: [SPARK-35248] Spark shall load system class first in IsolatedClientLoader

2021-04-28 Thread GitBox
yaooqinn commented on pull request #32373: URL: https://github.com/apache/spark/pull/32373#issuecomment-828889996 can you describe the jar loading order in the PR description? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32383: [SPARK-35255][BUILD] Automated formatting for Scala Code for Blank Lines

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32383: URL: https://github.com/apache/spark/pull/32383#issuecomment-828458768 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] SparkQA commented on pull request #32383: [SPARK-35255][BUILD] Automated formatting for Scala Code for Blank Lines

2021-04-28 Thread GitBox
SparkQA commented on pull request #32383: URL: https://github.com/apache/spark/pull/32383#issuecomment-828886548 **[Test build #138056 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138056/testReport)** for PR 32383 at commit

[GitHub] [spark] HyukjinKwon commented on pull request #32332: [SPARK-35211][PYTHON] verify inferred schema for _create_dataframe

2021-04-28 Thread GitBox
HyukjinKwon commented on pull request #32332: URL: https://github.com/apache/spark/pull/32332#issuecomment-828886196 BTW, can we avoid doing the verification twice (https://github.com/apache/spark/pull/32332#discussion_r619915537) too? -- This is an automated message from the Apache Git

[GitHub] [spark] HyukjinKwon commented on pull request #32332: [SPARK-35211][PYTHON] verify inferred schema for _create_dataframe

2021-04-28 Thread GitBox
HyukjinKwon commented on pull request #32332: URL: https://github.com/apache/spark/pull/32332#issuecomment-828886094 If users explicitly set `verifySchema`, then it should work. Without this PR, it doesn't work right. And then, we would have to document that `verifySchema` is turned

[GitHub] [spark] HyukjinKwon commented on pull request #32332: [SPARK-35211][PYTHON] verify inferred schema for _create_dataframe

2021-04-28 Thread GitBox
HyukjinKwon commented on pull request #32332: URL: https://github.com/apache/spark/pull/32332#issuecomment-828885872 Ah, I meant to disable it only when `schema` is `None`, meaning that keeping the same default behaviour. -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] yaooqinn commented on pull request #32387: [SPARK-35244][SQL][FOLLOWUP] Add null check for the exception cause

2021-04-28 Thread GitBox
yaooqinn commented on pull request #32387: URL: https://github.com/apache/spark/pull/32387#issuecomment-828885351 late LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32373: [SPARK-35248] Spark shall load system class first in IsolatedClientLoader

2021-04-28 Thread GitBox
HyukjinKwon commented on a change in pull request #32373: URL: https://github.com/apache/spark/pull/32373#discussion_r622673134 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala ## @@ -398,7 +398,7 @@ private[spark] object HiveUtils extends

[GitHub] [spark] HyukjinKwon commented on pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-04-28 Thread GitBox
HyukjinKwon commented on pull request #31490: URL: https://github.com/apache/spark/pull/31490#issuecomment-828880631 cc @gengliangwang too -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] HyukjinKwon commented on pull request #32383: [SPARK-35255][BUILD] Automated formatting for Scala Code for Blank Lines

2021-04-28 Thread GitBox
HyukjinKwon commented on pull request #32383: URL: https://github.com/apache/spark/pull/32383#issuecomment-828874903 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] otterc commented on pull request #32385: [WIP][SPARK-18188][CORE] Add checksum for shuffle blocks

2021-04-28 Thread GitBox
otterc commented on pull request #32385: URL: https://github.com/apache/spark/pull/32385#issuecomment-828874582 Thanks for copying me. I will take a look at it in few days. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32384: [SPARK-35257][TESTS] Speed up `HadoopVersionInfoSuite` with `SPARK_VERSIONS_SUITE_IVY_PATH`

2021-04-28 Thread GitBox
HyukjinKwon commented on a change in pull request #32384: URL: https://github.com/apache/spark/pull/32384#discussion_r622664398 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/client/HadoopVersionInfoSuite.scala ## @@ -35,39 +34,33 @@ class

[GitHub] [spark] HyukjinKwon commented on pull request #32388: [SPARK-35258][SHUFFLE][YARN] Add new metrics to ExternalShuffleService for better monitoring

2021-04-28 Thread GitBox
HyukjinKwon commented on pull request #32388: URL: https://github.com/apache/spark/pull/32388#issuecomment-828873465 cc @Ngone51 too FYI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] HyukjinKwon commented on pull request #32389: [SPARK-35263] [TEST] Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code

2021-04-28 Thread GitBox
HyukjinKwon commented on pull request #32389: URL: https://github.com/apache/spark/pull/32389#issuecomment-828873256 cc @Ngone51 @mridulm FYI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32386: [SPARK-34887][PYTHON] Port Koalas dependencies into PySpark

2021-04-28 Thread GitBox
HyukjinKwon commented on a change in pull request #32386: URL: https://github.com/apache/spark/pull/32386#discussion_r622662894 ## File path: python/setup.py ## @@ -250,14 +257,22 @@ def run(self): license='http://www.apache.org/licenses/LICENSE-2.0', # Don't

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32386: [SPARK-34887][PYTHON] Port Koalas dependencies into PySpark

2021-04-28 Thread GitBox
HyukjinKwon commented on a change in pull request #32386: URL: https://github.com/apache/spark/pull/32386#discussion_r622659546 ## File path: python/setup.py ## @@ -250,14 +257,22 @@ def run(self): license='http://www.apache.org/licenses/LICENSE-2.0', # Don't

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32386: [SPARK-34887][PYTHON] Port Koalas dependencies into PySpark

2021-04-28 Thread GitBox
HyukjinKwon commented on a change in pull request #32386: URL: https://github.com/apache/spark/pull/32386#discussion_r622658774 ## File path: dev/requirements.txt ## @@ -6,3 +6,13 @@ pydata_sphinx_theme ipython nbsphinx numpydoc + +# dependencies in pandas-on-spark. Review

[GitHub] [spark] HyukjinKwon commented on pull request #32368: [SPARK-35176][PYTHON] Standardize input validation error type

2021-04-28 Thread GitBox
HyukjinKwon commented on pull request #32368: URL: https://github.com/apache/spark/pull/32368#issuecomment-828867074 Oh yeah, that's a very good point. @Yikun feel free to create a new page for Spark 3.1 -> 3.2 at

[GitHub] [spark] maropu commented on pull request #32387: [SPARK-35244][SQL][FOLLOWUP] Add null check for the exception cause

2021-04-28 Thread GitBox
maropu commented on pull request #32387: URL: https://github.com/apache/spark/pull/32387#issuecomment-828862944 Thank you. Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] maropu closed pull request #32387: [SPARK-35244][SQL][FOLLOWUP] Add null check for the exception cause

2021-04-28 Thread GitBox
maropu closed pull request #32387: URL: https://github.com/apache/spark/pull/32387 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [spark] zero323 commented on pull request #32368: [SPARK-35176][PYTHON] Standardize input validation error type

2021-04-28 Thread GitBox
zero323 commented on pull request #32368: URL: https://github.com/apache/spark/pull/32368#issuecomment-828862489 At first glance it looks good (I'll try to do more thorough scan if I have access to larger screen). We might have to document this as a change of behaviour though, as it might

[GitHub] [spark] github-actions[bot] closed pull request #29650: [SPARK-32801][SQL] Make InferFiltersFromConstraints take into account EqualNullSafe

2021-04-28 Thread GitBox
github-actions[bot] closed pull request #29650: URL: https://github.com/apache/spark/pull/29650 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this

[GitHub] [spark] allisonwang-db commented on a change in pull request #32303: [SPARK-34382][SQL] Support LATERAL subqueries

2021-04-28 Thread GitBox
allisonwang-db commented on a change in pull request #32303: URL: https://github.com/apache/spark/pull/32303#discussion_r622637714 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -2234,6 +2260,76 @@ class Analyzer(override

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32389: [SPARK-35263] [TEST] Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32389: URL: https://github.com/apache/spark/pull/32389#issuecomment-828846565 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138055/

[GitHub] [spark] AmplabJenkins commented on pull request #32389: [SPARK-35263] [TEST] Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32389: URL: https://github.com/apache/spark/pull/32389#issuecomment-828846565 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138055/ -- This

[GitHub] [spark] HeartSaVioR commented on pull request #31944: [SPARK-34854][SQL][SS] Expose source metrics via progress report and add Kafka use-case to report delay.

2021-04-28 Thread GitBox
HeartSaVioR commented on pull request #31944: URL: https://github.com/apache/spark/pull/31944#issuecomment-828842015 Yeah I agree about the rationalization and benefits of "adding public API on custom source metrics", though it'd be even better if we could talk with real case which is not

[GitHub] [spark] yijiacui-db commented on a change in pull request #31944: [SPARK-34854][SQL][SS] Expose source metrics via progress report and add Kafka use-case to report delay.

2021-04-28 Thread GitBox
yijiacui-db commented on a change in pull request #31944: URL: https://github.com/apache/spark/pull/31944#discussion_r622628911 ## File path: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchStream.scala ## @@ -218,3 +226,35 @@

[GitHub] [spark] yijiacui-db commented on a change in pull request #31944: [SPARK-34854][SQL][SS] Expose source metrics via progress report and add Kafka use-case to report delay.

2021-04-28 Thread GitBox
yijiacui-db commented on a change in pull request #31944: URL: https://github.com/apache/spark/pull/31944#discussion_r622628591 ## File path: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchStream.scala ## @@ -133,6 +137,10 @@

[GitHub] [spark] SparkQA removed a comment on pull request #32389: [SPARK-35263] [TEST] Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code

2021-04-28 Thread GitBox
SparkQA removed a comment on pull request #32389: URL: https://github.com/apache/spark/pull/32389#issuecomment-828770214 **[Test build #138055 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138055/testReport)** for PR 32389 at commit

[GitHub] [spark] SparkQA commented on pull request #32389: [SPARK-35263] [TEST] Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code

2021-04-28 Thread GitBox
SparkQA commented on pull request #32389: URL: https://github.com/apache/spark/pull/32389#issuecomment-828838284 **[Test build #138055 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138055/testReport)** for PR 32389 at commit

[GitHub] [spark] yijiacui-db commented on pull request #31944: [SPARK-34854][SQL][SS] Expose source metrics via progress report and add Kafka use-case to report delay.

2021-04-28 Thread GitBox
yijiacui-db commented on pull request #31944: URL: https://github.com/apache/spark/pull/31944#issuecomment-828836467 > > > > I've tested it on real cluster and works fine. > > > > Just a question. How this it intended to use for dynamic allocation? > > > > > > > > > Users can

[GitHub] [spark] karenfeng commented on pull request #32301: [SPARK-35194][SQL] Refactor nested column aliasing for readability

2021-04-28 Thread GitBox
karenfeng commented on pull request #32301: URL: https://github.com/apache/spark/pull/32301#issuecomment-828836408 @viirya, I refactored the code that you merged in #31966 to fix conflicts. Can you take a look? -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] SparkQA removed a comment on pull request #32387: [SPARK-35244][SQL][FOLLOWUP] Add null check for the exception cause

2021-04-28 Thread GitBox
SparkQA removed a comment on pull request #32387: URL: https://github.com/apache/spark/pull/32387#issuecomment-828685688 **[Test build #138052 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138052/testReport)** for PR 32387 at commit

[GitHub] [spark] yijiacui-db edited a comment on pull request #31944: [SPARK-34854][SQL][SS] Expose source metrics via progress report and add Kafka use-case to report delay.

2021-04-28 Thread GitBox
yijiacui-db edited a comment on pull request #31944: URL: https://github.com/apache/spark/pull/31944#issuecomment-828832285 > > > I've tested it on real cluster and works fine. > > > Just a question. How this it intended to use for dynamic allocation? > > > > > > Users can

[GitHub] [spark] yijiacui-db edited a comment on pull request #31944: [SPARK-34854][SQL][SS] Expose source metrics via progress report and add Kafka use-case to report delay.

2021-04-28 Thread GitBox
yijiacui-db edited a comment on pull request #31944: URL: https://github.com/apache/spark/pull/31944#issuecomment-828832285 > > > I've tested it on real cluster and works fine. > > > Just a question. How this it intended to use for dynamic allocation? > > > > > > Users can

[GitHub] [spark] yijiacui-db commented on pull request #31944: [SPARK-34854][SQL][SS] Expose source metrics via progress report and add Kafka use-case to report delay.

2021-04-28 Thread GitBox
yijiacui-db commented on pull request #31944: URL: https://github.com/apache/spark/pull/31944#issuecomment-828832285 > > > I've tested it on real cluster and works fine. > > > Just a question. How this it intended to use for dynamic allocation? > > > > > > Users can implement

[GitHub] [spark] AmplabJenkins commented on pull request #32387: [SPARK-35244][SQL][FOLLOWUP] Add null check for the exception cause

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32387: URL: https://github.com/apache/spark/pull/32387#issuecomment-828830813 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138052/ -- This

[GitHub] [spark] SparkQA commented on pull request #32387: [SPARK-35244][SQL][FOLLOWUP] Add null check for the exception cause

2021-04-28 Thread GitBox
SparkQA commented on pull request #32387: URL: https://github.com/apache/spark/pull/32387#issuecomment-828830305 **[Test build #138052 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138052/testReport)** for PR 32387 at commit

[GitHub] [spark] viirya commented on a change in pull request #32361: [SPARK-35240][SS] Use CheckpointFileManager for checkpoint file manipulation

2021-04-28 Thread GitBox
viirya commented on a change in pull request #32361: URL: https://github.com/apache/spark/pull/32361#discussion_r622590163 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CheckpointFileManager.scala ## @@ -83,6 +83,9 @@ trait

[GitHub] [spark] HeartSaVioR commented on pull request #31944: [SPARK-34854][SQL][SS] Expose source metrics via progress report and add Kafka use-case to report delay.

2021-04-28 Thread GitBox
HeartSaVioR commented on pull request #31944: URL: https://github.com/apache/spark/pull/31944#issuecomment-828804207 > > > I've tested it on real cluster and works fine. > > > Just a question. How this it intended to use for dynamic allocation? > > > > Users can implement this

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32389: [SPARK-35263] [TEST] Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32389: URL: https://github.com/apache/spark/pull/32389#issuecomment-828795103 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42574/

[GitHub] [spark] AmplabJenkins commented on pull request #32389: [SPARK-35263] [TEST] Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32389: URL: https://github.com/apache/spark/pull/32389#issuecomment-828795103 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42574/ --

[GitHub] [spark] SparkQA commented on pull request #32389: [SPARK-35263] [TEST] Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code

2021-04-28 Thread GitBox
SparkQA commented on pull request #32389: URL: https://github.com/apache/spark/pull/32389#issuecomment-828795066 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32301: [SPARK-35194][SQL] Refactor nested column aliasing for readability

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32301: URL: https://github.com/apache/spark/pull/32301#issuecomment-828794503 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138050/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32386: [SPARK-34887][PYTHON] Port Koalas dependencies into PySpark

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32386: URL: https://github.com/apache/spark/pull/32386#issuecomment-828794501 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138053/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32389: [SPARK-35263] [TEST] Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32389: URL: https://github.com/apache/spark/pull/32389#issuecomment-828794499 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42573/

[GitHub] [spark] AmplabJenkins commented on pull request #32301: [SPARK-35194][SQL] Refactor nested column aliasing for readability

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32301: URL: https://github.com/apache/spark/pull/32301#issuecomment-828794503 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138050/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #32386: [SPARK-34887][PYTHON] Port Koalas dependencies into PySpark

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32386: URL: https://github.com/apache/spark/pull/32386#issuecomment-828794501 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138053/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #32389: [SPARK-35263] [TEST] Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32389: URL: https://github.com/apache/spark/pull/32389#issuecomment-828794499 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42573/ --

[GitHub] [spark] SparkQA commented on pull request #32389: [SPARK-35263] [TEST] Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code

2021-04-28 Thread GitBox
SparkQA commented on pull request #32389: URL: https://github.com/apache/spark/pull/32389#issuecomment-828785421 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] SparkQA removed a comment on pull request #32301: [SPARK-35194][SQL] Refactor nested column aliasing for readability

2021-04-28 Thread GitBox
SparkQA removed a comment on pull request #32301: URL: https://github.com/apache/spark/pull/32301#issuecomment-828608152 **[Test build #138050 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138050/testReport)** for PR 32301 at commit

[GitHub] [spark] SparkQA commented on pull request #32301: [SPARK-35194][SQL] Refactor nested column aliasing for readability

2021-04-28 Thread GitBox
SparkQA commented on pull request #32301: URL: https://github.com/apache/spark/pull/32301#issuecomment-828778971 **[Test build #138050 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138050/testReport)** for PR 32301 at commit

[GitHub] [spark] viirya commented on a change in pull request #32361: [SPARK-35240][SS] Use CheckpointFileManager for checkpoint file manipulation

2021-04-28 Thread GitBox
viirya commented on a change in pull request #32361: URL: https://github.com/apache/spark/pull/32361#discussion_r622544504 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CheckpointFileManager.scala ## @@ -83,6 +83,9 @@ trait

[GitHub] [spark] SparkQA removed a comment on pull request #32386: [SPARK-34887][PYTHON] Port Koalas dependencies into PySpark

2021-04-28 Thread GitBox
SparkQA removed a comment on pull request #32386: URL: https://github.com/apache/spark/pull/32386#issuecomment-828685784 **[Test build #138053 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138053/testReport)** for PR 32386 at commit

[GitHub] [spark] SparkQA commented on pull request #32386: [SPARK-34887][PYTHON] Port Koalas dependencies into PySpark

2021-04-28 Thread GitBox
SparkQA commented on pull request #32386: URL: https://github.com/apache/spark/pull/32386#issuecomment-828774881 **[Test build #138053 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138053/testReport)** for PR 32386 at commit

[GitHub] [spark] SparkQA commented on pull request #32389: [SPARK-35263] [TEST] Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code

2021-04-28 Thread GitBox
SparkQA commented on pull request #32389: URL: https://github.com/apache/spark/pull/32389#issuecomment-828770214 **[Test build #138055 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138055/testReport)** for PR 32389 at commit

[GitHub] [spark] mridulm commented on pull request #32385: [WIP][SPARK-18188][CORE] Add checksum for shuffle blocks

2021-04-28 Thread GitBox
mridulm commented on pull request #32385: URL: https://github.com/apache/spark/pull/32385#issuecomment-828764868 +CC @otterc This should be of interest given recent discussions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #32389: [SPARK-35263] [TEST] Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32389: URL: https://github.com/apache/spark/pull/32389#issuecomment-828762528 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138054/ -- This

[GitHub] [spark] SparkQA removed a comment on pull request #32389: [SPARK-35263] [TEST] Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code

2021-04-28 Thread GitBox
SparkQA removed a comment on pull request #32389: URL: https://github.com/apache/spark/pull/32389#issuecomment-828761011 **[Test build #138054 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138054/testReport)** for PR 32389 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32389: [SPARK-35263] [TEST] Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32389: URL: https://github.com/apache/spark/pull/32389#issuecomment-828762528 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138054/

[GitHub] [spark] SparkQA commented on pull request #32389: [SPARK-35263] [TEST] Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code

2021-04-28 Thread GitBox
SparkQA commented on pull request #32389: URL: https://github.com/apache/spark/pull/32389#issuecomment-828762506 **[Test build #138054 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138054/testReport)** for PR 32389 at commit

[GitHub] [spark] SparkQA commented on pull request #32389: [SPARK-35263] [TEST] Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code

2021-04-28 Thread GitBox
SparkQA commented on pull request #32389: URL: https://github.com/apache/spark/pull/32389#issuecomment-828761011 **[Test build #138054 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138054/testReport)** for PR 32389 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32388: [SPARK-35258][SHUFFLE][YARN] Add new metrics to ExternalShuffleService for better monitoring

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32388: URL: https://github.com/apache/spark/pull/32388#issuecomment-828759848 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138051/

[GitHub] [spark] AmplabJenkins commented on pull request #32388: [SPARK-35258][SHUFFLE][YARN] Add new metrics to ExternalShuffleService for better monitoring

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32388: URL: https://github.com/apache/spark/pull/32388#issuecomment-828759848 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138051/ -- This

[GitHub] [spark] SparkQA removed a comment on pull request #32388: [SPARK-35258][SHUFFLE][YARN] Add new metrics to ExternalShuffleService for better monitoring

2021-04-28 Thread GitBox
SparkQA removed a comment on pull request #32388: URL: https://github.com/apache/spark/pull/32388#issuecomment-828685634 **[Test build #138051 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138051/testReport)** for PR 32388 at commit

[GitHub] [spark] SparkQA commented on pull request #32388: [SPARK-35258][SHUFFLE][YARN] Add new metrics to ExternalShuffleService for better monitoring

2021-04-28 Thread GitBox
SparkQA commented on pull request #32388: URL: https://github.com/apache/spark/pull/32388#issuecomment-828758963 **[Test build #138051 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138051/testReport)** for PR 32388 at commit

[GitHub] [spark] xkrogen opened a new pull request #32389: [SPARK-35263] [TEST] Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code

2021-04-28 Thread GitBox
xkrogen opened a new pull request #32389: URL: https://github.com/apache/spark/pull/32389 ### What changes were proposed in this pull request? Introduce new shared methods to `ShuffleBlockFetcherIteratorSuite` to replace copy-pasted code. Use modern, Scala-like Mockito `Answer` syntax.

[GitHub] [spark] AmplabJenkins commented on pull request #32388: [SPARK-35258][SHUFFLE][YARN] Add new metrics to ExternalShuffleService for better monitoring

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32388: URL: https://github.com/apache/spark/pull/32388#issuecomment-828723507 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42570/ --

[GitHub] [spark] AmplabJenkins commented on pull request #32266: [SPARK-35111][SQL] Support Cast string to year-month interval

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32266: URL: https://github.com/apache/spark/pull/32266#issuecomment-828723505 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138047/ -- This

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32388: [SPARK-35258][SHUFFLE][YARN] Add new metrics to ExternalShuffleService for better monitoring

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32388: URL: https://github.com/apache/spark/pull/32388#issuecomment-828723507 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42570/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32387: [SPARK-35244][SQL][FOLLOWUP] Add null check for the exception cause

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32387: URL: https://github.com/apache/spark/pull/32387#issuecomment-828723504 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42571/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32266: [SPARK-35111][SQL] Support Cast string to year-month interval

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32266: URL: https://github.com/apache/spark/pull/32266#issuecomment-828567902 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32386: [SPARK-34887][PYTHON] Port Koalas dependencies into PySpark

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32386: URL: https://github.com/apache/spark/pull/32386#issuecomment-828723503 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42572/

[GitHub] [spark] AmplabJenkins commented on pull request #32386: [SPARK-34887][PYTHON] Port Koalas dependencies into PySpark

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32386: URL: https://github.com/apache/spark/pull/32386#issuecomment-828723503 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42572/ --

[GitHub] [spark] AmplabJenkins commented on pull request #32387: [SPARK-35244][SQL][FOLLOWUP] Add null check for the exception cause

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32387: URL: https://github.com/apache/spark/pull/32387#issuecomment-828723504 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42571/ --

[GitHub] [spark] SparkQA commented on pull request #32386: [SPARK-34887][PYTHON] Port Koalas dependencies into PySpark

2021-04-28 Thread GitBox
SparkQA commented on pull request #32386: URL: https://github.com/apache/spark/pull/32386#issuecomment-828720550 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42572/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #32387: [SPARK-35244][SQL][FOLLOWUP] Add null check for the exception cause

2021-04-28 Thread GitBox
SparkQA commented on pull request #32387: URL: https://github.com/apache/spark/pull/32387#issuecomment-828719114 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42571/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #32386: [SPARK-34887][PYTHON] Port Koalas dependencies into PySpark

2021-04-28 Thread GitBox
SparkQA commented on pull request #32386: URL: https://github.com/apache/spark/pull/32386#issuecomment-828717957 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42572/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #32387: [SPARK-35244][SQL][FOLLOWUP] Add null check for the exception cause

2021-04-28 Thread GitBox
SparkQA commented on pull request #32387: URL: https://github.com/apache/spark/pull/32387#issuecomment-828716535 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42571/ -- This is an automated message from the Apache

<    1   2   3   4   5   6   >