[GitHub] [spark] SparkQA removed a comment on pull request #32266: [SPARK-35111][SQL] Support Cast string to year-month interval

2021-04-28 Thread GitBox
SparkQA removed a comment on pull request #32266: URL: https://github.com/apache/spark/pull/32266#issuecomment-828507671 **[Test build #138047 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138047/testReport)** for PR 32266 at commit

[GitHub] [spark] SparkQA commented on pull request #32266: [SPARK-35111][SQL] Support Cast string to year-month interval

2021-04-28 Thread GitBox
SparkQA commented on pull request #32266: URL: https://github.com/apache/spark/pull/32266#issuecomment-828715750 **[Test build #138047 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138047/testReport)** for PR 32266 at commit

[GitHub] [spark] SparkQA commented on pull request #32388: [SPARK-35258][SHUFFLE][YARN] Add new metrics to ExternalShuffleService for better monitoring

2021-04-28 Thread GitBox
SparkQA commented on pull request #32388: URL: https://github.com/apache/spark/pull/32388#issuecomment-828714199 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] viirya commented on pull request #31944: [SPARK-34854][SQL][SS] Expose source metrics via progress report and add Kafka use-case to report delay.

2021-04-28 Thread GitBox
viirya commented on pull request #31944: URL: https://github.com/apache/spark/pull/31944#issuecomment-828714048 > > I've tested it on real cluster and works fine. > > Just a question. How this it intended to use for dynamic allocation? > > Users can implement this interface in

[GitHub] [spark] garawalid commented on a change in pull request #32386: [SPARK-34887][PYTHON] Port Koalas dependencies into PySpark

2021-04-28 Thread GitBox
garawalid commented on a change in pull request #32386: URL: https://github.com/apache/spark/pull/32386#discussion_r622457501 ## File path: dev/requirements.txt ## @@ -6,3 +6,13 @@ pydata_sphinx_theme ipython nbsphinx numpydoc + +# dependencies in pandas-on-spark. Review

[GitHub] [spark] andygrove commented on a change in pull request #32195: [SPARK-35093] [SQL] AQE now respects supportsColumnar when attempting to reuse exchanges

2021-04-28 Thread GitBox
andygrove commented on a change in pull request #32195: URL: https://github.com/apache/spark/pull/32195#discussion_r622450769 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala ## @@ -431,7 +431,8 @@ case class

[GitHub] [spark] sunchao commented on a change in pull request #32082: [SPARK-34981][SQL] Implement V2 function resolution and evaluation

2021-04-28 Thread GitBox
sunchao commented on a change in pull request #32082: URL: https://github.com/apache/spark/pull/32082#discussion_r622441347 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ScalarFunction.java ## @@ -23,17 +23,67 @@ /** * Interface

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32271: [SPARK-35112][SQL] Support Cast string to day-second interval

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32271: URL: https://github.com/apache/spark/pull/32271#issuecomment-828686325 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138046/

[GitHub] [spark] sunchao commented on pull request #32082: [SPARK-34981][SQL] Implement V2 function resolution and evaluation

2021-04-28 Thread GitBox
sunchao commented on pull request #32082: URL: https://github.com/apache/spark/pull/32082#issuecomment-828686966 Thanks all! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] xkrogen commented on pull request #32388: [SPARK-35258][SHUFFLE][YARN] Add new metrics to ExternalShuffleService for better monitoring

2021-04-28 Thread GitBox
xkrogen commented on pull request #32388: URL: https://github.com/apache/spark/pull/32388#issuecomment-828686364 cc @tgravescs @mridulm @dongjoon-hyun @gatorsmile -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] AmplabJenkins commented on pull request #32271: [SPARK-35112][SQL] Support Cast string to day-second interval

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32271: URL: https://github.com/apache/spark/pull/32271#issuecomment-828686325 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138046/ -- This

[GitHub] [spark] SparkQA commented on pull request #32388: [SPARK-35258][SHUFFLE][YARN] Add new metrics to ExternalShuffleService for better monitoring

2021-04-28 Thread GitBox
SparkQA commented on pull request #32388: URL: https://github.com/apache/spark/pull/32388#issuecomment-828685634 **[Test build #138051 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138051/testReport)** for PR 32388 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #32271: [SPARK-35112][SQL] Support Cast string to day-second interval

2021-04-28 Thread GitBox
SparkQA removed a comment on pull request #32271: URL: https://github.com/apache/spark/pull/32271#issuecomment-828507501 **[Test build #138046 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138046/testReport)** for PR 32271 at commit

[GitHub] [spark] SparkQA commented on pull request #32387: [SPARK-35244][SQL][FOLLOWUP] Add null check for the exception cause

2021-04-28 Thread GitBox
SparkQA commented on pull request #32387: URL: https://github.com/apache/spark/pull/32387#issuecomment-828685688 **[Test build #138052 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138052/testReport)** for PR 32387 at commit

[GitHub] [spark] SparkQA commented on pull request #32386: [SPARK-34887][PYTHON] Port Koalas dependencies into PySpark

2021-04-28 Thread GitBox
SparkQA commented on pull request #32386: URL: https://github.com/apache/spark/pull/32386#issuecomment-828685784 **[Test build #138053 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138053/testReport)** for PR 32386 at commit

[GitHub] [spark] SparkQA commented on pull request #32271: [SPARK-35112][SQL] Support Cast string to day-second interval

2021-04-28 Thread GitBox
SparkQA commented on pull request #32271: URL: https://github.com/apache/spark/pull/32271#issuecomment-828685144 **[Test build #138046 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138046/testReport)** for PR 32271 at commit

[GitHub] [spark] xkrogen opened a new pull request #32388: [SPARK-35258][SHUFFLE][YARN] Add new metrics to ExternalShuffleService for better monitoring

2021-04-28 Thread GitBox
xkrogen opened a new pull request #32388: URL: https://github.com/apache/spark/pull/32388 ### What changes were proposed in this pull request? This adds two new additional metrics to `ExternalBlockHandler`: - `blockTransferRate` -- for indicating the rate of transferring blocks, vs.

[GitHub] [spark] ueshin commented on a change in pull request #32386: [SPARK-34887][PYTHON] Port Koalas dependencies into PySpark

2021-04-28 Thread GitBox
ueshin commented on a change in pull request #32386: URL: https://github.com/apache/spark/pull/32386#discussion_r622435206 ## File path: python/setup.py ## @@ -250,14 +257,22 @@ def run(self): license='http://www.apache.org/licenses/LICENSE-2.0', # Don't

[GitHub] [spark] cloud-fan commented on pull request #32387: [SPARK-35244][SQL][FOLLOWUP] Add null check for the exception cause

2021-04-28 Thread GitBox
cloud-fan commented on pull request #32387: URL: https://github.com/apache/spark/pull/32387#issuecomment-828681962 cc @maropu @yaooqinn -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] cloud-fan opened a new pull request #32387: [SPARK-35244][SQL][FOLLOWUP] Add null check for the exception cause

2021-04-28 Thread GitBox
cloud-fan opened a new pull request #32387: URL: https://github.com/apache/spark/pull/32387 ### What changes were proposed in this pull request? Make sure we re-throw an exception that is not null. ### Why are the changes needed? to be super safe ### Does

[GitHub] [spark] allisonwang-db commented on a change in pull request #32303: [SPARK-34382][SQL] Support LATERAL subqueries

2021-04-28 Thread GitBox
allisonwang-db commented on a change in pull request #32303: URL: https://github.com/apache/spark/pull/32303#discussion_r622435322 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -2234,6 +2260,76 @@ class Analyzer(override

[GitHub] [spark] cloud-fan commented on a change in pull request #32370: [SPARK-35244][SQL] Invoke should throw the original exception

2021-04-28 Thread GitBox
cloud-fan commented on a change in pull request #32370: URL: https://github.com/apache/spark/pull/32370#discussion_r622433479 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ## @@ -129,7 +129,12 @@ trait InvokeLike

[GitHub] [spark] otterc commented on a change in pull request #32007: [SPARK-33350][SHUFFLE] Add support to DiskBlockManager to create merge directory and to get the local shuffle merged data

2021-04-28 Thread GitBox
otterc commented on a change in pull request #32007: URL: https://github.com/apache/spark/pull/32007#discussion_r622429509 ## File path: core/src/main/scala/org/apache/spark/storage/BlockId.scala ## @@ -87,6 +87,29 @@ case class ShufflePushBlockId(shuffleId: Int, mapIndex:

[GitHub] [spark] otterc commented on a change in pull request #32007: [SPARK-33350][SHUFFLE] Add support to DiskBlockManager to create merge directory and to get the local shuffle merged data

2021-04-28 Thread GitBox
otterc commented on a change in pull request #32007: URL: https://github.com/apache/spark/pull/32007#discussion_r622429509 ## File path: core/src/main/scala/org/apache/spark/storage/BlockId.scala ## @@ -87,6 +87,29 @@ case class ShufflePushBlockId(shuffleId: Int, mapIndex:

[GitHub] [spark] xinrong-databricks commented on pull request #32386: [SPARK-34887][PYTHON] Port Koalas dependencies into PySpark

2021-04-28 Thread GitBox
xinrong-databricks commented on pull request #32386: URL: https://github.com/apache/spark/pull/32386#issuecomment-828670013 FYI @HyukjinKwon @ueshin @itholic -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] xinrong-databricks opened a new pull request #32386: [SPARK-34887][PYTHON] Port Koalas dependencies into PySpark

2021-04-28 Thread GitBox
xinrong-databricks opened a new pull request #32386: URL: https://github.com/apache/spark/pull/32386 ### What changes were proposed in this pull request? Port Koalas dependencies appropriately to PySpark dependencies. ### Why are the changes needed?

[GitHub] [spark] xkrogen commented on pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-04-28 Thread GitBox
xkrogen commented on pull request #31490: URL: https://github.com/apache/spark/pull/31490#issuecomment-828664115 @mridulm do you have any interest in helping to review this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32384: [SPARK-35257][TESTS] Speed up `HadoopVersionInfoSuite` with `SPARK_VERSIONS_SUITE_IVY_PATH`

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32384: URL: https://github.com/apache/spark/pull/32384#issuecomment-828642680 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138048/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32301: [SPARK-35194][SQL] Refactor nested column aliasing for readability

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32301: URL: https://github.com/apache/spark/pull/32301#issuecomment-828642681 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42569/

[GitHub] [spark] AmplabJenkins commented on pull request #32301: [SPARK-35194][SQL] Refactor nested column aliasing for readability

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32301: URL: https://github.com/apache/spark/pull/32301#issuecomment-828642681 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42569/ --

[GitHub] [spark] AmplabJenkins commented on pull request #32384: [SPARK-35257][TESTS] Speed up `HadoopVersionInfoSuite` with `SPARK_VERSIONS_SUITE_IVY_PATH`

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32384: URL: https://github.com/apache/spark/pull/32384#issuecomment-828642680 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138048/ -- This

[GitHub] [spark] SparkQA removed a comment on pull request #32384: [SPARK-35257][TESTS] Speed up `HadoopVersionInfoSuite` with `SPARK_VERSIONS_SUITE_IVY_PATH`

2021-04-28 Thread GitBox
SparkQA removed a comment on pull request #32384: URL: https://github.com/apache/spark/pull/32384#issuecomment-828553933 **[Test build #138048 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138048/testReport)** for PR 32384 at commit

[GitHub] [spark] SparkQA commented on pull request #32384: [SPARK-35257][TESTS] Speed up `HadoopVersionInfoSuite` with `SPARK_VERSIONS_SUITE_IVY_PATH`

2021-04-28 Thread GitBox
SparkQA commented on pull request #32384: URL: https://github.com/apache/spark/pull/32384#issuecomment-828641025 **[Test build #138048 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138048/testReport)** for PR 32384 at commit

[GitHub] [spark] SparkQA commented on pull request #32301: [SPARK-35194][SQL] Refactor nested column aliasing for readability

2021-04-28 Thread GitBox
SparkQA commented on pull request #32301: URL: https://github.com/apache/spark/pull/32301#issuecomment-828639757 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] cloud-fan closed pull request #32082: [SPARK-34981][SQL] Implement V2 function resolution and evaluation

2021-04-28 Thread GitBox
cloud-fan closed pull request #32082: URL: https://github.com/apache/spark/pull/32082 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [spark] cloud-fan commented on pull request #32082: [SPARK-34981][SQL] Implement V2 function resolution and evaluation

2021-04-28 Thread GitBox
cloud-fan commented on pull request #32082: URL: https://github.com/apache/spark/pull/32082#issuecomment-828635127 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] cloud-fan commented on a change in pull request #32082: [SPARK-34981][SQL] Implement V2 function resolution and evaluation

2021-04-28 Thread GitBox
cloud-fan commented on a change in pull request #32082: URL: https://github.com/apache/spark/pull/32082#discussion_r622387401 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ScalarFunction.java ## @@ -23,17 +23,67 @@ /** *

[GitHub] [spark] venkata91 commented on pull request #30691: [SPARK-32920][SHUFFLE] Finalization of Shuffle push/merge with Push based shuffle and preparation step for the reduce stage

2021-04-28 Thread GitBox
venkata91 commented on pull request #30691: URL: https://github.com/apache/spark/pull/30691#issuecomment-828634232 Fixed the PR rebasing the latest master changes and also marked it as open for review. cc @mridulm @Victsm @otterc @Ngone51 -- This is an automated message from the Apache

[GitHub] [spark] viirya commented on a change in pull request #31944: [SPARK-34854][SQL][SS] Expose source metrics via progress report and add Kafka use-case to report delay.

2021-04-28 Thread GitBox
viirya commented on a change in pull request #31944: URL: https://github.com/apache/spark/pull/31944#discussion_r622381019 ## File path: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchStream.scala ## @@ -218,3 +226,35 @@ private[kafka010]

[GitHub] [spark] viirya commented on a change in pull request #31944: [SPARK-34854][SQL][SS] Expose source metrics via progress report and add Kafka use-case to report delay.

2021-04-28 Thread GitBox
viirya commented on a change in pull request #31944: URL: https://github.com/apache/spark/pull/31944#discussion_r622376735 ## File path: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchStream.scala ## @@ -133,6 +137,10 @@ private[kafka010]

[GitHub] [spark] viirya commented on a change in pull request #31944: [SPARK-34854][SQL][SS] Expose source metrics via progress report and add Kafka use-case to report delay.

2021-04-28 Thread GitBox
viirya commented on a change in pull request #31944: URL: https://github.com/apache/spark/pull/31944#discussion_r622374321 ## File path: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchStream.scala ## @@ -133,6 +137,10 @@ private[kafka010]

[GitHub] [spark] allisonwang-db commented on a change in pull request #32303: [SPARK-34382][SQL] Support LATERAL subqueries

2021-04-28 Thread GitBox
allisonwang-db commented on a change in pull request #32303: URL: https://github.com/apache/spark/pull/32303#discussion_r622372553 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala ## @@ -1037,6 +1057,10 @@ trait

[GitHub] [spark] viirya commented on a change in pull request #31944: [SPARK-34854][SQL][SS] Expose source metrics via progress report and add Kafka use-case to report delay.

2021-04-28 Thread GitBox
viirya commented on a change in pull request #31944: URL: https://github.com/apache/spark/pull/31944#discussion_r622372065 ## File path: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchStream.scala ## @@ -218,3 +226,35 @@ private[kafka010]

[GitHub] [spark] sunchao commented on a change in pull request #32082: [SPARK-34981][SQL] Implement V2 function resolution and evaluation

2021-04-28 Thread GitBox
sunchao commented on a change in pull request #32082: URL: https://github.com/apache/spark/pull/32082#discussion_r622367803 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ScalarFunction.java ## @@ -23,17 +23,67 @@ /** * Interface

[GitHub] [spark] SparkQA commented on pull request #32301: [SPARK-35194][SQL] Refactor nested column aliasing for readability

2021-04-28 Thread GitBox
SparkQA commented on pull request #32301: URL: https://github.com/apache/spark/pull/32301#issuecomment-828608152 **[Test build #138050 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138050/testReport)** for PR 32301 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32385: [WIP][SPARK-18188][CORE] Add checksum for shuffle blocks

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32385: URL: https://github.com/apache/spark/pull/32385#issuecomment-828606569 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42568/

[GitHub] [spark] advancedxy commented on pull request #32210: [SPARK-32634][SQL] Introduce sort-based fallback for shuffled hash join (non-code-gen path)

2021-04-28 Thread GitBox
advancedxy commented on pull request #32210: URL: https://github.com/apache/spark/pull/32210#issuecomment-828606822 > > Per my knowledge I don't know any obviously efficient way to do random lookup join with spilled hash map. > > How do we minimize random disk read for spilled map?

[GitHub] [spark] SparkQA commented on pull request #32385: [WIP][SPARK-18188][CORE] Add checksum for shuffle blocks

2021-04-28 Thread GitBox
SparkQA commented on pull request #32385: URL: https://github.com/apache/spark/pull/32385#issuecomment-828606525 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] AmplabJenkins commented on pull request #32385: [WIP][SPARK-18188][CORE] Add checksum for shuffle blocks

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32385: URL: https://github.com/apache/spark/pull/32385#issuecomment-828606569 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42568/ --

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32384: [SPARK-35257][TESTS] Speed up `HadoopVersionInfoSuite` with `SPARK_VERSIONS_SUITE_IVY_PATH`

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32384: URL: https://github.com/apache/spark/pull/32384#issuecomment-828598848 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42567/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32385: [WIP][SPARK-18188][CORE] Add checksum for shuffle blocks

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32385: URL: https://github.com/apache/spark/pull/32385#issuecomment-828598847 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138049/

[GitHub] [spark] AmplabJenkins commented on pull request #32385: [WIP][SPARK-18188][CORE] Add checksum for shuffle blocks

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32385: URL: https://github.com/apache/spark/pull/32385#issuecomment-828598847 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138049/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #32384: [SPARK-35257][TESTS] Speed up `HadoopVersionInfoSuite` with `SPARK_VERSIONS_SUITE_IVY_PATH`

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32384: URL: https://github.com/apache/spark/pull/32384#issuecomment-828598848 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42567/ --

[GitHub] [spark] SparkQA commented on pull request #32384: [SPARK-35257][TESTS] Speed up `HadoopVersionInfoSuite` with `SPARK_VERSIONS_SUITE_IVY_PATH`

2021-04-28 Thread GitBox
SparkQA commented on pull request #32384: URL: https://github.com/apache/spark/pull/32384#issuecomment-828593683 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42567/ -- This is an automated message from the

[GitHub] [spark] yaooqinn commented on a change in pull request #32370: [SPARK-35244][SQL] Invoke should throw the original exception

2021-04-28 Thread GitBox
yaooqinn commented on a change in pull request #32370: URL: https://github.com/apache/spark/pull/32370#discussion_r622343349 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ## @@ -129,7 +129,12 @@ trait InvokeLike

[GitHub] [spark] SparkQA commented on pull request #32384: [SPARK-35257][TESTS] Speed up `HadoopVersionInfoSuite` with `SPARK_VERSIONS_SUITE_IVY_PATH`

2021-04-28 Thread GitBox
SparkQA commented on pull request #32384: URL: https://github.com/apache/spark/pull/32384#issuecomment-828590228 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42567/ -- This is an automated message from the Apache

[GitHub] [spark] cloud-fan commented on a change in pull request #32266: [SPARK-35111][SQL] Support Cast string to year-month interval

2021-04-28 Thread GitBox
cloud-fan commented on a change in pull request #32266: URL: https://github.com/apache/spark/pull/32266#discussion_r622336292 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -92,6 +93,31 @@ object IntervalUtils { }

[GitHub] [spark] cloud-fan commented on a change in pull request #32266: [SPARK-35111][SQL] Support Cast string to year-month interval

2021-04-28 Thread GitBox
cloud-fan commented on a change in pull request #32266: URL: https://github.com/apache/spark/pull/32266#discussion_r622334501 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -92,6 +93,31 @@ object IntervalUtils { }

[GitHub] [spark] cloud-fan commented on a change in pull request #32266: [SPARK-35111][SQL] Support Cast string to year-month interval

2021-04-28 Thread GitBox
cloud-fan commented on a change in pull request #32266: URL: https://github.com/apache/spark/pull/32266#discussion_r622334319 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -92,6 +93,31 @@ object IntervalUtils { }

[GitHub] [spark] LuciferYang commented on pull request #32374: [WIP][SPARK-35253][BUILD][SQL] Upgrade Janino from 3.0.16 to 3.1.3

2021-04-28 Thread GitBox
LuciferYang commented on pull request #32374: URL: https://github.com/apache/spark/pull/32374#issuecomment-828581605 It seems that SPARK-32640 has not been solved with Janino 3.1.3 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] LuciferYang commented on pull request #32374: [WIP][SPARK-35253][BUILD][SQL] Upgrade Janino from 3.0.16 to 3.1.3

2021-04-28 Thread GitBox
LuciferYang commented on pull request #32374: URL: https://github.com/apache/spark/pull/32374#issuecomment-828579206 I found SPARK-31214 (https://github.com/apache/spark/pull/27860) was reversed, I'm not sure if we need to upgrade this lib, there are still some UT failures now.

[GitHub] [spark] SparkQA removed a comment on pull request #32385: [WIP][SPARK-18188][CORE] Add checksum for shuffle blocks

2021-04-28 Thread GitBox
SparkQA removed a comment on pull request #32385: URL: https://github.com/apache/spark/pull/32385#issuecomment-828567443 **[Test build #138049 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138049/testReport)** for PR 32385 at commit

[GitHub] [spark] SparkQA commented on pull request #32385: [WIP][SPARK-18188][CORE] Add checksum for shuffle blocks

2021-04-28 Thread GitBox
SparkQA commented on pull request #32385: URL: https://github.com/apache/spark/pull/32385#issuecomment-828577751 **[Test build #138049 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138049/testReport)** for PR 32385 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #32266: [SPARK-35111][SQL] Support Cast string to year-month interval

2021-04-28 Thread GitBox
SparkQA removed a comment on pull request #32266: URL: https://github.com/apache/spark/pull/32266#issuecomment-828348780 **[Test build #138044 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138044/testReport)** for PR 32266 at commit

[GitHub] [spark] c21 commented on pull request #32380: [SPARK-34781][SQL][FOLLOWUP] Adjust the order of AQE optimizer rules

2021-04-28 Thread GitBox
c21 commented on pull request #32380: URL: https://github.com/apache/spark/pull/32380#issuecomment-828571853 Late LGTM, thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] Ngone51 commented on pull request #32385: [WIP][SPARK-18188][CORE] Add checksum for shuffle blocks

2021-04-28 Thread GitBox
Ngone51 commented on pull request #32385: URL: https://github.com/apache/spark/pull/32385#issuecomment-828569773 cc @mridulm @otterc @attilapiros @tgravescs @cloud-fan Please take a look, thanks! -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] Ngone51 commented on pull request #32385: [WIP][SPARK-18188][CORE] Add checksum for shuffle blocks

2021-04-28 Thread GitBox
Ngone51 commented on pull request #32385: URL: https://github.com/apache/spark/pull/32385#issuecomment-828568544 I marked PR as `WIP` because I want to hear the community's feedback before working further, e.g., adding more unit tests. And there will be two following PRs (if the

[GitHub] [spark] AmplabJenkins commented on pull request #32266: [SPARK-35111][SQL] Support Cast string to year-month interval

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32266: URL: https://github.com/apache/spark/pull/32266#issuecomment-828567902 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138044/ -- This

[GitHub] [spark] SparkQA commented on pull request #32385: [WIP][SPARK-18188][CORE] Add checksum for shuffle blocks

2021-04-28 Thread GitBox
SparkQA commented on pull request #32385: URL: https://github.com/apache/spark/pull/32385#issuecomment-828567443 **[Test build #138049 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138049/testReport)** for PR 32385 at commit

[GitHub] [spark] Ngone51 opened a new pull request #32385: [WIP][SPARK-18188][CORE] Add checksum for shuffle blocks

2021-04-28 Thread GitBox
Ngone51 opened a new pull request #32385: URL: https://github.com/apache/spark/pull/32385 ### What changes were proposed in this pull request? This PR proposes to add checksum support for shuffle blocks. The basic idea is: On the mapper side, we'll wrap a

[GitHub] [spark] SparkQA commented on pull request #32266: [SPARK-35111][SQL] Support Cast string to year-month interval

2021-04-28 Thread GitBox
SparkQA commented on pull request #32266: URL: https://github.com/apache/spark/pull/32266#issuecomment-828565578 **[Test build #138044 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138044/testReport)** for PR 32266 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32374: [WIP][SPARK-35253][BUILD][SQL] Upgrade Janino from 3.0.16 to 3.1.3

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32374: URL: https://github.com/apache/spark/pull/32374#issuecomment-828558512 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138045/

[GitHub] [spark] AmplabJenkins commented on pull request #32374: [WIP][SPARK-35253][BUILD][SQL] Upgrade Janino from 3.0.16 to 3.1.3

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32374: URL: https://github.com/apache/spark/pull/32374#issuecomment-828558512 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138045/ -- This

[GitHub] [spark] SparkQA removed a comment on pull request #32374: [WIP][SPARK-35253][BUILD][SQL] Upgrade Janino from 3.0.16 to 3.1.3

2021-04-28 Thread GitBox
SparkQA removed a comment on pull request #32374: URL: https://github.com/apache/spark/pull/32374#issuecomment-828459208 **[Test build #138045 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138045/testReport)** for PR 32374 at commit

[GitHub] [spark] SparkQA commented on pull request #32374: [WIP][SPARK-35253][BUILD][SQL] Upgrade Janino from 3.0.16 to 3.1.3

2021-04-28 Thread GitBox
SparkQA commented on pull request #32374: URL: https://github.com/apache/spark/pull/32374#issuecomment-828557136 **[Test build #138045 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138045/testReport)** for PR 32374 at commit

[GitHub] [spark] SparkQA commented on pull request #32384: [SPARK-35257][TESTS] Speed up `HadoopVersionInfoSuite` with `SPARK_VERSIONS_SUITE_IVY_PATH`

2021-04-28 Thread GitBox
SparkQA commented on pull request #32384: URL: https://github.com/apache/spark/pull/32384#issuecomment-828553933 **[Test build #138048 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138048/testReport)** for PR 32384 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32266: [SPARK-35111][SQL] Support Cast string to year-month interval

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32266: URL: https://github.com/apache/spark/pull/32266#issuecomment-828553397 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42566/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32380: [SPARK-34781][SQL][FOLLOWUP] Adjust the order of AQE optimizer rules

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32380: URL: https://github.com/apache/spark/pull/32380#issuecomment-828553395 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138043/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32271: [SPARK-35112][SQL] Support Cast string to day-second interval

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32271: URL: https://github.com/apache/spark/pull/32271#issuecomment-828553398 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42565/

[GitHub] [spark] AmplabJenkins commented on pull request #32271: [SPARK-35112][SQL] Support Cast string to day-second interval

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32271: URL: https://github.com/apache/spark/pull/32271#issuecomment-828553398 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42565/ --

[GitHub] [spark] AmplabJenkins commented on pull request #32380: [SPARK-34781][SQL][FOLLOWUP] Adjust the order of AQE optimizer rules

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32380: URL: https://github.com/apache/spark/pull/32380#issuecomment-828553395 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138043/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #32266: [SPARK-35111][SQL] Support Cast string to year-month interval

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #32266: URL: https://github.com/apache/spark/pull/32266#issuecomment-828553397 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42566/ --

[GitHub] [spark] SparkQA commented on pull request #32271: [SPARK-35112][SQL] Support Cast string to day-second interval

2021-04-28 Thread GitBox
SparkQA commented on pull request #32271: URL: https://github.com/apache/spark/pull/32271#issuecomment-828550246 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] SparkQA commented on pull request #32266: [SPARK-35111][SQL] Support Cast string to year-month interval

2021-04-28 Thread GitBox
SparkQA commented on pull request #32266: URL: https://github.com/apache/spark/pull/32266#issuecomment-828545361 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] c21 commented on pull request #32198: [SPARK-26164][SQL] Allow concurrent writers for writing dynamic partitions and bucket table

2021-04-28 Thread GitBox
c21 commented on pull request #32198: URL: https://github.com/apache/spark/pull/32198#issuecomment-828544330 > One more thing, how much does this improve the write? Local sorts before the write are typically not too bad if you look at the cycles spend during the write. A much bigger

[GitHub] [spark] SparkQA removed a comment on pull request #32380: [SPARK-34781][SQL][FOLLOWUP] Adjust the order of AQE optimizer rules

2021-04-28 Thread GitBox
SparkQA removed a comment on pull request #32380: URL: https://github.com/apache/spark/pull/32380#issuecomment-828348617 **[Test build #138043 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138043/testReport)** for PR 32380 at commit

[GitHub] [spark] SparkQA commented on pull request #32380: [SPARK-34781][SQL][FOLLOWUP] Adjust the order of AQE optimizer rules

2021-04-28 Thread GitBox
SparkQA commented on pull request #32380: URL: https://github.com/apache/spark/pull/32380#issuecomment-828542782 **[Test build #138043 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138043/testReport)** for PR 32380 at commit

[GitHub] [spark] LuciferYang opened a new pull request #32384: [SPARK-35257][TESTS] Let `HadoopVersionInfoSuite` can use `SPARK_VERSIONS_SUITE_IVY_PATH` to speed up

2021-04-28 Thread GitBox
LuciferYang opened a new pull request #32384: URL: https://github.com/apache/spark/pull/32384 ### What changes were proposed in this pull request? `HadoopVersionInfoSuite` use a separate `ivyPath` to download the jars when create a new `IsolatedClientLoader`, and the ivy cache

[GitHub] [spark] c21 commented on pull request #32198: [SPARK-26164][SQL] Allow concurrent writers for writing dynamic partitions and bucket table

2021-04-28 Thread GitBox
c21 commented on pull request #32198: URL: https://github.com/apache/spark/pull/32198#issuecomment-828539579 > How do you avoid OOMs? Note the feature is designed to be disabled by default, and to be enabled case by case now. The fallback logic here is intended to avoid OOM when

[GitHub] [spark] c21 commented on pull request #32198: [SPARK-26164][SQL] Allow concurrent writers for writing dynamic partitions and bucket table

2021-04-28 Thread GitBox
c21 commented on pull request #32198: URL: https://github.com/apache/spark/pull/32198#issuecomment-828537591 > this doesn't do any sort of memory tracking right? Yes. It seems to me there's no way to track the memory usage accurately because writer is using on-heap memory. And we

[GitHub] [spark] SparkQA removed a comment on pull request #31776: [SPARK-34661][SQL] Clean up `OriginalType` and `DecimalMetadata ` usage in Parquet related code

2021-04-28 Thread GitBox
SparkQA removed a comment on pull request #31776: URL: https://github.com/apache/spark/pull/31776#issuecomment-828310099 **[Test build #138040 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138040/testReport)** for PR 31776 at commit

[GitHub] [spark] tgravescs commented on a change in pull request #31756: [SPARK-34637] [SQL] Support DPP + AQE when the broadcast exchange can be reused

2021-04-28 Thread GitBox
tgravescs commented on a change in pull request #31756: URL: https://github.com/apache/spark/pull/31756#discussion_r622256825 ## File path: sql/core/src/test/scala/org/apache/spark/sql/DynamicPartitionPruningSuite.scala ## @@ -1463,6 +1474,37 @@ abstract class

[GitHub] [spark] tgravescs commented on a change in pull request #31756: [SPARK-34637] [SQL] Support DPP + AQE when the broadcast exchange can be reused

2021-04-28 Thread GitBox
tgravescs commented on a change in pull request #31756: URL: https://github.com/apache/spark/pull/31756#discussion_r622256825 ## File path: sql/core/src/test/scala/org/apache/spark/sql/DynamicPartitionPruningSuite.scala ## @@ -1463,6 +1474,37 @@ abstract class

[GitHub] [spark] tgravescs commented on a change in pull request #31756: [SPARK-34637] [SQL] Support DPP + AQE when the broadcast exchange can be reused

2021-04-28 Thread GitBox
tgravescs commented on a change in pull request #31756: URL: https://github.com/apache/spark/pull/31756#discussion_r622256598 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/PlanAdaptiveDynamicPruningFilters.scala ## @@ -41,15 +42,26 @@ case

[GitHub] [spark] AmplabJenkins commented on pull request #31776: [SPARK-34661][SQL] Clean up `OriginalType` and `DecimalMetadata ` usage in Parquet related code

2021-04-28 Thread GitBox
AmplabJenkins commented on pull request #31776: URL: https://github.com/apache/spark/pull/31776#issuecomment-828515614 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138040/ -- This

[GitHub] [spark] SparkQA commented on pull request #31776: [SPARK-34661][SQL] Clean up `OriginalType` and `DecimalMetadata ` usage in Parquet related code

2021-04-28 Thread GitBox
SparkQA commented on pull request #31776: URL: https://github.com/apache/spark/pull/31776#issuecomment-828513726 **[Test build #138040 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138040/testReport)** for PR 31776 at commit

[GitHub] [spark] SparkQA commented on pull request #32266: [SPARK-35111][SQL] Support Cast string to year-month interval

2021-04-28 Thread GitBox
SparkQA commented on pull request #32266: URL: https://github.com/apache/spark/pull/32266#issuecomment-828507671 **[Test build #138047 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138047/testReport)** for PR 32266 at commit

[GitHub] [spark] SparkQA commented on pull request #32271: [SPARK-35112][SQL] Support Cast string to day-second interval

2021-04-28 Thread GitBox
SparkQA commented on pull request #32271: URL: https://github.com/apache/spark/pull/32271#issuecomment-828507501 **[Test build #138046 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138046/testReport)** for PR 32271 at commit

[GitHub] [spark] ulysses-you commented on pull request #32380: [SPARK-34781][SQL][FOLLOWUP] Adjust the order of AQE optimizer rules

2021-04-28 Thread GitBox
ulysses-you commented on pull request #32380: URL: https://github.com/apache/spark/pull/32380#issuecomment-828506252 thanks for merging! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32374: [WIP][SPARK-35253][BUILD][SQL] Upgrade Janino from 3.0.16 to 3.1.3

2021-04-28 Thread GitBox
AmplabJenkins removed a comment on pull request #32374: URL: https://github.com/apache/spark/pull/32374#issuecomment-828505015 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42564/

<    1   2   3   4   5   6   >