[GitHub] [spark] sunchao commented on a change in pull request #32407: [SPARK-35261][SQL] Support static magic method for stateless ScalarFunction

2021-05-02 Thread GitBox
sunchao commented on a change in pull request #32407: URL: https://github.com/apache/spark/pull/32407#discussion_r624907772 ## File path: sql/core/benchmarks/FunctionBenchmark-jdk11-results.txt ## @@ -0,0 +1,32 @@ +OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Mac OS X 10.16 Revie

[GitHub] [spark] otterc commented on a change in pull request #32287: [SPARK-27991][CORE] Defer the fetch request on Netty OOM

2021-05-02 Thread GitBox
otterc commented on a change in pull request #32287: URL: https://github.com/apache/spark/pull/32287#discussion_r624904117 ## File path: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ## @@ -708,6 +785,15 @@ final class ShuffleBlockFetcherIterat

[GitHub] [spark] byungsoo-oh commented on a change in pull request #32394: [SPARK-35266][TESTS] Fix error in BenchmarkBase.scala that occurs when creating benchmark files in non-existent directory

2021-05-02 Thread GitBox
byungsoo-oh commented on a change in pull request #32394: URL: https://github.com/apache/spark/pull/32394#discussion_r624902959 ## File path: core/src/test/scala/org/apache/spark/benchmark/BenchmarkBase.scala ## @@ -49,7 +49,11 @@ abstract class BenchmarkBase { val resul

[GitHub] [spark] byungsoo-oh commented on pull request #32394: [SPARK-35266][TESTS] Fix error in BenchmarkBase.scala that occurs when creating benchmark files in non-existent directory

2021-05-02 Thread GitBox
byungsoo-oh commented on pull request #32394: URL: https://github.com/apache/spark/pull/32394#issuecomment-831057105 > @byungsoo-oh: > > 1. would you mind checking https://github.com/apache/spark/pull/32394/checks?check_run_id=2464430892 and enable GitHub Actions in your forked repo

[GitHub] [spark] viirya edited a comment on pull request #32422: [DO-NOT-MERGE][WIP][SS] Custom stateful task scheduling

2021-05-02 Thread GitBox
viirya edited a comment on pull request #32422: URL: https://github.com/apache/spark/pull/32422#issuecomment-831053905 `StateSchedulingPlugin` is for scheduling stateful tasks. `StreamingSymmetricHashJoinHelper` specifies `StateSchedulingPlugin` for custom scheduling. Other diffs ar

[GitHub] [spark] viirya commented on a change in pull request #32422: [DO-NOT-MERGE][WIP][SS] Custom stateful task scheduling

2021-05-02 Thread GitBox
viirya commented on a change in pull request #32422: URL: https://github.com/apache/spark/pull/32422#discussion_r624900295 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateSchedulingPlugin.scala ## @@ -0,0 +1,116 @@ +/* + * Licensed to

[GitHub] [spark] viirya commented on pull request #32422: [DO-NOT-MERGE][WIP][SS] Custom stateful task scheduling

2021-05-02 Thread GitBox
viirya commented on pull request #32422: URL: https://github.com/apache/spark/pull/32422#issuecomment-831053905 `StateSchedulingPlugin` is for scheduling stateful tasks. `StreamingSymmetricHashJoinHelper` specifies `StateSchedulingPlugin` for custom scheduling. -- This is an automated m

[GitHub] [spark] Dobiasd commented on pull request #32400: [MINOR][SS][DOCS] Fix a typo in the documentation of GroupState

2021-05-02 Thread GitBox
Dobiasd commented on pull request #32400: URL: https://github.com/apache/spark/pull/32400#issuecomment-831053536 The changes are squashed into one commit now. :heavy_check_mark: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] viirya commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-05-02 Thread GitBox
viirya commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-831053315 A reference implementation for custom stateful task scheduling is at #32422. -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32394: [SPARK-35266][TESTS] Fix error in BenchmarkBase.scala that occurs when creating benchmark files in non-existent directory

2021-05-02 Thread GitBox
AmplabJenkins removed a comment on pull request #32394: URL: https://github.com/apache/spark/pull/32394#issuecomment-829008876 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[GitHub] [spark] HyukjinKwon closed pull request #32368: [SPARK-35176][PYTHON] Standardize input validation error type

2021-05-02 Thread GitBox
HyukjinKwon closed pull request #32368: URL: https://github.com/apache/spark/pull/32368 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, pl

[GitHub] [spark] HyukjinKwon commented on pull request #32368: [SPARK-35176][PYTHON] Standardize input validation error type

2021-05-02 Thread GitBox
HyukjinKwon commented on pull request #32368: URL: https://github.com/apache/spark/pull/32368#issuecomment-831053007 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [spark] viirya opened a new pull request #32422: [DO-NOT-MERGE][WIP][SS] Custom stateful task scheduling

2021-05-02 Thread GitBox
viirya opened a new pull request #32422: URL: https://github.com/apache/spark/pull/32422 ### What changes were proposed in this pull request? This is a reference PR of custom stateful task scheduling for SPARK-35022: Task Scheduling Plugin in Spark. (#32136). ### W

[GitHub] [spark] Dobiasd commented on pull request #32400: [MINOR][SS][DOCS] Fix a typo in the documentation of GroupState

2021-05-02 Thread GitBox
Dobiasd commented on pull request #32400: URL: https://github.com/apache/spark/pull/32400#issuecomment-831051189 Ah, when going directly to https://github.com/Dobiasd/spark/actions, it showed this: https://i.imgur.com/NA9PgLB.png I've clicked on "I understand my workflows, go ahead a

[GitHub] [spark] HyukjinKwon commented on pull request #32400: [MINOR][SS][DOCS] Fix a typo in the documentation of GroupState

2021-05-02 Thread GitBox
HyukjinKwon commented on pull request #32400: URL: https://github.com/apache/spark/pull/32400#issuecomment-831050620 Oh, I can see now: https://github.com/Dobiasd/spark/actions. Would you mind squahsing your comments and pushing it into this PR? e.g.) `git rebase upstream/master -i`, mark

[GitHub] [spark] HyukjinKwon commented on pull request #32400: [MINOR][SS][DOCS] Fix a typo in the documentation of GroupState

2021-05-02 Thread GitBox
HyukjinKwon commented on pull request #32400: URL: https://github.com/apache/spark/pull/32400#issuecomment-831049831 Hm, but the tests should pass though before merging it in. Do other repositories of yours run GitHub Actions workflows properly? -- This is an automated message from the A

[GitHub] [spark] HyukjinKwon edited a comment on pull request #32400: [MINOR][SS][DOCS] Fix a typo in the documentation of GroupState

2021-05-02 Thread GitBox
HyukjinKwon edited a comment on pull request #32400: URL: https://github.com/apache/spark/pull/32400#issuecomment-831048626 Thanks @Dobiasd. It's weird that your forked repository doesn't run the tests at https://github.com/Dobiasd/spark/actions .. your branch looks sync'ed with the latest

[GitHub] [spark] HyukjinKwon commented on pull request #32400: [MINOR][SS][DOCS] Fix a typo in the documentation of GroupState

2021-05-02 Thread GitBox
HyukjinKwon commented on pull request #32400: URL: https://github.com/apache/spark/pull/32400#issuecomment-831048626 Thanks @Dobiasd. Your forked repository doesn't run the tests at https://github.com/Dobiasd/spark/actions .. your branch looks sync'ed with the latest Apache Spark `master`

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32400: [MINOR][SS][DOCS] Fix a typo in the documentation of GroupState

2021-05-02 Thread GitBox
AmplabJenkins removed a comment on pull request #32400: URL: https://github.com/apache/spark/pull/32400#issuecomment-829296322 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[GitHub] [spark] HyukjinKwon commented on pull request #32400: [MINOR][SS][DOCS] Fix a typo in the documentation of GroupState

2021-05-02 Thread GitBox
HyukjinKwon commented on pull request #32400: URL: https://github.com/apache/spark/pull/32400#issuecomment-831046855 ok to test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[GitHub] [spark] maropu commented on a change in pull request #32420: [WIP][SPARK-35293][SQL][TESTS] Use the newer dsdgen for TPCDSQueryTestSuite

2021-05-02 Thread GitBox
maropu commented on a change in pull request #32420: URL: https://github.com/apache/spark/pull/32420#discussion_r624890341 ## File path: sql/core/src/test/resources/tpcds-query-results/v1_4/q23a.sql.out ## @@ -3,4 +3,4 @@ -- !query schema struct -- !query output -17030.91 +N

[GitHub] [spark] maropu commented on a change in pull request #32420: [WIP][SPARK-35293][SQL][TESTS] Use the newer dsdgen for TPCDSQueryTestSuite

2021-05-02 Thread GitBox
maropu commented on a change in pull request #32420: URL: https://github.com/apache/spark/pull/32420#discussion_r624890670 ## File path: sql/core/src/test/resources/tpcds-query-results/v1_4/q23b.sql.out ## @@ -3,7 +3,4 @@ -- !query schema struct -- !query output -NULL Robe

[GitHub] [spark] maropu commented on a change in pull request #32420: [WIP][SPARK-35293][SQL][TESTS] Use the newer dsdgen for TPCDSQueryTestSuite

2021-05-02 Thread GitBox
maropu commented on a change in pull request #32420: URL: https://github.com/apache/spark/pull/32420#discussion_r624893120 ## File path: sql/core/src/test/resources/tpcds-query-results/v1_4/q91.sql.out ## @@ -3,4 +3,4 @@ -- !query schema struct -- !query output -CAA

[GitHub] [spark] Dobiasd commented on pull request #32400: [MINOR][SS][DOCS] Fix a typo in the documentation of GroupState

2021-05-02 Thread GitBox
Dobiasd commented on pull request #32400: URL: https://github.com/apache/spark/pull/32400#issuecomment-831045670 @HyukjinKwon Thanks for the help. :+1: - GitHub actions already was enabled for my forked repo: https://i.imgur.com/CTEkw1R.png - I've rebased my branch on upstream mas

[GitHub] [spark] maropu commented on a change in pull request #32420: [WIP][SPARK-35293][SQL][TESTS] Use the newer dsdgen for TPCDSQueryTestSuite

2021-05-02 Thread GitBox
maropu commented on a change in pull request #32420: URL: https://github.com/apache/spark/pull/32420#discussion_r624892918 ## File path: sql/core/src/test/resources/tpcds-query-results/v1_4/q54.sql.out ## @@ -3,4 +3,4 @@ -- !query schema struct -- !query output -11860 1

[GitHub] [spark] maropu commented on a change in pull request #32420: [WIP][SPARK-35293][SQL][TESTS] Use the newer dsdgen for TPCDSQueryTestSuite

2021-05-02 Thread GitBox
maropu commented on a change in pull request #32420: URL: https://github.com/apache/spark/pull/32420#discussion_r624892682 ## File path: sql/core/src/test/resources/tpcds-query-results/v1_4/q25.sql.out ## @@ -3,4 +3,4 @@ -- !query schema struct -- !query output -DPM

[GitHub] [spark] maropu commented on a change in pull request #32420: [WIP][SPARK-35293][SQL][TESTS] Use the newer dsdgen for TPCDSQueryTestSuite

2021-05-02 Thread GitBox
maropu commented on a change in pull request #32420: URL: https://github.com/apache/spark/pull/32420#discussion_r624892545 ## File path: sql/core/src/test/resources/tpcds-query-results/v1_4/q24b.sql.out ## @@ -3,4 +3,4 @@ -- !query schema struct -- !query output -Griffith

[GitHub] [spark] maropu commented on a change in pull request #32420: [WIP][SPARK-35293][SQL][TESTS] Use the newer dsdgen for TPCDSQueryTestSuite

2021-05-02 Thread GitBox
maropu commented on a change in pull request #32420: URL: https://github.com/apache/spark/pull/32420#discussion_r624892369 ## File path: sql/core/src/test/resources/tpcds-query-results/v1_4/q24a.sql.out ## @@ -3,10 +3,4 @@ -- !query schema struct -- !query output -Holt

[GitHub] [spark] maropu commented on a change in pull request #32420: [WIP][SPARK-35293][SQL][TESTS] Use the newer dsdgen for TPCDSQueryTestSuite

2021-05-02 Thread GitBox
maropu commented on a change in pull request #32420: URL: https://github.com/apache/spark/pull/32420#discussion_r624890670 ## File path: sql/core/src/test/resources/tpcds-query-results/v1_4/q23b.sql.out ## @@ -3,7 +3,4 @@ -- !query schema struct -- !query output -NULL Robe

[GitHub] [spark] maropu commented on a change in pull request #32420: [WIP][SPARK-35293][SQL][TESTS] Use the newer dsdgen for TPCDSQueryTestSuite

2021-05-02 Thread GitBox
maropu commented on a change in pull request #32420: URL: https://github.com/apache/spark/pull/32420#discussion_r624890341 ## File path: sql/core/src/test/resources/tpcds-query-results/v1_4/q23a.sql.out ## @@ -3,4 +3,4 @@ -- !query schema struct -- !query output -17030.91 +N

[GitHub] [spark] maropu commented on a change in pull request #32420: [WIP][SPARK-35293][SQL][TESTS] Use the newer dsdgen for TPCDSQueryTestSuite

2021-05-02 Thread GitBox
maropu commented on a change in pull request #32420: URL: https://github.com/apache/spark/pull/32420#discussion_r624890151 ## File path: sql/core/src/test/resources/tpcds-query-results/v1_4/q17.sql.out ## @@ -3,4 +3,4 @@ -- !query schema struct -- !query output -KPF

[GitHub] [spark] sigmod opened a new pull request #32421: [WIP][SPARK-35294][SQL] Add tree traversal pruning in top-level optimizer rules

2021-05-02 Thread GitBox
sigmod opened a new pull request #32421: URL: https://github.com/apache/spark/pull/32421 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was

[GitHub] [spark] HyukjinKwon closed pull request #32416: [SPARK-35281][SQL] StaticInvoke should not apply boxing if return type is primitive

2021-05-02 Thread GitBox
HyukjinKwon closed pull request #32416: URL: https://github.com/apache/spark/pull/32416 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, pl

[GitHub] [spark] HyukjinKwon commented on pull request #32416: [SPARK-35281][SQL] StaticInvoke should not apply boxing if return type is primitive

2021-05-02 Thread GitBox
HyukjinKwon commented on pull request #32416: URL: https://github.com/apache/spark/pull/32416#issuecomment-831037845 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32407: [SPARK-35261][SQL] Support static magic method for stateless ScalarFunction

2021-05-02 Thread GitBox
HyukjinKwon commented on a change in pull request #32407: URL: https://github.com/apache/spark/pull/32407#discussion_r624886807 ## File path: sql/core/benchmarks/FunctionBenchmark-jdk11-results.txt ## @@ -0,0 +1,32 @@ +OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Mac OS X 10.16 R

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32407: [SPARK-35261][SQL] Support static magic method for stateless ScalarFunction

2021-05-02 Thread GitBox
HyukjinKwon commented on a change in pull request #32407: URL: https://github.com/apache/spark/pull/32407#discussion_r624886807 ## File path: sql/core/benchmarks/FunctionBenchmark-jdk11-results.txt ## @@ -0,0 +1,32 @@ +OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Mac OS X 10.16 R

[GitHub] [spark] HyukjinKwon edited a comment on pull request #32394: [SPARK-35266][TESTS] Fix error in BenchmarkBase.scala that occurs when creating benchmark files in non-existent directory

2021-05-02 Thread GitBox
HyukjinKwon edited a comment on pull request #32394: URL: https://github.com/apache/spark/pull/32394#issuecomment-831035088 @srowen and @zhengruifeng FYI from https://github.com/apache/spark/commit/9244066ca69a3fb5d7fe446ce0e19d108892c49d and https://github.com/apache/spark/commit/5b77ebb

[GitHub] [spark] HyukjinKwon commented on pull request #32394: [SPARK-35266][TESTS] Fix error in BenchmarkBase.scala that occurs when creating benchmark files in non-existent directory

2021-05-02 Thread GitBox
HyukjinKwon commented on pull request #32394: URL: https://github.com/apache/spark/pull/32394#issuecomment-831035815 ok to test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[GitHub] [spark] otterc commented on a change in pull request #32007: [SPARK-33350][SHUFFLE] Add support to DiskBlockManager to create merge directory and to get the local shuffle merged data

2021-05-02 Thread GitBox
otterc commented on a change in pull request #32007: URL: https://github.com/apache/spark/pull/32007#discussion_r624885672 ## File path: core/src/main/scala/org/apache/spark/storage/BlockId.scala ## @@ -87,6 +87,29 @@ case class ShufflePushBlockId(shuffleId: Int, mapIndex: Int,

[GitHub] [spark] HyukjinKwon commented on pull request #32394: [SPARK-35266][TESTS] Fix error in BenchmarkBase.scala that occurs when creating benchmark files in non-existent directory

2021-05-02 Thread GitBox
HyukjinKwon commented on pull request #32394: URL: https://github.com/apache/spark/pull/32394#issuecomment-831035088 @srowen and @zhengruifeng FYI from https://github.com/apache/spark/commit/9244066ca69a3fb5d7fe446ce0e19d108892c49d and https://github.com/apache/spark/commit/5b77ebb57ba6ed

[GitHub] [spark] HeartSaVioR commented on pull request #31944: [SPARK-34854][SQL][SS] Expose source metrics via progress report and add Kafka use-case to report delay.

2021-05-02 Thread GitBox
HeartSaVioR commented on pull request #31944: URL: https://github.com/apache/spark/pull/31944#issuecomment-831034365 @yijiacui-db Appreciate if you could rerun your Github Action test in your fork. I guess rerunning action here wouldn't retrigger the test on your fork. We don't have go

[GitHub] [spark] HyukjinKwon commented on pull request #32394: [SPARK-35266][TESTS] Fix error in BenchmarkBase.scala that occurs when creating benchmark files in non-existent directory

2021-05-02 Thread GitBox
HyukjinKwon commented on pull request #32394: URL: https://github.com/apache/spark/pull/32394#issuecomment-831032796 ok to test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[GitHub] [spark] HyukjinKwon commented on pull request #32394: [SPARK-35266][TESTS] Fix error in BenchmarkBase.scala that occurs when creating benchmark files in non-existent directory

2021-05-02 Thread GitBox
HyukjinKwon commented on pull request #32394: URL: https://github.com/apache/spark/pull/32394#issuecomment-831032690 @byungsoo-oh: 1. would you mind checking https://github.com/apache/spark/pull/32394/checks?check_run_id=2464430892 and enable GitHub Actions in your forked repository

[GitHub] [spark] HeartSaVioR commented on pull request #31944: [SPARK-34854][SQL][SS] Expose source metrics via progress report and add Kafka use-case to report delay.

2021-05-02 Thread GitBox
HeartSaVioR commented on pull request #31944: URL: https://github.com/apache/spark/pull/31944#issuecomment-831032642 retest this, please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32394: [SPARK-35266][TESTS] Fix error in BenchmarkBase.scala that occurs when creating benchmark files in non-existent directory

2021-05-02 Thread GitBox
HyukjinKwon commented on a change in pull request #32394: URL: https://github.com/apache/spark/pull/32394#discussion_r624883369 ## File path: core/src/test/scala/org/apache/spark/benchmark/BenchmarkBase.scala ## @@ -49,7 +49,11 @@ abstract class BenchmarkBase { val resul

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32394: [SPARK-35266][TESTS] Fix error in BenchmarkBase.scala that occurs when creating benchmark files in non-existent directory

2021-05-02 Thread GitBox
HyukjinKwon commented on a change in pull request #32394: URL: https://github.com/apache/spark/pull/32394#discussion_r624883141 ## File path: core/src/test/scala/org/apache/spark/benchmark/BenchmarkBase.scala ## @@ -49,7 +49,11 @@ abstract class BenchmarkBase { val resul

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32394: [SPARK-35266][TESTS] Fix error in BenchmarkBase.scala that occurs when creating benchmark files in non-existent directory

2021-05-02 Thread GitBox
HyukjinKwon commented on a change in pull request #32394: URL: https://github.com/apache/spark/pull/32394#discussion_r624881857 ## File path: core/src/test/scala/org/apache/spark/benchmark/BenchmarkBase.scala ## @@ -49,7 +49,11 @@ abstract class BenchmarkBase { val resul

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32394: [SPARK-35266][TESTS] Fix error in BenchmarkBase.scala that occurs when creating benchmark files in non-existent directory

2021-05-02 Thread GitBox
HyukjinKwon commented on a change in pull request #32394: URL: https://github.com/apache/spark/pull/32394#discussion_r624880366 ## File path: core/src/test/scala/org/apache/spark/benchmark/BenchmarkBase.scala ## @@ -49,7 +49,11 @@ abstract class BenchmarkBase { val resul

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32394: [SPARK-35266][TESTS] Fix error in BenchmarkBase.scala that occurs when creating benchmark files in non-existent directory

2021-05-02 Thread GitBox
HyukjinKwon commented on a change in pull request #32394: URL: https://github.com/apache/spark/pull/32394#discussion_r624879992 ## File path: core/src/test/scala/org/apache/spark/benchmark/BenchmarkBase.scala ## @@ -49,7 +49,11 @@ abstract class BenchmarkBase { val resul

[GitHub] [spark] SaurabhChawla100 commented on a change in pull request #32381: [SPARK-35229][WEBUI] Limit the maximum number of items on the timeline view.

2021-05-02 Thread GitBox
SaurabhChawla100 commented on a change in pull request #32381: URL: https://github.com/apache/spark/pull/32381#discussion_r624878008 ## File path: core/src/main/scala/org/apache/spark/ui/jobs/AllJobsPage.scala ## @@ -192,6 +201,32 @@ private[ui] class AllJobsPage(parent: JobsTa

[GitHub] [spark] HyukjinKwon commented on pull request #32395: [SPARK-35270][SQL][CORE] Remove the use of guava in order to upgrade guava version to 27

2021-05-02 Thread GitBox
HyukjinKwon commented on pull request #32395: URL: https://github.com/apache/spark/pull/32395#issuecomment-831020675 @wForget can you enable GitHub Actions in your forked repository? https://github.com/apache/spark/pull/32395/checks?check_run_id=2465058510 -- This is an automated message

[GitHub] [spark] HyukjinKwon commented on pull request #32397: [SPARK-35084][CORE] Spark 3: supporting "--packages" in k8s cluster mode

2021-05-02 Thread GitBox
HyukjinKwon commented on pull request #32397: URL: https://github.com/apache/spark/pull/32397#issuecomment-831018332 @ocworld can you take a look https://github.com/apache/spark/pull/32397/checks?check_run_id=2484554770 and enable Github Actions in your forked repository? -- This is an

[GitHub] [spark] HyukjinKwon commented on pull request #32400: Fix a typo in the documentation of GroupState

2021-05-02 Thread GitBox
HyukjinKwon commented on pull request #32400: URL: https://github.com/apache/spark/pull/32400#issuecomment-831018039 This is fine but @Dobiasd: 1. would you mind taking a look and see if there are more typos since we're here? 2. Also, please keep the [GitHub pull request template

[GitHub] [spark] HyukjinKwon closed pull request #32409: [SPARK-35285][SQL] Parse ANSI interval types in SQL schema

2021-05-02 Thread GitBox
HyukjinKwon closed pull request #32409: URL: https://github.com/apache/spark/pull/32409 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, pl

[GitHub] [spark] HyukjinKwon commented on pull request #32409: [SPARK-35285][SQL] Parse ANSI interval types in SQL schema

2021-05-02 Thread GitBox
HyukjinKwon commented on pull request #32409: URL: https://github.com/apache/spark/pull/32409#issuecomment-831017443 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32411: [SPARK-28551][SQL]In CTAS with LOCATION , should not allow to a non-empty directory.

2021-05-02 Thread GitBox
HyukjinKwon commented on a change in pull request #32411: URL: https://github.com/apache/spark/pull/32411#discussion_r624870695 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/DataWritingCommand.scala ## @@ -96,4 +98,20 @@ object DataWritingCommand

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32411: [SPARK-28551][SQL]In CTAS with LOCATION , should not allow to a non-empty directory.

2021-05-02 Thread GitBox
HyukjinKwon commented on a change in pull request #32411: URL: https://github.com/apache/spark/pull/32411#discussion_r624870218 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/DataWritingCommand.scala ## @@ -96,4 +98,20 @@ object DataWritingCommand

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32411: [SPARK-28551][SQL]In CTAS with LOCATION , should not allow to a non-empty directory.

2021-05-02 Thread GitBox
HyukjinKwon commented on a change in pull request #32411: URL: https://github.com/apache/spark/pull/32411#discussion_r624869835 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala ## @@ -166,6 +166,10 @@ case class CreateDa

[GitHub] [spark] HyukjinKwon commented on pull request #32416: [SPARK-35281][SQL] StaticInvoke should not apply boxing if return type is primitive

2021-05-02 Thread GitBox
HyukjinKwon commented on pull request #32416: URL: https://github.com/apache/spark/pull/32416#issuecomment-831012754 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [spark] maropu commented on pull request #32420: [WIP][SPARK-35293][SQL][TESTS] Use the newer dsdgen for TPCDSQueryTestSuite

2021-05-02 Thread GitBox
maropu commented on pull request #32420: URL: https://github.com/apache/spark/pull/32420#issuecomment-831006189 I'm checking the results query-by-query... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] maropu opened a new pull request #32420: [WIP][SPARK-35293][SQL][TESTS] Use the newer dsdgen for TPCDSQueryTestSuite

2021-05-02 Thread GitBox
maropu opened a new pull request #32420: URL: https://github.com/apache/spark/pull/32420 ### What changes were proposed in this pull request? This PR intends to replace `maropu/spark-tpcds-datagen` with `databricks/tpcds-kit` for using a newer dsdgen and update the golden fil

[GitHub] [spark] viirya commented on pull request #31944: [SPARK-34854][SQL][SS] Expose source metrics via progress report and add Kafka use-case to report delay.

2021-05-02 Thread GitBox
viirya commented on pull request #31944: URL: https://github.com/apache/spark/pull/31944#issuecomment-831004037 Okay, that metrics for FileStreamSource sound an use-case that makes sense. I'm okay you go with this PR. -- This is an automated message from the Apache Git Service. To respon

[GitHub] [spark] HyukjinKwon commented on pull request #32418: [SPARK-35292][PYTHON] Delete redundant parameter in mypy configuration

2021-05-02 Thread GitBox
HyukjinKwon commented on pull request #32418: URL: https://github.com/apache/spark/pull/32418#issuecomment-831002979 @garawalid, would you mind taking a look for https://github.com/apache/spark/pull/32418/checks?check_run_id=2487946353 and enable GitHub Actions please? Apache Spark leverag

[GitHub] [spark] HyukjinKwon commented on pull request #32418: [SPARK-35292][PYTHON] Delete redundant parameter in mypy configuration

2021-05-02 Thread GitBox
HyukjinKwon commented on pull request #32418: URL: https://github.com/apache/spark/pull/32418#issuecomment-831002834 cc @zero323 FYI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32332: [SPARK-35211][PYTHON] verify inferred schema for _create_dataframe

2021-05-02 Thread GitBox
HyukjinKwon commented on a change in pull request #32332: URL: https://github.com/apache/spark/pull/32332#discussion_r624859613 ## File path: python/pyspark/sql/session.py ## @@ -695,9 +704,10 @@ def prepare(obj): prepare = lambda obj: obj Review comment: Y

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32332: [SPARK-35211][PYTHON] verify inferred schema for _create_dataframe

2021-05-02 Thread GitBox
HyukjinKwon commented on a change in pull request #32332: URL: https://github.com/apache/spark/pull/32332#discussion_r624859160 ## File path: python/pyspark/sql/session.py ## @@ -695,9 +704,10 @@ def prepare(obj): prepare = lambda obj: obj Review comment: H

[GitHub] [spark] viirya commented on a change in pull request #32413: [SPARK-35288][SQL] StaticInvoke should find the method without exact argument classes match

2021-05-02 Thread GitBox
viirya commented on a change in pull request #32413: URL: https://github.com/apache/spark/pull/32413#discussion_r624858933 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ObjectExpressionsSuite.scala ## @@ -638,8 +638,22 @@ class ObjectExpre

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32161: [SPARK-35025][SQL][PYTHON][DOCS] Move Parquet data source options from Python and Scala into a single page.

2021-05-02 Thread GitBox
HyukjinKwon commented on a change in pull request #32161: URL: https://github.com/apache/spark/pull/32161#discussion_r624858210 ## File path: python/pyspark/sql/readwriter.py ## @@ -1240,7 +1197,7 @@ def json(self, path, mode=None, compression=None, dateFormat=None, timestampF

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

2021-05-02 Thread GitBox
HyukjinKwon commented on a change in pull request #32204: URL: https://github.com/apache/spark/pull/32204#discussion_r624858107 ## File path: python/pyspark/sql/readwriter.py ## @@ -209,14 +209,7 @@ def load(self, path=None, format=None, schema=None, **options): else:

[GitHub] [spark] viirya commented on a change in pull request #32413: [SPARK-35288][SQL] StaticInvoke should find the method without exact argument classes match

2021-05-02 Thread GitBox
viirya commented on a change in pull request #32413: URL: https://github.com/apache/spark/pull/32413#discussion_r624857286 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ## @@ -236,7 +236,32 @@ case class StaticInvoke(

[GitHub] [spark] maropu commented on pull request #32243: [SPARK-35192][SQL][TESTS] Port minimal TPC-DS datagen code from databricks/spark-sql-perf

2021-05-02 Thread GitBox
maropu commented on pull request #32243: URL: https://github.com/apache/spark/pull/32243#issuecomment-830997133 Thank you, @HyukjinKwon @dongjoon-hyun ~ Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[GitHub] [spark] maropu closed pull request #32243: [SPARK-35192][SQL][TESTS] Port minimal TPC-DS datagen code from databricks/spark-sql-perf

2021-05-02 Thread GitBox
maropu closed pull request #32243: URL: https://github.com/apache/spark/pull/32243 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [spark] JustDoCoder closed pull request #32419: Update

2021-05-02 Thread GitBox
JustDoCoder closed pull request #32419: URL: https://github.com/apache/spark/pull/32419 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, pl

[GitHub] [spark] JustDoCoder opened a new pull request #32419: Update

2021-05-02 Thread GitBox
JustDoCoder opened a new pull request #32419: URL: https://github.com/apache/spark/pull/32419 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

[GitHub] [spark] wangyum commented on pull request #32411: [SPARK-28551][SQL]In CTAS with LOCATION , should not allow to a non-empty directory.

2021-05-02 Thread GitBox
wangyum commented on pull request #32411: URL: https://github.com/apache/spark/pull/32411#issuecomment-830978432 cc @AngersZh -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [spark] github-actions[bot] commented on pull request #30832: [SPARK-33489][PYSPARK][WIP]Add support for converting null from & to Arrow…

2021-05-02 Thread GitBox
github-actions[bot] commented on pull request #30832: URL: https://github.com/apache/spark/pull/30832#issuecomment-830938096 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue ma

[GitHub] [spark] github-actions[bot] commented on pull request #30716: [SPARK-33747][CORE] Avoid calling unregisterMapOutput when the map stage is being rerunning.

2021-05-02 Thread GitBox
github-actions[bot] commented on pull request #30716: URL: https://github.com/apache/spark/pull/30716#issuecomment-830938117 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue ma

[GitHub] [spark] github-actions[bot] closed pull request #31268: [SPARK-32165][SQL] Ensure Spark only initiates SharedState once across SparkSessions

2021-05-02 Thread GitBox
github-actions[bot] closed pull request #31268: URL: https://github.com/apache/spark/pull/31268 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this ser

[GitHub] [spark] viirya commented on pull request #32416: [SPARK-35281][SQL] StaticInvoke should not apply boxing if return type is primitive

2021-05-02 Thread GitBox
viirya commented on pull request #32416: URL: https://github.com/apache/spark/pull/32416#issuecomment-830874358 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [spark] AmplabJenkins commented on pull request #32418: [SPARK-35292][PYTHON] Delete redundant parameter in mypy configuration

2021-05-02 Thread GitBox
AmplabJenkins commented on pull request #32418: URL: https://github.com/apache/spark/pull/32418#issuecomment-830871723 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [spark] garawalid opened a new pull request #32418: [SPARK-35292][PYTHON] Delete redundant parameter in mypy configuration

2021-05-02 Thread GitBox
garawalid opened a new pull request #32418: URL: https://github.com/apache/spark/pull/32418 ### What changes were proposed in this pull request? The parameter **no_implicit_optional** is defined twice in the mypy configuration, [ligne 20](https://github.com/apache/spark/blob

[GitHub] [spark] SparkQA removed a comment on pull request #32416: [SPARK-35281][SQL] StaticInvoke should not apply boxing if return type is primitive

2021-05-02 Thread GitBox
SparkQA removed a comment on pull request #32416: URL: https://github.com/apache/spark/pull/32416#issuecomment-830781271 **[Test build #138141 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138141/testReport)** for PR 32416 at commit [`7c687d0`](https://gi

[GitHub] [spark] SparkQA commented on pull request #32416: [SPARK-35281][SQL] StaticInvoke should not apply boxing if return type is primitive

2021-05-02 Thread GitBox
SparkQA commented on pull request #32416: URL: https://github.com/apache/spark/pull/32416#issuecomment-830848984 **[Test build #138141 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138141/testReport)** for PR 32416 at commit [`7c687d0`](https://github.co

[GitHub] [spark] viirya commented on a change in pull request #32416: [SPARK-35281][SQL] StaticInvoke should not apply boxing if return type is primitive

2021-05-02 Thread GitBox
viirya commented on a change in pull request #32416: URL: https://github.com/apache/spark/pull/32416#discussion_r624730201 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ## @@ -263,14 +263,18 @@ case class StaticInvoke

[GitHub] [spark] sunchao commented on a change in pull request #32416: [SPARK-35281][SQL] StaticInvoke should not apply boxing if return type is primitive

2021-05-02 Thread GitBox
sunchao commented on a change in pull request #32416: URL: https://github.com/apache/spark/pull/32416#discussion_r624728611 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ## @@ -263,14 +263,18 @@ case class StaticInvok

[GitHub] [spark] sunchao commented on a change in pull request #32416: [SPARK-35281][SQL] StaticInvoke should not apply boxing if return type is primitive

2021-05-02 Thread GitBox
sunchao commented on a change in pull request #32416: URL: https://github.com/apache/spark/pull/32416#discussion_r624728611 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ## @@ -263,14 +263,18 @@ case class StaticInvok

[GitHub] [spark] viirya commented on a change in pull request #32416: [SPARK-35281][SQL] StaticInvoke should not apply boxing if return type is primitive

2021-05-02 Thread GitBox
viirya commented on a change in pull request #32416: URL: https://github.com/apache/spark/pull/32416#discussion_r624726318 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ## @@ -263,14 +263,18 @@ case class StaticInvoke

[GitHub] [spark] viirya commented on a change in pull request #32416: [SPARK-35281][SQL] StaticInvoke should not apply boxing if return type is primitive

2021-05-02 Thread GitBox
viirya commented on a change in pull request #32416: URL: https://github.com/apache/spark/pull/32416#discussion_r624726126 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ## @@ -263,14 +263,18 @@ case class StaticInvoke

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32354: [SPARK-35232][SQL] Nested column pruning should retain column metadata

2021-05-02 Thread GitBox
AmplabJenkins removed a comment on pull request #32354: URL: https://github.com/apache/spark/pull/32354#issuecomment-830830241 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42663/

[GitHub] [spark] AmplabJenkins commented on pull request #32354: [SPARK-35232][SQL] Nested column pruning should retain column metadata

2021-05-02 Thread GitBox
AmplabJenkins commented on pull request #32354: URL: https://github.com/apache/spark/pull/32354#issuecomment-830830241 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42663/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #32417: [SPARK-35289][SQL] Make CatalogString contains nullable information

2021-05-02 Thread GitBox
AmplabJenkins commented on pull request #32417: URL: https://github.com/apache/spark/pull/32417#issuecomment-830812432 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [spark] SemanticBeeng edited a comment on pull request #24748: [SPARK-27872][K8s] Fix executor service account inconsistency

2021-05-02 Thread GitBox
SemanticBeeng edited a comment on pull request #24748: URL: https://github.com/apache/spark/pull/24748#issuecomment-830809011 @cloudflow @RayRoestenburg @yuchaoran2011 : can latest `cloudflow` work with latest `Spark operator` for 3.1.1 ? On https://cloudflow.io/docs/dev/administrat

[GitHub] [spark] SemanticBeeng commented on pull request #24748: [SPARK-27872][K8s] Fix executor service account inconsistency

2021-05-02 Thread GitBox
SemanticBeeng commented on pull request #24748: URL: https://github.com/apache/spark/pull/24748#issuecomment-830809011 @cloudflow @RayRoestenburg @yuchaoran2011 : can latest `cloudflow` work with latest `Spark operator` for 3.1.1 ? On https://cloudflow.io/docs/dev/administration/ins

[GitHub] [spark] hezuojiao opened a new pull request #32417: [SPARK-35289][SQL] Make CatalogString contains nullable information

2021-05-02 Thread GitBox
hezuojiao opened a new pull request #32417: URL: https://github.com/apache/spark/pull/32417 ### What changes were proposed in this pull request? This patch proposes to make CatalogString contains nullable information to help understand Casting failure message. ### Why are

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32361: [SPARK-35240][SS] Use CheckpointFileManager for checkpoint file manipulation

2021-05-02 Thread GitBox
AmplabJenkins removed a comment on pull request #32361: URL: https://github.com/apache/spark/pull/32361#issuecomment-830804569 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138140/ -

[GitHub] [spark] AmplabJenkins commented on pull request #32361: [SPARK-35240][SS] Use CheckpointFileManager for checkpoint file manipulation

2021-05-02 Thread GitBox
AmplabJenkins commented on pull request #32361: URL: https://github.com/apache/spark/pull/32361#issuecomment-830804569 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138140/ -- This

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32416: [SPARK-35281][SQL] StaticInvoke should not apply boxing if return type is primitive

2021-05-02 Thread GitBox
AmplabJenkins removed a comment on pull request #32416: URL: https://github.com/apache/spark/pull/32416#issuecomment-830804457 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42662/

[GitHub] [spark] AmplabJenkins commented on pull request #32416: [SPARK-35281][SQL] StaticInvoke should not apply boxing if return type is primitive

2021-05-02 Thread GitBox
AmplabJenkins commented on pull request #32416: URL: https://github.com/apache/spark/pull/32416#issuecomment-830804457 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42662/ -- T

[GitHub] [spark] SparkQA commented on pull request #32354: [SPARK-35232][SQL] Nested column pruning should retain column metadata

2021-05-02 Thread GitBox
SparkQA commented on pull request #32354: URL: https://github.com/apache/spark/pull/32354#issuecomment-830793653 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries

[GitHub] [spark] SparkQA removed a comment on pull request #32361: [SPARK-35240][SS] Use CheckpointFileManager for checkpoint file manipulation

2021-05-02 Thread GitBox
SparkQA removed a comment on pull request #32361: URL: https://github.com/apache/spark/pull/32361#issuecomment-830760668 **[Test build #138140 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138140/testReport)** for PR 32361 at commit [`466d353`](https://gi

  1   2   >