[GitHub] [spark] SparkQA removed a comment on pull request #33422: [SPARK-34806][SQL] Add Observation helper for Dataset.observe

2021-07-20 Thread GitBox
SparkQA removed a comment on pull request #33422: URL: https://github.com/apache/spark/pull/33422#issuecomment-883913412 **[Test build #141389 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141389/testReport)** for PR 33422 at commit [`4d78f0c`](https://gi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883941090 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141361/ -

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33422: [SPARK-34806][SQL] Add Observation helper for Dataset.observe

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33422: URL: https://github.com/apache/spark/pull/33422#issuecomment-883941092 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141389/ -

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33450: [SPARK-35809][PYTHON] Add `index_col` argument for ps.sql

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33450: URL: https://github.com/apache/spark/pull/33450#issuecomment-883941091 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45895/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33385: [WIP][SPARK-36173][CORE] Support getting CPU number in TaskContext

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33385: URL: https://github.com/apache/spark/pull/33385#issuecomment-881170304 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[GitHub] [spark] SparkQA commented on pull request #33450: [SPARK-35809][PYTHON] Add `index_col` argument for ps.sql

2021-07-20 Thread GitBox
SparkQA commented on pull request #33450: URL: https://github.com/apache/spark/pull/33450#issuecomment-883942059 **[Test build #141392 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141392/testReport)** for PR 33450 at commit [`1615bd6`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
SparkQA commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883942091 **[Test build #141393 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141393/testReport)** for PR 33447 at commit [`0d028af`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #33436: [SPARK-35912][SQL] Fix nullability of `spark.read.json/spark.read.csv`

2021-07-20 Thread GitBox
SparkQA commented on pull request #33436: URL: https://github.com/apache/spark/pull/33436#issuecomment-883942139 **[Test build #141395 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141395/testReport)** for PR 33436 at commit [`56ceec4`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #33444: [WIP][SPARK-36227][SQL][3.2] Remove TimestampNTZ type support in Spark 3.2

2021-07-20 Thread GitBox
SparkQA commented on pull request #33444: URL: https://github.com/apache/spark/pull/33444#issuecomment-883942043 **[Test build #141394 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141394/testReport)** for PR 33444 at commit [`fcaa4e5`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #33457: [SPARK-36237][SQL] We should attach and start handler after application started

2021-07-20 Thread GitBox
SparkQA commented on pull request #33457: URL: https://github.com/apache/spark/pull/33457#issuecomment-883941965 **[Test build #141391 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141391/testReport)** for PR 33457 at commit [`7832d40`](https://github.com

[GitHub] [spark] AmplabJenkins commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
AmplabJenkins commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883941090 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141361/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33450: [SPARK-35809][PYTHON] Add `index_col` argument for ps.sql

2021-07-20 Thread GitBox
AmplabJenkins commented on pull request #33450: URL: https://github.com/apache/spark/pull/33450#issuecomment-883941091 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45895/ -- T

[GitHub] [spark] carsonwang commented on pull request #33385: [WIP][SPARK-36173][CORE] Support getting CPU number in TaskContext

2021-07-20 Thread GitBox
carsonwang commented on pull request #33385: URL: https://github.com/apache/spark/pull/33385#issuecomment-883936328 ok to test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [spark] SparkQA commented on pull request #33450: [SPARK-35809][PYTHON] Add `index_col` argument for ps.sql

2021-07-20 Thread GitBox
SparkQA commented on pull request #33450: URL: https://github.com/apache/spark/pull/33450#issuecomment-883931941 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45895/ -- This is an automated message from the A

[GitHub] [spark] itholic commented on a change in pull request #33450: [SPARK-35809][PYTHON] Add `index_col` argument for ps.sql

2021-07-20 Thread GitBox
itholic commented on a change in pull request #33450: URL: https://github.com/apache/spark/pull/33450#discussion_r673698115 ## File path: python/pyspark/pandas/sql_processor.py ## @@ -65,6 +66,9 @@ def sql( -- query : str the SQL query +index_col:

[GitHub] [spark] AngersZhuuuu commented on pull request #33457: [SPARK-36237][SQL] We should attach and start handler after application started

2021-07-20 Thread GitBox
AngersZh commented on pull request #33457: URL: https://github.com/apache/spark/pull/33457#issuecomment-883931492 ping @srowen since I have found you handle similar prs before in 2014 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AngersZhuuuu opened a new pull request #33457: [SPARK-36237][SQL] We should attach and start handler after application started

2021-07-20 Thread GitBox
AngersZh opened a new pull request #33457: URL: https://github.com/apache/spark/pull/33457 ### What changes were proposed in this pull request? When we use prometheus to fetch metrics, always pull data before application started. Then throw a lot of exception not of NoSuchElem

[GitHub] [spark] itholic commented on a change in pull request #33450: [SPARK-35809][PYTHON] Add `index_col` argument for ps.sql

2021-07-20 Thread GitBox
itholic commented on a change in pull request #33450: URL: https://github.com/apache/spark/pull/33450#discussion_r673697427 ## File path: python/pyspark/pandas/sql_processor.py ## @@ -65,6 +66,9 @@ def sql( -- query : str the SQL query +index_col:

[GitHub] [spark] SparkQA commented on pull request #31517: [SPARK-34309][BUILD][CORE][SQL][K8S]Use Caffeine instead of Guava Cache

2021-07-20 Thread GitBox
SparkQA commented on pull request #31517: URL: https://github.com/apache/spark/pull/31517#issuecomment-883928931 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45898/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #33422: [SPARK-34806][SQL] Add Observation helper for Dataset.observe

2021-07-20 Thread GitBox
SparkQA commented on pull request #33422: URL: https://github.com/apache/spark/pull/33422#issuecomment-883926493 **[Test build #141389 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141389/testReport)** for PR 33422 at commit [`4d78f0c`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
SparkQA removed a comment on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883733461 **[Test build #141361 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141361/testReport)** for PR 33447 at commit [`0f890e3`](https://gi

[GitHub] [spark] SparkQA commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
SparkQA commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883922648 **[Test build #141361 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141361/testReport)** for PR 33447 at commit [`0f890e3`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #33452: [SPARK-36030][SQL][FOLLOW-UP] Avoid procedure syntax deprecated in Scala 2.13

2021-07-20 Thread GitBox
SparkQA commented on pull request #33452: URL: https://github.com/apache/spark/pull/33452#issuecomment-883922563 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45896/ -- This is an automated message from the Apache

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883920330 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45899/

[GitHub] [spark] AmplabJenkins commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
AmplabJenkins commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883920330 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45899/ -- T

[GitHub] [spark] SparkQA commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
SparkQA commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883920309 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45899/ -- This

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33419: [SPARK-36208][SQL] SparkScriptTransformation should support ANSI interval types

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33419: URL: https://github.com/apache/spark/pull/33419#issuecomment-883919274 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45894/

[GitHub] [spark] SparkQA commented on pull request #33419: [SPARK-36208][SQL] SparkScriptTransformation should support ANSI interval types

2021-07-20 Thread GitBox
SparkQA commented on pull request #33419: URL: https://github.com/apache/spark/pull/33419#issuecomment-883919240 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45894/ -- This is an automated message from the A

[GitHub] [spark] AmplabJenkins commented on pull request #33419: [SPARK-36208][SQL] SparkScriptTransformation should support ANSI interval types

2021-07-20 Thread GitBox
AmplabJenkins commented on pull request #33419: URL: https://github.com/apache/spark/pull/33419#issuecomment-883919274 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45894/ -- T

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #33363: [SPARK-36156][SQL] SCRIPT TRANSFORM ROW FORMAT DELIMITED should respect `NULL DEFINED AS` and default value should be `\N`

2021-07-20 Thread GitBox
AngersZh commented on a change in pull request #33363: URL: https://github.com/apache/spark/pull/33363#discussion_r673684054 ## File path: sql/core/src/test/resources/sql-tests/results/transform.sql.out ## @@ -202,9 +202,9 @@ FROM t -- !query schema struct -- !query outp

[GitHub] [spark] cloud-fan closed pull request #33442: [SPARK-36020][SQL][FOLLOWUP] RemoveRedundantProjects should retain the LOGICAL_PLAN_TAG tag

2021-07-20 Thread GitBox
cloud-fan closed pull request #33442: URL: https://github.com/apache/spark/pull/33442 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsu

[GitHub] [spark] cloud-fan commented on pull request #33442: [SPARK-36020][SQL][FOLLOWUP] RemoveRedundantProjects should retain the LOGICAL_PLAN_TAG tag

2021-07-20 Thread GitBox
cloud-fan commented on pull request #33442: URL: https://github.com/apache/spark/pull/33442#issuecomment-883915691 thanks for the review, merging to master/3.2/3.1! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] cloud-fan commented on a change in pull request #33310: [SPARK-36105][SQL] OptimizeLocalShuffleReader support reading data of multiple mappers in one task

2021-07-20 Thread GitBox
cloud-fan commented on a change in pull request #33310: URL: https://github.com/apache/spark/pull/33310#discussion_r673682713 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/ShuffledRowRDD.scala ## @@ -181,6 +187,9 @@ class ShuffledRowRDD( case Pa

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33449: [SPARK-35310][MLLIB] Update to breeze 1.2

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33449: URL: https://github.com/apache/spark/pull/33449#issuecomment-883912481 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141371/ -

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33445: [SPARK-36228][SQL] Skip splitting a skewed partition when some map outputs are removed

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33445: URL: https://github.com/apache/spark/pull/33445#issuecomment-883912477 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45889/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33310: [SPARK-36105][SQL] OptimizeLocalShuffleReader support reading data of multiple mappers in one task

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33310: URL: https://github.com/apache/spark/pull/33310#issuecomment-883912480 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141365/ -

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33436: [SPARK-35912][SQL] Fix nullability of `spark.read.json/spark.read.csv`

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33436: URL: https://github.com/apache/spark/pull/33436#issuecomment-883912478 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45890/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33422: [SPARK-34806][SQL] Add Observation helper for Dataset.observe

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33422: URL: https://github.com/apache/spark/pull/33422#issuecomment-883757362 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33352: [SPARK-34952][SQL] DSv2 Aggregate push down APIs

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33352: URL: https://github.com/apache/spark/pull/33352#issuecomment-883912474 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45891/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33453: [SPARK-36030][SQL][FOLLOW-UP] Remove duplicated test suite

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33453: URL: https://github.com/apache/spark/pull/33453#issuecomment-883913062 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] SparkQA removed a comment on pull request #33310: [SPARK-36105][SQL] OptimizeLocalShuffleReader support reading data of multiple mappers in one task

2021-07-20 Thread GitBox
SparkQA removed a comment on pull request #33310: URL: https://github.com/apache/spark/pull/33310#issuecomment-883758947 **[Test build #141365 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141365/testReport)** for PR 33310 at commit [`113f0c8`](https://gi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883912475 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] SparkQA removed a comment on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
SparkQA removed a comment on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883707928 **[Test build #141360 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141360/testReport)** for PR 33447 at commit [`8aa69dc`](https://gi

[GitHub] [spark] SparkQA commented on pull request #33422: [SPARK-34806][SQL] Add Observation helper for Dataset.observe

2021-07-20 Thread GitBox
SparkQA commented on pull request #33422: URL: https://github.com/apache/spark/pull/33422#issuecomment-883913412 **[Test build #141389 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141389/testReport)** for PR 33422 at commit [`4d78f0c`](https://github.com

[GitHub] [spark] gengliangwang commented on pull request #33444: [WIP][SPARK-36227][SQL][3.2] Remove TimestampNTZ type support in Spark 3.2

2021-07-20 Thread GitBox
gengliangwang commented on pull request #33444: URL: https://github.com/apache/spark/pull/33444#issuecomment-883913439 @dongjoon-hyun @HyukjinKwon I think this is almost all of them. I am still double-checking. -- This is an automated message from the Apache Git Service. To respond to th

[GitHub] [spark] SparkQA commented on pull request #33352: [SPARK-34952][SQL] DSv2 Aggregate push down APIs

2021-07-20 Thread GitBox
SparkQA commented on pull request #33352: URL: https://github.com/apache/spark/pull/33352#issuecomment-883913455 **[Test build #141390 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141390/testReport)** for PR 33352 at commit [`07ce59e`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #33444: [WIP][SPARK-36227][SQL][3.2] Remove TimestampNTZ type support in Spark 3.2

2021-07-20 Thread GitBox
SparkQA commented on pull request #33444: URL: https://github.com/apache/spark/pull/33444#issuecomment-883913336 **[Test build #141388 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141388/testReport)** for PR 33444 at commit [`216687c`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #33455: [SPARK-36236] Additional metrics for RocksDB based state store implementation

2021-07-20 Thread GitBox
SparkQA commented on pull request #33455: URL: https://github.com/apache/spark/pull/33455#issuecomment-883913295 **[Test build #141387 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141387/testReport)** for PR 33455 at commit [`6fbb8e9`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #33456: [SPARK-35815][SQL] Allow delayThreshold for watermark to be represented as ANSI interval literals

2021-07-20 Thread GitBox
SparkQA commented on pull request #33456: URL: https://github.com/apache/spark/pull/33456#issuecomment-883913292 **[Test build #141386 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141386/testReport)** for PR 33456 at commit [`3a21aee`](https://github.com

[GitHub] [spark] AmplabJenkins commented on pull request #33453: [SPARK-36030][SQL][FOLLOW-UP] Remove duplicated test suite

2021-07-20 Thread GitBox
AmplabJenkins commented on pull request #33453: URL: https://github.com/apache/spark/pull/33453#issuecomment-883913062 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] AmplabJenkins commented on pull request #33454: [SPARK-36030][SQL][FOLLOW-UP][3.2] Remove duplicated test suiteRemove duplicated test suite.

2021-07-20 Thread GitBox
AmplabJenkins commented on pull request #33454: URL: https://github.com/apache/spark/pull/33454#issuecomment-883913122 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45902/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #33445: [SPARK-36228][SQL] Skip splitting a skewed partition when some map outputs are removed

2021-07-20 Thread GitBox
AmplabJenkins commented on pull request #33445: URL: https://github.com/apache/spark/pull/33445#issuecomment-883912477 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45889/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #33310: [SPARK-36105][SQL] OptimizeLocalShuffleReader support reading data of multiple mappers in one task

2021-07-20 Thread GitBox
AmplabJenkins commented on pull request #33310: URL: https://github.com/apache/spark/pull/33310#issuecomment-883912480 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141365/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33449: [SPARK-35310][MLLIB] Update to breeze 1.2

2021-07-20 Thread GitBox
AmplabJenkins commented on pull request #33449: URL: https://github.com/apache/spark/pull/33449#issuecomment-883912481 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141371/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33352: [SPARK-34952][SQL] DSv2 Aggregate push down APIs

2021-07-20 Thread GitBox
AmplabJenkins commented on pull request #33352: URL: https://github.com/apache/spark/pull/33352#issuecomment-883912474 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45891/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #33436: [SPARK-35912][SQL] Fix nullability of `spark.read.json/spark.read.csv`

2021-07-20 Thread GitBox
AmplabJenkins commented on pull request #33436: URL: https://github.com/apache/spark/pull/33436#issuecomment-883912478 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45890/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
AmplabJenkins commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883912475 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] cloud-fan commented on a change in pull request #33310: [SPARK-36105][SQL] OptimizeLocalShuffleReader support reading data of multiple mappers in one task

2021-07-20 Thread GitBox
cloud-fan commented on a change in pull request #33310: URL: https://github.com/apache/spark/pull/33310#discussion_r673680582 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/ShuffledRowRDD.scala ## @@ -181,6 +187,9 @@ class ShuffledRowRDD( case Pa

[GitHub] [spark] tdas commented on pull request #33336: [SPARK-36132][SS][SQL] Support initial state for batch mode of flatMapGroupsWithState

2021-07-20 Thread GitBox
tdas commented on pull request #6: URL: https://github.com/apache/spark/pull/6#issuecomment-883911168 merged to master and backported to 3.2 since this is minor change but a good improvement to new feature added in 3.2 -- This is an automated message from the Apache Git S

[GitHub] [spark] cloud-fan commented on a change in pull request #33363: [SPARK-36156][SQL] SCRIPT TRANSFORM ROW FORMAT DELIMITED should respect `NULL DEFINED AS` and default value should be `\N`

2021-07-20 Thread GitBox
cloud-fan commented on a change in pull request #33363: URL: https://github.com/apache/spark/pull/33363#discussion_r673679364 ## File path: sql/core/src/test/resources/sql-tests/inputs/transform.sql ## @@ -121,6 +121,38 @@ USING 'cat' AS (d) NULL DEFINED AS 'NULL' FROM t;

[GitHub] [spark] cloud-fan commented on a change in pull request #33363: [SPARK-36156][SQL] SCRIPT TRANSFORM ROW FORMAT DELIMITED should respect `NULL DEFINED AS` and default value should be `\N`

2021-07-20 Thread GitBox
cloud-fan commented on a change in pull request #33363: URL: https://github.com/apache/spark/pull/33363#discussion_r673678993 ## File path: sql/core/src/test/resources/sql-tests/results/transform.sql.out ## @@ -202,9 +202,9 @@ FROM t -- !query schema struct -- !query output

[GitHub] [spark] SparkQA commented on pull request #33310: [SPARK-36105][SQL] OptimizeLocalShuffleReader support reading data of multiple mappers in one task

2021-07-20 Thread GitBox
SparkQA commented on pull request #33310: URL: https://github.com/apache/spark/pull/33310#issuecomment-883909935 **[Test build #141365 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141365/testReport)** for PR 33310 at commit [`113f0c8`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #33450: [SPARK-35809][PYTHON] Add `index_col` argument for ps.sql

2021-07-20 Thread GitBox
SparkQA commented on pull request #33450: URL: https://github.com/apache/spark/pull/33450#issuecomment-883909896 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45895/ -- This is an automated message from the Apache

[GitHub] [spark] asfgit closed pull request #33336: [SPARK-36132][SS][SQL] Support initial state for batch mode of flatMapGroupsWithState

2021-07-20 Thread GitBox
asfgit closed pull request #6: URL: https://github.com/apache/spark/pull/6 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubsc

[GitHub] [spark] cloud-fan commented on a change in pull request #33442: [SPARK-36020][SQL][FOLLOWUP] RemoveRedundantProjects should retain the LOGICAL_PLAN_TAG tag

2021-07-20 Thread GitBox
cloud-fan commented on a change in pull request #33442: URL: https://github.com/apache/spark/pull/33442#discussion_r673677898 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/RemoveRedundantProjects.scala ## @@ -49,7 +49,11 @@ object RemoveRedundantProjects

[GitHub] [spark] cloud-fan commented on pull request #33362: [SPARK-36153][SQL][DOCS] Update transform doc to match the current code

2021-07-20 Thread GitBox
cloud-fan commented on pull request #33362: URL: https://github.com/apache/spark/pull/33362#issuecomment-883908351 @AngersZh can you open a backport PR? thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] SparkQA commented on pull request #33436: [SPARK-35912][SQL] Fix nullability of `spark.read.json/spark.read.csv`

2021-07-20 Thread GitBox
SparkQA commented on pull request #33436: URL: https://github.com/apache/spark/pull/33436#issuecomment-883903472 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45890/ -- This is an automated message from the A

[GitHub] [spark] viirya closed pull request #33454: [SPARK-36030][SQL][FOLLOW-UP][3.2] Remove duplicated test suiteRemove duplicated test suite.

2021-07-20 Thread GitBox
viirya closed pull request #33454: URL: https://github.com/apache/spark/pull/33454 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubsc

[GitHub] [spark] SparkQA commented on pull request #33352: [SPARK-34952][SQL] DSv2 Aggregate push down APIs

2021-07-20 Thread GitBox
SparkQA commented on pull request #33352: URL: https://github.com/apache/spark/pull/33352#issuecomment-883902352 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45891/ -- This is an automated message from the A

[GitHub] [spark] viirya commented on pull request #33454: [SPARK-36030][SQL][FOLLOW-UP][3.2] Remove duplicated test suiteRemove duplicated test suite.

2021-07-20 Thread GitBox
viirya commented on pull request #33454: URL: https://github.com/apache/spark/pull/33454#issuecomment-883901990 Thanks @HyukjinKwon. Merging to branch-3.2. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] sarutak opened a new pull request #33456: [SPARK-35815][SQL] Allow delayThreshold for watermark to be represented as ANSI interval literals

2021-07-20 Thread GitBox
sarutak opened a new pull request #33456: URL: https://github.com/apache/spark/pull/33456 ### What changes were proposed in this pull request? This PR extends the way to represent `delayThreshold` with ANSI interval literals for watermark. ### Why are the changes needed?

[GitHub] [spark] vkorukanti opened a new pull request #33455: [SPARK-36236] Additional metrics for RocksDB based state store implementation

2021-07-20 Thread GitBox
vkorukanti opened a new pull request #33455: URL: https://github.com/apache/spark/pull/33455 ### What changes were proposed in this pull request? Proposing adding new metrics to `customMetrics` under the `stateOperators` in `StreamingQueryProgress` event These metrics help ha

[GitHub] [spark] SparkQA commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
SparkQA commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883900452 **[Test build #141360 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141360/testReport)** for PR 33447 at commit [`8aa69dc`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #33419: [SPARK-36208][SQL] SparkScriptTransformation should support ANSI interval types

2021-07-20 Thread GitBox
SparkQA commented on pull request #33419: URL: https://github.com/apache/spark/pull/33419#issuecomment-883900279 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45894/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA removed a comment on pull request #33449: [SPARK-35310][MLLIB] Update to breeze 1.2

2021-07-20 Thread GitBox
SparkQA removed a comment on pull request #33449: URL: https://github.com/apache/spark/pull/33449#issuecomment-883828834 **[Test build #141371 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141371/testReport)** for PR 33449 at commit [`9bf2482`](https://gi

[GitHub] [spark] SparkQA removed a comment on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
SparkQA removed a comment on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883703479 **[Test build #141357 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141357/testReport)** for PR 33447 at commit [`ad6dbaa`](https://gi

[GitHub] [spark] SparkQA commented on pull request #33449: [SPARK-35310][MLLIB] Update to breeze 1.2

2021-07-20 Thread GitBox
SparkQA commented on pull request #33449: URL: https://github.com/apache/spark/pull/33449#issuecomment-883897120 **[Test build #141371 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141371/testReport)** for PR 33449 at commit [`9bf2482`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #33445: [SPARK-36228][SQL] Skip splitting a skewed partition when some map outputs are removed

2021-07-20 Thread GitBox
SparkQA commented on pull request #33445: URL: https://github.com/apache/spark/pull/33445#issuecomment-883897144 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45889/ -- This is an automated message from the A

[GitHub] [spark] SparkQA commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
SparkQA commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883897114 **[Test build #141357 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141357/testReport)** for PR 33447 at commit [`ad6dbaa`](https://github.co

[GitHub] [spark] viirya commented on pull request #33454: [SPARK-36030][SQL][FOLLOW-UP][3.2] Remove duplicated test suiteRemove duplicated test suite.

2021-07-20 Thread GitBox
viirya commented on pull request #33454: URL: https://github.com/apache/spark/pull/33454#issuecomment-883896406 cc @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[GitHub] [spark] viirya opened a new pull request #33454: [SPARK-36030][SQL][FOLLOW-UP][3.2] Remove duplicated test suiteRemove duplicated test suite.

2021-07-20 Thread GitBox
viirya opened a new pull request #33454: URL: https://github.com/apache/spark/pull/33454 ### What changes were proposed in this pull request? Removes `FileFormatDataWriterMetricSuite` which duplicated. ### Why are the changes needed? `FileFormatDataWriter

[GitHub] [spark] SparkQA commented on pull request #33451: [SPARK-36206][CORE] Support shuffle data corruption diagnosis via shuffle checksum

2021-07-20 Thread GitBox
SparkQA commented on pull request #33451: URL: https://github.com/apache/spark/pull/33451#issuecomment-883895963 **[Test build #141385 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141385/testReport)** for PR 33451 at commit [`df30c80`](https://github.com

[GitHub] [spark] HyukjinKwon commented on pull request #33453: [SPARK-36030][SQL][FOLLOW-UP] Remove duplicated test suite

2021-07-20 Thread GitBox
HyukjinKwon commented on pull request #33453: URL: https://github.com/apache/spark/pull/33453#issuecomment-883895851 lol I think you can just directly push it to. Pr is fine either -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

[GitHub] [spark] viirya commented on pull request #33453: [SPARK-36030][SQL][FOLLOW-UP] Remove duplicated test suite

2021-07-20 Thread GitBox
viirya commented on pull request #33453: URL: https://github.com/apache/spark/pull/33453#issuecomment-883895474 Sorry, I typed too quickly when merging script asked if to pick to another branch. I will open another for 3.2... -- This is an automated message from the Apache Git Service. T

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33450: [SPARK-35809] Add `index_col` argument for ps.sql

2021-07-20 Thread GitBox
HyukjinKwon commented on a change in pull request #33450: URL: https://github.com/apache/spark/pull/33450#discussion_r673665623 ## File path: python/pyspark/pandas/sql_processor.py ## @@ -65,6 +66,9 @@ def sql( -- query : str the SQL query +index_

[GitHub] [spark] viirya closed pull request #33453: [SPARK-36030][SQL][FOLLOW-UP] Remove duplicated test suite

2021-07-20 Thread GitBox
viirya closed pull request #33453: URL: https://github.com/apache/spark/pull/33453 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubsc

[GitHub] [spark] HyukjinKwon commented on pull request #33453: [SPARK-36030][SQL][FOLLOW-UP] Remove duplicated test suite

2021-07-20 Thread GitBox
HyukjinKwon commented on pull request #33453: URL: https://github.com/apache/spark/pull/33453#issuecomment-883894784 Yeah +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] viirya commented on pull request #33453: [SPARK-36030][SQL][FOLLOW-UP] Remove duplicated test suite

2021-07-20 Thread GitBox
viirya commented on pull request #33453: URL: https://github.com/apache/spark/pull/33453#issuecomment-883894502 Thanks @HyukjinKwon. This only removed duplicated test suite. I'm going to merge this. -- This is an automated message from the Apache Git Service. To respond to the message, p

[GitHub] [spark] SparkQA commented on pull request #33453: [SPARK-36030][SQL][FOLLOW-UP] Remove duplicated test suite

2021-07-20 Thread GitBox
SparkQA commented on pull request #33453: URL: https://github.com/apache/spark/pull/33453#issuecomment-883894277 **[Test build #141384 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141384/testReport)** for PR 33453 at commit [`377bec3`](https://github.com

[GitHub] [spark] HyukjinKwon commented on pull request #33452: [SPARK-36030][SQL][FOLLOW-UP] Avoid procedure syntax deprecated in Scala 2.13

2021-07-20 Thread GitBox
HyukjinKwon commented on pull request #33452: URL: https://github.com/apache/spark/pull/33452#issuecomment-883893691 Merged to master and branch-3.2. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] HyukjinKwon closed pull request #33452: [SPARK-36030][SQL][FOLLOW-UP] Avoid procedure syntax deprecated in Scala 2.13

2021-07-20 Thread GitBox
HyukjinKwon closed pull request #33452: URL: https://github.com/apache/spark/pull/33452 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-un

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33451: [SPARK-36206][CORE] Support shuffle data corruption diagnosis via shuffle checksum

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33451: URL: https://github.com/apache/spark/pull/33451#issuecomment-883892948 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141382/ -

[GitHub] [spark] SparkQA removed a comment on pull request #33451: [SPARK-36206][CORE] Support shuffle data corruption diagnosis via shuffle checksum

2021-07-20 Thread GitBox
SparkQA removed a comment on pull request #33451: URL: https://github.com/apache/spark/pull/33451#issuecomment-883889093 **[Test build #141382 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141382/testReport)** for PR 33451 at commit [`7220442`](https://gi

[GitHub] [spark] viirya commented on pull request #33453: [SPARK-36030][SQL][FOLLOW-UP] Remove duplicated test suite

2021-07-20 Thread GitBox
viirya commented on pull request #33453: URL: https://github.com/apache/spark/pull/33453#issuecomment-883893149 cc @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[GitHub] [spark] AmplabJenkins commented on pull request #33451: [SPARK-36206][CORE] Support shuffle data corruption diagnosis via shuffle checksum

2021-07-20 Thread GitBox
AmplabJenkins commented on pull request #33451: URL: https://github.com/apache/spark/pull/33451#issuecomment-883892948 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141382/ -- This

[GitHub] [spark] viirya opened a new pull request #33453: [SPARK-36030][SQL][FOLLOW-UP] Remove duplicated test suite

2021-07-20 Thread GitBox
viirya opened a new pull request #33453: URL: https://github.com/apache/spark/pull/33453 ### What changes were proposed in this pull request? Removes `FileFormatDataWriterMetricSuite` which duplicated. ### Why are the changes needed? `FileFormatDataWriter

[GitHub] [spark] SparkQA commented on pull request #33451: [SPARK-36206][CORE] Support shuffle data corruption diagnosis via shuffle checksum

2021-07-20 Thread GitBox
SparkQA commented on pull request #33451: URL: https://github.com/apache/spark/pull/33451#issuecomment-883892909 **[Test build #141382 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141382/testReport)** for PR 33451 at commit [`7220442`](https://github.co

[GitHub] [spark] HyukjinKwon commented on pull request #33452: [SPARK-36030][SQL][FOLLOW-UP] Avoid procedure syntax deprecated in Scala 2.13

2021-07-20 Thread GitBox
HyukjinKwon commented on pull request #33452: URL: https://github.com/apache/spark/pull/33452#issuecomment-883892829 Let me merge this. I believe the compilation success should be enough to test this Pr out. -- This is an automated message from the Apache Git Service. To respond to t

[GitHub] [spark] SparkQA commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
SparkQA commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883892606 **[Test build #141383 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141383/testReport)** for PR 33447 at commit [`85e2c98`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33451: [SPARK-36206][CORE] Support shuffle data corruption diagnosis via shuffle checksum

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33451: URL: https://github.com/apache/spark/pull/33451#issuecomment-883890969 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45897/

  1   2   3   4   5   6   7   8   9   10   >