[GitHub] [spark] SparkQA commented on pull request #33364: [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function

2021-07-26 Thread GitBox
SparkQA commented on pull request #33364: URL: https://github.com/apache/spark/pull/33364#issuecomment-886845510 **[Test build #141648 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141648/testReport)** for PR 33364 at commit

[GitHub] [spark] SparkQA commented on pull request #33451: [SPARK-36206][CORE] Support shuffle data corruption diagnosis via shuffle checksum

2021-07-26 Thread GitBox
SparkQA commented on pull request #33451: URL: https://github.com/apache/spark/pull/33451#issuecomment-886845413 **[Test build #141647 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141647/testReport)** for PR 33451 at commit

[GitHub] [spark] SparkQA commented on pull request #33522: [SPARK-36290][SQL] Push down join condition evaluation

2021-07-26 Thread GitBox
SparkQA commented on pull request #33522: URL: https://github.com/apache/spark/pull/33522#issuecomment-886845381 **[Test build #141646 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141646/testReport)** for PR 33522 at commit

[GitHub] [spark] SparkQA commented on pull request #33523: [SPARK-35259][SHUFFLE][3.1] Update ExternalBlockHandler Timer variables to expose correct units

2021-07-26 Thread GitBox
SparkQA commented on pull request #33523: URL: https://github.com/apache/spark/pull/33523#issuecomment-886845315 **[Test build #141645 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141645/testReport)** for PR 33523 at commit

[GitHub] [spark] SparkQA commented on pull request #33524: [SPARK-35259][SHUFFLE][3.0] Update ExternalBlockHandler Timer variables to expose correct units

2021-07-26 Thread GitBox
SparkQA commented on pull request #33524: URL: https://github.com/apache/spark/pull/33524#issuecomment-886845233 **[Test build #141644 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141644/testReport)** for PR 33524 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33364: [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33364: URL: https://github.com/apache/spark/pull/33364#issuecomment-886842529 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46157/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33468: [SPARK-36247][SQL] Check string length for char/varchar and apply type coercion in UPDATE/MERGE command

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33468: URL: https://github.com/apache/spark/pull/33468#issuecomment-886842532 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46156/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33399: [SPARK-36211][PYTHON] Correct typing of `udf` return value

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33399: URL: https://github.com/apache/spark/pull/33399#issuecomment-886842528 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141642/

[GitHub] [spark] AmplabJenkins commented on pull request #33399: [SPARK-36211][PYTHON] Correct typing of `udf` return value

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33399: URL: https://github.com/apache/spark/pull/33399#issuecomment-886842528 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141642/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33468: [SPARK-36247][SQL] Check string length for char/varchar and apply type coercion in UPDATE/MERGE command

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33468: URL: https://github.com/apache/spark/pull/33468#issuecomment-886842532 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46156/ --

[GitHub] [spark] AmplabJenkins commented on pull request #33364: [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33364: URL: https://github.com/apache/spark/pull/33364#issuecomment-886842529 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46157/ --

[GitHub] [spark] ekoifman commented on a change in pull request #32776: [SPARK-35639][SQL] Add metrics about coalesced partitions to CustomShuffleReader in AQE

2021-07-26 Thread GitBox
ekoifman commented on a change in pull request #32776: URL: https://github.com/apache/spark/pull/32776#discussion_r676750494 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/CustomShuffleReaderExec.scala ## @@ -182,6 +193,17 @@ case class

[GitHub] [spark] Ngone51 commented on pull request #33451: [SPARK-36206][CORE] Support shuffle data corruption diagnosis via shuffle checksum

2021-07-26 Thread GitBox
Ngone51 commented on pull request #33451: URL: https://github.com/apache/spark/pull/33451#issuecomment-886839493 FYI, there's a major change after addressing https://github.com/apache/spark/pull/33451#discussion_r676354691: Previously, we'd diagnose corruption when the first

[GitHub] [spark] gengliangwang commented on a change in pull request #33385: [SPARK-36173][CORE] Support getting CPU number in TaskContext

2021-07-26 Thread GitBox
gengliangwang commented on a change in pull request #33385: URL: https://github.com/apache/spark/pull/33385#discussion_r676746304 ## File path: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ## @@ -418,6 +418,8 @@ private[spark] class TaskSetManager( *

[GitHub] [spark] SparkQA commented on pull request #33399: [SPARK-36211][PYTHON] Correct typing of `udf` return value

2021-07-26 Thread GitBox
SparkQA commented on pull request #33399: URL: https://github.com/apache/spark/pull/33399#issuecomment-886834883 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46158/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #33364: [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function

2021-07-26 Thread GitBox
SparkQA commented on pull request #33364: URL: https://github.com/apache/spark/pull/33364#issuecomment-886833822 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46159/ -- This is an automated message from the Apache

[GitHub] [spark] gengliangwang commented on a change in pull request #33385: [SPARK-36173][CORE] Support getting CPU number in TaskContext

2021-07-26 Thread GitBox
gengliangwang commented on a change in pull request #33385: URL: https://github.com/apache/spark/pull/33385#discussion_r676744954 ## File path: core/src/main/scala/org/apache/spark/TaskContext.scala ## @@ -177,6 +177,12 @@ abstract class TaskContext extends Serializable {

[GitHub] [spark] SparkQA commented on pull request #33468: [SPARK-36247][SQL] Check string length for char/varchar and apply type coercion in UPDATE/MERGE command

2021-07-26 Thread GitBox
SparkQA commented on pull request #33468: URL: https://github.com/apache/spark/pull/33468#issuecomment-886832090 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46156/ -- This is an automated message from the

[GitHub] [spark] dgd-contributor commented on a change in pull request #33317: [SPARK-36095][CORE] Grouping exception in core/rdd

2021-07-26 Thread GitBox
dgd-contributor commented on a change in pull request #33317: URL: https://github.com/apache/spark/pull/33317#discussion_r676743275 ## File path: core/src/main/scala/org/apache/spark/errors/SparkCoreErrors.scala ## @@ -0,0 +1,140 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] dgd-contributor commented on a change in pull request #33317: [SPARK-36095][CORE] Grouping exception in core/rdd

2021-07-26 Thread GitBox
dgd-contributor commented on a change in pull request #33317: URL: https://github.com/apache/spark/pull/33317#discussion_r676743275 ## File path: core/src/main/scala/org/apache/spark/errors/SparkCoreErrors.scala ## @@ -0,0 +1,140 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] xkrogen commented on pull request #33524: [SPARK-35259][SHUFFLE][3.0] Update ExternalBlockHandler Timer variables to expose correct units

2021-07-26 Thread GitBox
xkrogen commented on pull request #33524: URL: https://github.com/apache/spark/pull/33524#issuecomment-886830044 fyi @Ngone51 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] xkrogen commented on pull request #33523: [SPARK-35259][SHUFFLE][3.1] Update ExternalBlockHandler Timer variables to expose correct units

2021-07-26 Thread GitBox
xkrogen commented on pull request #33523: URL: https://github.com/apache/spark/pull/33523#issuecomment-886830118 fyi @Ngone51 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] xkrogen edited a comment on pull request #33116: [SPARK-35259][SHUFFLE] Update ExternalBlockHandler Timer variables to expose correct units

2021-07-26 Thread GitBox
xkrogen edited a comment on pull request #33116: URL: https://github.com/apache/spark/pull/33116#issuecomment-886827252 Awesome, many thanks @Ngone51 and @dongjoon-hyun ! Backport PR for 3.1: #33523 Backport PR for 3.0: #33524 -- This is an automated message from the Apache Git

[GitHub] [spark] xkrogen opened a new pull request #33524: [SPARK-35259][SHUFFLE][3.0] Update ExternalBlockHandler Timer variables to expose correct units

2021-07-26 Thread GitBox
xkrogen opened a new pull request #33524: URL: https://github.com/apache/spark/pull/33524 `ExternalBlockHandler` exposes 2 metrics which are Dropwizard `Timer` metrics, and are named with a `millis` suffix: ``` private final Timer openBlockRequestLatencyMillis = new Timer();

[GitHub] [spark] xkrogen commented on pull request #33116: [SPARK-35259][SHUFFLE] Update ExternalBlockHandler Timer variables to expose correct units

2021-07-26 Thread GitBox
xkrogen commented on pull request #33116: URL: https://github.com/apache/spark/pull/33116#issuecomment-886827252 Awesome, many thanks @Ngone51 and @dongjoon-hyun ! Backport PR for 3.1: #33523 -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] xkrogen opened a new pull request #33523: [SPARK-35259][SHUFFLE][3.1] Update ExternalBlockHandler Timer variables to expose correct units

2021-07-26 Thread GitBox
xkrogen opened a new pull request #33523: URL: https://github.com/apache/spark/pull/33523 `ExternalBlockHandler` exposes 3 metrics which are Dropwizard `Timer` metrics, and are named with a `millis` suffix: ``` private final Timer openBlockRequestLatencyMillis = new Timer();

[GitHub] [spark] tgravescs commented on a change in pull request #33385: [SPARK-36173][CORE] Support getting CPU number in TaskContext

2021-07-26 Thread GitBox
tgravescs commented on a change in pull request #33385: URL: https://github.com/apache/spark/pull/33385#discussion_r676738629 ## File path: core/src/main/scala/org/apache/spark/TaskContext.scala ## @@ -177,6 +177,12 @@ abstract class TaskContext extends Serializable { */

[GitHub] [spark] gengliangwang commented on pull request #33457: [SPARK-36237][UI][SQL] Attach and start handler after application started in UI

2021-07-26 Thread GitBox
gengliangwang commented on pull request #33457: URL: https://github.com/apache/spark/pull/33457#issuecomment-886826686 > Shall we make the RESTFUL request hang and the web page loading if the spark application is not fully started? Or we can just redirect to a page saying Spark is

[GitHub] [spark] Ngone51 commented on a change in pull request #33451: [SPARK-36206][CORE] Support shuffle data corruption diagnosis via shuffle checksum

2021-07-26 Thread GitBox
Ngone51 commented on a change in pull request #33451: URL: https://github.com/apache/spark/pull/33451#discussion_r676735501 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/checksum/ShuffleChecksumHelper.java ## @@ -0,0 +1,160 @@ +/* + *

[GitHub] [spark] tgravescs commented on pull request #33385: [SPARK-36173][CORE] Support getting CPU number in TaskContext

2021-07-26 Thread GitBox
tgravescs commented on pull request #33385: URL: https://github.com/apache/spark/pull/33385#issuecomment-886823433 @xwu99 can you rekick the tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] Ngone51 commented on a change in pull request #33451: [SPARK-36206][CORE] Support shuffle data corruption diagnosis via shuffle checksum

2021-07-26 Thread GitBox
Ngone51 commented on a change in pull request #33451: URL: https://github.com/apache/spark/pull/33451#discussion_r676734662 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java ## @@ -374,6 +379,27 @@ public int

[GitHub] [spark] Ngone51 commented on a change in pull request #33451: [SPARK-36206][CORE] Support shuffle data corruption diagnosis via shuffle checksum

2021-07-26 Thread GitBox
Ngone51 commented on a change in pull request #33451: URL: https://github.com/apache/spark/pull/33451#discussion_r676733053 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java ## @@ -374,6 +379,45 @@ public int

[GitHub] [spark] Ngone51 commented on a change in pull request #33451: [SPARK-36206][CORE] Support shuffle data corruption diagnosis via shuffle checksum

2021-07-26 Thread GitBox
Ngone51 commented on a change in pull request #33451: URL: https://github.com/apache/spark/pull/33451#discussion_r676732889 ## File path: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ## @@ -971,7 +1000,50 @@ final class

[GitHub] [spark] gengliangwang commented on pull request #33457: [SPARK-36237][UI][SQL] Attach and start handler after application started in UI

2021-07-26 Thread GitBox
gengliangwang commented on pull request #33457: URL: https://github.com/apache/spark/pull/33457#issuecomment-886819852 > With this 500 and error stack in the log makes user confused too.. they always ask me if there is something wong. At least before the changes it shows hint

[GitHub] [spark] SparkQA removed a comment on pull request #33399: [SPARK-36211][PYTHON] Correct typing of `udf` return value

2021-07-26 Thread GitBox
SparkQA removed a comment on pull request #33399: URL: https://github.com/apache/spark/pull/33399#issuecomment-886797008 **[Test build #141642 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141642/testReport)** for PR 33399 at commit

[GitHub] [spark] SparkQA commented on pull request #33399: [SPARK-36211][PYTHON] Correct typing of `udf` return value

2021-07-26 Thread GitBox
SparkQA commented on pull request #33399: URL: https://github.com/apache/spark/pull/33399#issuecomment-886818091 **[Test build #141642 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141642/testReport)** for PR 33399 at commit

[GitHub] [spark] gengliangwang commented on pull request #33457: [SPARK-36237][UI][SQL] Attach and start handler after application started in UI

2021-07-26 Thread GitBox
gengliangwang commented on pull request #33457: URL: https://github.com/apache/spark/pull/33457#issuecomment-886817220 > When we use prometheus to fetch metrics, always pull data before application started. > we need to start server and bind port before taskScheduler started for

[GitHub] [spark] HyukjinKwon commented on pull request #33429: [SPARK-36217][SQL] Rename CustomShuffleReader and OptimizeLocalShuffleReader in AQE

2021-07-26 Thread GitBox
HyukjinKwon commented on pull request #33429: URL: https://github.com/apache/spark/pull/33429#issuecomment-886816675 Thanks Wenchen! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] SparkQA commented on pull request #33364: [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function

2021-07-26 Thread GitBox
SparkQA commented on pull request #33364: URL: https://github.com/apache/spark/pull/33364#issuecomment-886816264 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46157/ -- This is an automated message from the

[GitHub] [spark] wangyum opened a new pull request #33522: [SPARK-36290][SQL] Push down join condition evaluation

2021-07-26 Thread GitBox
wangyum opened a new pull request #33522: URL: https://github.com/apache/spark/pull/33522 ### What changes were proposed in this pull request? The expressions in join condition maybe eval three times(`ShuffleExchangeExec`, `SortExec` and the join itself). This pr add a new

[GitHub] [spark] cloud-fan commented on a change in pull request #33188: [SPARK-35989][SQL] Only remove redundant shuffle if shuffle origin is REPARTITION_BY_COL in AQE

2021-07-26 Thread GitBox
cloud-fan commented on a change in pull request #33188: URL: https://github.com/apache/spark/pull/33188#discussion_r676713243 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala ## @@ -250,7 +250,12 @@ object

[GitHub] [spark] dgd-contributor commented on a change in pull request #33459: [SPARK-36229][SQL] conv() inconsistently handles invalid strings with more than 64 invalid characters and return wrong va

2021-07-26 Thread GitBox
dgd-contributor commented on a change in pull request #33459: URL: https://github.com/apache/spark/pull/33459#discussion_r676684793 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConverter.scala ## @@ -89,6 +91,10 @@ object NumberConverter

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33364: [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33364: URL: https://github.com/apache/spark/pull/33364#issuecomment-886798481 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141643/

[GitHub] [spark] SparkQA removed a comment on pull request #33364: [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function

2021-07-26 Thread GitBox
SparkQA removed a comment on pull request #33364: URL: https://github.com/apache/spark/pull/33364#issuecomment-886797143 **[Test build #141643 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141643/testReport)** for PR 33364 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #33364: [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33364: URL: https://github.com/apache/spark/pull/33364#issuecomment-886798481 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141643/ -- This

[GitHub] [spark] SparkQA commented on pull request #33364: [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function

2021-07-26 Thread GitBox
SparkQA commented on pull request #33364: URL: https://github.com/apache/spark/pull/33364#issuecomment-886798445 **[Test build #141643 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141643/testReport)** for PR 33364 at commit

[GitHub] [spark] SparkQA commented on pull request #33399: [SPARK-36211][PYTHON] Correct typing of `udf` return value

2021-07-26 Thread GitBox
SparkQA commented on pull request #33399: URL: https://github.com/apache/spark/pull/33399#issuecomment-886797008 **[Test build #141642 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141642/testReport)** for PR 33399 at commit

[GitHub] [spark] SparkQA commented on pull request #33364: [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function

2021-07-26 Thread GitBox
SparkQA commented on pull request #33364: URL: https://github.com/apache/spark/pull/33364#issuecomment-886797143 **[Test build #141643 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141643/testReport)** for PR 33364 at commit

[GitHub] [spark] dgd-contributor commented on a change in pull request #33459: [SPARK-36229][SQL] conv() inconsistently handles invalid strings with more than 64 invalid characters and return wrong va

2021-07-26 Thread GitBox
dgd-contributor commented on a change in pull request #33459: URL: https://github.com/apache/spark/pull/33459#discussion_r676704901 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConverter.scala ## @@ -49,12 +49,14 @@ object NumberConverter

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33521: [SPARK-36142][PYTHON] Follow Pandas when pow between Series with Na and bool literal

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33521: URL: https://github.com/apache/spark/pull/33521#issuecomment-886794556 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46155/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33364: [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33364: URL: https://github.com/apache/spark/pull/33364#issuecomment-886794559 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46154/

[GitHub] [spark] AmplabJenkins commented on pull request #33521: [SPARK-36142][PYTHON] Follow Pandas when pow between Series with Na and bool literal

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33521: URL: https://github.com/apache/spark/pull/33521#issuecomment-886794556 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46155/ --

[GitHub] [spark] AmplabJenkins commented on pull request #33364: [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33364: URL: https://github.com/apache/spark/pull/33364#issuecomment-886794559 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46154/ --

[GitHub] [spark] SparkQA commented on pull request #33468: [SPARK-36247][SQL] Check string length for char/varchar and apply type coercion in UPDATE/MERGE command

2021-07-26 Thread GitBox
SparkQA commented on pull request #33468: URL: https://github.com/apache/spark/pull/33468#issuecomment-886791504 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46156/ -- This is an automated message from the Apache

[GitHub] [spark] LuciferYang commented on pull request #33514: [SPARK-36242][CORE][3.0] Ensure spill file closed before set success = true in ExternalSorter.spillMemoryIteratorToDisk method

2021-07-26 Thread GitBox
LuciferYang commented on pull request #33514: URL: https://github.com/apache/spark/pull/33514#issuecomment-886790740 > Curious why the earlier PR could not have been merged to 3.1/3.0 I'm not sure, but there seems to be no code conflict -- This is an automated message from the

[GitHub] [spark] dgd-contributor commented on a change in pull request #33459: [SPARK-36229][SQL] conv() inconsistently handles invalid strings with more than 64 invalid characters and return wrong va

2021-07-26 Thread GitBox
dgd-contributor commented on a change in pull request #33459: URL: https://github.com/apache/spark/pull/33459#discussion_r676686936 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConverter.scala ## @@ -49,12 +49,14 @@ object NumberConverter

[GitHub] [spark] SparkQA commented on pull request #33364: [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function

2021-07-26 Thread GitBox
SparkQA commented on pull request #33364: URL: https://github.com/apache/spark/pull/33364#issuecomment-886788728 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46157/ -- This is an automated message from the Apache

[GitHub] [spark] dgd-contributor commented on a change in pull request #33459: [SPARK-36229][SQL] conv() inconsistently handles invalid strings with more than 64 invalid characters and return wrong va

2021-07-26 Thread GitBox
dgd-contributor commented on a change in pull request #33459: URL: https://github.com/apache/spark/pull/33459#discussion_r676686936 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConverter.scala ## @@ -49,12 +49,14 @@ object NumberConverter

[GitHub] [spark] dgd-contributor commented on a change in pull request #33459: [SPARK-36229][SQL] conv() inconsistently handles invalid strings with more than 64 invalid characters and return wrong va

2021-07-26 Thread GitBox
dgd-contributor commented on a change in pull request #33459: URL: https://github.com/apache/spark/pull/33459#discussion_r676691828 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConverter.scala ## @@ -49,12 +49,14 @@ object NumberConverter

[GitHub] [spark] dgd-contributor commented on a change in pull request #33459: [SPARK-36229][SQL] conv() inconsistently handles invalid strings with more than 64 invalid characters and return wrong va

2021-07-26 Thread GitBox
dgd-contributor commented on a change in pull request #33459: URL: https://github.com/apache/spark/pull/33459#discussion_r676691828 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConverter.scala ## @@ -49,12 +49,14 @@ object NumberConverter

[GitHub] [spark] sammyjmoseley commented on pull request #33364: [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function

2021-07-26 Thread GitBox
sammyjmoseley commented on pull request #33364: URL: https://github.com/apache/spark/pull/33364#issuecomment-886782497 Can't use Iterable type check because a `string` is Itereable. Therefore reverted back to checking the argument is `tuple` or `list` -- This is an automated message

[GitHub] [spark] SparkQA commented on pull request #33521: [SPARK-36142][PYTHON] Follow Pandas when pow between Series with Na and bool literal

2021-07-26 Thread GitBox
SparkQA commented on pull request #33521: URL: https://github.com/apache/spark/pull/33521#issuecomment-886780883 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46155/ -- This is an automated message from the

[GitHub] [spark] dgd-contributor commented on a change in pull request #33459: [SPARK-36229][SQL] conv() inconsistently handles invalid strings with more than 64 invalid characters and return wrong va

2021-07-26 Thread GitBox
dgd-contributor commented on a change in pull request #33459: URL: https://github.com/apache/spark/pull/33459#discussion_r676688190 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConverter.scala ## @@ -49,12 +49,14 @@ object NumberConverter

[GitHub] [spark] dgd-contributor commented on a change in pull request #33459: [SPARK-36229][SQL] conv() inconsistently handles invalid strings with more than 64 invalid characters and return wrong va

2021-07-26 Thread GitBox
dgd-contributor commented on a change in pull request #33459: URL: https://github.com/apache/spark/pull/33459#discussion_r676686936 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConverter.scala ## @@ -49,12 +49,14 @@ object NumberConverter

[GitHub] [spark] dgd-contributor commented on a change in pull request #33459: [SPARK-36229][SQL] conv() inconsistently handles invalid strings with more than 64 invalid characters and return wrong va

2021-07-26 Thread GitBox
dgd-contributor commented on a change in pull request #33459: URL: https://github.com/apache/spark/pull/33459#discussion_r676684793 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConverter.scala ## @@ -89,6 +91,10 @@ object NumberConverter

[GitHub] [spark] SparkQA commented on pull request #33364: [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function

2021-07-26 Thread GitBox
SparkQA commented on pull request #33364: URL: https://github.com/apache/spark/pull/33364#issuecomment-886774176 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46154/ -- This is an automated message from the

[GitHub] [spark] cloud-fan closed pull request #33429: [SPARK-36217][SQL] Rename CustomShuffleReader and OptimizeLocalShuffleReader in AQE

2021-07-26 Thread GitBox
cloud-fan closed pull request #33429: URL: https://github.com/apache/spark/pull/33429 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] cloud-fan commented on pull request #33429: [SPARK-36217][SQL] Rename CustomShuffleReader and OptimizeLocalShuffleReader in AQE

2021-07-26 Thread GitBox
cloud-fan commented on pull request #33429: URL: https://github.com/apache/spark/pull/33429#issuecomment-886764028 thanks, merging to master/3.2! (otherwise backporting AQE changes will be very hard) -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] SparkQA removed a comment on pull request #33364: [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function

2021-07-26 Thread GitBox
SparkQA removed a comment on pull request #33364: URL: https://github.com/apache/spark/pull/33364#issuecomment-886745785 **[Test build #141641 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141641/testReport)** for PR 33364 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33364: [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33364: URL: https://github.com/apache/spark/pull/33364#issuecomment-886759964 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141641/

[GitHub] [spark] AmplabJenkins commented on pull request #33364: [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33364: URL: https://github.com/apache/spark/pull/33364#issuecomment-886759964 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141641/ -- This

[GitHub] [spark] SparkQA commented on pull request #33364: [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function

2021-07-26 Thread GitBox
SparkQA commented on pull request #33364: URL: https://github.com/apache/spark/pull/33364#issuecomment-886759717 **[Test build #141641 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141641/testReport)** for PR 33364 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33317: [SPARK-36095][CORE] Grouping exception in core/rdd

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33317: URL: https://github.com/apache/spark/pull/33317#issuecomment-886759312 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141635/

[GitHub] [spark] AmplabJenkins commented on pull request #33317: [SPARK-36095][CORE] Grouping exception in core/rdd

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33317: URL: https://github.com/apache/spark/pull/33317#issuecomment-886759312 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141635/ -- This

[GitHub] [spark] SparkQA removed a comment on pull request #33317: [SPARK-36095][CORE] Grouping exception in core/rdd

2021-07-26 Thread GitBox
SparkQA removed a comment on pull request #33317: URL: https://github.com/apache/spark/pull/33317#issuecomment-886617714 **[Test build #141635 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141635/testReport)** for PR 33317 at commit

[GitHub] [spark] SparkQA commented on pull request #33317: [SPARK-36095][CORE] Grouping exception in core/rdd

2021-07-26 Thread GitBox
SparkQA commented on pull request #33317: URL: https://github.com/apache/spark/pull/33317#issuecomment-886757905 **[Test build #141635 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141635/testReport)** for PR 33317 at commit

[GitHub] [spark] yaooqinn commented on pull request #33468: [SPARK-36247][SQL] Check string length for char/varchar and apply type coercion in UPDATE/MERGE command

2021-07-26 Thread GitBox
yaooqinn commented on pull request #33468: URL: https://github.com/apache/spark/pull/33468#issuecomment-886754538 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] SparkQA commented on pull request #33521: [SPARK-36142][PYTHON] Follow Pandas when pow between Series with Na and bool literal

2021-07-26 Thread GitBox
SparkQA commented on pull request #33521: URL: https://github.com/apache/spark/pull/33521#issuecomment-886752202 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46155/ -- This is an automated message from the Apache

[GitHub] [spark] MaxGekk commented on a change in pull request #33516: [SPARK-34249][DOCS] Add documentation for ANSI implicit cast rules

2021-07-26 Thread GitBox
MaxGekk commented on a change in pull request #33516: URL: https://github.com/apache/spark/pull/33516#discussion_r676646906 ## File path: docs/sql-ref-ansi-compliance.md ## @@ -160,6 +160,81 @@ SELECT * FROM t; +---+ ``` +### Type coercion + Type Promotion and

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31517: [SPARK-34309][BUILD][CORE][SQL][K8S]Use Caffeine instead of Guava Cache

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #31517: URL: https://github.com/apache/spark/pull/31517#issuecomment-886745559 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141633/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33490: [SPARK-36286][SQL] Block some invalid datetime string

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33490: URL: https://github.com/apache/spark/pull/33490#issuecomment-886746097 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141630/

[GitHub] [spark] AmplabJenkins commented on pull request #31517: [SPARK-34309][BUILD][CORE][SQL][K8S]Use Caffeine instead of Guava Cache

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #31517: URL: https://github.com/apache/spark/pull/31517#issuecomment-886745559 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141633/ -- This

[GitHub] [spark] SparkQA commented on pull request #33364: [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function

2021-07-26 Thread GitBox
SparkQA commented on pull request #33364: URL: https://github.com/apache/spark/pull/33364#issuecomment-886745661 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] AmplabJenkins commented on pull request #33490: [SPARK-36286][SQL] Block some invalid datetime string

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33490: URL: https://github.com/apache/spark/pull/33490#issuecomment-886746097 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141630/ -- This

[GitHub] [spark] mridulm commented on pull request #33426: [SPARK-32920][FOLLOW-UP] Fix shuffleMergeFinalized directly calling rdd.getNumPartitions as RDD is not serialized to executor

2021-07-26 Thread GitBox
mridulm commented on pull request #33426: URL: https://github.com/apache/spark/pull/33426#issuecomment-886745871 Merged to master/branch-3.2 +CC @gengliangwang Thanks for fixing this @venkata91 ! Thanks for the review @Ngone51 :-) -- This is an automated message from the Apache Git

[GitHub] [spark] SparkQA commented on pull request #33468: [SPARK-36247][SQL] Check string length for char/varchar and apply type coercion in UPDATE/MERGE command

2021-07-26 Thread GitBox
SparkQA commented on pull request #33468: URL: https://github.com/apache/spark/pull/33468#issuecomment-886745694 **[Test build #141640 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141640/testReport)** for PR 33468 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33518: [SPARK-34619][SQL][DOCS] Describe ANSI interval types at the `Data types` page of the SQL reference

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33518: URL: https://github.com/apache/spark/pull/33518#issuecomment-886744848 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46153/

[GitHub] [spark] SparkQA removed a comment on pull request #31517: [SPARK-34309][BUILD][CORE][SQL][K8S]Use Caffeine instead of Guava Cache

2021-07-26 Thread GitBox
SparkQA removed a comment on pull request #31517: URL: https://github.com/apache/spark/pull/31517#issuecomment-886613290 **[Test build #141633 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141633/testReport)** for PR 31517 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33488: [SPARK-36241][SQL] Support creating tables with void column

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33488: URL: https://github.com/apache/spark/pull/33488#issuecomment-886744053 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141622/

[GitHub] [spark] SparkQA commented on pull request #33518: [SPARK-34619][SQL][DOCS] Describe ANSI interval types at the `Data types` page of the SQL reference

2021-07-26 Thread GitBox
SparkQA commented on pull request #33518: URL: https://github.com/apache/spark/pull/33518#issuecomment-886744802 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46153/ -- This is an automated message from the

[GitHub] [spark] SparkQA removed a comment on pull request #33490: [SPARK-36286][SQL] Block some invalid datetime string

2021-07-26 Thread GitBox
SparkQA removed a comment on pull request #33490: URL: https://github.com/apache/spark/pull/33490#issuecomment-886536073 **[Test build #141630 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141630/testReport)** for PR 33490 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #33518: [SPARK-34619][SQL][DOCS] Describe ANSI interval types at the `Data types` page of the SQL reference

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33518: URL: https://github.com/apache/spark/pull/33518#issuecomment-886744848 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46153/ --

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33364: [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33364: URL: https://github.com/apache/spark/pull/33364#issuecomment-886744044 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141638/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33521: [SPARK-36142][PYTHON] Follow Pandas when pow between Series with Na and bool literal

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33521: URL: https://github.com/apache/spark/pull/33521#issuecomment-886744063 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141639/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33429: [SPARK-36217][SQL] Rename CustomShuffleReader and OptimizeLocalShuffleReader in AQE

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33429: URL: https://github.com/apache/spark/pull/33429#issuecomment-886744049 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141636/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33484: [SPARK-36263][SQL][PYTHON] Add Dataframe.observation to PySpark

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33484: URL: https://github.com/apache/spark/pull/33484#issuecomment-886744056 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141634/

[GitHub] [spark] AmplabJenkins commented on pull request #33484: [SPARK-36263][SQL][PYTHON] Add Dataframe.observation to PySpark

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33484: URL: https://github.com/apache/spark/pull/33484#issuecomment-886744056 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141634/ -- This

[GitHub] [spark] asfgit closed pull request #33426: [SPARK-32920][FOLLOW-UP] Fix shuffleMergeFinalized directly calling rdd.getNumPartitions as RDD is not serialized to executor

2021-07-26 Thread GitBox
asfgit closed pull request #33426: URL: https://github.com/apache/spark/pull/33426 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] AmplabJenkins commented on pull request #33429: [SPARK-36217][SQL] Rename CustomShuffleReader and OptimizeLocalShuffleReader in AQE

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33429: URL: https://github.com/apache/spark/pull/33429#issuecomment-886744049 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141636/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33364: [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33364: URL: https://github.com/apache/spark/pull/33364#issuecomment-886744044 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141638/ -- This

<    1   2   3   4   5   6   7   8   9   >