[GitHub] [spark] AmplabJenkins commented on pull request #34333: [SPARK-37062][SS] Introduce a new data source for providing consistent set of rows per microbatch

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34333: URL: https://github.com/apache/spark/pull/34333#issuecomment-948394208 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144487/ -- This

[GitHub] [spark] SparkQA removed a comment on pull request #34333: [SPARK-37062][SS] Introduce a new data source for providing consistent set of rows per microbatch

2021-10-21 Thread GitBox
SparkQA removed a comment on pull request #34333: URL: https://github.com/apache/spark/pull/34333#issuecomment-948236461 **[Test build #144487 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144487/testReport)** for PR 34333 at commit

[GitHub] [spark] SparkQA commented on pull request #34333: [SPARK-37062][SS] Introduce a new data source for providing consistent set of rows per microbatch

2021-10-21 Thread GitBox
SparkQA commented on pull request #34333: URL: https://github.com/apache/spark/pull/34333#issuecomment-948392566 **[Test build #144487 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144487/testReport)** for PR 34333 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34291: URL: https://github.com/apache/spark/pull/34291#issuecomment-948391342 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48965/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33828: [SPARK-36579][CORE][SQL] Make spark source stagingDir can be customized

2021-10-21 Thread GitBox
AmplabJenkins removed a comment on pull request #33828: URL: https://github.com/apache/spark/pull/33828#issuecomment-948391344 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48966/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34308: [SPARK-37035][SQL] Improve error message when use parquet vectorize reader

2021-10-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34308: URL: https://github.com/apache/spark/pull/34308#issuecomment-948391341 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48964/

[GitHub] [spark] SparkQA commented on pull request #34241: [SPARK-36975][SQL] Correct the hive client calls‘s metrics in HiveClientImpl

2021-10-21 Thread GitBox
SparkQA commented on pull request #34241: URL: https://github.com/apache/spark/pull/34241#issuecomment-948392519 **[Test build #144498 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144498/testReport)** for PR 34241 at commit

[GitHub] [spark] SparkQA commented on pull request #34352: [SPARK-37018][SQL] Spark SQL should support create function with Aggregator

2021-10-21 Thread GitBox
SparkQA commented on pull request #34352: URL: https://github.com/apache/spark/pull/34352#issuecomment-948392347 **[Test build #144497 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144497/testReport)** for PR 34352 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #34353: Set spark.sql.files.openCostInBytes to bytesConf

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34353: URL: https://github.com/apache/spark/pull/34353#issuecomment-948391980 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] AmplabJenkins commented on pull request #33828: [SPARK-36579][CORE][SQL] Make spark source stagingDir can be customized

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #33828: URL: https://github.com/apache/spark/pull/33828#issuecomment-948391344 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48966/ --

[GitHub] [spark] AmplabJenkins commented on pull request #34308: [SPARK-37035][SQL] Improve error message when use parquet vectorize reader

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34308: URL: https://github.com/apache/spark/pull/34308#issuecomment-948391341 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48964/ --

[GitHub] [spark] AmplabJenkins commented on pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34291: URL: https://github.com/apache/spark/pull/34291#issuecomment-948391342 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48965/ --

[GitHub] [spark] SparkQA commented on pull request #34346: [SPARK-36645][SQL][FOLLOWUP] Disable min/max push down for Parquet Binary

2021-10-21 Thread GitBox
SparkQA commented on pull request #34346: URL: https://github.com/apache/spark/pull/34346#issuecomment-948390148 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48968/ -- This is an automated message from the Apache

[GitHub] [spark] HyukjinKwon commented on pull request #34353: Set spark.sql.files.openCostInBytes to bytesConf

2021-10-21 Thread GitBox
HyukjinKwon commented on pull request #34353: URL: https://github.com/apache/spark/pull/34353#issuecomment-948389308 Let's also file a JIRA, see also https://spark.apache.org/contributing.html -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] SparkQA commented on pull request #34308: [SPARK-37035][SQL] Improve error message when use parquet vectorize reader

2021-10-21 Thread GitBox
SparkQA commented on pull request #34308: URL: https://github.com/apache/spark/pull/34308#issuecomment-948388688 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48964/ -- This is an automated message from the

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34353: Set spark.sql.files.openCostInBytes to bytesConf

2021-10-21 Thread GitBox
HyukjinKwon commented on a change in pull request #34353: URL: https://github.com/apache/spark/pull/34353#discussion_r733449140 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -1415,8 +1415,8 @@ object SQLConf { " bigger

[GitHub] [spark] SparkQA commented on pull request #33828: [SPARK-36579][CORE][SQL] Make spark source stagingDir can be customized

2021-10-21 Thread GitBox
SparkQA commented on pull request #33828: URL: https://github.com/apache/spark/pull/33828#issuecomment-948385275 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48966/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #34350: [SPARK-37081][SQL][TESTS] Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests

2021-10-21 Thread GitBox
SparkQA commented on pull request #34350: URL: https://github.com/apache/spark/pull/34350#issuecomment-948383482 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48967/ -- This is an automated message from the Apache

[GitHub] [spark] RabbidHY opened a new pull request #34353: Set spark.sql.files.openCostInBytes to bytesConf

2021-10-21 Thread GitBox
RabbidHY opened a new pull request #34353: URL: https://github.com/apache/spark/pull/34353 ### What changes were proposed in this pull request? Set `spark.sql.files.openCostInBytes` to bytesConf. ### Why are the changes needed? The name is _*InBytes_, but it actually

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34241: [SPARK-36975][SQL] Correct the hive client calls‘s metrics in HiveClientImpl

2021-10-21 Thread GitBox
AngersZh commented on a change in pull request #34241: URL: https://github.com/apache/spark/pull/34241#discussion_r733435747 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -372,13 +374,15 @@ private[hive] class

[GitHub] [spark] SparkQA commented on pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-21 Thread GitBox
SparkQA commented on pull request #34291: URL: https://github.com/apache/spark/pull/34291#issuecomment-948375430 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48965/ -- This is an automated message from the

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34241: [SPARK-36975][SQL] Correct the hive client calls‘s metrics in HiveClientImpl

2021-10-21 Thread GitBox
AngersZh commented on a change in pull request #34241: URL: https://github.com/apache/spark/pull/34241#discussion_r733435006 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -372,13 +374,15 @@ private[hive] class

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34241: [SPARK-36975][SQL] Correct the hive client calls‘s metrics in HiveClientImpl

2021-10-21 Thread GitBox
AngersZh commented on a change in pull request #34241: URL: https://github.com/apache/spark/pull/34241#discussion_r733433538 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -372,13 +374,15 @@ private[hive] class

[GitHub] [spark] beliefer closed pull request #34303: [SPARK-37018][SQL] Spark SQL should support create function with Aggregator

2021-10-21 Thread GitBox
beliefer closed pull request #34303: URL: https://github.com/apache/spark/pull/34303 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] beliefer commented on pull request #34303: [SPARK-37018][SQL] Spark SQL should support create function with Aggregator

2021-10-21 Thread GitBox
beliefer commented on pull request #34303: URL: https://github.com/apache/spark/pull/34303#issuecomment-948364624 Because https://github.com/apache/spark/pull/34340 reactor the architecture of register user-defined function, I opened https://github.com/apache/spark/pull/34352 replaces

[GitHub] [spark] beliefer opened a new pull request #34352: [SPARK-37018][SQL] Spark SQL should support create function with Aggregator

2021-10-21 Thread GitBox
beliefer opened a new pull request #34352: URL: https://github.com/apache/spark/pull/34352 ### What changes were proposed in this pull request? Spark SQL not supports to create function of `Aggregator` yet and deprecated `UserDefinedAggregateFunction`. If we want remove

[GitHub] [spark] SparkQA removed a comment on pull request #34337: [SPARK-37066][SQL] Improve error message to show file path when failed to read next file

2021-10-21 Thread GitBox
SparkQA removed a comment on pull request #34337: URL: https://github.com/apache/spark/pull/34337#issuecomment-948231893 **[Test build #144485 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144485/testReport)** for PR 34337 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34337: [SPARK-37066][SQL] Improve error message to show file path when failed to read next file

2021-10-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34337: URL: https://github.com/apache/spark/pull/34337#issuecomment-948350476 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144485/

[GitHub] [spark] AmplabJenkins commented on pull request #34337: [SPARK-37066][SQL] Improve error message to show file path when failed to read next file

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34337: URL: https://github.com/apache/spark/pull/34337#issuecomment-948350476 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144485/ -- This

[GitHub] [spark] cloud-fan commented on a change in pull request #34241: [SPARK-36975][SQL] Correct the hive client calls‘s metrics in HiveClientImpl

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34241: URL: https://github.com/apache/spark/pull/34241#discussion_r733409636 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -963,36 +975,45 @@ private[hive] class

[GitHub] [spark] SparkQA commented on pull request #34337: [SPARK-37066][SQL] Improve error message to show file path when failed to read next file

2021-10-21 Thread GitBox
SparkQA commented on pull request #34337: URL: https://github.com/apache/spark/pull/34337#issuecomment-948348508 **[Test build #144485 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144485/testReport)** for PR 34337 at commit

[GitHub] [spark] cloud-fan commented on a change in pull request #34241: [SPARK-36975][SQL] Correct the hive client calls‘s metrics in HiveClientImpl

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34241: URL: https://github.com/apache/spark/pull/34241#discussion_r733407580 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -398,7 +402,9 @@ private[hive] class HiveClientImpl(

[GitHub] [spark] cloud-fan commented on a change in pull request #34241: [SPARK-36975][SQL] Correct the hive client calls‘s metrics in HiveClientImpl

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34241: URL: https://github.com/apache/spark/pull/34241#discussion_r733406754 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -372,13 +374,15 @@ private[hive] class

[GitHub] [spark] cloud-fan commented on a change in pull request #34241: [SPARK-36975][SQL] Correct the hive client calls‘s metrics in HiveClientImpl

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34241: URL: https://github.com/apache/spark/pull/34241#discussion_r733406087 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -372,13 +374,15 @@ private[hive] class

[GitHub] [spark] SparkQA commented on pull request #34350: [SPARK-37081][SQL][TESTS] Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests

2021-10-21 Thread GitBox
SparkQA commented on pull request #34350: URL: https://github.com/apache/spark/pull/34350#issuecomment-948346877 **[Test build #144496 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144496/testReport)** for PR 34350 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #34351: [SPARK-37071][CORE] Make OpenHashMap serialize without reference tracking

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34351: URL: https://github.com/apache/spark/pull/34351#issuecomment-948345276 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34338: [SPARK-37067][SQL] Use ZoneId.of() to handle timezone string in DatetimeUtils

2021-10-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34338: URL: https://github.com/apache/spark/pull/34338#issuecomment-948344642 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48962/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34341: [SPARK-37076][SQL] Implement StructType.toString explicitly for Scala 2.13

2021-10-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34341: URL: https://github.com/apache/spark/pull/34341#issuecomment-948344637 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144480/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34313: [SPARK-37013][SQL] Forbid `%0$` usage explicitly to ensure `format_string` has same behavior when using Java 8 and Java 17

2021-10-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34313: URL: https://github.com/apache/spark/pull/34313#issuecomment-948344638 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48963/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34350: [SPARK-37081][SQL][TESTS] Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests

2021-10-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34350: URL: https://github.com/apache/spark/pull/34350#issuecomment-948344639 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144488/

[GitHub] [spark] AmplabJenkins commented on pull request #34338: [SPARK-37067][SQL] Use ZoneId.of() to handle timezone string in DatetimeUtils

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34338: URL: https://github.com/apache/spark/pull/34338#issuecomment-948344642 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48962/ --

[GitHub] [spark] AmplabJenkins commented on pull request #34341: [SPARK-37076][SQL] Implement StructType.toString explicitly for Scala 2.13

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34341: URL: https://github.com/apache/spark/pull/34341#issuecomment-948344637 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144480/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #34313: [SPARK-37013][SQL] Forbid `%0$` usage explicitly to ensure `format_string` has same behavior when using Java 8 and Java 17

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34313: URL: https://github.com/apache/spark/pull/34313#issuecomment-948344638 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48963/ --

[GitHub] [spark] AmplabJenkins commented on pull request #34350: [SPARK-37081][SQL][TESTS] Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34350: URL: https://github.com/apache/spark/pull/34350#issuecomment-948344639 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144488/ -- This

[GitHub] [spark] SparkQA commented on pull request #34313: [SPARK-37013][SQL] Forbid `%0$` usage explicitly to ensure `format_string` has same behavior when using Java 8 and Java 17

2021-10-21 Thread GitBox
SparkQA commented on pull request #34313: URL: https://github.com/apache/spark/pull/34313#issuecomment-948343025 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48963/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #34308: [SPARK-37035][SQL] Improve error message when use parquet vectorize reader

2021-10-21 Thread GitBox
SparkQA commented on pull request #34308: URL: https://github.com/apache/spark/pull/34308#issuecomment-948342774 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48964/ -- This is an automated message from the Apache

[GitHub] [spark] cloud-fan commented on a change in pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34291: URL: https://github.com/apache/spark/pull/34291#discussion_r733400349 ## File path: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala ## @@ -92,6 +93,50 @@ class JDBCV2Suite extends QueryTest with

[GitHub] [spark] SparkQA commented on pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-21 Thread GitBox
SparkQA commented on pull request #34291: URL: https://github.com/apache/spark/pull/34291#issuecomment-948341787 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48965/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA removed a comment on pull request #34341: [SPARK-37076][SQL] Implement StructType.toString explicitly for Scala 2.13

2021-10-21 Thread GitBox
SparkQA removed a comment on pull request #34341: URL: https://github.com/apache/spark/pull/34341#issuecomment-948181872 **[Test build #144480 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144480/testReport)** for PR 34341 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #34350: [SPARK-37081][SQL][TESTS] Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests

2021-10-21 Thread GitBox
SparkQA removed a comment on pull request #34350: URL: https://github.com/apache/spark/pull/34350#issuecomment-948253954 **[Test build #144488 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144488/testReport)** for PR 34350 at commit

[GitHub] [spark] cloud-fan commented on a change in pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34291: URL: https://github.com/apache/spark/pull/34291#discussion_r733399323 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala ## @@ -252,6 +284,7 @@ case class

[GitHub] [spark] SparkQA commented on pull request #33828: [SPARK-36579][CORE][SQL] Make spark source stagingDir can be customized

2021-10-21 Thread GitBox
SparkQA commented on pull request #33828: URL: https://github.com/apache/spark/pull/33828#issuecomment-948340628 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48966/ -- This is an automated message from the Apache

[GitHub] [spark] cloud-fan commented on a change in pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34291: URL: https://github.com/apache/spark/pull/34291#discussion_r733398430 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala ## @@ -225,6 +225,38 @@ object

[GitHub] [spark] cloud-fan commented on a change in pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34291: URL: https://github.com/apache/spark/pull/34291#discussion_r733398197 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala ## @@ -225,6 +225,38 @@ object

[GitHub] [spark] cloud-fan commented on a change in pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34291: URL: https://github.com/apache/spark/pull/34291#discussion_r733396995 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala ## @@ -179,6 +179,7 @@ object JDBCRDD extends

[GitHub] [spark] cloud-fan commented on a change in pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34291: URL: https://github.com/apache/spark/pull/34291#discussion_r733396178 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala ## @@ -149,11 +150,14 @@ case class

[GitHub] [spark] eejbyfeldt opened a new pull request #34351: [SPARK-37071][CORE] Make OpenHashMap serialize without reference tracking

2021-10-21 Thread GitBox
eejbyfeldt opened a new pull request #34351: URL: https://github.com/apache/spark/pull/34351 ### What changes were proposed in this pull request? Change the anonymous functions in OpenHashMap to member methods. This avoid having a member which captures the OpenHashMap object in

[GitHub] [spark] SparkQA commented on pull request #34338: [SPARK-37067][SQL] Use ZoneId.of() to handle timezone string in DatetimeUtils

2021-10-21 Thread GitBox
SparkQA commented on pull request #34338: URL: https://github.com/apache/spark/pull/34338#issuecomment-948330761 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48962/ -- This is an automated message from the

[GitHub] [spark] sarutak commented on pull request #34350: [SPARK-37081][SQL][TESTS] Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests

2021-10-21 Thread GitBox
sarutak commented on pull request #34350: URL: https://github.com/apache/spark/pull/34350#issuecomment-948327621 retest this please. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] cloud-fan commented on a change in pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34291: URL: https://github.com/apache/spark/pull/34291#discussion_r732567591 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala ## @@ -225,6 +225,25 @@ object

[GitHub] [spark] cloud-fan commented on a change in pull request #34334: [SPARK-36763][SQL] Pull out complex sorting expressions

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34334: URL: https://github.com/apache/spark/pull/34334#discussion_r733385010 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PullOutComplexExpressions.scala ## @@ -75,6 +83,26 @@ object

[GitHub] [spark] SparkQA commented on pull request #34350: [SPARK-37081][SQL][TESTS] Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests

2021-10-21 Thread GitBox
SparkQA commented on pull request #34350: URL: https://github.com/apache/spark/pull/34350#issuecomment-948324018 **[Test build #144488 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144488/testReport)** for PR 34350 at commit

[GitHub] [spark] cloud-fan commented on a change in pull request #34337: [SPARK-37066][SQL] Improve error message to show file path when failed to read next file

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34337: URL: https://github.com/apache/spark/pull/34337#discussion_r733378723 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala ## @@ -177,6 +177,10 @@ class FileScanRDD(

[GitHub] [spark] huaxingao commented on a change in pull request #34311: [SPARK-37038][SQL][WIP] DSV2 Sample Push Down

2021-10-21 Thread GitBox
huaxingao commented on a change in pull request #34311: URL: https://github.com/apache/spark/pull/34311#discussion_r733377718 ## File path: external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/PostgresIntegrationSuite.scala ## @@ -49,6 +49,8 @@ class

[GitHub] [spark] cloud-fan commented on a change in pull request #34337: [SPARK-37066][SQL] Improve error message to show file path when failed to read next file

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34337: URL: https://github.com/apache/spark/pull/34337#discussion_r733377326 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala ## @@ -177,6 +177,10 @@ class FileScanRDD(

[GitHub] [spark] SparkQA commented on pull request #34341: [SPARK-37076][SQL] Implement StructType.toString explicitly for Scala 2.13

2021-10-21 Thread GitBox
SparkQA commented on pull request #34341: URL: https://github.com/apache/spark/pull/34341#issuecomment-948318397 **[Test build #144480 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144480/testReport)** for PR 34341 at commit

[GitHub] [spark] Yikun commented on a change in pull request #34314: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series

2021-10-21 Thread GitBox
Yikun commented on a change in pull request #34314: URL: https://github.com/apache/spark/pull/34314#discussion_r733376501 ## File path: python/pyspark/pandas/data_type_ops/num_ops.py ## @@ -447,10 +447,29 @@ def nan_to_null(self, index_ops: IndexOpsLike) -> IndexOpsLike:

[GitHub] [spark] cloud-fan commented on pull request #34308: [SPARK-37035][SQL] Improve error message when use parquet vectorize reader

2021-10-21 Thread GitBox
cloud-fan commented on pull request #34308: URL: https://github.com/apache/spark/pull/34308#issuecomment-948317448 Is it possible to write a test? you can commit a corrupted parquet file into the code base for testing -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA commented on pull request #34346: [SPARK-36645][SQL][FOLLOWUP] Disable min/max push down for Parquet Binary

2021-10-21 Thread GitBox
SparkQA commented on pull request #34346: URL: https://github.com/apache/spark/pull/34346#issuecomment-948317076 **[Test build #144495 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144495/testReport)** for PR 34346 at commit

[GitHub] [spark] SparkQA commented on pull request #33828: [SPARK-36579][CORE][SQL] Make spark source stagingDir can be customized

2021-10-21 Thread GitBox
SparkQA commented on pull request #33828: URL: https://github.com/apache/spark/pull/33828#issuecomment-948306867 **[Test build #144494 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144494/testReport)** for PR 33828 at commit

[GitHub] [spark] SparkQA commented on pull request #34313: [SPARK-37013][SQL] Forbid `%0$` usage explicitly to ensure `format_string` has same behavior when using Java 8 and Java 17

2021-10-21 Thread GitBox
SparkQA commented on pull request #34313: URL: https://github.com/apache/spark/pull/34313#issuecomment-948306729 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48963/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-21 Thread GitBox
SparkQA commented on pull request #34291: URL: https://github.com/apache/spark/pull/34291#issuecomment-948306406 **[Test build #144493 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144493/testReport)** for PR 34291 at commit

[GitHub] [spark] SparkQA commented on pull request #34308: [SPARK-37035][SQL] Improve error message when use parquet vectorize reader

2021-10-21 Thread GitBox
SparkQA commented on pull request #34308: URL: https://github.com/apache/spark/pull/34308#issuecomment-948306363 **[Test build #144492 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144492/testReport)** for PR 34308 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34308: [SPARK-37035][SQL] Improve error message when use parquet vectorize reader

2021-10-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34308: URL: https://github.com/apache/spark/pull/34308#issuecomment-948305330 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48961/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34350: [SPARK-37081][SQL][TESTS] Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests

2021-10-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34350: URL: https://github.com/apache/spark/pull/34350#issuecomment-948305331 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48960/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34349: [SPARK-37080][INFRA] Add benchmark tool guide in pull request template

2021-10-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34349: URL: https://github.com/apache/spark/pull/34349#issuecomment-948305329 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144482/

[GitHub] [spark] AmplabJenkins commented on pull request #34308: [SPARK-37035][SQL] Improve error message when use parquet vectorize reader

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34308: URL: https://github.com/apache/spark/pull/34308#issuecomment-948305330 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48961/ --

[GitHub] [spark] AmplabJenkins commented on pull request #34349: [SPARK-37080][INFRA] Add benchmark tool guide in pull request template

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34349: URL: https://github.com/apache/spark/pull/34349#issuecomment-948305329 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144482/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #34350: [SPARK-37081][SQL][TESTS] Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34350: URL: https://github.com/apache/spark/pull/34350#issuecomment-948305331 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48960/ --

[GitHub] [spark] SparkQA removed a comment on pull request #34349: [SPARK-37080][INFRA] Add benchmark tool guide in pull request template

2021-10-21 Thread GitBox
SparkQA removed a comment on pull request #34349: URL: https://github.com/apache/spark/pull/34349#issuecomment-948204019 **[Test build #144482 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144482/testReport)** for PR 34349 at commit

[GitHub] [spark] SparkQA commented on pull request #34338: [SPARK-37067][SQL] Use ZoneId.of() to handle timezone string in DatetimeUtils

2021-10-21 Thread GitBox
SparkQA commented on pull request #34338: URL: https://github.com/apache/spark/pull/34338#issuecomment-948302756 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48962/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #34350: [SPARK-37081][SQL][TESTS] Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests

2021-10-21 Thread GitBox
SparkQA commented on pull request #34350: URL: https://github.com/apache/spark/pull/34350#issuecomment-948301761 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48960/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #34308: [SPARK-37035][SQL] Improve error message when use parquet vectorize reader

2021-10-21 Thread GitBox
SparkQA commented on pull request #34308: URL: https://github.com/apache/spark/pull/34308#issuecomment-948301562 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48961/ -- This is an automated message from the

[GitHub] [spark] HyukjinKwon edited a comment on pull request #29719: [SPARK-32846][SQL][PYTHON] Support createDataFrame from an RDD of pd.DataFrames

2021-10-21 Thread GitBox
HyukjinKwon edited a comment on pull request #29719: URL: https://github.com/apache/spark/pull/29719#issuecomment-948296942 @linar-jether, would you mind sharing your pseudo codes? I am trying to figure out the general approach to address this problem (e.g., SPARK-32846, SPARK-30153,

[GitHub] [spark] HyukjinKwon edited a comment on pull request #26783: [SPARK-30153][PYTHON][WIP] Extend data exchange options for vectorized UDF functions with vanilla Arrow serialization

2021-10-21 Thread GitBox
HyukjinKwon edited a comment on pull request #26783: URL: https://github.com/apache/spark/pull/26783#issuecomment-948297109 @LucaCanali, would you mind sharing your pseudo codes? I am trying to figure out the general approach to address this problem (e.g., SPARK-32846, SPARK-30153,

[GitHub] [spark] HyukjinKwon edited a comment on pull request #29719: [SPARK-32846][SQL][PYTHON] Support createDataFrame from an RDD of pd.DataFrames

2021-10-21 Thread GitBox
HyukjinKwon edited a comment on pull request #29719: URL: https://github.com/apache/spark/pull/29719#issuecomment-948296942 @linar-jether, would you mind sharing your sudo codes? I am trying to figure out the general approach to address this problem (e.g., SPARK-32846, SPARK-30153,

[GitHub] [spark] HyukjinKwon edited a comment on pull request #26783: [SPARK-30153][PYTHON][WIP] Extend data exchange options for vectorized UDF functions with vanilla Arrow serialization

2021-10-21 Thread GitBox
HyukjinKwon edited a comment on pull request #26783: URL: https://github.com/apache/spark/pull/26783#issuecomment-948297109 @LucaCanali, would you mind sharing your sudo codes? I am trying to figure out the general approach to address this problem (e.g., SPARK-32846, SPARK-30153,

[GitHub] [spark] cloud-fan commented on pull request #34245: [SPARK-33277][PYSPARK][SQL] Writer thread must not access input after task completion listener returns

2021-10-21 Thread GitBox
cloud-fan commented on pull request #34245: URL: https://github.com/apache/spark/pull/34245#issuecomment-948297803 I didn't realize that the linked JIRA ticket is the old one. @ankurdave can you create a new JIRA ticket for this bug? thanks! -- This is an automated message from the

[GitHub] [spark] HyukjinKwon commented on pull request #26783: [SPARK-30153][PYTHON][WIP] Extend data exchange options for vectorized UDF functions with vanilla Arrow serialization

2021-10-21 Thread GitBox
HyukjinKwon commented on pull request #26783: URL: https://github.com/apache/spark/pull/26783#issuecomment-948297109 @LucaCanali, would you mind sharing your sudo codes? I am trying to figure out the general approach to address this problem. -- This is an automated message from the

[GitHub] [spark] HyukjinKwon commented on pull request #29719: [SPARK-32846][SQL][PYTHON] Support createDataFrame from an RDD of pd.DataFrames

2021-10-21 Thread GitBox
HyukjinKwon commented on pull request #29719: URL: https://github.com/apache/spark/pull/29719#issuecomment-948296942 @linar-jether, would you mind sharing your sudo codes? I am trying to figure out the general approach to address this problem. -- This is an automated message from the

[GitHub] [spark] cloud-fan closed pull request #34245: [SPARK-33277][PYSPARK][SQL] Writer thread must not access input after task completion listener returns

2021-10-21 Thread GitBox
cloud-fan closed pull request #34245: URL: https://github.com/apache/spark/pull/34245 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] cloud-fan commented on pull request #34245: [SPARK-33277][PYSPARK][SQL] Writer thread must not access input after task completion listener returns

2021-10-21 Thread GitBox
cloud-fan commented on pull request #34245: URL: https://github.com/apache/spark/pull/34245#issuecomment-948296665 thanks, merging to master/3.2! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] AngersZhuuuu commented on pull request #33828: [SPARK-36579][CORE][SQL] Make spark source stagingDir can be customized

2021-10-21 Thread GitBox
AngersZh commented on pull request #33828: URL: https://github.com/apache/spark/pull/33828#issuecomment-948290065 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] AngersZhuuuu commented on pull request #34308: [SPARK-37035][SQL] Improve error message when use parquet vectorize reader

2021-10-21 Thread GitBox
AngersZh commented on pull request #34308: URL: https://github.com/apache/spark/pull/34308#issuecomment-948289668 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] SparkQA commented on pull request #34349: [SPARK-37080][INFRA] Add benchmark tool guide in pull request template

2021-10-21 Thread GitBox
SparkQA commented on pull request #34349: URL: https://github.com/apache/spark/pull/34349#issuecomment-948284941 **[Test build #144482 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144482/testReport)** for PR 34349 at commit

<    1   2   3   4   5