[GitHub] [spark] HyukjinKwon edited a comment on pull request #26783: [SPARK-30153][PYTHON][WIP] Extend data exchange options for vectorized UDF functions with vanilla Arrow serialization

2021-10-21 Thread GitBox
HyukjinKwon edited a comment on pull request #26783: URL: https://github.com/apache/spark/pull/26783#issuecomment-948297109 @LucaCanali, would you mind sharing your pseudo codes? I am trying to figure out the general approach to address this problem (e.g., SPARK-32846, SPARK-30153,

[GitHub] [spark] SparkQA commented on pull request #34349: [SPARK-37080][INFRA] Add benchmark tool guide in pull request template

2021-10-21 Thread GitBox
SparkQA commented on pull request #34349: URL: https://github.com/apache/spark/pull/34349#issuecomment-948284941 **[Test build #144482 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144482/testReport)** for PR 34349 at commit

[GitHub] [spark] AngersZhuuuu commented on pull request #33828: [SPARK-36579][CORE][SQL] Make spark source stagingDir can be customized

2021-10-21 Thread GitBox
AngersZh commented on pull request #33828: URL: https://github.com/apache/spark/pull/33828#issuecomment-948290065 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] AngersZhuuuu commented on pull request #34308: [SPARK-37035][SQL] Improve error message when use parquet vectorize reader

2021-10-21 Thread GitBox
AngersZh commented on pull request #34308: URL: https://github.com/apache/spark/pull/34308#issuecomment-948289668 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] cloud-fan closed pull request #34245: [SPARK-33277][PYSPARK][SQL] Writer thread must not access input after task completion listener returns

2021-10-21 Thread GitBox
cloud-fan closed pull request #34245: URL: https://github.com/apache/spark/pull/34245 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] HyukjinKwon commented on pull request #29719: [SPARK-32846][SQL][PYTHON] Support createDataFrame from an RDD of pd.DataFrames

2021-10-21 Thread GitBox
HyukjinKwon commented on pull request #29719: URL: https://github.com/apache/spark/pull/29719#issuecomment-948296942 @linar-jether, would you mind sharing your sudo codes? I am trying to figure out the general approach to address this problem. -- This is an automated message from the

[GitHub] [spark] HyukjinKwon commented on pull request #26783: [SPARK-30153][PYTHON][WIP] Extend data exchange options for vectorized UDF functions with vanilla Arrow serialization

2021-10-21 Thread GitBox
HyukjinKwon commented on pull request #26783: URL: https://github.com/apache/spark/pull/26783#issuecomment-948297109 @LucaCanali, would you mind sharing your sudo codes? I am trying to figure out the general approach to address this problem. -- This is an automated message from the

[GitHub] [spark] cloud-fan commented on pull request #34245: [SPARK-33277][PYSPARK][SQL] Writer thread must not access input after task completion listener returns

2021-10-21 Thread GitBox
cloud-fan commented on pull request #34245: URL: https://github.com/apache/spark/pull/34245#issuecomment-948296665 thanks, merging to master/3.2! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] HyukjinKwon edited a comment on pull request #29719: [SPARK-32846][SQL][PYTHON] Support createDataFrame from an RDD of pd.DataFrames

2021-10-21 Thread GitBox
HyukjinKwon edited a comment on pull request #29719: URL: https://github.com/apache/spark/pull/29719#issuecomment-948296942 @linar-jether, would you mind sharing your sudo codes? I am trying to figure out the general approach to address this problem (e.g., SPARK-32846, SPARK-30153,

[GitHub] [spark] cloud-fan commented on pull request #34245: [SPARK-33277][PYSPARK][SQL] Writer thread must not access input after task completion listener returns

2021-10-21 Thread GitBox
cloud-fan commented on pull request #34245: URL: https://github.com/apache/spark/pull/34245#issuecomment-948297803 I didn't realize that the linked JIRA ticket is the old one. @ankurdave can you create a new JIRA ticket for this bug? thanks! -- This is an automated message from the

[GitHub] [spark] HyukjinKwon edited a comment on pull request #26783: [SPARK-30153][PYTHON][WIP] Extend data exchange options for vectorized UDF functions with vanilla Arrow serialization

2021-10-21 Thread GitBox
HyukjinKwon edited a comment on pull request #26783: URL: https://github.com/apache/spark/pull/26783#issuecomment-948297109 @LucaCanali, would you mind sharing your sudo codes? I am trying to figure out the general approach to address this problem (e.g., SPARK-32846, SPARK-30153,

[GitHub] [spark] HyukjinKwon edited a comment on pull request #29719: [SPARK-32846][SQL][PYTHON] Support createDataFrame from an RDD of pd.DataFrames

2021-10-21 Thread GitBox
HyukjinKwon edited a comment on pull request #29719: URL: https://github.com/apache/spark/pull/29719#issuecomment-948296942 @linar-jether, would you mind sharing your pseudo codes? I am trying to figure out the general approach to address this problem (e.g., SPARK-32846, SPARK-30153,

[GitHub] [spark] AmplabJenkins commented on pull request #34350: [SPARK-37081][SQL][TESTS] Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34350: URL: https://github.com/apache/spark/pull/34350#issuecomment-948305331 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48960/ --

[GitHub] [spark] AmplabJenkins commented on pull request #34349: [SPARK-37080][INFRA] Add benchmark tool guide in pull request template

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34349: URL: https://github.com/apache/spark/pull/34349#issuecomment-948305329 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144482/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #34308: [SPARK-37035][SQL] Improve error message when use parquet vectorize reader

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34308: URL: https://github.com/apache/spark/pull/34308#issuecomment-948305330 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48961/ --

[GitHub] [spark] cloud-fan commented on a change in pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34291: URL: https://github.com/apache/spark/pull/34291#discussion_r733396995 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala ## @@ -179,6 +179,7 @@ object JDBCRDD extends

[GitHub] [spark] SparkQA commented on pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-21 Thread GitBox
SparkQA commented on pull request #34291: URL: https://github.com/apache/spark/pull/34291#issuecomment-948341787 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48965/ -- This is an automated message from the Apache

[GitHub] [spark] cloud-fan commented on a change in pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34291: URL: https://github.com/apache/spark/pull/34291#discussion_r733400349 ## File path: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala ## @@ -92,6 +93,50 @@ class JDBCV2Suite extends QueryTest with

[GitHub] [spark] SparkQA commented on pull request #34337: [SPARK-37066][SQL] Improve error message to show file path when failed to read next file

2021-10-21 Thread GitBox
SparkQA commented on pull request #34337: URL: https://github.com/apache/spark/pull/34337#issuecomment-948348508 **[Test build #144485 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144485/testReport)** for PR 34337 at commit

[GitHub] [spark] SparkQA commented on pull request #33828: [SPARK-36579][CORE][SQL] Make spark source stagingDir can be customized

2021-10-21 Thread GitBox
SparkQA commented on pull request #33828: URL: https://github.com/apache/spark/pull/33828#issuecomment-948385275 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48966/ -- This is an automated message from the

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34353: Set spark.sql.files.openCostInBytes to bytesConf

2021-10-21 Thread GitBox
HyukjinKwon commented on a change in pull request #34353: URL: https://github.com/apache/spark/pull/34353#discussion_r733449140 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -1415,8 +1415,8 @@ object SQLConf { " bigger

[GitHub] [spark] SparkQA removed a comment on pull request #34333: [SPARK-37062][SS] Introduce a new data source for providing consistent set of rows per microbatch

2021-10-21 Thread GitBox
SparkQA removed a comment on pull request #34333: URL: https://github.com/apache/spark/pull/34333#issuecomment-948236461 **[Test build #144487 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144487/testReport)** for PR 34333 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #34333: [SPARK-37062][SS] Introduce a new data source for providing consistent set of rows per microbatch

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34333: URL: https://github.com/apache/spark/pull/34333#issuecomment-948394208 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144487/ -- This

[GitHub] [spark] SparkQA commented on pull request #34346: [SPARK-36645][SQL][FOLLOWUP] Disable min/max push down for Parquet Binary

2021-10-21 Thread GitBox
SparkQA commented on pull request #34346: URL: https://github.com/apache/spark/pull/34346#issuecomment-948427391 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48968/ -- This is an automated message from the

[GitHub] [spark] SparkQA removed a comment on pull request #34324: [SPARK-37015][PYTHON] Inline type hints for python/pyspark/streaming/dstream.py

2021-10-21 Thread GitBox
SparkQA removed a comment on pull request #34324: URL: https://github.com/apache/spark/pull/34324#issuecomment-948395660 **[Test build #144499 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144499/testReport)** for PR 34324 at commit

[GitHub] [spark] SparkQA commented on pull request #34354: [WIP][SPARK-37085][PYTHON][SQL] Add list/tuple overloads to array, struct, create_map, map_concat

2021-10-21 Thread GitBox
SparkQA commented on pull request #34354: URL: https://github.com/apache/spark/pull/34354#issuecomment-948441267 **[Test build #144500 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144500/testReport)** for PR 34354 at commit

[GitHub] [spark] SparkQA commented on pull request #34324: [SPARK-37015][PYTHON] Inline type hints for python/pyspark/streaming/dstream.py

2021-10-21 Thread GitBox
SparkQA commented on pull request #34324: URL: https://github.com/apache/spark/pull/34324#issuecomment-948441375 **[Test build #144501 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144501/testReport)** for PR 34324 at commit

[GitHub] [spark] PengleiShi commented on a change in pull request #33914: [SPARK-32268][SQL] Dynamic bloom filter join pruning

2021-10-21 Thread GitBox
PengleiShi commented on a change in pull request #33914: URL: https://github.com/apache/spark/pull/33914#discussion_r733515433 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/DynamicBloomFilterPruning.scala ## @@ -0,0 +1,191 @@ +/* + *

[GitHub] [spark] linhongliu-db commented on pull request #34338: [SPARK-37067][SQL] Use ZoneId.of() to handle timezone string in DatetimeUtils

2021-10-21 Thread GitBox
linhongliu-db commented on pull request #34338: URL: https://github.com/apache/spark/pull/34338#issuecomment-948457573 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] SparkQA commented on pull request #34338: [SPARK-37067][SQL] Use ZoneId.of() to handle timezone string in DatetimeUtils

2021-10-21 Thread GitBox
SparkQA commented on pull request #34338: URL: https://github.com/apache/spark/pull/34338#issuecomment-948463781 **[Test build #144490 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144490/testReport)** for PR 34338 at commit

[GitHub] [spark] SparkQA commented on pull request #34354: [WIP][SPARK-37085][PYTHON][SQL] Add list/tuple overloads to array, struct, create_map, map_concat

2021-10-21 Thread GitBox
SparkQA commented on pull request #34354: URL: https://github.com/apache/spark/pull/34354#issuecomment-948463850 **[Test build #144500 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144500/testReport)** for PR 34354 at commit

[GitHub] [spark] SparkQA commented on pull request #34354: [WIP][SPARK-37085][PYTHON][SQL] Add list/tuple overloads to array, struct, create_map, map_concat

2021-10-21 Thread GitBox
SparkQA commented on pull request #34354: URL: https://github.com/apache/spark/pull/34354#issuecomment-948478104 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48972/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #34352: [SPARK-37018][SQL] Spark SQL should support create function with Aggregator

2021-10-21 Thread GitBox
SparkQA commented on pull request #34352: URL: https://github.com/apache/spark/pull/34352#issuecomment-948486712 **[Test build #144497 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144497/testReport)** for PR 34352 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34352: [SPARK-37018][SQL] Spark SQL should support create function with Aggregator

2021-10-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34352: URL: https://github.com/apache/spark/pull/34352#issuecomment-948486976 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144497/

[GitHub] [spark] SparkQA commented on pull request #34308: [SPARK-37035][SQL] Improve error message when use parquet vectorize reader

2021-10-21 Thread GitBox
SparkQA commented on pull request #34308: URL: https://github.com/apache/spark/pull/34308#issuecomment-948301562 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48961/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #34308: [SPARK-37035][SQL] Improve error message when use parquet vectorize reader

2021-10-21 Thread GitBox
SparkQA commented on pull request #34308: URL: https://github.com/apache/spark/pull/34308#issuecomment-948306363 **[Test build #144492 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144492/testReport)** for PR 34308 at commit

[GitHub] [spark] SparkQA commented on pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-21 Thread GitBox
SparkQA commented on pull request #34291: URL: https://github.com/apache/spark/pull/34291#issuecomment-948306406 **[Test build #144493 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144493/testReport)** for PR 34291 at commit

[GitHub] [spark] SparkQA commented on pull request #34350: [SPARK-37081][SQL][TESTS] Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests

2021-10-21 Thread GitBox
SparkQA commented on pull request #34350: URL: https://github.com/apache/spark/pull/34350#issuecomment-948324018 **[Test build #144488 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144488/testReport)** for PR 34350 at commit

[GitHub] [spark] cloud-fan commented on a change in pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34291: URL: https://github.com/apache/spark/pull/34291#discussion_r733398197 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala ## @@ -225,6 +225,38 @@ object

[GitHub] [spark] cloud-fan commented on a change in pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34291: URL: https://github.com/apache/spark/pull/34291#discussion_r733398430 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala ## @@ -225,6 +225,38 @@ object

[GitHub] [spark] SparkQA commented on pull request #34313: [SPARK-37013][SQL] Forbid `%0$` usage explicitly to ensure `format_string` has same behavior when using Java 8 and Java 17

2021-10-21 Thread GitBox
SparkQA commented on pull request #34313: URL: https://github.com/apache/spark/pull/34313#issuecomment-948343025 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48963/ -- This is an automated message from the

[GitHub] [spark] beliefer commented on pull request #34303: [SPARK-37018][SQL] Spark SQL should support create function with Aggregator

2021-10-21 Thread GitBox
beliefer commented on pull request #34303: URL: https://github.com/apache/spark/pull/34303#issuecomment-948364624 Because https://github.com/apache/spark/pull/34340 reactor the architecture of register user-defined function, I opened https://github.com/apache/spark/pull/34352 replaces

[GitHub] [spark] beliefer closed pull request #34303: [SPARK-37018][SQL] Spark SQL should support create function with Aggregator

2021-10-21 Thread GitBox
beliefer closed pull request #34303: URL: https://github.com/apache/spark/pull/34303 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34241: [SPARK-36975][SQL] Correct the hive client calls‘s metrics in HiveClientImpl

2021-10-21 Thread GitBox
AngersZh commented on a change in pull request #34241: URL: https://github.com/apache/spark/pull/34241#discussion_r733435747 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -372,13 +374,15 @@ private[hive] class

[GitHub] [spark] SparkQA commented on pull request #34350: [SPARK-37081][SQL][TESTS] Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests

2021-10-21 Thread GitBox
SparkQA commented on pull request #34350: URL: https://github.com/apache/spark/pull/34350#issuecomment-948414507 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48967/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #34338: [SPARK-37067][SQL] Use ZoneId.of() to handle timezone string in DatetimeUtils

2021-10-21 Thread GitBox
SparkQA commented on pull request #34338: URL: https://github.com/apache/spark/pull/34338#issuecomment-948485617 **[Test build #144502 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144502/testReport)** for PR 34338 at commit

[GitHub] [spark] SparkQA commented on pull request #34337: [SPARK-37066][SQL] Improve error message to show file path when failed to read next file

2021-10-21 Thread GitBox
SparkQA commented on pull request #34337: URL: https://github.com/apache/spark/pull/34337#issuecomment-948485642 **[Test build #144503 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144503/testReport)** for PR 34337 at commit

[GitHub] [spark] SparkQA commented on pull request #34241: [SPARK-36975][SQL] Correct the hive client calls‘s metrics in HiveClientImpl

2021-10-21 Thread GitBox
SparkQA commented on pull request #34241: URL: https://github.com/apache/spark/pull/34241#issuecomment-948485880 **[Test build #144505 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144505/testReport)** for PR 34241 at commit

[GitHub] [spark] SparkQA commented on pull request #34296: [SPARK-36989][TESTS][PYTHON] Add type hints data tests

2021-10-21 Thread GitBox
SparkQA commented on pull request #34296: URL: https://github.com/apache/spark/pull/34296#issuecomment-948485764 **[Test build #144504 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144504/testReport)** for PR 34296 at commit

[GitHub] [spark] cloud-fan commented on a change in pull request #34337: [SPARK-37066][SQL] Improve error message to show file path when failed to read next file

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34337: URL: https://github.com/apache/spark/pull/34337#discussion_r733377326 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala ## @@ -177,6 +177,10 @@ class FileScanRDD(

[GitHub] [spark] huaxingao commented on a change in pull request #34311: [SPARK-37038][SQL][WIP] DSV2 Sample Push Down

2021-10-21 Thread GitBox
huaxingao commented on a change in pull request #34311: URL: https://github.com/apache/spark/pull/34311#discussion_r733377718 ## File path: external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/PostgresIntegrationSuite.scala ## @@ -49,6 +49,8 @@ class

[GitHub] [spark] cloud-fan commented on a change in pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34291: URL: https://github.com/apache/spark/pull/34291#discussion_r733396178 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala ## @@ -149,11 +150,14 @@ case class

[GitHub] [spark] SparkQA commented on pull request #34350: [SPARK-37081][SQL][TESTS] Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests

2021-10-21 Thread GitBox
SparkQA commented on pull request #34350: URL: https://github.com/apache/spark/pull/34350#issuecomment-948346877 **[Test build #144496 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144496/testReport)** for PR 34350 at commit

[GitHub] [spark] cloud-fan commented on a change in pull request #34241: [SPARK-36975][SQL] Correct the hive client calls‘s metrics in HiveClientImpl

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34241: URL: https://github.com/apache/spark/pull/34241#discussion_r733406087 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -372,13 +374,15 @@ private[hive] class

[GitHub] [spark] SparkQA commented on pull request #34333: [SPARK-37062][SS] Introduce a new data source for providing consistent set of rows per microbatch

2021-10-21 Thread GitBox
SparkQA commented on pull request #34333: URL: https://github.com/apache/spark/pull/34333#issuecomment-948392566 **[Test build #144487 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144487/testReport)** for PR 34333 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34308: [SPARK-37035][SQL] Improve error message when use parquet vectorize reader

2021-10-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34308: URL: https://github.com/apache/spark/pull/34308#issuecomment-948391341 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48964/

[GitHub] [spark] SparkQA commented on pull request #34241: [SPARK-36975][SQL] Correct the hive client calls‘s metrics in HiveClientImpl

2021-10-21 Thread GitBox
SparkQA commented on pull request #34241: URL: https://github.com/apache/spark/pull/34241#issuecomment-948392519 **[Test build #144498 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144498/testReport)** for PR 34241 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33828: [SPARK-36579][CORE][SQL] Make spark source stagingDir can be customized

2021-10-21 Thread GitBox
AmplabJenkins removed a comment on pull request #33828: URL: https://github.com/apache/spark/pull/33828#issuecomment-948391344 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48966/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34291: URL: https://github.com/apache/spark/pull/34291#issuecomment-948391342 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48965/

[GitHub] [spark] zero323 opened a new pull request #34354: [WIP][SPARK-37085][PYTHON][SQL] Add list/tuple overloads to array, struct, create_map, map_concat

2021-10-21 Thread GitBox
zero323 opened a new pull request #34354: URL: https://github.com/apache/spark/pull/34354 ### What changes were proposed in this pull request? This PR adds overloads to the following `pyspark.sql.functions`: - `array` - `struct` - `create_map` -

[GitHub] [spark] SparkQA commented on pull request #34241: [SPARK-36975][SQL] Correct the hive client calls‘s metrics in HiveClientImpl

2021-10-21 Thread GitBox
SparkQA commented on pull request #34241: URL: https://github.com/apache/spark/pull/34241#issuecomment-948459864 **[Test build #144498 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144498/testReport)** for PR 34241 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #34354: [WIP][SPARK-37085][PYTHON][SQL] Add list/tuple overloads to array, struct, create_map, map_concat

2021-10-21 Thread GitBox
SparkQA removed a comment on pull request #34354: URL: https://github.com/apache/spark/pull/34354#issuecomment-948441267 **[Test build #144500 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144500/testReport)** for PR 34354 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #34338: [SPARK-37067][SQL] Use ZoneId.of() to handle timezone string in DatetimeUtils

2021-10-21 Thread GitBox
SparkQA removed a comment on pull request #34338: URL: https://github.com/apache/spark/pull/34338#issuecomment-948276447 **[Test build #144490 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144490/testReport)** for PR 34338 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #34241: [SPARK-36975][SQL] Correct the hive client calls‘s metrics in HiveClientImpl

2021-10-21 Thread GitBox
SparkQA removed a comment on pull request #34241: URL: https://github.com/apache/spark/pull/34241#issuecomment-948392519 **[Test build #144498 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144498/testReport)** for PR 34241 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #34350: [SPARK-37081][SQL][TESTS] Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests

2021-10-21 Thread GitBox
SparkQA removed a comment on pull request #34350: URL: https://github.com/apache/spark/pull/34350#issuecomment-948346877 **[Test build #144496 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144496/testReport)** for PR 34350 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #34324: [SPARK-37015][PYTHON] Inline type hints for python/pyspark/streaming/dstream.py

2021-10-21 Thread GitBox
SparkQA removed a comment on pull request #34324: URL: https://github.com/apache/spark/pull/34324#issuecomment-948441375 **[Test build #144501 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144501/testReport)** for PR 34324 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #34241: [SPARK-36975][SQL] Correct the hive client calls‘s metrics in HiveClientImpl

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34241: URL: https://github.com/apache/spark/pull/34241#issuecomment-948485085 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] AmplabJenkins commented on pull request #34350: [SPARK-37081][SQL][TESTS] Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34350: URL: https://github.com/apache/spark/pull/34350#issuecomment-948485089 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144496/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #34354: [WIP][SPARK-37085][PYTHON][SQL] Add list/tuple overloads to array, struct, create_map, map_concat

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34354: URL: https://github.com/apache/spark/pull/34354#issuecomment-948485092 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144500/ -- This

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34324: [SPARK-37015][PYTHON] Inline type hints for python/pyspark/streaming/dstream.py

2021-10-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34324: URL: https://github.com/apache/spark/pull/34324#issuecomment-948485083 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] AmplabJenkins commented on pull request #34352: [SPARK-37018][SQL] Spark SQL should support create function with Aggregator

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34352: URL: https://github.com/apache/spark/pull/34352#issuecomment-948485088 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48969/ --

[GitHub] [spark] AmplabJenkins commented on pull request #34338: [SPARK-37067][SQL] Use ZoneId.of() to handle timezone string in DatetimeUtils

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34338: URL: https://github.com/apache/spark/pull/34338#issuecomment-948485095 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144490/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #34324: [SPARK-37015][PYTHON] Inline type hints for python/pyspark/streaming/dstream.py

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34324: URL: https://github.com/apache/spark/pull/34324#issuecomment-948485083 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34350: [SPARK-37081][SQL][TESTS] Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests

2021-10-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34350: URL: https://github.com/apache/spark/pull/34350#issuecomment-948485089 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144496/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34241: [SPARK-36975][SQL] Correct the hive client calls‘s metrics in HiveClientImpl

2021-10-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34241: URL: https://github.com/apache/spark/pull/34241#issuecomment-948485085 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34354: [WIP][SPARK-37085][PYTHON][SQL] Add list/tuple overloads to array, struct, create_map, map_concat

2021-10-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34354: URL: https://github.com/apache/spark/pull/34354#issuecomment-948485092 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144500/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34338: [SPARK-37067][SQL] Use ZoneId.of() to handle timezone string in DatetimeUtils

2021-10-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34338: URL: https://github.com/apache/spark/pull/34338#issuecomment-948485095 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144490/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34352: [SPARK-37018][SQL] Spark SQL should support create function with Aggregator

2021-10-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34352: URL: https://github.com/apache/spark/pull/34352#issuecomment-948485088 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48969/

[GitHub] [spark] SparkQA commented on pull request #34338: [SPARK-37067][SQL] Use ZoneId.of() to handle timezone string in DatetimeUtils

2021-10-21 Thread GitBox
SparkQA commented on pull request #34338: URL: https://github.com/apache/spark/pull/34338#issuecomment-948302756 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48962/ -- This is an automated message from the Apache

[GitHub] [spark] sarutak commented on pull request #34350: [SPARK-37081][SQL][TESTS] Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests

2021-10-21 Thread GitBox
sarutak commented on pull request #34350: URL: https://github.com/apache/spark/pull/34350#issuecomment-948327621 retest this please. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] cloud-fan commented on a change in pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34291: URL: https://github.com/apache/spark/pull/34291#discussion_r733399323 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala ## @@ -252,6 +284,7 @@ case class

[GitHub] [spark] SparkQA removed a comment on pull request #34341: [SPARK-37076][SQL] Implement StructType.toString explicitly for Scala 2.13

2021-10-21 Thread GitBox
SparkQA removed a comment on pull request #34341: URL: https://github.com/apache/spark/pull/34341#issuecomment-948181872 **[Test build #144480 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144480/testReport)** for PR 34341 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #34350: [SPARK-37081][SQL][TESTS] Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests

2021-10-21 Thread GitBox
SparkQA removed a comment on pull request #34350: URL: https://github.com/apache/spark/pull/34350#issuecomment-948253954 **[Test build #144488 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144488/testReport)** for PR 34350 at commit

[GitHub] [spark] SparkQA commented on pull request #33828: [SPARK-36579][CORE][SQL] Make spark source stagingDir can be customized

2021-10-21 Thread GitBox
SparkQA commented on pull request #33828: URL: https://github.com/apache/spark/pull/33828#issuecomment-948340628 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48966/ -- This is an automated message from the Apache

[GitHub] [spark] AmplabJenkins commented on pull request #34353: Set spark.sql.files.openCostInBytes to bytesConf

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34353: URL: https://github.com/apache/spark/pull/34353#issuecomment-948391980 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] SparkQA commented on pull request #34352: [SPARK-37018][SQL] Spark SQL should support create function with Aggregator

2021-10-21 Thread GitBox
SparkQA commented on pull request #34352: URL: https://github.com/apache/spark/pull/34352#issuecomment-948392347 **[Test build #144497 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144497/testReport)** for PR 34352 at commit

[GitHub] [spark] cloud-fan commented on a change in pull request #34352: [SPARK-37018][SQL] Spark SQL should support create function with Aggregator

2021-10-21 Thread GitBox
cloud-fan commented on a change in pull request #34352: URL: https://github.com/apache/spark/pull/34352#discussion_r733473500 ## File path: sql/core/src/main/scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala ## @@ -410,6 +415,35 @@ class

[GitHub] [spark] zero323 commented on pull request #34354: [WIP][SPARK-37085][PYTHON][SQL] Add list/tuple overloads to array, struct, create_map, map_concat

2021-10-21 Thread GitBox
zero323 commented on pull request #34354: URL: https://github.com/apache/spark/pull/34354#issuecomment-948430499 New annotations are already implemented, but I think we might have to redefine `ColumnOrName` to fully support these, so I'll keep this as a draft for now. FYI

[GitHub] [spark] SparkQA commented on pull request #34324: [SPARK-37015][PYTHON] Inline type hints for python/pyspark/streaming/dstream.py

2021-10-21 Thread GitBox
SparkQA commented on pull request #34324: URL: https://github.com/apache/spark/pull/34324#issuecomment-948432822 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48970/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #33828: [SPARK-36579][CORE][SQL] Make spark source stagingDir can be customized

2021-10-21 Thread GitBox
SparkQA commented on pull request #33828: URL: https://github.com/apache/spark/pull/33828#issuecomment-948435508 **[Test build #144494 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144494/testReport)** for PR 33828 at commit

[GitHub] [spark] SparkQA commented on pull request #34350: [SPARK-37081][SQL][TESTS] Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests

2021-10-21 Thread GitBox
SparkQA commented on pull request #34350: URL: https://github.com/apache/spark/pull/34350#issuecomment-948301761 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48960/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #34313: [SPARK-37013][SQL] Forbid `%0$` usage explicitly to ensure `format_string` has same behavior when using Java 8 and Java 17

2021-10-21 Thread GitBox
SparkQA commented on pull request #34313: URL: https://github.com/apache/spark/pull/34313#issuecomment-948306729 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48963/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #33828: [SPARK-36579][CORE][SQL] Make spark source stagingDir can be customized

2021-10-21 Thread GitBox
SparkQA commented on pull request #33828: URL: https://github.com/apache/spark/pull/33828#issuecomment-948306867 **[Test build #144494 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144494/testReport)** for PR 33828 at commit

[GitHub] [spark] SparkQA commented on pull request #34341: [SPARK-37076][SQL] Implement StructType.toString explicitly for Scala 2.13

2021-10-21 Thread GitBox
SparkQA commented on pull request #34341: URL: https://github.com/apache/spark/pull/34341#issuecomment-948318397 **[Test build #144480 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144480/testReport)** for PR 34341 at commit

[GitHub] [spark] Yikun commented on a change in pull request #34314: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series

2021-10-21 Thread GitBox
Yikun commented on a change in pull request #34314: URL: https://github.com/apache/spark/pull/34314#discussion_r733376501 ## File path: python/pyspark/pandas/data_type_ops/num_ops.py ## @@ -447,10 +447,29 @@ def nan_to_null(self, index_ops: IndexOpsLike) -> IndexOpsLike:

[GitHub] [spark] eejbyfeldt opened a new pull request #34351: [SPARK-37071][CORE] Make OpenHashMap serialize without reference tracking

2021-10-21 Thread GitBox
eejbyfeldt opened a new pull request #34351: URL: https://github.com/apache/spark/pull/34351 ### What changes were proposed in this pull request? Change the anonymous functions in OpenHashMap to member methods. This avoid having a member which captures the OpenHashMap object in

[GitHub] [spark] AmplabJenkins commented on pull request #34350: [SPARK-37081][SQL][TESTS] Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34350: URL: https://github.com/apache/spark/pull/34350#issuecomment-948344639 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144488/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #34338: [SPARK-37067][SQL] Use ZoneId.of() to handle timezone string in DatetimeUtils

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34338: URL: https://github.com/apache/spark/pull/34338#issuecomment-948344642 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48962/ --

[GitHub] [spark] AmplabJenkins commented on pull request #34341: [SPARK-37076][SQL] Implement StructType.toString explicitly for Scala 2.13

2021-10-21 Thread GitBox
AmplabJenkins commented on pull request #34341: URL: https://github.com/apache/spark/pull/34341#issuecomment-948344637 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144480/ -- This

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34350: [SPARK-37081][SQL][TESTS] Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests

2021-10-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34350: URL: https://github.com/apache/spark/pull/34350#issuecomment-948344639 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144488/

  1   2   3   4   5   >