[GitHub] [spark] AmplabJenkins commented on pull request #32949: [WIP][SPARK-35749][SPARK-35773][SQL] Parse unit list interval literals as tightest year-month/day-time interval types

2021-06-18 Thread GitBox
AmplabJenkins commented on pull request #32949: URL: https://github.com/apache/spark/pull/32949#issuecomment-863452749 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139921/ -- This

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32963: [SPARK-35378][SQL][FOLLOWUP] isLocal should consider CommandResult

2021-06-18 Thread GitBox
AmplabJenkins removed a comment on pull request #32963: URL: https://github.com/apache/spark/pull/32963#issuecomment-863960196 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] MaxGekk closed pull request #32953: [SPARK-35095][SS][TESTS] Use ANSI intervals in streaming join tests

2021-06-18 Thread GitBox
MaxGekk closed pull request #32953: URL: https://github.com/apache/spark/pull/32953 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32961: [SPARK-35694][INFRA][TEST-HADOOP3.2][TEST-JAVA11] Increase JVM stack size to 128M for Maven

2021-06-18 Thread GitBox
AmplabJenkins removed a comment on pull request #32961: URL: https://github.com/apache/spark/pull/32961#issuecomment-863848568 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] ulysses-you commented on a change in pull request #32960: [SPARK-35813][SQL][DOCS] Add new adaptive config into sql-performance-tuning docs

2021-06-18 Thread GitBox
ulysses-you commented on a change in pull request #32960: URL: https://github.com/apache/spark/pull/32960#discussion_r654154456 ## File path: docs/sql-performance-tuning.md ## @@ -273,7 +273,32 @@ This feature coalesces the post shuffle partitions based on the map output

[GitHub] [spark] AmplabJenkins commented on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast

2021-06-18 Thread GitBox
AmplabJenkins commented on pull request #32940: URL: https://github.com/apache/spark/pull/32940#issuecomment-863452742 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] AmplabJenkins commented on pull request #32952: [SPARK-35799][SS] Fix the allUpdatesTimeMs metric measuring in FlatMapGroupsWithStateExec

2021-06-18 Thread GitBox
AmplabJenkins commented on pull request #32952: URL: https://github.com/apache/spark/pull/32952#issuecomment-863529561 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] AmplabJenkins commented on pull request #32953: [SPARK-35095][SS][TESTS] Use ANSI intervals in streaming join tests

2021-06-18 Thread GitBox
AmplabJenkins commented on pull request #32953: URL: https://github.com/apache/spark/pull/32953#issuecomment-863458860 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast

2021-06-18 Thread GitBox
AngersZh commented on a change in pull request #32940: URL: https://github.com/apache/spark/pull/32940#discussion_r654484165 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -108,18 +108,22 @@ object IntervalUtils {

[GitHub] [spark] gengliangwang closed pull request #32936: [SPARK-35720][SQL] Support casting of String to timestamp without time zone type

2021-06-18 Thread GitBox
gengliangwang closed pull request #32936: URL: https://github.com/apache/spark/pull/32936 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32852: [SPARK-35283][SQL] Support query some DDL with CTES

2021-06-18 Thread GitBox
AmplabJenkins removed a comment on pull request #32852: URL: https://github.com/apache/spark/pull/32852#issuecomment-863783347 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] SparkQA commented on pull request #32965: [SPARK-35478][PYTHON] Fix Jenkins' linter.

2021-06-18 Thread GitBox
SparkQA commented on pull request #32965: URL: https://github.com/apache/spark/pull/32965#issuecomment-864257707 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] SparkQA removed a comment on pull request #32881: [SPARK-33298][CORE] Decouple file naming from FileCommitProtocol

2021-06-18 Thread GitBox
SparkQA removed a comment on pull request #32881: URL: https://github.com/apache/spark/pull/32881#issuecomment-863722742 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] Ngone51 commented on a change in pull request #32868: [SPARK-35714][CORE] Bug fix for deadlock during the executor shutdown

2021-06-18 Thread GitBox
Ngone51 commented on a change in pull request #32868: URL: https://github.com/apache/spark/pull/32868#discussion_r654210782 ## File path: core/src/main/scala/org/apache/spark/deploy/worker/WorkerWatcher.scala ## @@ -45,7 +50,14 @@ private[spark] class WorkerWatcher(

[GitHub] [spark] HeartSaVioR commented on pull request #32938: [SPARK-35800][SS] Improving GroupState testability by introducing TestGroupState

2021-06-18 Thread GitBox
HeartSaVioR commented on pull request #32938: URL: https://github.com/apache/spark/pull/32938#issuecomment-864040572 Please fill the content of the PR accordingly based on PR template on Apache Spark community's guideline.

[GitHub] [spark] cloud-fan commented on a change in pull request #32868: [SPARK-35714][CORE] Bug fix for deadlock during the executor shutdown

2021-06-18 Thread GitBox
cloud-fan commented on a change in pull request #32868: URL: https://github.com/apache/spark/pull/32868#discussion_r654190470 ## File path: core/src/main/scala/org/apache/spark/deploy/worker/WorkerWatcher.scala ## @@ -45,7 +50,14 @@ private[spark] class WorkerWatcher(

[GitHub] [spark] ueshin commented on pull request #32821: [SPARK-35342][PYTHON] Introduce DecimalOps and make `isnull` method data-type-based

2021-06-18 Thread GitBox
ueshin commented on pull request #32821: URL: https://github.com/apache/spark/pull/32821#issuecomment-864187846 Thanks! merging to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] Yikun commented on a change in pull request #32867: [SPARK-35721][PYTHON] Path level discover for python unittests

2021-06-18 Thread GitBox
Yikun commented on a change in pull request #32867: URL: https://github.com/apache/spark/pull/32867#discussion_r654522707 ## File path: dev/sparktestsupport/modules.py ## @@ -58,14 +95,19 @@ def __init__(self, name, dependencies, source_file_regexes, build_profile_flags=

[GitHub] [spark] xkrogen commented on pull request #32810: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

2021-06-18 Thread GitBox
xkrogen commented on pull request #32810: URL: https://github.com/apache/spark/pull/32810#issuecomment-864156403 Pushed up a new version which expands test cases, extracts some shared constants in the test, and simplifies the logic in getUserClasspathUrls and makes the assumptions more

[GitHub] [spark] github-actions[bot] closed pull request #29247: [SPARK-32446][SHS] Add percentile distribution REST API of peak memory metrics for all executors

2021-06-18 Thread GitBox
github-actions[bot] closed pull request #29247: URL: https://github.com/apache/spark/pull/29247 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this

[GitHub] [spark] SparkQA commented on pull request #32957: [SPARK-35472][PYTHON] Fix disallow_untyped_defs mypy checks.

2021-06-18 Thread GitBox
SparkQA commented on pull request #32957: URL: https://github.com/apache/spark/pull/32957#issuecomment-863706939 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] SparkQA commented on pull request #32821: [SPARK-35342][PYTHON] Introduce DecimalOps and make `isnull` method data-type-based

2021-06-18 Thread GitBox
SparkQA commented on pull request #32821: URL: https://github.com/apache/spark/pull/32821#issuecomment-863851709 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32952: [SPARK-35799][SS] Fix the allUpdatesTimeMs metric measuring in FlatMapGroupsWithStateExec

2021-06-18 Thread GitBox
AmplabJenkins removed a comment on pull request #32952: URL: https://github.com/apache/spark/pull/32952#issuecomment-863529561 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] dongjoon-hyun commented on pull request #32826: [SPARK-35670][BUILD] Upgrade ZSTD-JNI to 1.5.0-2

2021-06-18 Thread GitBox
dongjoon-hyun commented on pull request #32826: URL: https://github.com/apache/spark/pull/32826#issuecomment-863451112 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] ueshin opened a new pull request #32957: [SPARK-35472][PYTHON] Fix disallow_untyped_defs mypy checks.

2021-06-18 Thread GitBox
ueshin opened a new pull request #32957: URL: https://github.com/apache/spark/pull/32957 ### What changes were proposed in this pull request? Adds more type annotations in the file `python/pyspark/pandas/generic.py` and fixes the mypy check failures. ### Why are the changes

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32742: [SPARK-35608][SQL] Support AQE optimizer side transformUpWithPruning

2021-06-18 Thread GitBox
AmplabJenkins removed a comment on pull request #32742: URL: https://github.com/apache/spark/pull/32742#issuecomment-863887632 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] SparkQA removed a comment on pull request #32953: [SPARK-35095][SS][TESTS] Use ANSI intervals in streaming join tests

2021-06-18 Thread GitBox
SparkQA removed a comment on pull request #32953: URL: https://github.com/apache/spark/pull/32953#issuecomment-863454208 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] SparkQA commented on pull request #32936: [SPARK-35720][SQL] Support casting of String to timestamp without time zone type

2021-06-18 Thread GitBox
SparkQA commented on pull request #32936: URL: https://github.com/apache/spark/pull/32936#issuecomment-863445690 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32861: [SPARK-35710] [SQL] Support DPP + AQE when there is no reused broadcast exchange

2021-06-18 Thread GitBox
AmplabJenkins removed a comment on pull request #32861: URL: https://github.com/apache/spark/pull/32861#issuecomment-863856057 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] AngersZhuuuu commented on pull request #32943: [SPARK-35735][SQL] Take into account day-time interval fields in cast

2021-06-18 Thread GitBox
AngersZh commented on pull request #32943: URL: https://github.com/apache/spark/pull/32943#issuecomment-863701186 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] MaxGekk commented on a change in pull request #32945: [SPARK-35769][SQL] Truncate java.time.Period by fields of year-month interval type

2021-06-18 Thread GitBox
MaxGekk commented on a change in pull request #32945: URL: https://github.com/apache/spark/pull/32945#discussion_r654150276 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala ## @@ -284,7 +285,10 @@ object RandomDataGenerator {

[GitHub] [spark] SparkQA removed a comment on pull request #32945: [SPARK-35769][SQL] Truncate java.time.Period by fields of year-month interval type

2021-06-18 Thread GitBox
SparkQA removed a comment on pull request #32945: URL: https://github.com/apache/spark/pull/32945#issuecomment-863330070 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] SparkQA commented on pull request #32962: [SPARK-35303][SPARK-35498][PYTHON][FOLLOW-UP] Refactor inheritable thread logic, and use it in codebase for pinned thread mode

2021-06-18 Thread GitBox
SparkQA commented on pull request #32962: URL: https://github.com/apache/spark/pull/32962#issuecomment-863851453 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] dongjoon-hyun commented on pull request #32753: [SPARK-34859][SQL] Handle column index when using vectorized Parquet reader

2021-06-18 Thread GitBox
dongjoon-hyun commented on pull request #32753: URL: https://github.com/apache/spark/pull/32753#issuecomment-863562089 Thank you for updates, @sunchao ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] AmplabJenkins commented on pull request #32861: [SPARK-35710] [SQL] Support DPP + AQE when there is no reused broadcast exchange

2021-06-18 Thread GitBox
AmplabJenkins commented on pull request #32861: URL: https://github.com/apache/spark/pull/32861#issuecomment-863856057 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] SparkQA removed a comment on pull request #32867: [SPARK-35721][PYTHON] Path level discover for python unittests

2021-06-18 Thread GitBox
SparkQA removed a comment on pull request #32867: URL: https://github.com/apache/spark/pull/32867#issuecomment-864120019 **[Test build #139996 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139996/testReport)** for PR 32867 at commit

[GitHub] [spark] ueshin closed pull request #32821: [SPARK-35342][PYTHON] Introduce DecimalOps and make `isnull` method data-type-based

2021-06-18 Thread GitBox
ueshin closed pull request #32821: URL: https://github.com/apache/spark/pull/32821 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [spark] HyukjinKwon closed pull request #32429: [SPARK-35303][PYTHON] Enable pinned thread mode by default

2021-06-18 Thread GitBox
HyukjinKwon closed pull request #32429: URL: https://github.com/apache/spark/pull/32429 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)

2021-06-18 Thread GitBox
AmplabJenkins removed a comment on pull request #32958: URL: https://github.com/apache/spark/pull/32958#issuecomment-863741345 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32928: [WIP][SPARK-35784] Implementation for RocksDB instance

2021-06-18 Thread GitBox
dongjoon-hyun commented on a change in pull request #32928: URL: https://github.com/apache/spark/pull/32928#discussion_r654049392 ## File path: sql/core/pom.xml ## @@ -35,6 +35,11 @@ + + org.rocksdb + rocksdbjni + 6.2.2 Review comment:

[GitHub] [spark] cloud-fan commented on a change in pull request #32881: [SPARK-33298][CORE] Decouple file naming from FileCommitProtocol

2021-06-18 Thread GitBox
cloud-fan commented on a change in pull request #32881: URL: https://github.com/apache/spark/pull/32881#discussion_r654156008 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -1192,6 +1192,13 @@ object SQLConf {

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32957: [SPARK-35472][PYTHON] Fix disallow_untyped_defs mypy checks.

2021-06-18 Thread GitBox
AmplabJenkins removed a comment on pull request #32957: URL: https://github.com/apache/spark/pull/32957#issuecomment-863721014 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] zhengruifeng commented on pull request #32927: [SPARK-35678][ML][FOLLOWUP] softmax support offset and step

2021-06-18 Thread GitBox
zhengruifeng commented on pull request #32927: URL: https://github.com/apache/spark/pull/32927#issuecomment-863775035 Thanks @huaxingao @srowen for reviewing! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] AmplabJenkins commented on pull request #32867: [SPARK-35721][PYTHON] Path level discover for python unittests

2021-06-18 Thread GitBox
AmplabJenkins commented on pull request #32867: URL: https://github.com/apache/spark/pull/32867#issuecomment-864143367 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32049: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-06-18 Thread GitBox
AmplabJenkins removed a comment on pull request #32049: URL: https://github.com/apache/spark/pull/32049#issuecomment-864234691 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44524/

[GitHub] [spark] gengliangwang edited a comment on pull request #32879: [SPARK-35694][INFRA][FollowUp] Increase the default JVM stack size of SBT/Maven

2021-06-18 Thread GitBox
gengliangwang edited a comment on pull request #32879: URL: https://github.com/apache/spark/pull/32879#issuecomment-863723364 @LuciferYang Thanks for reporting. It seems that it happens on the job https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-3.2-jdk-11/

[GitHub] [spark] HeartSaVioR commented on a change in pull request #32747: [SPARK-35611][SS] Introduce the strategy on mismatched offset for start offset timestamp on Kafka data source

2021-06-18 Thread GitBox
HeartSaVioR commented on a change in pull request #32747: URL: https://github.com/apache/spark/pull/32747#discussion_r654408254 ## File path: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReaderConsumer.scala ## @@ -270,6 +262,35 @@

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32955: [WIP] Support creating a Column of numpy literal value in pandas-on-Spark

2021-06-18 Thread GitBox
AmplabJenkins removed a comment on pull request #32955: URL: https://github.com/apache/spark/pull/32955#issuecomment-863497676 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] MaxGekk commented on a change in pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast

2021-06-18 Thread GitBox
MaxGekk commented on a change in pull request #32940: URL: https://github.com/apache/spark/pull/32940#discussion_r653869032 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -108,18 +108,22 @@ object IntervalUtils {

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32881: [SPARK-33298][CORE] Decouple file naming from FileCommitProtocol

2021-06-18 Thread GitBox
AmplabJenkins removed a comment on pull request #32881: URL: https://github.com/apache/spark/pull/32881#issuecomment-863726505 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] ueshin closed pull request #32886: [SPARK-35478][PYTHON] Enable disallow_untyped_defs mypy check for pyspark.pandas.window.

2021-06-18 Thread GitBox
ueshin closed pull request #32886: URL: https://github.com/apache/spark/pull/32886 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [spark] ueshin commented on a change in pull request #32821: [SPARK-35342][PYTHON] Introduce DecimalOps and make `isnull` method data-type-based

2021-06-18 Thread GitBox
ueshin commented on a change in pull request #32821: URL: https://github.com/apache/spark/pull/32821#discussion_r654013813 ## File path: python/pyspark/pandas/data_type_ops/base.py ## @@ -300,6 +303,13 @@ def prepare(self, col: pd.Series) -> pd.Series: """Prepare

[GitHub] [spark] SparkQA commented on pull request #32859: [SPARK-35708][PYTHON][TEST] Add BaseTest for DataTypeOps

2021-06-18 Thread GitBox
SparkQA commented on pull request #32859: URL: https://github.com/apache/spark/pull/32859#issuecomment-863815064 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] rdblue commented on a change in pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-18 Thread GitBox
rdblue commented on a change in pull request #32921: URL: https://github.com/apache/spark/pull/32921#discussion_r654632259 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsDynamicFiltering.java ## @@ -0,0 +1,55 @@ +/* + * Licensed to the

[GitHub] [spark] AmplabJenkins commented on pull request #32881: [SPARK-33298][CORE] Decouple file naming from FileCommitProtocol

2021-06-18 Thread GitBox
AmplabJenkins commented on pull request #32881: URL: https://github.com/apache/spark/pull/32881#issuecomment-863726505 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] viirya commented on pull request #32702: [SPARK-35565][SS] Add config for ignoring metadata directory of FileStreamSink

2021-06-18 Thread GitBox
viirya commented on pull request #32702: URL: https://github.com/apache/spark/pull/32702#issuecomment-864207085 Thanks @HeartSaVioR @xuanyuanking! I've updated the change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] github-actions[bot] closed pull request #31639: [SPARK-34528][SQL] Named explicitly field in struct of a catalog view

2021-06-18 Thread GitBox
github-actions[bot] closed pull request #31639: URL: https://github.com/apache/spark/pull/31639 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this

[GitHub] [spark] SparkQA commented on pull request #32950: [SPARK-35726][SQL] Truncate java.time.Duration by fields of day-time interval type

2021-06-18 Thread GitBox
SparkQA commented on pull request #32950: URL: https://github.com/apache/spark/pull/32950#issuecomment-863760836 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] xkrogen commented on a change in pull request #32810: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

2021-06-18 Thread GitBox
xkrogen commented on a change in pull request #32810: URL: https://github.com/apache/spark/pull/32810#discussion_r654550571 ## File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ## @@ -1464,6 +1464,34 @@ private object Client extends

[GitHub] [spark] pingsutw opened a new pull request #32964: [SPARK-35811][PYTHON] Deprecate DataFrame.to_spark_io

2021-06-18 Thread GitBox
pingsutw opened a new pull request #32964: URL: https://github.com/apache/spark/pull/32964 ### What changes were proposed in this pull request? Deprecate the `DataFrame.to_spark_io` ### Why are the changes needed? We should deprecate the

[GitHub] [spark] HyukjinKwon removed a comment on pull request #32894: [SPARK-35747][CORE] Avoid printing full Exception stack trace, if Hbase/Kafka/Hive services are not running in a secure cluster

2021-06-18 Thread GitBox
HyukjinKwon removed a comment on pull request #32894: URL: https://github.com/apache/spark/pull/32894#issuecomment-863954343 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] AngersZhuuuu commented on pull request #32945: [SPARK-35769][SQL] Truncate java.time.Period by fields of year-month interval type

2021-06-18 Thread GitBox
AngersZh commented on pull request #32945: URL: https://github.com/apache/spark/pull/32945#issuecomment-863713825 > @AngersZh The failed test is related to the changes: > > ``` > org.scalatest.exceptions.TestFailedException: P-96322443Y-9M did not equal P-96322443Y >

[GitHub] [spark] MaxGekk commented on pull request #32953: [SPARK-35095][SS][TESTS] Use ANSI intervals in streaming join tests

2021-06-18 Thread GitBox
MaxGekk commented on pull request #32953: URL: https://github.com/apache/spark/pull/32953#issuecomment-863517408 +1, LGTM. Merging to master. Thank you, @sarutak . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] SparkQA commented on pull request #32960: [SPARK-35813][SQL][DOCS] Add new adaptive config into sql-performance-tuning docs

2021-06-18 Thread GitBox
SparkQA commented on pull request #32960: URL: https://github.com/apache/spark/pull/32960#issuecomment-863784508 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] AmplabJenkins commented on pull request #32810: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

2021-06-18 Thread GitBox
AmplabJenkins commented on pull request #32810: URL: https://github.com/apache/spark/pull/32810#issuecomment-864206889 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32401: [SPARK-35276][CORE] Calculate checksum for shuffle data and write as checksum file

2021-06-18 Thread GitBox
AmplabJenkins removed a comment on pull request #32401: URL: https://github.com/apache/spark/pull/32401#issuecomment-863459427 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139937/

[GitHub] [spark] AmplabJenkins commented on pull request #32961: [SPARK-35694][INFRA][TEST-HADOOP3.2][TEST-JAVA11] Increase JVM stack size to 128M for Maven

2021-06-18 Thread GitBox
AmplabJenkins commented on pull request #32961: URL: https://github.com/apache/spark/pull/32961#issuecomment-863848568 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] mridulm commented on a change in pull request #32401: [SPARK-35276][CORE] Calculate checksum for shuffle data and write as checksum file

2021-06-18 Thread GitBox
mridulm commented on a change in pull request #32401: URL: https://github.com/apache/spark/pull/32401#discussion_r653777439 ## File path: core/src/main/scala/org/apache/spark/io/MutableCheckedOutputStream.scala ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] SparkQA commented on pull request #32961: [SPARK-35694][INFRA][TEST-JAVA11] Increase JVM stack size to 128M for Maven

2021-06-18 Thread GitBox
SparkQA commented on pull request #32961: URL: https://github.com/apache/spark/pull/32961#issuecomment-863824199 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] peter-toth commented on pull request #32947: [SPARK-35798][SQL] Fix SparkPlan.sqlContext usage

2021-06-18 Thread GitBox
peter-toth commented on pull request #32947: URL: https://github.com/apache/spark/pull/32947#issuecomment-863556056 Thanks for the review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] huaxingao commented on pull request #32927: [SPARK-35678][ML][FOLLOWUP] softmax support offset and step

2021-06-18 Thread GitBox
huaxingao commented on pull request #32927: URL: https://github.com/apache/spark/pull/32927#issuecomment-863775124 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] Yikun commented on a change in pull request #32821: [SPARK-35342][PYTHON] Introduce DecimalOps and make `isnull` method data-type-based

2021-06-18 Thread GitBox
Yikun commented on a change in pull request #32821: URL: https://github.com/apache/spark/pull/32821#discussion_r654208604 ## File path: python/pyspark/pandas/data_type_ops/base.py ## @@ -300,6 +303,13 @@ def prepare(self, col: pd.Series) -> pd.Series: """Prepare

[GitHub] [spark] ueshin commented on pull request #32956: [SPARK-35469][PYTHON] Fix disallow_untyped_defs mypy checks.

2021-06-18 Thread GitBox
ueshin commented on pull request #32956: URL: https://github.com/apache/spark/pull/32956#issuecomment-863635606 cc @HyukjinKwon @itholic @xinrong-databricks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] SparkQA removed a comment on pull request #32936: [SPARK-35720][SQL] Support casting of String to timestamp without time zone type

2021-06-18 Thread GitBox
SparkQA removed a comment on pull request #32936: URL: https://github.com/apache/spark/pull/32936#issuecomment-863278493 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] SparkQA removed a comment on pull request #32861: [SPARK-35710] [SQL] Support DPP + AQE when there is no reused broadcast exchange

2021-06-18 Thread GitBox
SparkQA removed a comment on pull request #32861: URL: https://github.com/apache/spark/pull/32861#issuecomment-863851686 **[Test build #139979 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139979/testReport)** for PR 32861 at commit

[GitHub] [spark] SparkQA commented on pull request #32949: [WIP][SPARK-35749][SPARK-35773][SQL] Parse unit list interval literals as tightest year-month/day-time interval types

2021-06-18 Thread GitBox
SparkQA commented on pull request #32949: URL: https://github.com/apache/spark/pull/32949#issuecomment-863433808 **[Test build #139921 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139921/testReport)** for PR 32949 at commit

[GitHub] [spark] viirya edited a comment on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-06-18 Thread GitBox
viirya edited a comment on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-864257650 Oh, we can continue the discussion here too if you prefer. I just think that as we are not to review the code here, maybe it is good to close the PR. -- This is an

[GitHub] [spark] AngersZhuuuu commented on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast

2021-06-18 Thread GitBox
AngersZh commented on pull request #32940: URL: https://github.com/apache/spark/pull/32940#issuecomment-864087991 > @AngersZh Please, resolve conflicts. Done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] SparkQA removed a comment on pull request #32950: [SPARK-35726][SQL] Truncate java.time.Duration by fields of day-time interval type

2021-06-18 Thread GitBox
SparkQA removed a comment on pull request #32950: URL: https://github.com/apache/spark/pull/32950#issuecomment-863322717 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] SparkQA removed a comment on pull request #32747: [SPARK-35611][SS] Introduce the strategy on mismatched offset for start offset timestamp on Kafka data source

2021-06-18 Thread GitBox
SparkQA removed a comment on pull request #32747: URL: https://github.com/apache/spark/pull/32747#issuecomment-864038419 **[Test build #139991 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139991/testReport)** for PR 32747 at commit

[GitHub] [spark] SparkQA commented on pull request #32953: [SPARK-35095][SS] Use ANSI intervals in streaming join tests

2021-06-18 Thread GitBox
SparkQA commented on pull request #32953: URL: https://github.com/apache/spark/pull/32953#issuecomment-863454208 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32753: [SPARK-34859][SQL] Handle column index when using vectorized Parquet reader

2021-06-18 Thread GitBox
AmplabJenkins removed a comment on pull request #32753: URL: https://github.com/apache/spark/pull/32753#issuecomment-863565038 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] SparkQA removed a comment on pull request #32874: [SPARK-35699][K8S] Improve error message when creating k8s pod failed.

2021-06-18 Thread GitBox
SparkQA removed a comment on pull request #32874: URL: https://github.com/apache/spark/pull/32874#issuecomment-863927807 **[Test build #139986 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139986/testReport)** for PR 32874 at commit

[GitHub] [spark] pingsutw commented on a change in pull request #32874: [SPARK-35699][K8S] Improve error message when creating k8s pod failed.

2021-06-18 Thread GitBox
pingsutw commented on a change in pull request #32874: URL: https://github.com/apache/spark/pull/32874#discussion_r654291077 ## File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/KubernetesClientApplication.scala ## @@ -136,7

[GitHub] [spark] SparkQA commented on pull request #32945: [SPARK-35769][SQL] Truncate java.time.Period by fields of year-month interval type

2021-06-18 Thread GitBox
SparkQA commented on pull request #32945: URL: https://github.com/apache/spark/pull/32945#issuecomment-863496712 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] SparkQA commented on pull request #32401: [SPARK-35276][CORE] Calculate checksum for shuffle data and write as checksum file

2021-06-18 Thread GitBox
SparkQA commented on pull request #32401: URL: https://github.com/apache/spark/pull/32401#issuecomment-863458339 **[Test build #139937 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139937/testReport)** for PR 32401 at commit

[GitHub] [spark] Yikun commented on a change in pull request #32859: [SPARK-35708][PYTHON][TEST] Add BaseTest for DataTypeOps

2021-06-18 Thread GitBox
Yikun commented on a change in pull request #32859: URL: https://github.com/apache/spark/pull/32859#discussion_r654179073 ## File path: python/pyspark/pandas/tests/data_type_ops/test_base.py ## @@ -0,0 +1,87 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [spark] viirya closed pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-06-18 Thread GitBox
viirya closed pull request #32136: URL: https://github.com/apache/spark/pull/32136 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [spark] SparkQA removed a comment on pull request #32957: [SPARK-35472][PYTHON] Fix disallow_untyped_defs mypy checks.

2021-06-18 Thread GitBox
SparkQA removed a comment on pull request #32957: URL: https://github.com/apache/spark/pull/32957#issuecomment-863706939 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] ueshin opened a new pull request #32965: [SPARK-35478][PYTHON] Fix Jenkins' linter.

2021-06-18 Thread GitBox
ueshin opened a new pull request #32965: URL: https://github.com/apache/spark/pull/32965 ### What changes were proposed in this pull request? This is a follow-up of #32886 to fix the Jenkins' linter. ### Why are the changes needed? The PR #32886 was mistakenly merged

[GitHub] [spark] SparkQA removed a comment on pull request #32956: [SPARK-35469][PYTHON] Fix disallow_untyped_defs mypy checks.

2021-06-18 Thread GitBox
SparkQA removed a comment on pull request #32956: URL: https://github.com/apache/spark/pull/32956#issuecomment-863595160 **[Test build #139953 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139953/testReport)** for PR 32956 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #28885: [SPARK-29375][SPARK-28940][SPARK-32041][SQL] Whole plan exchange and subquery reuse

2021-06-18 Thread GitBox
AmplabJenkins commented on pull request #28885: URL: https://github.com/apache/spark/pull/28885#issuecomment-863565048 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] MaxGekk commented on pull request #32945: [SPARK-35769][SQL] Truncate java.time.Period by fields of year-month interval type

2021-06-18 Thread GitBox
MaxGekk commented on pull request #32945: URL: https://github.com/apache/spark/pull/32945#issuecomment-863556042 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] wangyum closed pull request #32291: [SPARK-35185][SQL] Improve Distinct statistics estimation

2021-06-18 Thread GitBox
wangyum closed pull request #32291: URL: https://github.com/apache/spark/pull/32291 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [spark] HyukjinKwon commented on pull request #32961: [SPARK-35694][INFRA][TEST-JAVA11] Increase JVM stack size to 128M for Maven

2021-06-18 Thread GitBox
HyukjinKwon commented on pull request #32961: URL: https://github.com/apache/spark/pull/32961#issuecomment-863830847 128 is very big .. are we sure this fixes the problem? I tried 256 to fix up AppVeyor CI but it didn't work: https://github.com/apache/spark/pull/32901 We can merge

[GitHub] [spark] HeartSaVioR commented on a change in pull request #32702: [SPARK-35565][SS] Add config for ignoring metadata directory of FileStreamSink

2021-06-18 Thread GitBox
HeartSaVioR commented on a change in pull request #32702: URL: https://github.com/apache/spark/pull/32702#discussion_r654400586 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -1568,6 +1568,16 @@ object SQLConf { .booleanConf

[GitHub] [spark] toujours33 commented on a change in pull request #32948: [SPARK-35796][Test] fix ut assertionError caused by unexpected path get from CanonicalFile on MacOs higher than 10.15

2021-06-18 Thread GitBox
toujours33 commented on a change in pull request #32948: URL: https://github.com/apache/spark/pull/32948#discussion_r654101887 ## File path: core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala ## @@ -482,7 +482,7 @@ class SparkSubmitSuite val (childArgs,

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #32923: [SPARK-35783][SQL] Set the list of read columns in the task configuration to reduce reading of ORC data.

2021-06-18 Thread GitBox
dongjoon-hyun edited a comment on pull request #32923: URL: https://github.com/apache/spark/pull/32923#issuecomment-863609248 For me, this is a performance improvement, @zhengruifeng . -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA removed a comment on pull request #32955: [WIP] Support creating a Column of numpy literal value in pandas-on-Spark

2021-06-18 Thread GitBox
SparkQA removed a comment on pull request #32955: URL: https://github.com/apache/spark/pull/32955#issuecomment-863459194 **[Test build #139943 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139943/testReport)** for PR 32955 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #32753: [SPARK-34859][SQL] Handle column index when using vectorized Parquet reader

2021-06-18 Thread GitBox
SparkQA removed a comment on pull request #32753: URL: https://github.com/apache/spark/pull/32753#issuecomment-863575428 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

<    1   2   3   4   5   >