[GitHub] [spark] AmplabJenkins removed a comment on pull request #33239: [SPARK-36030][SQL] Support DS v2 metrics at writing path

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33239: URL: https://github.com/apache/spark/pull/33239#issuecomment-883757343 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] SparkQA commented on pull request #33310: [SPARK-36105][SQL] OptimizeLocalShuffleReader support reading data of multiple mappers in one task

2021-07-20 Thread GitBox
SparkQA commented on pull request #33310: URL: https://github.com/apache/spark/pull/33310#issuecomment-883791422 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45881/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #33336: [SPARK-36132][SS][SQL] Support initial state for batch mode of flatMapGroupsWithState

2021-07-20 Thread GitBox
SparkQA commented on pull request #6: URL: https://github.com/apache/spark/pull/6#issuecomment-883794333 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45880/ -- This is an automated message from the A

[GitHub] [spark] SparkQA commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
SparkQA commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883794413 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45879/ -- This is an automated message from the A

[GitHub] [spark] HyukjinKwon commented on pull request #33429: [SPARK-36217][SQL] Rename CustomShuffleReader and OptimizeLocalShuffleReader in AQE

2021-07-20 Thread GitBox
HyukjinKwon commented on pull request #33429: URL: https://github.com/apache/spark/pull/33429#issuecomment-883800454 I'll merge in few days if there are no more comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[GitHub] [spark] tobiasedwards commented on pull request #33428: [SPARK-36220][PYTHON] Fix pyspark.sql.types.Row type annotation

2021-07-20 Thread GitBox
tobiasedwards commented on pull request #33428: URL: https://github.com/apache/spark/pull/33428#issuecomment-883801095 Thanks @zero323 for chiming in on this, that's really interesting insight. I'm new to working with PySpark and hadn't realised it was a discouraged use case. Given

[GitHub] [spark] HyukjinKwon commented on pull request #33444: [WIP][SPARK-36227][SQL][3.2] Remove TimestampNTZ type support in Spark 3.2

2021-07-20 Thread GitBox
HyukjinKwon commented on pull request #33444: URL: https://github.com/apache/spark/pull/33444#issuecomment-883802208 Why don't we just simply document? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] HyukjinKwon commented on pull request #33444: [WIP][SPARK-36227][SQL][3.2] Remove TimestampNTZ type support in Spark 3.2

2021-07-20 Thread GitBox
HyukjinKwon commented on pull request #33444: URL: https://github.com/apache/spark/pull/33444#issuecomment-883802642 in fact we didn't even yet document it as an official type in https://spark.apache.org/docs/latest/sql-ref-datatypes.html. If we're worried about being used, I would just ex

[GitHub] [spark] AmplabJenkins commented on pull request #33336: [SPARK-36132][SS][SQL] Support initial state for batch mode of flatMapGroupsWithState

2021-07-20 Thread GitBox
AmplabJenkins commented on pull request #6: URL: https://github.com/apache/spark/pull/6#issuecomment-883803307 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45880/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
AmplabJenkins commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883803308 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45879/ -- T

[GitHub] [spark] SparkQA commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
SparkQA commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883803789 **[Test build #141366 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141366/testReport)** for PR 33447 at commit [`2f8ea89`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #33270: [SPARK-35956][K8S] Support auto assigning labels to decommissioning pods

2021-07-20 Thread GitBox
SparkQA commented on pull request #33270: URL: https://github.com/apache/spark/pull/33270#issuecomment-883803930 **[Test build #141368 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141368/testReport)** for PR 33270 at commit [`e460e52`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #33336: [SPARK-36132][SS][SQL] Support initial state for batch mode of flatMapGroupsWithState

2021-07-20 Thread GitBox
SparkQA commented on pull request #6: URL: https://github.com/apache/spark/pull/6#issuecomment-883803893 **[Test build #141367 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141367/testReport)** for PR 6 at commit [`4572809`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33336: [SPARK-36132][SS][SQL] Support initial state for batch mode of flatMapGroupsWithState

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #6: URL: https://github.com/apache/spark/pull/6#issuecomment-883803307 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45880/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883803308 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45879/

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33422: [SPARK-34806][SQL] Add Observation helper for Dataset.observe

2021-07-20 Thread GitBox
HyukjinKwon commented on a change in pull request #33422: URL: https://github.com/apache/spark/pull/33422#discussion_r673587148 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -1947,6 +1947,31 @@ class Dataset[T] private[sql]( CollectMetrics(

[GitHub] [spark] HyukjinKwon commented on pull request #33399: [SPARK-36211][PYTHON] Correct typing of `udf` return value

2021-07-20 Thread GitBox
HyukjinKwon commented on pull request #33399: URL: https://github.com/apache/spark/pull/33399#issuecomment-883804824 I'll leave it to @zero323. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33441: [SPARK-36202][SQL] Unify schema check code of FileFormat

2021-07-20 Thread GitBox
HyukjinKwon commented on a change in pull request #33441: URL: https://github.com/apache/spark/pull/33441#discussion_r673588586 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ## @@ -933,19 +934,34 @@ object DDLUtils { case HIVE_

[GitHub] [spark] SparkQA commented on pull request #33310: [SPARK-36105][SQL] OptimizeLocalShuffleReader support reading data of multiple mappers in one task

2021-07-20 Thread GitBox
SparkQA commented on pull request #33310: URL: https://github.com/apache/spark/pull/33310#issuecomment-883807367 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45881/ -- This is an automated message from the A

[GitHub] [spark] AmplabJenkins commented on pull request #33310: [SPARK-36105][SQL] OptimizeLocalShuffleReader support reading data of multiple mappers in one task

2021-07-20 Thread GitBox
AmplabJenkins commented on pull request #33310: URL: https://github.com/apache/spark/pull/33310#issuecomment-883807581 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45881/ -- T

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33441: [SPARK-36202][SQL] Unify schema check code of FileFormat

2021-07-20 Thread GitBox
HyukjinKwon commented on a change in pull request #33441: URL: https://github.com/apache/spark/pull/33441#discussion_r673589222 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileDataSourceV2.scala ## @@ -112,4 +112,21 @@ trait FileDataSour

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33310: [SPARK-36105][SQL] OptimizeLocalShuffleReader support reading data of multiple mappers in one task

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33310: URL: https://github.com/apache/spark/pull/33310#issuecomment-883807581 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45881/

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33441: [SPARK-36202][SQL] Unify schema check code of FileFormat

2021-07-20 Thread GitBox
HyukjinKwon commented on a change in pull request #33441: URL: https://github.com/apache/spark/pull/33441#discussion_r673590254 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileDataSourceV2.scala ## @@ -112,4 +112,21 @@ trait FileDataSour

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33402: [SPARK-36189][PYTHON] Improve bool, string, numeric DataTypeOps tests by avoiding joins

2021-07-20 Thread GitBox
HyukjinKwon commented on a change in pull request #33402: URL: https://github.com/apache/spark/pull/33402#discussion_r673591078 ## File path: python/pyspark/pandas/tests/data_type_ops/test_boolean_ops.py ## @@ -24,289 +24,273 @@ from pandas.api.types import CategoricalDtype

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33436: [SPARK-35912][SQL] Fix nullability of `spark.read.json/spark.read.csv`

2021-07-20 Thread GitBox
HyukjinKwon commented on a change in pull request #33436: URL: https://github.com/apache/spark/pull/33436#discussion_r673592374 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ## @@ -521,7 +521,7 @@ class DataFrameReader private[sql](sparkSessio

[GitHub] [spark] SparkQA commented on pull request #33270: [SPARK-35956][K8S] Support auto assigning labels to decommissioning pods

2021-07-20 Thread GitBox
SparkQA commented on pull request #33270: URL: https://github.com/apache/spark/pull/33270#issuecomment-883812071 **[Test build #141368 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141368/testReport)** for PR 33270 at commit [`e460e52`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #33270: [SPARK-35956][K8S] Support auto assigning labels to decommissioning pods

2021-07-20 Thread GitBox
SparkQA removed a comment on pull request #33270: URL: https://github.com/apache/spark/pull/33270#issuecomment-883803930 **[Test build #141368 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141368/testReport)** for PR 33270 at commit [`e460e52`](https://gi

[GitHub] [spark] ueshin commented on pull request #33400: [SPARK-36186][PYTHON] Add as_ordered/as_unordered to CategoricalAccessor and CategoricalIndex

2021-07-20 Thread GitBox
ueshin commented on pull request #33400: URL: https://github.com/apache/spark/pull/33400#issuecomment-883813511 The tests for pandas-on-Spark and the linter passed. I'd merge this to master/3.2. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [spark] ueshin closed pull request #33400: [SPARK-36186][PYTHON] Add as_ordered/as_unordered to CategoricalAccessor and CategoricalIndex

2021-07-20 Thread GitBox
ueshin closed pull request #33400: URL: https://github.com/apache/spark/pull/33400 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubsc

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33310: [SPARK-36105][SQL] OptimizeLocalShuffleReader support reading data of multiple mappers in one task

2021-07-20 Thread GitBox
HyukjinKwon commented on a change in pull request #33310: URL: https://github.com/apache/spark/pull/33310#discussion_r673595867 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/ShuffledRowRDD.scala ## @@ -58,6 +58,12 @@ case class PartialMapperPartitionSpec

[GitHub] [spark] HyukjinKwon commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
HyukjinKwon commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883816993 Thanks for investigating this @viirya. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] cutiechi commented on a change in pull request #33257: [SPARK-36039][K8S] Fix executor pod hadoop conf mount

2021-07-20 Thread GitBox
cutiechi commented on a change in pull request #33257: URL: https://github.com/apache/spark/pull/33257#discussion_r673598521 ## File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/HadoopConfExecutorFeatureStep.scala ## @@ -0,0 +1,12

[GitHub] [spark] ueshin opened a new pull request #33448: [SPARK-36188][PYTHON] Add categories setter to CategoricalAccessor and CategoricalIndex

2021-07-20 Thread GitBox
ueshin opened a new pull request #33448: URL: https://github.com/apache/spark/pull/33448 ### What changes were proposed in this pull request? Add categories setter to `CategoricalAccessor` and `CategoricalIndex`. ### Why are the changes needed? We should implement ca

[GitHub] [spark] HeartSaVioR commented on pull request #33433: [SPARK-36172][SS] Document session window into Structured Streaming guide doc

2021-07-20 Thread GitBox
HeartSaVioR commented on pull request #33433: URL: https://github.com/apache/spark/pull/33433#issuecomment-883818598 Thanks all for reviewing! Merging to master/3.2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] HeartSaVioR closed pull request #33433: [SPARK-36172][SS] Document session window into Structured Streaming guide doc

2021-07-20 Thread GitBox
HeartSaVioR closed pull request #33433: URL: https://github.com/apache/spark/pull/33433 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-un

[GitHub] [spark] SparkQA commented on pull request #33336: [SPARK-36132][SS][SQL] Support initial state for batch mode of flatMapGroupsWithState

2021-07-20 Thread GitBox
SparkQA commented on pull request #6: URL: https://github.com/apache/spark/pull/6#issuecomment-883823819 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45883/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
SparkQA commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883824116 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45882/ -- This is an automated message from the Apache

[GitHub] [spark] AmplabJenkins commented on pull request #33270: [SPARK-35956][K8S] Support auto assigning labels to decommissioning pods

2021-07-20 Thread GitBox
AmplabJenkins commented on pull request #33270: URL: https://github.com/apache/spark/pull/33270#issuecomment-883824452 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141368/ -- This

[GitHub] [spark] beliefer commented on pull request #33430: [SPARK-36046][SQL][FOLLOWUP] Implement prettyName for MakeTimestampNTZ and MakeTimestampLTZ

2021-07-20 Thread GitBox
beliefer commented on pull request #33430: URL: https://github.com/apache/spark/pull/33430#issuecomment-883824484 @gengliangwang Thank you for review. @HyukjinKwon @dongjoon-hyun Thank you too. -- This is an automated message from the Apache Git Service. To respond to the message, plea

[GitHub] [spark] beliefer edited a comment on pull request #33430: [SPARK-36046][SQL][FOLLOWUP] Implement prettyName for MakeTimestampNTZ and MakeTimestampLTZ

2021-07-20 Thread GitBox
beliefer edited a comment on pull request #33430: URL: https://github.com/apache/spark/pull/33430#issuecomment-883824484 @gengliangwang Thank you for review. @HyukjinKwon @dongjoon-hyun Thanks too. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] HeartSaVioR commented on pull request #33433: [SPARK-36172][SS] Document session window into Structured Streaming guide doc

2021-07-20 Thread GitBox
HeartSaVioR commented on pull request #33433: URL: https://github.com/apache/spark/pull/33433#issuecomment-883824840 Thanks. I merged this into master/3.2. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] Ngone51 commented on pull request #33116: [SPARK-35259][SHUFFLE] Rename ExternalBlockHandler Timer variables to remove incorrect millis suffix

2021-07-20 Thread GitBox
Ngone51 commented on pull request #33116: URL: https://github.com/apache/spark/pull/33116#issuecomment-883824828 Sure, I'll take a look later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] SparkQA commented on pull request #33270: [SPARK-35956][K8S] Support auto assigning labels to decommissioning pods

2021-07-20 Thread GitBox
SparkQA commented on pull request #33270: URL: https://github.com/apache/spark/pull/33270#issuecomment-883824864 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45884/ -- This is an automated message from the Apache

[GitHub] [spark] beliefer commented on pull request #33439: [SPARK-36222][SQL] Step by days in the Sequence expression for dates

2021-07-20 Thread GitBox
beliefer commented on pull request #33439: URL: https://github.com/apache/spark/pull/33439#issuecomment-883825003 Thank you @MaxGekk and @srowen for review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33270: [SPARK-35956][K8S] Support auto assigning labels to decommissioning pods

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33270: URL: https://github.com/apache/spark/pull/33270#issuecomment-883824452 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141368/ -

[GitHub] [spark] HyukjinKwon commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
HyukjinKwon commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883825434 FWIW, I created a ticket at GitHub Actions: https://user-images.githubusercontent.com/6477701/126418579-fd5f6b94-b01c-4c7a-b215-4f32cc1b7fb6.png";> -- This is an aut

[GitHub] [spark] Ngone51 commented on a change in pull request #33425: [SPARK-32919][FOLLOW-UP] Filter out driver in the merger locations and fix the return type of RemoveShufflePushMergerLocations

2021-07-20 Thread GitBox
Ngone51 commented on a change in pull request #33425: URL: https://github.com/apache/spark/pull/33425#discussion_r673605760 ## File path: core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala ## @@ -2093,6 +2093,9 @@ class BlockManagerSuite extends SparkFunSuite

[GitHub] [spark] SparkQA commented on pull request #33448: [SPARK-36188][PYTHON] Add categories setter to CategoricalAccessor and CategoricalIndex

2021-07-20 Thread GitBox
SparkQA commented on pull request #33448: URL: https://github.com/apache/spark/pull/33448#issuecomment-883825717 **[Test build #141369 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141369/testReport)** for PR 33448 at commit [`9e8ade5`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
SparkQA commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883825743 **[Test build #141370 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141370/testReport)** for PR 33447 at commit [`aacc1dd`](https://github.com

[GitHub] [spark] viirya commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
viirya commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883827110 Thank you @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [spark] srowen opened a new pull request #33449: [SPARK-35310][MLLIB] Update to breeze 1.2

2021-07-20 Thread GitBox
srowen opened a new pull request #33449: URL: https://github.com/apache/spark/pull/33449 ### What changes were proposed in this pull request? Update to the latest breeze 1.2 ### Why are the changes needed? Minor bug fixes ### Does this PR introduce _any_ user-faci

[GitHub] [spark] srowen commented on a change in pull request #33449: [SPARK-35310][MLLIB] Update to breeze 1.2

2021-07-20 Thread GitBox
srowen commented on a change in pull request #33449: URL: https://github.com/apache/spark/pull/33449#discussion_r673607385 ## File path: mllib/src/test/scala/org/apache/spark/ml/optim/WeightedLeastSquaresSuite.scala ## @@ -531,7 +534,8 @@ class WeightedLeastSquaresSuite extend

[GitHub] [spark] SparkQA commented on pull request #33449: [SPARK-35310][MLLIB] Update to breeze 1.2

2021-07-20 Thread GitBox
SparkQA commented on pull request #33449: URL: https://github.com/apache/spark/pull/33449#issuecomment-883828834 **[Test build #141371 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141371/testReport)** for PR 33449 at commit [`9bf2482`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
SparkQA commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883830414 **[Test build #141372 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141372/testReport)** for PR 33447 at commit [`84d7451`](https://github.com

[GitHub] [spark] AngersZhuuuu commented on pull request #33253: [SPARK-36038][CORE] Speculation metrics summary at stage level

2021-07-20 Thread GitBox
AngersZh commented on pull request #33253: URL: https://github.com/apache/spark/pull/33253#issuecomment-883831121 LGTM, WDYT? @sarutak -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [spark] mridulm commented on a change in pull request #33426: [SPARK-32920][FOLLOW-UP] Fix shuffleMergeFinalized directly calling rdd.getNumPartitions as RDD is not serialized to executor

2021-07-20 Thread GitBox
mridulm commented on a change in pull request #33426: URL: https://github.com/apache/spark/pull/33426#discussion_r673611951 ## File path: core/src/main/scala/org/apache/spark/shuffle/ShuffleWriteProcessor.scala ## @@ -64,7 +64,8 @@ private[spark] class ShuffleWriteProcessor ex

[GitHub] [spark] mridulm commented on a change in pull request #33426: [SPARK-32920][FOLLOW-UP] Fix shuffleMergeFinalized directly calling rdd.getNumPartitions as RDD is not serialized to executor

2021-07-20 Thread GitBox
mridulm commented on a change in pull request #33426: URL: https://github.com/apache/spark/pull/33426#discussion_r673612301 ## File path: core/src/main/scala/org/apache/spark/Dependency.scala ## @@ -97,6 +97,8 @@ class ShuffleDependency[K: ClassTag, V: ClassTag, C: ClassTag](

[GitHub] [spark] ulysses-you commented on a change in pull request #33310: [SPARK-36105][SQL] OptimizeLocalShuffleReader support reading data of multiple mappers in one task

2021-07-20 Thread GitBox
ulysses-you commented on a change in pull request #33310: URL: https://github.com/apache/spark/pull/33310#discussion_r673612589 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/ShuffledRowRDD.scala ## @@ -181,6 +187,9 @@ class ShuffledRowRDD( case

[GitHub] [spark] mridulm commented on pull request #33034: WIP: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-07-20 Thread GitBox
mridulm commented on pull request #33034: URL: https://github.com/apache/spark/pull/33034#issuecomment-883834355 Is this still WIP ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] cfmcgrady commented on a change in pull request #33436: [SPARK-35912][SQL] Fix nullability of `spark.read.json/spark.read.csv`

2021-07-20 Thread GitBox
cfmcgrady commented on a change in pull request #33436: URL: https://github.com/apache/spark/pull/33436#discussion_r673613103 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ## @@ -521,7 +521,7 @@ class DataFrameReader private[sql](sparkSession:

[GitHub] [spark] venkata91 commented on pull request #33034: WIP: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-07-20 Thread GitBox
venkata91 commented on pull request #33034: URL: https://github.com/apache/spark/pull/33034#issuecomment-883835982 > Is this still WIP ? Yeah I am in the process of breaking this into 2 PRs. Will update here once that is done. -- This is an automated message from the Apache Git Se

[GitHub] [spark] srowen closed pull request #33263: [SPARK-35027][CORE] Close the inputStream in FileAppender when writin…

2021-07-20 Thread GitBox
srowen closed pull request #33263: URL: https://github.com/apache/spark/pull/33263 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubsc

[GitHub] [spark] srowen commented on pull request #33263: [SPARK-35027][CORE] Close the inputStream in FileAppender when writin…

2021-07-20 Thread GitBox
srowen commented on pull request #33263: URL: https://github.com/apache/spark/pull/33263#issuecomment-883836234 Merged to master/3.2/3.1/3.0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [spark] SparkQA commented on pull request #33336: [SPARK-36132][SS][SQL] Support initial state for batch mode of flatMapGroupsWithState

2021-07-20 Thread GitBox
SparkQA commented on pull request #6: URL: https://github.com/apache/spark/pull/6#issuecomment-883837684 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45883/ -- This is an automated message from the A

[GitHub] [spark] SparkQA commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
SparkQA commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883839467 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45882/ -- This is an automated message from the A

[GitHub] [spark] srowen closed pull request #32895: [SPARK-35658][DOCS] Document Parquet encryption feature in Spark SQL

2021-07-20 Thread GitBox
srowen closed pull request #32895: URL: https://github.com/apache/spark/pull/32895 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubsc

[GitHub] [spark] srowen commented on pull request #32895: [SPARK-35658][DOCS] Document Parquet encryption feature in Spark SQL

2021-07-20 Thread GitBox
srowen commented on pull request #32895: URL: https://github.com/apache/spark/pull/32895#issuecomment-883840316 Merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[GitHub] [spark] SparkQA commented on pull request #33270: [SPARK-35956][K8S] Support auto assigning labels to decommissioning pods

2021-07-20 Thread GitBox
SparkQA commented on pull request #33270: URL: https://github.com/apache/spark/pull/33270#issuecomment-883840340 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45884/ -- This is an automated message from the A

[GitHub] [spark] srowen closed pull request #33362: [SPARK-36153][SQL][DOCS] Update transform doc to match the current code

2021-07-20 Thread GitBox
srowen closed pull request #33362: URL: https://github.com/apache/spark/pull/33362 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubsc

[GitHub] [spark] SparkQA commented on pull request #33448: [SPARK-36188][PYTHON] Add categories setter to CategoricalAccessor and CategoricalIndex

2021-07-20 Thread GitBox
SparkQA commented on pull request #33448: URL: https://github.com/apache/spark/pull/33448#issuecomment-883840924 **[Test build #141369 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141369/testReport)** for PR 33448 at commit [`9e8ade5`](https://github.co

[GitHub] [spark] srowen commented on pull request #33362: [SPARK-36153][SQL][DOCS] Update transform doc to match the current code

2021-07-20 Thread GitBox
srowen commented on pull request #33362: URL: https://github.com/apache/spark/pull/33362#issuecomment-883841000 Merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[GitHub] [spark] viirya commented on pull request #33445: [SPARK-36228][SQL] Skip splitting a skewed partition when some map outputs are removed

2021-07-20 Thread GitBox
viirya commented on pull request #33445: URL: https://github.com/apache/spark/pull/33445#issuecomment-883841192 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [spark] HyukjinKwon commented on pull request #33444: [WIP][SPARK-36227][SQL][3.2] Remove TimestampNTZ type support in Spark 3.2

2021-07-20 Thread GitBox
HyukjinKwon commented on pull request #33444: URL: https://github.com/apache/spark/pull/33444#issuecomment-883841574 okie, had a short offline discussion. if this is all we need to change (or if the change is not so big), I am fine. -- This is an automated message from the Apache Git Ser

[GitHub] [spark] AngersZhuuuu commented on pull request #33363: [SPARK-36156][SQL] SCRIPT TRANSFORM ROW FORMAT DELIMITED should respect `NULL DEFINED AS` and default value should be `\N`

2021-07-20 Thread GitBox
AngersZh commented on pull request #33363: URL: https://github.com/apache/spark/pull/33363#issuecomment-883845097 @srowen Since you have merged the doc pr, I think you also need to check this -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [spark] SparkQA commented on pull request #33239: [SPARK-36030][SQL] Support DS v2 metrics at writing path

2021-07-20 Thread GitBox
SparkQA commented on pull request #33239: URL: https://github.com/apache/spark/pull/33239#issuecomment-883845400 **[Test build #141362 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141362/testReport)** for PR 33239 at commit [`25cc546`](https://github.co

[GitHub] [spark] JkSelf commented on pull request #33445: [SPARK-36228][SQL] Skip splitting a skewed partition when some map outputs are removed

2021-07-20 Thread GitBox
JkSelf commented on pull request #33445: URL: https://github.com/apache/spark/pull/33445#issuecomment-883845739 LGTM. +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #33296: [SPARK-34402][SQL] Group exception about data format schema

2021-07-20 Thread GitBox
AngersZh commented on a change in pull request #33296: URL: https://github.com/apache/spark/pull/33296#discussion_r673623765 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ## @@ -3002,8 +3002,8 @@ class HiveDDLSuite

[GitHub] [spark] srowen commented on pull request #33363: [SPARK-36156][SQL] SCRIPT TRANSFORM ROW FORMAT DELIMITED should respect `NULL DEFINED AS` and default value should be `\N`

2021-07-20 Thread GitBox
srowen commented on pull request #33363: URL: https://github.com/apache/spark/pull/33363#issuecomment-883846147 Oh, I see, that wasn't existing behavior. Eh, I am not as well placed to evaluate the actual change - is this really not a behavior change? -- This is an automated message from

[GitHub] [spark] AmplabJenkins commented on pull request #33336: [SPARK-36132][SS][SQL] Support initial state for batch mode of flatMapGroupsWithState

2021-07-20 Thread GitBox
AmplabJenkins commented on pull request #6: URL: https://github.com/apache/spark/pull/6#issuecomment-883846279 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45883/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #33270: [SPARK-35956][K8S] Support auto assigning labels to decommissioning pods

2021-07-20 Thread GitBox
AmplabJenkins commented on pull request #33270: URL: https://github.com/apache/spark/pull/33270#issuecomment-883846283 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45884/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #33239: [SPARK-36030][SQL] Support DS v2 metrics at writing path

2021-07-20 Thread GitBox
AmplabJenkins commented on pull request #33239: URL: https://github.com/apache/spark/pull/33239#issuecomment-883846280 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141362/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
AmplabJenkins commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883846277 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45882/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #33448: [SPARK-36188][PYTHON] Add categories setter to CategoricalAccessor and CategoricalIndex

2021-07-20 Thread GitBox
AmplabJenkins commented on pull request #33448: URL: https://github.com/apache/spark/pull/33448#issuecomment-883846278 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141369/ -- This

[GitHub] [spark] AngersZhuuuu commented on pull request #33363: [SPARK-36156][SQL] SCRIPT TRANSFORM ROW FORMAT DELIMITED should respect `NULL DEFINED AS` and default value should be `\N`

2021-07-20 Thread GitBox
AngersZh commented on pull request #33363: URL: https://github.com/apache/spark/pull/33363#issuecomment-883846489 > Oh, I see, that wasn't existing behavior. Eh, I am not as well placed to evaluate the actual change - is this really not a behavior change? It should be a bug fix a

[GitHub] [spark] SparkQA removed a comment on pull request #33239: [SPARK-36030][SQL] Support DS v2 metrics at writing path

2021-07-20 Thread GitBox
SparkQA removed a comment on pull request #33239: URL: https://github.com/apache/spark/pull/33239#issuecomment-883733691 **[Test build #141362 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141362/testReport)** for PR 33239 at commit [`25cc546`](https://gi

[GitHub] [spark] SparkQA removed a comment on pull request #33448: [SPARK-36188][PYTHON] Add categories setter to CategoricalAccessor and CategoricalIndex

2021-07-20 Thread GitBox
SparkQA removed a comment on pull request #33448: URL: https://github.com/apache/spark/pull/33448#issuecomment-883825717 **[Test build #141369 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141369/testReport)** for PR 33448 at commit [`9e8ade5`](https://gi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33270: [SPARK-35956][K8S] Support auto assigning labels to decommissioning pods

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33270: URL: https://github.com/apache/spark/pull/33270#issuecomment-883846283 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45884/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33448: [SPARK-36188][PYTHON] Add categories setter to CategoricalAccessor and CategoricalIndex

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33448: URL: https://github.com/apache/spark/pull/33448#issuecomment-883846278 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141369/ -

[GitHub] [spark] AngersZhuuuu edited a comment on pull request #33363: [SPARK-36156][SQL] SCRIPT TRANSFORM ROW FORMAT DELIMITED should respect `NULL DEFINED AS` and default value should be `\N`

2021-07-20 Thread GitBox
AngersZh edited a comment on pull request #33363: URL: https://github.com/apache/spark/pull/33363#issuecomment-883846489 > Oh, I see, that wasn't existing behavior. Eh, I am not as well placed to evaluate the actual change - is this really not a behavior change? It's a behavior c

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33336: [SPARK-36132][SS][SQL] Support initial state for batch mode of flatMapGroupsWithState

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #6: URL: https://github.com/apache/spark/pull/6#issuecomment-883846279 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45883/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-883846277 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45882/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33239: [SPARK-36030][SQL] Support DS v2 metrics at writing path

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33239: URL: https://github.com/apache/spark/pull/33239#issuecomment-883846280 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141362/ -

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33436: [SPARK-35912][SQL] Fix nullability of `spark.read.json/spark.read.csv`

2021-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #33436: URL: https://github.com/apache/spark/pull/33436#issuecomment-883590988 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[GitHub] [spark] SparkQA commented on pull request #33445: [SPARK-36228][SQL] Skip splitting a skewed partition when some map outputs are removed

2021-07-20 Thread GitBox
SparkQA commented on pull request #33445: URL: https://github.com/apache/spark/pull/33445#issuecomment-883847208 **[Test build #141373 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141373/testReport)** for PR 33445 at commit [`1caa391`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #33296: [SPARK-34402][SQL] Group exception about data format schema

2021-07-20 Thread GitBox
SparkQA commented on pull request #33296: URL: https://github.com/apache/spark/pull/33296#issuecomment-883847328 **[Test build #141376 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141376/testReport)** for PR 33296 at commit [`19aa188`](https://github.com

[GitHub] [spark] yaooqinn commented on pull request #33424: [SPARK-36213][SQL] Normalize PartitionSpec for Describe Table Command with PartitionSpec

2021-07-20 Thread GitBox
yaooqinn commented on pull request #33424: URL: https://github.com/apache/spark/pull/33424#issuecomment-883847337 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [spark] SparkQA commented on pull request #33352: [SPARK-34952][SQL] DSv2 Aggregate push down APIs

2021-07-20 Thread GitBox
SparkQA commented on pull request #33352: URL: https://github.com/apache/spark/pull/33352#issuecomment-883847251 **[Test build #141375 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141375/testReport)** for PR 33352 at commit [`74e4c3b`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #33436: [SPARK-35912][SQL] Fix nullability of `spark.read.json/spark.read.csv`

2021-07-20 Thread GitBox
SparkQA commented on pull request #33436: URL: https://github.com/apache/spark/pull/33436#issuecomment-883847204 **[Test build #141374 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141374/testReport)** for PR 33436 at commit [`741d0c9`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #33424: [SPARK-36213][SQL] Normalize PartitionSpec for Describe Table Command with PartitionSpec

2021-07-20 Thread GitBox
SparkQA commented on pull request #33424: URL: https://github.com/apache/spark/pull/33424#issuecomment-883848584 **[Test build #141377 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141377/testReport)** for PR 33424 at commit [`afa5539`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #33448: [SPARK-36188][PYTHON] Add categories setter to CategoricalAccessor and CategoricalIndex

2021-07-20 Thread GitBox
SparkQA commented on pull request #33448: URL: https://github.com/apache/spark/pull/33448#issuecomment-883850885 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45885/ -- This is an automated message from the Apache

  1   2   3   4   5   6   7   8   9   10   >