[GitHub] [spark] wangyum commented on a change in pull request #34810: [SPARK-37549][SQL] Support set parallel through data source properties

2021-12-05 Thread GitBox
wangyum commented on a change in pull request #34810: URL: https://github.com/apache/spark/pull/34810#discussion_r762771255 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileSourceParallelSuite.scala ## @@ -0,0 +1,61 @@ +/* + * Licensed to th

[GitHub] [spark] SparkQA commented on pull request #34810: [SPARK-37549][SQL] Support set parallel through data source properties

2021-12-05 Thread GitBox
SparkQA commented on pull request #34810: URL: https://github.com/apache/spark/pull/34810#issuecomment-986521063 **[Test build #145945 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145945/testReport)** for PR 34810 at commit [`dd20bbd`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #34807: [SPARK-37546][SQL] V2 ReplaceTableAsSelect command should qualify location

2021-12-05 Thread GitBox
SparkQA commented on pull request #34807: URL: https://github.com/apache/spark/pull/34807#issuecomment-986519018 **[Test build #145944 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145944/testReport)** for PR 34807 at commit [`e1104a0`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #34815: [SPARK-37555][SQL] spark-sql should pass last unclosed comment to backend

2021-12-05 Thread GitBox
SparkQA commented on pull request #34815: URL: https://github.com/apache/spark/pull/34815#issuecomment-986517659 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50417/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #34807: [SPARK-37546][SQL] V2 ReplaceTableAsSelect command should qualify location

2021-12-05 Thread GitBox
SparkQA commented on pull request #34807: URL: https://github.com/apache/spark/pull/34807#issuecomment-986517012 **[Test build #145943 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145943/testReport)** for PR 34807 at commit [`5ff0c26`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #34817: [WIP][SPARK-37552][SQL] Add the `convert_timezone()` function

2021-12-05 Thread GitBox
SparkQA commented on pull request #34817: URL: https://github.com/apache/spark/pull/34817#issuecomment-986516947 **[Test build #145942 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145942/testReport)** for PR 34817 at commit [`a65294b`](https://github.com

[GitHub] [spark] summaryzb commented on a change in pull request #34749: [SPARK-37493][CORE] show driver's gc time and duration time in executors page

2021-12-05 Thread GitBox
summaryzb commented on a change in pull request #34749: URL: https://github.com/apache/spark/pull/34749#discussion_r762764064 ## File path: core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ## @@ -249,21 +249,18 @@ private[spark] class EventLoggingListe

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34815: [SPARK-37555][SQL] spark-sql should pass last unclosed comment to backend

2021-12-05 Thread GitBox
AmplabJenkins removed a comment on pull request #34815: URL: https://github.com/apache/spark/pull/34815#issuecomment-986515982 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] AmplabJenkins commented on pull request #34815: [SPARK-37555][SQL] spark-sql should pass last unclosed comment to backend

2021-12-05 Thread GitBox
AmplabJenkins commented on pull request #34815: URL: https://github.com/apache/spark/pull/34815#issuecomment-986515983 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] AmplabJenkins commented on pull request #34814: [SPARK-37004][PYTHON] Upgrade to Py4J 0.10.9.3

2021-12-05 Thread GitBox
AmplabJenkins commented on pull request #34814: URL: https://github.com/apache/spark/pull/34814#issuecomment-986515984 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50414/ -- T

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34814: [SPARK-37004][PYTHON] Upgrade to Py4J 0.10.9.3

2021-12-05 Thread GitBox
AmplabJenkins removed a comment on pull request #34814: URL: https://github.com/apache/spark/pull/34814#issuecomment-986515984 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50414/

[GitHub] [spark] SparkQA removed a comment on pull request #34815: [SPARK-37555][SQL] spark-sql should pass last unclosed comment to backend

2021-12-05 Thread GitBox
SparkQA removed a comment on pull request #34815: URL: https://github.com/apache/spark/pull/34815#issuecomment-986491923 **[Test build #145941 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145941/testReport)** for PR 34815 at commit [`c67dcaf`](https://gi

[GitHub] [spark] SparkQA commented on pull request #34815: [SPARK-37555][SQL] spark-sql should pass last unclosed comment to backend

2021-12-05 Thread GitBox
SparkQA commented on pull request #34815: URL: https://github.com/apache/spark/pull/34815#issuecomment-986515035 **[Test build #145941 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145941/testReport)** for PR 34815 at commit [`c67dcaf`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #34638: [SPARK-37360][SQL] Support TimestampNTZ in JSON data source

2021-12-05 Thread GitBox
SparkQA commented on pull request #34638: URL: https://github.com/apache/spark/pull/34638#issuecomment-986514121 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50416/ -- This is an automated message from the Apache

[GitHub] [spark] toujours33 commented on pull request #34720: [SPARK-37469][WebUI] unified shuffle read block time to shuffle read fetch wait time in StagePage

2021-12-05 Thread GitBox
toujours33 commented on pull request #34720: URL: https://github.com/apache/spark/pull/34720#issuecomment-986512558 Sorry for the long delay due to some personal reasons, please take a look~ @sarutak @mridulm -- This is an automated message from the Apache Git Service. To respond to t

[GitHub] [spark] c21 commented on a change in pull request #34702: [SPARK-37455][SQL] Replace hash with sort aggregate if child is already sorted

2021-12-05 Thread GitBox
c21 commented on a change in pull request #34702: URL: https://github.com/apache/spark/pull/34702#discussion_r762759532 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/ReplaceHashWithSortAgg.scala ## @@ -0,0 +1,106 @@ +/* + * Licensed to the Apache Softwar

[GitHub] [spark] SparkQA commented on pull request #34815: [SPARK-37555][SQL] spark-sql should pass last unclosed comment to backend

2021-12-05 Thread GitBox
SparkQA commented on pull request #34815: URL: https://github.com/apache/spark/pull/34815#issuecomment-986511781 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50415/ -- This is an automated message from the Apache

[GitHub] [spark] summaryzb commented on a change in pull request #34749: [SPARK-37493][CORE] show driver's gc time and duration time in executors page

2021-12-05 Thread GitBox
summaryzb commented on a change in pull request #34749: URL: https://github.com/apache/spark/pull/34749#discussion_r76275 ## File path: core/src/main/scala/org/apache/spark/metrics/ExecutorMetricType.scala ## @@ -137,7 +138,9 @@ case object GarbageCollectionMetrics extends

[GitHub] [spark] yaooqinn edited a comment on pull request #34735: [SPARK-37481][Core][WebUI] Fix disappearance of skipped stages after they retry

2021-12-05 Thread GitBox
yaooqinn edited a comment on pull request #34735: URL: https://github.com/apache/spark/pull/34735#issuecomment-986509805 > But showing a skipped stage, when it runs initially, as "retry 1" can be confusing Per the figure in PR desc, we will both keep the skipped info and retry info,

[GitHub] [spark] yaooqinn commented on pull request #34735: [SPARK-37481][Core][WebUI] Fix disappearance of skipped stages after they retry

2021-12-05 Thread GitBox
yaooqinn commented on pull request #34735: URL: https://github.com/apache/spark/pull/34735#issuecomment-986509805 > But showing a skipped stage, when it runs initially, as "retry 1" can be confusing Per the figure in PR desc, we will both keep the skipped info and retry info, it's c

[GitHub] [spark] summaryzb commented on a change in pull request #34749: [SPARK-37493][CORE] show driver's gc time and duration time in executors page

2021-12-05 Thread GitBox
summaryzb commented on a change in pull request #34749: URL: https://github.com/apache/spark/pull/34749#discussion_r762756637 ## File path: core/src/main/scala/org/apache/spark/metrics/ExecutorMetricType.scala ## @@ -137,7 +138,9 @@ case object GarbageCollectionMetrics extends

[GitHub] [spark] MaxGekk opened a new pull request #34817: [WIP][SPARK-37552][SQL] Add the `convert_timezone()` function

2021-12-05 Thread GitBox
MaxGekk opened a new pull request #34817: URL: https://github.com/apache/spark/pull/34817 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was

[GitHub] [spark] SparkQA removed a comment on pull request #34815: [SPARK-37555][SQL] spark-sql should pass last unclosed comment to backend

2021-12-05 Thread GitBox
SparkQA removed a comment on pull request #34815: URL: https://github.com/apache/spark/pull/34815#issuecomment-986484982 **[Test build #145939 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145939/testReport)** for PR 34815 at commit [`e874bbd`](https://gi

[GitHub] [spark] SparkQA commented on pull request #34815: [SPARK-37555][SQL] spark-sql should pass last unclosed comment to backend

2021-12-05 Thread GitBox
SparkQA commented on pull request #34815: URL: https://github.com/apache/spark/pull/34815#issuecomment-986505796 **[Test build #145939 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145939/testReport)** for PR 34815 at commit [`e874bbd`](https://github.co

[GitHub] [spark] yaooqinn commented on pull request #34735: [SPARK-37481][Core][WebUI] Fix disappearance of skipped stages after they retry

2021-12-05 Thread GitBox
yaooqinn commented on pull request #34735: URL: https://github.com/apache/spark/pull/34735#issuecomment-986503970 > Are we trying to preserve skipped stages when they are retried? If yes, why ? Yes. When the skipped stages are retried, 1. the skipped info on UI get lost 2. t

[GitHub] [spark] summaryzb commented on a change in pull request #34749: [SPARK-37493][CORE] show driver's gc time and duration time in executors page

2021-12-05 Thread GitBox
summaryzb commented on a change in pull request #34749: URL: https://github.com/apache/spark/pull/34749#discussion_r762750607 ## File path: core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ## @@ -249,21 +249,18 @@ private[spark] class EventLoggingListe

[GitHub] [spark] SparkQA commented on pull request #34814: [SPARK-37004][PYTHON] Upgrade to Py4J 0.10.9.3

2021-12-05 Thread GitBox
SparkQA commented on pull request #34814: URL: https://github.com/apache/spark/pull/34814#issuecomment-986499331 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50414/ -- This is an automated message from the A

[GitHub] [spark] cloud-fan commented on a change in pull request #34807: [SPARK-37546][SQL] V2 ReplaceTableAsSelect command should qualify location

2021-12-05 Thread GitBox
cloud-fan commented on a change in pull request #34807: URL: https://github.com/apache/spark/pull/34807#discussion_r762748238 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala ## @@ -165,9 +165,10 @@ class DataSource

[GitHub] [spark] cloud-fan commented on a change in pull request #34807: [SPARK-37546][SQL] V2 ReplaceTableAsSelect command should qualify location

2021-12-05 Thread GitBox
cloud-fan commented on a change in pull request #34807: URL: https://github.com/apache/spark/pull/34807#discussion_r762747825 ## File path: sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala ## @@ -408,17 +408,24 @@ class DataSourceV2SQLSuite

[GitHub] [spark] AngersZhuuuu commented on pull request #34815: [SPARK-37555][SQL] spark-sql should pass last unclosed comment to backend

2021-12-05 Thread GitBox
AngersZh commented on pull request #34815: URL: https://github.com/apache/spark/pull/34815#issuecomment-986496652 ping @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [spark] SparkQA commented on pull request #34815: [SPARK-37555][SQL] spark-sql should pass last unclosed comment to backend

2021-12-05 Thread GitBox
SparkQA commented on pull request #34815: URL: https://github.com/apache/spark/pull/34815#issuecomment-986491923 **[Test build #145941 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145941/testReport)** for PR 34815 at commit [`c67dcaf`](https://github.com

[GitHub] [spark] cloud-fan commented on a change in pull request #34702: [SPARK-37455][SQL] Replace hash with sort aggregate if child is already sorted

2021-12-05 Thread GitBox
cloud-fan commented on a change in pull request #34702: URL: https://github.com/apache/spark/pull/34702#discussion_r762743589 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/ReplaceHashWithSortAgg.scala ## @@ -0,0 +1,106 @@ +/* + * Licensed to the Apache S

[GitHub] [spark] SparkQA commented on pull request #34638: [SPARK-37360][SQL] Support TimestampNTZ in JSON data source

2021-12-05 Thread GitBox
SparkQA commented on pull request #34638: URL: https://github.com/apache/spark/pull/34638#issuecomment-986485140 **[Test build #145940 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145940/testReport)** for PR 34638 at commit [`a626c9d`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #34815: [SPARK-37555][SQL] spark-sql should pass last unclosed comment to backend

2021-12-05 Thread GitBox
SparkQA commented on pull request #34815: URL: https://github.com/apache/spark/pull/34815#issuecomment-986484982 **[Test build #145939 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145939/testReport)** for PR 34815 at commit [`e874bbd`](https://github.com

[GitHub] [spark] AmplabJenkins commented on pull request #34816: SPARK-37556: Deser void class fail with Java serialization

2021-12-05 Thread GitBox
AmplabJenkins commented on pull request #34816: URL: https://github.com/apache/spark/pull/34816#issuecomment-986484752 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [spark] mridulm commented on pull request #34735: [SPARK-37481][Core][WebUI] Fix disappearance of skipped stages after they retry

2021-12-05 Thread GitBox
mridulm commented on pull request #34735: URL: https://github.com/apache/spark/pull/34735#issuecomment-986484286 I am trying to understand the change here, I saw the gist but it was not clear to me why/what the expected behavior we are trying to expose here. Are we trying to preserve

[GitHub] [spark] SparkQA commented on pull request #34814: [SPARK-37004][PYTHON] Upgrade to Py4J 0.10.9.3

2021-12-05 Thread GitBox
SparkQA commented on pull request #34814: URL: https://github.com/apache/spark/pull/34814#issuecomment-986478905 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50414/ -- This is an automated message from the Apache

[GitHub] [spark] mridulm commented on a change in pull request #34749: [SPARK-37493][CORE] show driver's gc time and duration time in executors page

2021-12-05 Thread GitBox
mridulm commented on a change in pull request #34749: URL: https://github.com/apache/spark/pull/34749#discussion_r762723491 ## File path: core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ## @@ -249,21 +249,18 @@ private[spark] class EventLoggingListene

[GitHub] [spark] mridulm commented on a change in pull request #34749: [SPARK-37493][CORE] show driver's gc time and duration time in executors page

2021-12-05 Thread GitBox
mridulm commented on a change in pull request #34749: URL: https://github.com/apache/spark/pull/34749#discussion_r762722711 ## File path: core/src/main/scala/org/apache/spark/metrics/ExecutorMetricType.scala ## @@ -137,7 +138,9 @@ case object GarbageCollectionMetrics extends E

[GitHub] [spark] sadikovi commented on pull request #34638: [SPARK-37360][SQL] Support TimestampNTZ in JSON data source

2021-12-05 Thread GitBox
sadikovi commented on pull request #34638: URL: https://github.com/apache/spark/pull/34638#issuecomment-986477557 JSONOptions have the following comment for `inferTimestamp`: > Enables inferring of TimestampType and TimestampNTZType from strings matched to the corresponding timestamp pa

[GitHub] [spark] sadikovi commented on a change in pull request #34638: [SPARK-37360][SQL] Support TimestampNTZ in JSON data source

2021-12-05 Thread GitBox
sadikovi commented on a change in pull request #34638: URL: https://github.com/apache/spark/pull/34638#discussion_r762730267 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala ## @@ -2746,6 +2746,182 @@ abstract class JsonSuit

[GitHub] [spark] cloud-fan commented on pull request #34806: [SPARK-37545][SQL] V2 CreateTableAsSelect command should qualify location

2021-12-05 Thread GitBox
cloud-fan commented on pull request #34806: URL: https://github.com/apache/spark/pull/34806#issuecomment-986473762 late LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] sadikovi commented on a change in pull request #34638: [SPARK-37360][SQL] Support TimestampNTZ in JSON data source

2021-12-05 Thread GitBox
sadikovi commented on a change in pull request #34638: URL: https://github.com/apache/spark/pull/34638#discussion_r762727935 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala ## @@ -144,6 +150,9 @@ private[sql] class JsonInferSc

[GitHub] [spark] AngersZhuuuu opened a new pull request #34815: [SPARK-37555][SQL] spark-sql should pass last unclosed comment to backend

2021-12-05 Thread GitBox
AngersZh opened a new pull request #34815: URL: https://github.com/apache/spark/pull/34815 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? Yes, if user write a wrong com

[GitHub] [spark] yaooqinn commented on pull request #34735: [SPARK-37481][Core][WebUI] Fix disappearance of skipped stages after they retry

2021-12-05 Thread GitBox
yaooqinn commented on pull request #34735: URL: https://github.com/apache/spark/pull/34735#issuecomment-986464047 FYI, https://gist.github.com/yaooqinn/6acb7b74b343a6a6dffe8401f6b7b45c#gistcomment-3977382 @mridulm -- This is an automated message from the Apache Git Service. To respond

[GitHub] [spark] mridulm commented on pull request #34735: [SPARK-37481][Core][WebUI] Fix disappearance of skipped stages after they retry

2021-12-05 Thread GitBox
mridulm commented on pull request #34735: URL: https://github.com/apache/spark/pull/34735#issuecomment-986461809 What is the root cause of this issue ? Can you describe why this is happening ? -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34813: [SPARK-37554][BUILD] Add PyArrow, pandas and plotly to release Docker image dependencies

2021-12-05 Thread GitBox
AmplabJenkins removed a comment on pull request #34813: URL: https://github.com/apache/spark/pull/34813#issuecomment-986460825 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145936/ -

[GitHub] [spark] AmplabJenkins commented on pull request #34813: [SPARK-37554][BUILD] Add PyArrow, pandas and plotly to release Docker image dependencies

2021-12-05 Thread GitBox
AmplabJenkins commented on pull request #34813: URL: https://github.com/apache/spark/pull/34813#issuecomment-986460825 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145936/ -- This

[GitHub] [spark] SparkQA removed a comment on pull request #34813: [SPARK-37554][BUILD] Add PyArrow, pandas and plotly to release Docker image dependencies

2021-12-05 Thread GitBox
SparkQA removed a comment on pull request #34813: URL: https://github.com/apache/spark/pull/34813#issuecomment-986391829 **[Test build #145936 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145936/testReport)** for PR 34813 at commit [`92ed81c`](https://gi

[GitHub] [spark] SparkQA commented on pull request #34813: [SPARK-37554][BUILD] Add PyArrow, pandas and plotly to release Docker image dependencies

2021-12-05 Thread GitBox
SparkQA commented on pull request #34813: URL: https://github.com/apache/spark/pull/34813#issuecomment-986460082 **[Test build #145936 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145936/testReport)** for PR 34813 at commit [`92ed81c`](https://github.co

[GitHub] [spark] HyukjinKwon closed pull request #34809: [SPARK-37550][SQL][DOCS] Add an example of parsing jsonStr with complex types for from_json

2021-12-05 Thread GitBox
HyukjinKwon closed pull request #34809: URL: https://github.com/apache/spark/pull/34809 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-un

[GitHub] [spark] HyukjinKwon commented on pull request #34809: [SPARK-37550][SQL][DOCS] Add an example of parsing jsonStr with complex types for from_json

2021-12-05 Thread GitBox
HyukjinKwon commented on pull request #34809: URL: https://github.com/apache/spark/pull/34809#issuecomment-986459020 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [spark] HyukjinKwon commented on pull request #34814: [SPARK-37004][PYTHON] Upgrade to Py4J 0.10.9.3

2021-12-05 Thread GitBox
HyukjinKwon commented on pull request #34814: URL: https://github.com/apache/spark/pull/34814#issuecomment-986458108 This has to be backported to `branch-3.2`. cc @huaxingao, @dongjoon-hyun, @mengxr, @joshRosen FYI -- This is an automated message from the Apache Git Service. To res

[GitHub] [spark] SparkQA commented on pull request #34814: [SPARK-37004][PYTHON] Upgrade to Py4J 0.10.9.3

2021-12-05 Thread GitBox
SparkQA commented on pull request #34814: URL: https://github.com/apache/spark/pull/34814#issuecomment-986458032 **[Test build #145938 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145938/testReport)** for PR 34814 at commit [`15bca82`](https://github.com

[GitHub] [spark] HyukjinKwon opened a new pull request #34814: [SPARK-37004][PYTHON] Upgrade to Py4J 0.10.9.3

2021-12-05 Thread GitBox
HyukjinKwon opened a new pull request #34814: URL: https://github.com/apache/spark/pull/34814 ### What changes were proposed in this pull request? This PR upgrades Py4J from 0.10.9.2 to 0.10.9.3 which contains the bug fix (https://github.com/bartdag/py4j/pull/440) that directly affec

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34809: [SPARK-37550][SQL][DOCS] Add an example of parsing jsonStr with complex types for from_json

2021-12-05 Thread GitBox
AmplabJenkins removed a comment on pull request #34809: URL: https://github.com/apache/spark/pull/34809#issuecomment-986457535 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50413/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34760: [SPARK-37506][CORE][SQL][DSTREAM][GRAPHX][ML][MLLIB][SS][EXAMPLES] Change the never changed 'var' to 'val'

2021-12-05 Thread GitBox
AmplabJenkins removed a comment on pull request #34760: URL: https://github.com/apache/spark/pull/34760#issuecomment-986457536 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145935/ -

[GitHub] [spark] AmplabJenkins commented on pull request #34760: [SPARK-37506][CORE][SQL][DSTREAM][GRAPHX][ML][MLLIB][SS][EXAMPLES] Change the never changed 'var' to 'val'

2021-12-05 Thread GitBox
AmplabJenkins commented on pull request #34760: URL: https://github.com/apache/spark/pull/34760#issuecomment-986457536 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145935/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #34809: [SPARK-37550][SQL][DOCS] Add an example of parsing jsonStr with complex types for from_json

2021-12-05 Thread GitBox
AmplabJenkins commented on pull request #34809: URL: https://github.com/apache/spark/pull/34809#issuecomment-986457535 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50413/ -- T

[GitHub] [spark] SparkQA removed a comment on pull request #34760: [SPARK-37506][CORE][SQL][DSTREAM][GRAPHX][ML][MLLIB][SS][EXAMPLES] Change the never changed 'var' to 'val'

2021-12-05 Thread GitBox
SparkQA removed a comment on pull request #34760: URL: https://github.com/apache/spark/pull/34760#issuecomment-986388261 **[Test build #145935 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145935/testReport)** for PR 34760 at commit [`7f60b2e`](https://gi

[GitHub] [spark] SparkQA commented on pull request #34760: [SPARK-37506][CORE][SQL][DSTREAM][GRAPHX][ML][MLLIB][SS][EXAMPLES] Change the never changed 'var' to 'val'

2021-12-05 Thread GitBox
SparkQA commented on pull request #34760: URL: https://github.com/apache/spark/pull/34760#issuecomment-986450705 **[Test build #145935 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145935/testReport)** for PR 34760 at commit [`7f60b2e`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #34809: [SPARK-37550][SQL][DOCS] Add an example of parsing jsonStr with complex types for from_json

2021-12-05 Thread GitBox
SparkQA commented on pull request #34809: URL: https://github.com/apache/spark/pull/34809#issuecomment-986445150 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50413/ -- This is an automated message from the A

[GitHub] [spark] wangyum commented on a change in pull request #34810: [SPARK-37549][SQL] Support set parallel through data source properties

2021-12-05 Thread GitBox
wangyum commented on a change in pull request #34810: URL: https://github.com/apache/spark/pull/34810#discussion_r762704186 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/SourceOptions.scala ## @@ -47,4 +50,6 @@ object SourceOptions { val R

[GitHub] [spark] wangyum commented on pull request #34811: [SPARK-37451][SQL] Fix cast string type to decimal type if spark.sql.legacy.allowNegativeScaleOfDecimal is enabled

2021-12-05 Thread GitBox
wangyum commented on pull request #34811: URL: https://github.com/apache/spark/pull/34811#issuecomment-986439317 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[GitHub] [spark] mridulm commented on pull request #34009: [SPARK-34378][SQL][AVRO] Enhance AvroSerializer validation to allow extra nullable Avro fields

2021-12-05 Thread GitBox
mridulm commented on pull request #34009: URL: https://github.com/apache/spark/pull/34009#issuecomment-986437222 The change looks fine to me, but like @HyukjinKwon I dont have too much context here. Would be great to hear from folks who might have worked on this in past. +CC @genglian

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34760: [SPARK-37506][CORE][SQL][DSTREAM][GRAPHX][ML][MLLIB][SS][EXAMPLES] Change the never changed 'var' to 'val'

2021-12-05 Thread GitBox
AmplabJenkins removed a comment on pull request #34760: URL: https://github.com/apache/spark/pull/34760#issuecomment-986434293 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50411/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34813: [SPARK-37554][BUILD] Add PyArrow, pandas and plotly to release Docker image dependencies

2021-12-05 Thread GitBox
AmplabJenkins removed a comment on pull request #34813: URL: https://github.com/apache/spark/pull/34813#issuecomment-986434291 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50412/

[GitHub] [spark] AmplabJenkins commented on pull request #34813: [SPARK-37554][BUILD] Add PyArrow, pandas and plotly to release Docker image dependencies

2021-12-05 Thread GitBox
AmplabJenkins commented on pull request #34813: URL: https://github.com/apache/spark/pull/34813#issuecomment-986434291 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50412/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #34760: [SPARK-37506][CORE][SQL][DSTREAM][GRAPHX][ML][MLLIB][SS][EXAMPLES] Change the never changed 'var' to 'val'

2021-12-05 Thread GitBox
AmplabJenkins commented on pull request #34760: URL: https://github.com/apache/spark/pull/34760#issuecomment-986434293 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50411/ -- T

[GitHub] [spark] SparkQA commented on pull request #34809: [SPARK-37550][SQL][DOCS] Add an example of parsing jsonStr with complex types for from_json

2021-12-05 Thread GitBox
SparkQA commented on pull request #34809: URL: https://github.com/apache/spark/pull/34809#issuecomment-986428570 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50413/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #34813: [SPARK-37554][BUILD] Add PyArrow, pandas and plotly to release Docker image dependencies

2021-12-05 Thread GitBox
SparkQA commented on pull request #34813: URL: https://github.com/apache/spark/pull/34813#issuecomment-986423493 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50412/ -- This is an automated message from the A

[GitHub] [spark] SparkQA commented on pull request #34760: [SPARK-37506][CORE][SQL][DSTREAM][GRAPHX][ML][MLLIB][SS][EXAMPLES] Change the never changed 'var' to 'val'

2021-12-05 Thread GitBox
SparkQA commented on pull request #34760: URL: https://github.com/apache/spark/pull/34760#issuecomment-986420287 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50411/ -- This is an automated message from the A

[GitHub] [spark] SparkQA commented on pull request #34809: [SPARK-37550][SQL][DOCS] Add an example of parsing jsonStr with complex types for from_json

2021-12-05 Thread GitBox
SparkQA commented on pull request #34809: URL: https://github.com/apache/spark/pull/34809#issuecomment-986410815 **[Test build #145937 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145937/testReport)** for PR 34809 at commit [`27a355e`](https://github.com

[GitHub] [spark] yaooqinn commented on a change in pull request #34794: [SPARK-37532][CORE] Limit the length of RDD name

2021-12-05 Thread GitBox
yaooqinn commented on a change in pull request #34794: URL: https://github.com/apache/spark/pull/34794#discussion_r762684556 ## File path: core/src/main/scala/org/apache/spark/internal/config/package.scala ## @@ -2267,4 +2267,13 @@ package object config { .version("3.3.0

[GitHub] [spark] SparkQA commented on pull request #34813: [SPARK-37554][BUILD] Add PyArrow, pandas and plotly to release Docker image dependencies

2021-12-05 Thread GitBox
SparkQA commented on pull request #34813: URL: https://github.com/apache/spark/pull/34813#issuecomment-986408489 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50412/ -- This is an automated message from the Apache

[GitHub] [spark] oeuf commented on a change in pull request #34812: [WIP][PARK-37553][PYTHON] Fix underscore (`_`) bug in pyspark.pandas.frames.DataFrame.pivot_table

2021-12-05 Thread GitBox
oeuf commented on a change in pull request #34812: URL: https://github.com/apache/spark/pull/34812#discussion_r762681880 ## File path: python/pyspark/pandas/frame.py ## @@ -6054,17 +6056,21 @@ def pivot_table( # E.g. if column is b and values is ['b','e'],

[GitHub] [spark] byyue commented on a change in pull request #34809: [SPARK-37550][SQL][DOCS] Add an example of parsing jsonStr with complex types for from_json

2021-12-05 Thread GitBox
byyue commented on a change in pull request #34809: URL: https://github.com/apache/spark/pull/34809#discussion_r762680503 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala ## @@ -522,6 +522,8 @@ case class JsonTuple(child

[GitHub] [spark] srowen commented on a change in pull request #34794: [SPARK-37532][CORE] Limit the length of RDD name

2021-12-05 Thread GitBox
srowen commented on a change in pull request #34794: URL: https://github.com/apache/spark/pull/34794#discussion_r762680264 ## File path: core/src/main/scala/org/apache/spark/internal/config/package.scala ## @@ -2267,4 +2267,13 @@ package object config { .version("3.3.0")

[GitHub] [spark] SparkQA commented on pull request #34760: [SPARK-37506][CORE][SQL][DSTREAM][GRAPHX][ML][MLLIB][SS][EXAMPLES] Change the never changed 'var' to 'val'

2021-12-05 Thread GitBox
SparkQA commented on pull request #34760: URL: https://github.com/apache/spark/pull/34760#issuecomment-986404569 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50411/ -- This is an automated message from the Apache

[GitHub] [spark] yaooqinn commented on a change in pull request #34794: [SPARK-37532][CORE] Limit the length of RDD name

2021-12-05 Thread GitBox
yaooqinn commented on a change in pull request #34794: URL: https://github.com/apache/spark/pull/34794#discussion_r762676403 ## File path: core/src/main/scala/org/apache/spark/internal/config/package.scala ## @@ -2267,4 +2267,13 @@ package object config { .version("3.3.0

[GitHub] [spark] yaooqinn commented on a change in pull request #34794: [SPARK-37532][CORE] Limit the length of RDD name

2021-12-05 Thread GitBox
yaooqinn commented on a change in pull request #34794: URL: https://github.com/apache/spark/pull/34794#discussion_r762676403 ## File path: core/src/main/scala/org/apache/spark/internal/config/package.scala ## @@ -2267,4 +2267,13 @@ package object config { .version("3.3.0

[GitHub] [spark] HyukjinKwon commented on pull request #34813: [SPARK-37554][BUILD] Add PyArrow, pandas and plotly to release Docker image dependencies

2021-12-05 Thread GitBox
HyukjinKwon commented on pull request #34813: URL: https://github.com/apache/spark/pull/34813#issuecomment-986395961 Oh, my mistake. He actually already did https://github.com/bartdag/py4j/tree/0.10.9.3. I will upgrade it in PySpark soon by EOD today. -- This is an automated message fro

[GitHub] [spark] HyukjinKwon commented on pull request #34813: [SPARK-37554][BUILD] Add PyArrow, pandas and plotly to release Docker image dependencies

2021-12-05 Thread GitBox
HyukjinKwon commented on pull request #34813: URL: https://github.com/apache/spark/pull/34813#issuecomment-986394758 BTW, I am trying to resolve SPARK-37004 (3.2.1 blocker) as soon as possible. Once Py4J releases, we will resolve this by upgrading Py4J version soon. I asked the author to m

[GitHub] [spark] HyukjinKwon closed pull request #34813: [SPARK-37554][BUILD] Add PyArrow, pandas and plotly to release Docker image dependencies

2021-12-05 Thread GitBox
HyukjinKwon closed pull request #34813: URL: https://github.com/apache/spark/pull/34813 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-un

[GitHub] [spark] dongjoon-hyun commented on pull request #34813: [SPARK-37554][BUILD] Add PyArrow, pandas and plotly to release Docker image dependencies

2021-12-05 Thread GitBox
dongjoon-hyun commented on pull request #34813: URL: https://github.com/apache/spark/pull/34813#issuecomment-986392811 Sure, no problem. BTW, cc @huaxingao since she is the next release manager. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] HyukjinKwon commented on pull request #34813: [SPARK-37554][BUILD] Add PyArrow, pandas and plotly to release Docker image dependencies

2021-12-05 Thread GitBox
HyukjinKwon commented on pull request #34813: URL: https://github.com/apache/spark/pull/34813#issuecomment-986392630 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [spark] HyukjinKwon commented on pull request #34813: [SPARK-37554][BUILD] Add PyArrow, pandas and plotly to release Docker image dependencies

2021-12-05 Thread GitBox
HyukjinKwon commented on pull request #34813: URL: https://github.com/apache/spark/pull/34813#issuecomment-986392530 Thanks @dongjoon-hyun. I am merging this in as CI won't verify this change in any event. -- This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34809: [SPARK-37550][SQL][DOCS] Add an example of parsing jsonStr with complex types for from_json

2021-12-05 Thread GitBox
AmplabJenkins removed a comment on pull request #34809: URL: https://github.com/apache/spark/pull/34809#issuecomment-986391374 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145934/ -

[GitHub] [spark] SparkQA commented on pull request #34813: [SPARK-37554][BUILD] Add PyArrow, pandas and plotly to release Docker image dependencies

2021-12-05 Thread GitBox
SparkQA commented on pull request #34813: URL: https://github.com/apache/spark/pull/34813#issuecomment-986391829 **[Test build #145936 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145936/testReport)** for PR 34813 at commit [`92ed81c`](https://github.com

[GitHub] [spark] AmplabJenkins commented on pull request #34809: [SPARK-37550][SQL][DOCS] Add an example of parsing jsonStr with complex types for from_json

2021-12-05 Thread GitBox
AmplabJenkins commented on pull request #34809: URL: https://github.com/apache/spark/pull/34809#issuecomment-986391374 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145934/ -- This

[GitHub] [spark] SparkQA removed a comment on pull request #34809: [SPARK-37550][SQL][DOCS] Add an example of parsing jsonStr with complex types for from_json

2021-12-05 Thread GitBox
SparkQA removed a comment on pull request #34809: URL: https://github.com/apache/spark/pull/34809#issuecomment-986340478 **[Test build #145934 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145934/testReport)** for PR 34809 at commit [`4a314cd`](https://gi

[GitHub] [spark] SparkQA commented on pull request #34809: [SPARK-37550][SQL][DOCS] Add an example of parsing jsonStr with complex types for from_json

2021-12-05 Thread GitBox
SparkQA commented on pull request #34809: URL: https://github.com/apache/spark/pull/34809#issuecomment-986391242 **[Test build #145934 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145934/testReport)** for PR 34809 at commit [`4a314cd`](https://github.co

[GitHub] [spark] HyukjinKwon opened a new pull request #34813: [SPARK-37554][BUILD] Add PyArrow, pandas and plotly to release Docker image dependencies

2021-12-05 Thread GitBox
HyukjinKwon opened a new pull request #34813: URL: https://github.com/apache/spark/pull/34813 ### What changes were proposed in this pull request? This PR proposes to add plotly, pyarrow and pandas dependencies for generating the API documentation for pandas API on Spark. The

[GitHub] [spark] HyukjinKwon commented on pull request #34813: [SPARK-37554][BUILD] Add PyArrow, pandas and plotly to release Docker image dependencies

2021-12-05 Thread GitBox
HyukjinKwon commented on pull request #34813: URL: https://github.com/apache/spark/pull/34813#issuecomment-986390469 cc @gengliangwang and @dongjoon-hyun FYI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

[GitHub] [spark] SparkQA commented on pull request #34760: [SPARK-37506][CORE][SQL][DSTREAM][GRAPHX][ML][MLLIB][SS][EXAMPLES] Change the never changed 'var' to 'val'

2021-12-05 Thread GitBox
SparkQA commented on pull request #34760: URL: https://github.com/apache/spark/pull/34760#issuecomment-986388261 **[Test build #145935 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145935/testReport)** for PR 34760 at commit [`7f60b2e`](https://github.com

[GitHub] [spark] LuciferYang commented on a change in pull request #34760: [SPARK-37506][CORE][SQL][DSTREAM][GRAPHX][ML][MLLIB][SS][EXAMPLES] Change the never changed 'var' to 'val'

2021-12-05 Thread GitBox
LuciferYang commented on a change in pull request #34760: URL: https://github.com/apache/spark/pull/34760#discussion_r762669659 ## File path: core/src/main/scala/org/apache/spark/status/LiveEntity.scala ## @@ -908,7 +908,7 @@ private[spark] class LiveMiscellaneousProcess(val p

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34809: [SPARK-37550][SQL][DOCS] Add an example of parsing jsonStr with complex types for from_json

2021-12-05 Thread GitBox
AmplabJenkins removed a comment on pull request #34809: URL: https://github.com/apache/spark/pull/34809#issuecomment-986387198 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50410/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34812: [WIP][PARK-37553][PYTHON] Fix underscore (`_`) bug in pyspark.pandas.frames.DataFrame.pivot_table

2021-12-05 Thread GitBox
AmplabJenkins removed a comment on pull request #34812: URL: https://github.com/apache/spark/pull/34812#issuecomment-986387197 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50409/

[GitHub] [spark] AmplabJenkins commented on pull request #34812: [WIP][PARK-37553][PYTHON] Fix underscore (`_`) bug in pyspark.pandas.frames.DataFrame.pivot_table

2021-12-05 Thread GitBox
AmplabJenkins commented on pull request #34812: URL: https://github.com/apache/spark/pull/34812#issuecomment-986387197 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50409/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #34809: [SPARK-37550][SQL][DOCS] Add an example of parsing jsonStr with complex types for from_json

2021-12-05 Thread GitBox
AmplabJenkins commented on pull request #34809: URL: https://github.com/apache/spark/pull/34809#issuecomment-986387198 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50410/ -- T

  1   2   3   >