[GitHub] [spark] HyukjinKwon commented on a diff in pull request #41711: [SPARK-44155][SQL] Adding a dev utility to improve error messages based on LLM

2023-06-23 Thread via GitHub
HyukjinKwon commented on code in PR #41711: URL: https://github.com/apache/spark/pull/41711#discussion_r1240603584 ## dev/error_message_refiner.py: ## @@ -0,0 +1,235 @@ +#!/usr/bin/env python3 + +# +# Licensed to the Apache Software Foundation (ASF) under one or more +#

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #41711: [SPARK-44155][SQL] Adding a dev utility to improve error messages based on LLM

2023-06-23 Thread via GitHub
HyukjinKwon commented on code in PR #41711: URL: https://github.com/apache/spark/pull/41711#discussion_r1240603499 ## dev/api_key.txt: ## @@ -0,0 +1 @@ +# Please REMOVE this comment and enter the API key here. You can obtain an API key from

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #41711: [SPARK-44155][SQL] Adding a dev utility to improve error messages based on LLM

2023-06-23 Thread via GitHub
HyukjinKwon commented on code in PR #41711: URL: https://github.com/apache/spark/pull/41711#discussion_r1240603443 ## dev/error_message_refiner.py: ## @@ -0,0 +1,235 @@ +#!/usr/bin/env python3 + +# +# Licensed to the Apache Software Foundation (ASF) under one or more +#

[GitHub] [spark] bersprockets commented on a diff in pull request #41712: [SPARK-44132][SQL] join using Stream of column name fails codegen

2023-06-23 Thread via GitHub
bersprockets commented on code in PR #41712: URL: https://github.com/apache/spark/pull/41712#discussion_r1240569151 ## sql/core/src/test/scala/org/apache/spark/sql/JoinSuite.scala: ## @@ -1685,4 +1685,24 @@ class JoinSuite extends QueryTest with SharedSparkSession with

[GitHub] [spark] itholic commented on a diff in pull request #41711: [SPARK-44155][SQL] Adding a dev utility to improve error messages based on LLM

2023-06-23 Thread via GitHub
itholic commented on code in PR #41711: URL: https://github.com/apache/spark/pull/41711#discussion_r1240578537 ## dev/error_message_refiner.py: ## @@ -0,0 +1,219 @@ +#!/usr/bin/env python3 + +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor

[GitHub] [spark] szehon-ho commented on a diff in pull request #41614: [SPARK-44060][SQL] Code-gen for build side outer shuffled hash join

2023-06-23 Thread via GitHub
szehon-ho commented on code in PR #41614: URL: https://github.com/apache/spark/pull/41614#discussion_r1240571397 ## sql/core/src/main/scala/org/apache/spark/sql/execution/joins/ShuffledHashJoinExec.scala: ## @@ -578,7 +594,9 @@ case class ShuffledHashJoinExec( |

[GitHub] [spark] szehon-ho commented on a diff in pull request #41614: [SPARK-44060][SQL] Code-gen for build side outer shuffled hash join

2023-06-23 Thread via GitHub
szehon-ho commented on code in PR #41614: URL: https://github.com/apache/spark/pull/41614#discussion_r1240571397 ## sql/core/src/main/scala/org/apache/spark/sql/execution/joins/ShuffledHashJoinExec.scala: ## @@ -578,7 +594,9 @@ case class ShuffledHashJoinExec( |

[GitHub] [spark] dongjoon-hyun commented on pull request #36374: [SPARK-39006][K8S] Show a directional error message for executor PVC dynamic allocation failure

2023-06-23 Thread via GitHub
dongjoon-hyun commented on PR #36374: URL: https://github.com/apache/spark/pull/36374#issuecomment-1605232031 Thank you for confirming. Apache Spark 3.4.1 is released officially. - https://lists.apache.org/list.html?d...@spark.apache.org - https://spark.apache.org/downloads.html --

[GitHub] [spark] szehon-ho commented on a diff in pull request #41614: [SPARK-44060][SQL] Code-gen for build side outer shuffled hash join

2023-06-23 Thread via GitHub
szehon-ho commented on code in PR #41614: URL: https://github.com/apache/spark/pull/41614#discussion_r1240569908 ## sql/core/src/main/scala/org/apache/spark/sql/execution/joins/ShuffledHashJoinExec.scala: ## @@ -578,7 +594,9 @@ case class ShuffledHashJoinExec( |

[GitHub] [spark] bersprockets commented on a diff in pull request #41712: [SPARK-44132][SQL] join using Stream of column name fails codegen

2023-06-23 Thread via GitHub
bersprockets commented on code in PR #41712: URL: https://github.com/apache/spark/pull/41712#discussion_r1240569151 ## sql/core/src/test/scala/org/apache/spark/sql/JoinSuite.scala: ## @@ -1685,4 +1685,24 @@ class JoinSuite extends QueryTest with SharedSparkSession with

[GitHub] [spark] bersprockets commented on pull request #41712: [SPARK-44132][SQL] join using Stream of column name fails codegen

2023-06-23 Thread via GitHub
bersprockets commented on PR #41712: URL: https://github.com/apache/spark/pull/41712#issuecomment-1605228576 For the title, may I suggest: ``` [SPARK-44132][SQL] Materialize `Stream` of join column names to avoid codegen failure ``` For the description, may I suggest: ###

[GitHub] [spark] szehon-ho commented on a diff in pull request #41614: [SPARK-44060][SQL] Code-gen for build side outer shuffled hash join

2023-06-23 Thread via GitHub
szehon-ho commented on code in PR #41614: URL: https://github.com/apache/spark/pull/41614#discussion_r1240567843 ## sql/core/src/main/scala/org/apache/spark/sql/execution/joins/ShuffledHashJoinExec.scala: ## @@ -364,7 +366,13 @@ case class ShuffledHashJoinExec( override def

[GitHub] [spark] amaliujia commented on pull request #41716: [SPARK-44164][SQL] Extract toAttribute method from StructField to Util class

2023-06-23 Thread via GitHub
amaliujia commented on PR #41716: URL: https://github.com/apache/spark/pull/41716#issuecomment-1605197640 @hvanhovell @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] amaliujia opened a new pull request, #41716: [SPARK-44164][SQL] Extract toAttribute method from StructField to Util class

2023-06-23 Thread via GitHub
amaliujia opened a new pull request, #41716: URL: https://github.com/apache/spark/pull/41716 ### What changes were proposed in this pull request? Extract toAttribute method from StructField to Util class. ### Why are the changes needed? StructField should be

[GitHub] [spark] github-actions[bot] closed pull request #38202: [SPARK-40763][K8S] Should expose driver service name to config for user features

2023-06-23 Thread via GitHub
github-actions[bot] closed pull request #38202: [SPARK-40763][K8S] Should expose driver service name to config for user features URL: https://github.com/apache/spark/pull/38202 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] github-actions[bot] commented on pull request #38885: [WIP][SPARK-41367][SQL] Enable V2 file tables in read paths in session catalog

2023-06-23 Thread via GitHub
github-actions[bot] commented on PR #38885: URL: https://github.com/apache/spark/pull/38885#issuecomment-1605183251 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] closed pull request #39985: [SPARK-42412][WIP] Initial prototype implementation of PySpark ML via Spark connect

2023-06-23 Thread via GitHub
github-actions[bot] closed pull request #39985: [SPARK-42412][WIP] Initial prototype implementation of PySpark ML via Spark connect URL: https://github.com/apache/spark/pull/39985 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] github-actions[bot] closed pull request #40391: [SPARK-42766][YARN] YarnAllocator filter excluded nodes when launching containers

2023-06-23 Thread via GitHub
github-actions[bot] closed pull request #40391: [SPARK-42766][YARN] YarnAllocator filter excluded nodes when launching containers URL: https://github.com/apache/spark/pull/40391 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] github-actions[bot] closed pull request #40122: [SPARK-42349][PYTHON] Support pandas cogroup with multiple df

2023-06-23 Thread via GitHub
github-actions[bot] closed pull request #40122: [SPARK-42349][PYTHON] Support pandas cogroup with multiple df URL: https://github.com/apache/spark/pull/40122 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] github-actions[bot] closed pull request #40411: [SPARK-42781][DOCS][PYTHON] provide one format for writing to kafka

2023-06-23 Thread via GitHub
github-actions[bot] closed pull request #40411: [SPARK-42781][DOCS][PYTHON] provide one format for writing to kafka URL: https://github.com/apache/spark/pull/40411 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] github-actions[bot] closed pull request #40419: [SPARK-42789][SQL] Rewrite multiple GetJsonObjects to a JsonTuple if their json expressions are the same

2023-06-23 Thread via GitHub
github-actions[bot] closed pull request #40419: [SPARK-42789][SQL] Rewrite multiple GetJsonObjects to a JsonTuple if their json expressions are the same URL: https://github.com/apache/spark/pull/40419 -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] dongjoon-hyun closed pull request #41715: [SPARK-44163][PYTHON] Handle `ModuleNotFoundError` in addition to `ImportError`

2023-06-23 Thread via GitHub
dongjoon-hyun closed pull request #41715: [SPARK-44163][PYTHON] Handle `ModuleNotFoundError` in addition to `ImportError` URL: https://github.com/apache/spark/pull/41715 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] dtenedor commented on pull request #41486: [SPARK-43986][SQL] Create error classes for HyperLogLog function call failures

2023-06-23 Thread via GitHub
dtenedor commented on PR #41486: URL: https://github.com/apache/spark/pull/41486#issuecomment-1605060822 Hi @MaxGekk can we trouble you to help us merge this please when you have a moment :) -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] dongjoon-hyun opened a new pull request, #41715: [SPARK-44163][PYTHON] Handle `ModuleNotFoundError` in addition to `ImportError`

2023-06-23 Thread via GitHub
dongjoon-hyun opened a new pull request, #41715: URL: https://github.com/apache/spark/pull/41715 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

[GitHub] [spark] dtenedor commented on a diff in pull request #41486: [SPARK-43986][SQL] Create error classes for HyperLogLog function call failures

2023-06-23 Thread via GitHub
dtenedor commented on code in PR #41486: URL: https://github.com/apache/spark/pull/41486#discussion_r1240463584 ## core/src/main/resources/error/error-classes.json: ## @@ -757,6 +757,21 @@ "The expression cannot be used as a grouping expression because its data type

[GitHub] [spark] dtenedor commented on a diff in pull request #41486: [SPARK-43986][SQL] Create error classes for HyperLogLog function call failures

2023-06-23 Thread via GitHub
dtenedor commented on code in PR #41486: URL: https://github.com/apache/spark/pull/41486#discussion_r1240463321 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datasketchesExpressions.scala: ## @@ -98,15 +104,22 @@ case class HllUnion(first: Expression,

[GitHub] [spark] dtenedor commented on a diff in pull request #41486: [SPARK-43986][SQL] Create error classes for HyperLogLog function call failures

2023-06-23 Thread via GitHub
dtenedor commented on code in PR #41486: URL: https://github.com/apache/spark/pull/41486#discussion_r1240463194 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/datasketchesAggregates.scala: ## @@ -189,21 +189,20 @@ object HllSketchAgg {

[GitHub] [spark] mkaravel commented on a diff in pull request #41486: [SPARK-43986][SQL] Create error classes for HyperLogLog function call failures

2023-06-23 Thread via GitHub
mkaravel commented on code in PR #41486: URL: https://github.com/apache/spark/pull/41486#discussion_r1240413693 ## core/src/main/resources/error/error-classes.json: ## @@ -757,6 +757,21 @@ "The expression cannot be used as a grouping expression because its data type

[GitHub] [spark] szehon-ho commented on pull request #41683: [SPARK-36680][SQL] Supports Dynamic Table Options for Spark SQL

2023-06-23 Thread via GitHub
szehon-ho commented on PR #41683: URL: https://github.com/apache/spark/pull/41683#issuecomment-1604982731 Hm build failure doesnt seem related: ``` python/pyspark/mllib/clustering.py:781: error: Decorated property not supported [misc] python/pyspark/mllib/clustering.py:949:

[GitHub] [spark] amaliujia commented on pull request #41714: [SPARK-44160][SQL][CONNECT] Extract shared code from StructType

2023-06-23 Thread via GitHub
amaliujia commented on PR #41714: URL: https://github.com/apache/spark/pull/41714#issuecomment-1604972107 cc @hvanhovell @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] siying commented on pull request #41578: [SPARK-44044][SS] Improve Error message for Window functions with streaming

2023-06-23 Thread via GitHub
siying commented on PR #41578: URL: https://github.com/apache/spark/pull/41578#issuecomment-1604966953 @MaxGekk I addressed the comments. The CI failure is on python link. I don't know why it happens but I don't see how it is related. -- This is an automated message from the Apache Git

[GitHub] [spark] amaliujia opened a new pull request, #41714: [SPARK-44160][SQL][CONNECT] Extract shared code from StructType

2023-06-23 Thread via GitHub
amaliujia opened a new pull request, #41714: URL: https://github.com/apache/spark/pull/41714 ### What changes were proposed in this pull request? The StructType has some methods that require CatalystParser and Catalyst expression. We are not planning to move the parser and

[GitHub] [spark] dongjoon-hyun closed pull request #41713: [SPARK-44158][K8S] Remove unused `spark.kubernetes.executor.lostCheckmaxAttempts`

2023-06-23 Thread via GitHub
dongjoon-hyun closed pull request #41713: [SPARK-44158][K8S] Remove unused `spark.kubernetes.executor.lostCheckmaxAttempts` URL: https://github.com/apache/spark/pull/41713 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] dongjoon-hyun commented on pull request #41713: [SPARK-44158][K8S] Remove unused `spark.kubernetes.executor.lostCheckmaxAttempts`

2023-06-23 Thread via GitHub
dongjoon-hyun commented on PR #41713: URL: https://github.com/apache/spark/pull/41713#issuecomment-1604941186 Thank you so much! Merged to master/3.4/3.3. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] vicennial commented on pull request #41701: [CONNECT][SPARK-44146] Isolate Spark Connect Session jars and classfiles

2023-06-23 Thread via GitHub
vicennial commented on PR #41701: URL: https://github.com/apache/spark/pull/41701#issuecomment-1604932773 The PR is ready to review (the falling tests are flaky, I've submitted a rerun request). -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] dongjoon-hyun commented on pull request #41713: [SPARK-44158][K8S] Remove unused `spark.kubernetes.executor.lostCheckmaxAttempts`

2023-06-23 Thread via GitHub
dongjoon-hyun commented on PR #41713: URL: https://github.com/apache/spark/pull/41713#issuecomment-1604904486 Could you review this PR when you have some time, @viirya ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] jdesjean commented on pull request #41443: [SPARK-43923][CONNECT] Post listenerBus events during ExecutePlanRequest

2023-06-23 Thread via GitHub
jdesjean commented on PR #41443: URL: https://github.com/apache/spark/pull/41443#issuecomment-1604796450 > > @beliefer the event API is open by design, it is made for others to use it. For example thriftserver uses it. This particular PR will be foundation on which we will be building a UI

[GitHub] [spark] dongjoon-hyun opened a new pull request, #41713: [SPARK-44158][K8S] Remove unused `spark.kubernetes.executor.lostCheckmaxAttempts`

2023-06-23 Thread via GitHub
dongjoon-hyun opened a new pull request, #41713: URL: https://github.com/apache/spark/pull/41713 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was

[GitHub] [spark] amaliujia commented on a diff in pull request #41559: [SPARK-44030][SQL] Implement DataTypeExpression to offer Unapply for expression

2023-06-23 Thread via GitHub
amaliujia commented on code in PR #41559: URL: https://github.com/apache/spark/pull/41559#discussion_r1240066813 ## sql/catalyst/src/main/scala/org/apache/spark/sql/types/DataTypeExpression.scala: ## @@ -0,0 +1,99 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] [spark] dongjoon-hyun commented on pull request #41707: [SPARK-44151][BUILD] Upgrade `commons-codec` to 1.16.0

2023-06-23 Thread via GitHub
dongjoon-hyun commented on PR #41707: URL: https://github.com/apache/spark/pull/41707#issuecomment-1604550160 Merged to master for Apache Spark 3.5.0 Thank you, @panbingkun and @HyukjinKwon . -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] dongjoon-hyun closed pull request #41707: [SPARK-44151][BUILD] Upgrade `commons-codec` to 1.16.0

2023-06-23 Thread via GitHub
dongjoon-hyun closed pull request #41707: [SPARK-44151][BUILD] Upgrade `commons-codec` to 1.16.0 URL: https://github.com/apache/spark/pull/41707 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #41709: [SPARK-44153][CORE][UI] Support `Heap Histogram` column in `Executors` tab

2023-06-23 Thread via GitHub
dongjoon-hyun commented on code in PR #41709: URL: https://github.com/apache/spark/pull/41709#discussion_r1240040307 ## core/src/main/scala/org/apache/spark/util/Utils.scala: ## @@ -2287,6 +2287,22 @@ private[spark] object Utils extends Logging with SparkClassUtils {

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #41700: [SPARK-44139][SQL] Discard completely pushed down filters in group-based MERGE operations

2023-06-23 Thread via GitHub
dongjoon-hyun commented on code in PR #41700: URL: https://github.com/apache/spark/pull/41700#discussion_r1240037361 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/GroupBasedRowLevelOperationScanPlanning.scala: ## @@ -56,29 +62,67 @@ object

[GitHub] [spark] aokolnychyi commented on a diff in pull request #41700: [SPARK-44139][SQL] Discard completely pushed down filters in group-based MERGE operations

2023-06-23 Thread via GitHub
aokolnychyi commented on code in PR #41700: URL: https://github.com/apache/spark/pull/41700#discussion_r1240020952 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/GroupBasedRowLevelOperationScanPlanning.scala: ## @@ -56,29 +62,67 @@ object

[GitHub] [spark] aokolnychyi commented on pull request #41700: [SPARK-44139][SQL] Discard completely pushed down filters in group-based MERGE operations

2023-06-23 Thread via GitHub
aokolnychyi commented on PR #41700: URL: https://github.com/apache/spark/pull/41700#issuecomment-1604524387 Thank you, @dongjoon-hyun! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] gatorsmile commented on a diff in pull request #41711: [SPARK-44155][SQL] Adding a dev utility to improve error messages based on LLM

2023-06-23 Thread via GitHub
gatorsmile commented on code in PR #41711: URL: https://github.com/apache/spark/pull/41711#discussion_r1239940518 ## dev/error_message_refiner.py: ## @@ -0,0 +1,219 @@ +#!/usr/bin/env python3 + +# +# Licensed to the Apache Software Foundation (ASF) under one or more +#

[GitHub] [spark] steven-aerts commented on pull request #41688: [WIP][SPARK-44132][SQL][TESTS] nesting full outer join confuses codegen

2023-06-23 Thread via GitHub
steven-aerts commented on PR #41688: URL: https://github.com/apache/spark/pull/41688#issuecomment-1604244490 Eclipsed by #41712. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] steven-aerts closed pull request #41688: [WIP][SPARK-44132][SQL][TESTS] nesting full outer join confuses codegen

2023-06-23 Thread via GitHub
steven-aerts closed pull request #41688: [WIP][SPARK-44132][SQL][TESTS] nesting full outer join confuses codegen URL: https://github.com/apache/spark/pull/41688 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] steven-aerts opened a new pull request, #41712: [SPARK-44132][SQL] join using Stream of column name fails codegen

2023-06-23 Thread via GitHub
steven-aerts opened a new pull request, #41712: URL: https://github.com/apache/spark/pull/41712 When nesting multiple joins using column names which are Stream the lazily evaluated stream can sometimes generate faulty codegen data. Resulting in a NPE or bad access of the data.

[GitHub] [spark] 0xdarkman commented on pull request #36374: [SPARK-39006][K8S] Show a directional error message for executor PVC dynamic allocation failure

2023-06-23 Thread via GitHub
0xdarkman commented on PR #36374: URL: https://github.com/apache/spark/pull/36374#issuecomment-1604226893 I am running this version 3.4.1 and it looks good. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] itholic commented on pull request #41711: [SPARK-44155][SQL] Adding a dev utility to improve error messages based on LLM

2023-06-23 Thread via GitHub
itholic commented on PR #41711: URL: https://github.com/apache/spark/pull/41711#issuecomment-1604003249 BTW, I set the category as `SQL` because it works based on the SQL error class, but I'm not sure if this is correct. -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] itholic commented on pull request #41711: [SPARK-44155][SQL] Adding a dev utility to improve error messages based on LLM

2023-06-23 Thread via GitHub
itholic commented on PR #41711: URL: https://github.com/apache/spark/pull/41711#issuecomment-1603986988 cc @gengliangwang @allisonwang-db @MaxGekk @cloud-fan could you take a look at this PR when you find some time? -- This is an automated message from the Apache Git Service. To respond

[GitHub] [spark] itholic opened a new pull request, #41711: [SPARK-44155][SQL] Adding a dev utility to improve error messages based on LLM

2023-06-23 Thread via GitHub
itholic opened a new pull request, #41711: URL: https://github.com/apache/spark/pull/41711 ### What changes were proposed in this pull request? This PR proposes adding a utility script that can help increase productivity in error message improvement tasks. When passing the

[GitHub] [spark] pan3793 commented on a diff in pull request #41709: [SPARK-44153][CORE][UI] Support `Heap Histogram` column in `Executors` tab

2023-06-23 Thread via GitHub
pan3793 commented on code in PR #41709: URL: https://github.com/apache/spark/pull/41709#discussion_r1239566530 ## core/src/main/scala/org/apache/spark/util/Utils.scala: ## @@ -2287,6 +2287,22 @@ private[spark] object Utils extends Logging with SparkClassUtils {

[GitHub] [spark] pan3793 commented on a diff in pull request #41709: [SPARK-44153][CORE][UI] Support `Heap Histogram` column in `Executors` tab

2023-06-23 Thread via GitHub
pan3793 commented on code in PR #41709: URL: https://github.com/apache/spark/pull/41709#discussion_r1239566530 ## core/src/main/scala/org/apache/spark/util/Utils.scala: ## @@ -2287,6 +2287,22 @@ private[spark] object Utils extends Logging with SparkClassUtils {

[GitHub] [spark] pan3793 commented on a diff in pull request #41709: [SPARK-44153][CORE][UI] Support `Heap Histogram` column in `Executors` tab

2023-06-23 Thread via GitHub
pan3793 commented on code in PR #41709: URL: https://github.com/apache/spark/pull/41709#discussion_r1239566530 ## core/src/main/scala/org/apache/spark/util/Utils.scala: ## @@ -2287,6 +2287,22 @@ private[spark] object Utils extends Logging with SparkClassUtils {

[GitHub] [spark] zzzzming95 commented on a diff in pull request #41154: [SPARK-43327][CORE][3.3] Trigger `committer.setupJob` before plan execute in `FileFormatWriter#write`

2023-06-23 Thread via GitHub
ming95 commented on code in PR #41154: URL: https://github.com/apache/spark/pull/41154#discussion_r1239560157 ## sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala: ## @@ -1275,4 +1275,27 @@ class DataFrameReaderWriterSuite extends QueryTest

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #41709: [SPARK-44153][CORE][UI] Support `Heap Histogram` column in `Executors` tab

2023-06-23 Thread via GitHub
dongjoon-hyun commented on code in PR #41709: URL: https://github.com/apache/spark/pull/41709#discussion_r1239558578 ## core/src/main/scala/org/apache/spark/util/Utils.scala: ## @@ -2287,6 +2287,22 @@ private[spark] object Utils extends Logging with SparkClassUtils {

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #41709: [SPARK-44153][CORE][UI] Support `Heap Histogram` column in `Executors` tab

2023-06-23 Thread via GitHub
dongjoon-hyun commented on code in PR #41709: URL: https://github.com/apache/spark/pull/41709#discussion_r1239558578 ## core/src/main/scala/org/apache/spark/util/Utils.scala: ## @@ -2287,6 +2287,22 @@ private[spark] object Utils extends Logging with SparkClassUtils {

[GitHub] [spark] peter-toth commented on pull request #41677: [WIP][SPARK-35564][SQL] Improve subexpression elimination

2023-06-23 Thread via GitHub
peter-toth commented on PR #41677: URL: https://github.com/apache/spark/pull/41677#issuecomment-1603934838 The failure in `[Run / Linters, licenses, dependencies and documentation generation]` seems unrelated. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] pan3793 commented on a diff in pull request #41709: [SPARK-44153][CORE][UI] Support `Heap Histogram` column in `Executors` tab

2023-06-23 Thread via GitHub
pan3793 commented on code in PR #41709: URL: https://github.com/apache/spark/pull/41709#discussion_r1239521669 ## core/src/main/scala/org/apache/spark/util/Utils.scala: ## @@ -2287,6 +2287,22 @@ private[spark] object Utils extends Logging with SparkClassUtils {

[GitHub] [spark] pan3793 commented on a diff in pull request #41709: [SPARK-44153][CORE][UI] Support `Heap Histogram` column in `Executors` tab

2023-06-23 Thread via GitHub
pan3793 commented on code in PR #41709: URL: https://github.com/apache/spark/pull/41709#discussion_r1239521669 ## core/src/main/scala/org/apache/spark/util/Utils.scala: ## @@ -2287,6 +2287,22 @@ private[spark] object Utils extends Logging with SparkClassUtils {

[GitHub] [spark] dongjoon-hyun commented on pull request #41709: [SPARK-44153][CORE][UI] Support `Heap Histogram` column in `Executors` tab

2023-06-23 Thread via GitHub
dongjoon-hyun commented on PR #41709: URL: https://github.com/apache/spark/pull/41709#issuecomment-1603874002 Merged to master for Apache Spark 3.5.0. Thank you, @viirya and @pan3793 . -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] dongjoon-hyun closed pull request #41709: [SPARK-44153][CORE][UI] Support `Heap Histogram` column in `Executors` tab

2023-06-23 Thread via GitHub
dongjoon-hyun closed pull request #41709: [SPARK-44153][CORE][UI] Support `Heap Histogram` column in `Executors` tab URL: https://github.com/apache/spark/pull/41709 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] EnricoMi commented on pull request #40122: [SPARK-42349][PYTHON] Support pandas cogroup with multiple df

2023-06-23 Thread via GitHub
EnricoMi commented on PR #40122: URL: https://github.com/apache/spark/pull/40122#issuecomment-1603851057 @HyukjinKwon @cloud-fan @sunchao @zhengruifeng this is so sad. Can we get some feedback if this contribution is welcome or if any more effort is just wasted. -- This is an

[GitHub] [spark] amaliujia commented on pull request #41710: [SPARK-43255][SQL] Replace the error class _LEGACY_ERROR_TEMP_2020 by an internal error

2023-06-23 Thread via GitHub
amaliujia commented on PR #41710: URL: https://github.com/apache/spark/pull/41710#issuecomment-1603761304 I did a search and looks like `constructorNotFoundError` may fit into internal errors which throws when a constructor is not found during reflection related operation, which is also at

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #41709: [SPARK-44153][CORE][UI] Support `Heap Histogram` column in `Executors` tab

2023-06-23 Thread via GitHub
dongjoon-hyun commented on code in PR #41709: URL: https://github.com/apache/spark/pull/41709#discussion_r1239373825 ## core/src/main/scala/org/apache/spark/util/Utils.scala: ## @@ -2287,6 +2287,22 @@ private[spark] object Utils extends Logging with SparkClassUtils {

[GitHub] [spark] HyukjinKwon closed pull request #41708: [SPARK-44135][PYTHON][CONNECT][DOCS] Document Spark Connect only API in PySpark

2023-06-23 Thread via GitHub
HyukjinKwon closed pull request #41708: [SPARK-44135][PYTHON][CONNECT][DOCS] Document Spark Connect only API in PySpark URL: https://github.com/apache/spark/pull/41708 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] HyukjinKwon commented on pull request #41708: [SPARK-44135][PYTHON][CONNECT][DOCS] Document Spark Connect only API in PySpark

2023-06-23 Thread via GitHub
HyukjinKwon commented on PR #41708: URL: https://github.com/apache/spark/pull/41708#issuecomment-1603750548 Merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #41709: [SPARK-44153][CORE][UI] Support `Heap Histogram` column in `Executors` tab

2023-06-23 Thread via GitHub
dongjoon-hyun commented on code in PR #41709: URL: https://github.com/apache/spark/pull/41709#discussion_r1239373825 ## core/src/main/scala/org/apache/spark/util/Utils.scala: ## @@ -2287,6 +2287,22 @@ private[spark] object Utils extends Logging with SparkClassUtils {

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #41709: [SPARK-44153][CORE][UI] Support `Heap Histogram` column in `Executors` tab

2023-06-23 Thread via GitHub
dongjoon-hyun commented on code in PR #41709: URL: https://github.com/apache/spark/pull/41709#discussion_r1239373194 ## core/src/main/scala/org/apache/spark/util/Utils.scala: ## @@ -2287,6 +2287,22 @@ private[spark] object Utils extends Logging with SparkClassUtils {

[GitHub] [spark] pan3793 commented on a diff in pull request #41709: [SPARK-44153][CORE][UI] Support `Heap Histogram` column in `Executors` tab

2023-06-23 Thread via GitHub
pan3793 commented on code in PR #41709: URL: https://github.com/apache/spark/pull/41709#discussion_r1239357126 ## core/src/main/scala/org/apache/spark/util/Utils.scala: ## @@ -2287,6 +2287,22 @@ private[spark] object Utils extends Logging with SparkClassUtils {