[GitHub] [spark] HyukjinKwon closed pull request #42989: [SPARK-45210][DOCS][3.4] Switch languages consistently across docs for all code snippets (Spark 3.4 and below)

2023-09-18 Thread via GitHub
HyukjinKwon closed pull request #42989: [SPARK-45210][DOCS][3.4] Switch languages consistently across docs for all code snippets (Spark 3.4 and below) URL: https://github.com/apache/spark/pull/42989 -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] HyukjinKwon commented on pull request #42989: [SPARK-45210][DOCS][3.4] Switch languages consistently across docs for all code snippets (Spark 3.4 and below)

2023-09-18 Thread via GitHub
HyukjinKwon commented on PR #42989: URL: https://github.com/apache/spark/pull/42989#issuecomment-1724870114 Merged to branch-3.4, branch-3.3, branch-3.2, and branch-3.1. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] panbingkun commented on pull request #42917: [SPARK-45163][SQL] Merge UNSUPPORTED_VIEW_OPERATION & UNSUPPORTED_TABLE_OPERATION & fix some issue

2023-09-18 Thread via GitHub
panbingkun commented on PR #42917: URL: https://github.com/apache/spark/pull/42917#issuecomment-1724869552 > looks much better now, thanks for your patience! Thank you very much for patiently reviewing the code. ❤️ -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] panbingkun commented on a diff in pull request #42917: [SPARK-45163][SQL] Merge UNSUPPORTED_VIEW_OPERATION & UNSUPPORTED_TABLE_OPERATION & fix some issue

2023-09-18 Thread via GitHub
panbingkun commented on code in PR #42917: URL: https://github.com/apache/spark/pull/42917#discussion_r1329599785 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -860,6 +860,50 @@ "Exceeds char/varchar type length limitation: ." ] }, +

[GitHub] [spark] panbingkun commented on pull request #42990: [SPARK-45212][INFRA] Install independent Python linter dependencies for branch-3.5

2023-09-18 Thread via GitHub
panbingkun commented on PR #42990: URL: https://github.com/apache/spark/pull/42990#issuecomment-1724855910 +1, LGTM. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] rednaxelafx commented on pull request #42988: [WIP][SPARK-45209][CORE][UI] Flame Graph Support For Executor Thread Dump Page

2023-09-18 Thread via GitHub
rednaxelafx commented on PR #42988: URL: https://github.com/apache/spark/pull/42988#issuecomment-1724837849 One important aspect of flame graphs is the semantics of the "width" of the bars. It can be defined to mean anything, e.g. aggregated profiling ticks (i.e. number of samples) or wall

[GitHub] [spark] zhengruifeng commented on pull request #42990: [SPARK-45212][INFRA] Install independent Python linter dependencies for branch-3.5

2023-09-18 Thread via GitHub
zhengruifeng commented on PR #42990: URL: https://github.com/apache/spark/pull/42990#issuecomment-1724834827 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] rangadi commented on a diff in pull request #42986: [SPARK-44463][SS][CONNECT] Improve error handling for Connect steaming Python worker

2023-09-18 Thread via GitHub
rangadi commented on code in PR #42986: URL: https://github.com/apache/spark/pull/42986#discussion_r1329577300 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/StreamingQueryListenerHelper.scala: ## @@ -76,16 +78,21 @@ class

[GitHub] [spark] bogao007 commented on a diff in pull request #42986: [SPARK-44463][SS][CONNECT] Improve error handling for Connect steaming Python worker

2023-09-18 Thread via GitHub
bogao007 commented on code in PR #42986: URL: https://github.com/apache/spark/pull/42986#discussion_r1329574367 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/StreamingQueryListenerHelper.scala: ## @@ -76,16 +78,21 @@ class

[GitHub] [spark] bogao007 commented on a diff in pull request #42986: [SPARK-44463][SS][CONNECT] Improve error handling for Connect steaming Python worker

2023-09-18 Thread via GitHub
bogao007 commented on code in PR #42986: URL: https://github.com/apache/spark/pull/42986#discussion_r1329574367 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/StreamingQueryListenerHelper.scala: ## @@ -76,16 +78,21 @@ class

[GitHub] [spark] dongjoon-hyun closed pull request #42914: [SPARK-44910][SQL][3.4] Encoders.bean does not support superclasses with generic type arguments

2023-09-18 Thread via GitHub
dongjoon-hyun closed pull request #42914: [SPARK-44910][SQL][3.4] Encoders.bean does not support superclasses with generic type arguments URL: https://github.com/apache/spark/pull/42914 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] dongjoon-hyun commented on pull request #42914: [SPARK-44910][SQL][3.4] Encoders.bean does not support superclasses with generic type arguments

2023-09-18 Thread via GitHub
dongjoon-hyun commented on PR #42914: URL: https://github.com/apache/spark/pull/42914#issuecomment-1724820449 Merged to branch-3.4. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] dongjoon-hyun commented on pull request #42634: [SPARK-44910][SQL] Encoders.bean does not support superclasses with generic type arguments

2023-09-18 Thread via GitHub
dongjoon-hyun commented on PR #42634: URL: https://github.com/apache/spark/pull/42634#issuecomment-1724819818 Merged to master/3.5. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] dongjoon-hyun closed pull request #42634: [SPARK-44910][SQL] Encoders.bean does not support superclasses with generic type arguments

2023-09-18 Thread via GitHub
dongjoon-hyun closed pull request #42634: [SPARK-44910][SQL] Encoders.bean does not support superclasses with generic type arguments URL: https://github.com/apache/spark/pull/42634 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] dongjoon-hyun closed pull request #42956: [SPARK-43654][CONNECT][PS][TESTS] Enable `InternalFrameParityTests.test_from_pandas`

2023-09-18 Thread via GitHub
dongjoon-hyun closed pull request #42956: [SPARK-43654][CONNECT][PS][TESTS] Enable `InternalFrameParityTests.test_from_pandas` URL: https://github.com/apache/spark/pull/42956 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] yaooqinn commented on pull request #42988: [WIP][SPARK-45209][CORE][UI] Flame Graph Support For Executor Thread Dump Page

2023-09-18 Thread via GitHub
yaooqinn commented on PR #42988: URL: https://github.com/apache/spark/pull/42988#issuecomment-1724810773 This PR mainly focuses on the UI, independent of the profiling steps. What we might have in the future are: - Flame Graph Support For Task Thread Page, which

[GitHub] [spark] LuciferYang opened a new pull request, #42990: [SPARK-45212][INFRA] Install independent Python linter dependencies for branch-3.5

2023-09-18 Thread via GitHub
LuciferYang opened a new pull request, #42990: URL: https://github.com/apache/spark/pull/42990 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

[GitHub] [spark] mridulm commented on pull request #42988: [WIP][SPARK-45209][CORE][UI] Flame Graph Support For Executor Thread Dump Page

2023-09-18 Thread via GitHub
mridulm commented on PR #42988: URL: https://github.com/apache/spark/pull/42988#issuecomment-1724800071 The UI looks nice ! Thanks for working on this @yaooqinn :-) My main concern is around effectively capturing stack frames without safepoint bias, correlating it to the specific

[GitHub] [spark] sunchao commented on a diff in pull request #42612: [SPARK-44913][SQL] DS V2 supports push down V2 UDF that has magic method

2023-09-18 Thread via GitHub
sunchao commented on code in PR #42612: URL: https://github.com/apache/spark/pull/42612#discussion_r1329548074 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala: ## @@ -279,7 +283,9 @@ case class StaticInvoke( inputTypes:

[GitHub] [spark] allisonwang-db commented on a diff in pull request #42949: [SPARK-45093][CONNECT][PYTHON] Error reporting for addArtifacts query

2023-09-18 Thread via GitHub
allisonwang-db commented on code in PR #42949: URL: https://github.com/apache/spark/pull/42949#discussion_r1329547039 ## python/pyspark/sql/connect/client/artifact.py: ## @@ -243,11 +244,15 @@ def _create_requests( self, *path: str, pyfile: bool, archive: bool, file:

[GitHub] [spark] LuciferYang commented on pull request #42981: [SPARK-45211][CONNECT] Eliminated ambiguous references in `CloseableIterator#apply` to fix Scala 2.13 daily test

2023-09-18 Thread via GitHub
LuciferYang commented on PR #42981: URL: https://github.com/apache/spark/pull/42981#issuecomment-1724793150 Thanks @juliuszsompolski -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] ConeyLiu commented on a diff in pull request #42612: [SPARK-44913][SQL] DS V2 supports push down V2 UDF that has magic method

2023-09-18 Thread via GitHub
ConeyLiu commented on code in PR #42612: URL: https://github.com/apache/spark/pull/42612#discussion_r1329541422 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala: ## @@ -279,7 +283,9 @@ case class StaticInvoke( inputTypes:

[GitHub] [spark] ConeyLiu commented on a diff in pull request #42612: [SPARK-44913][SQL] DS V2 supports push down V2 UDF that has magic method

2023-09-18 Thread via GitHub
ConeyLiu commented on code in PR #42612: URL: https://github.com/apache/spark/pull/42612#discussion_r1329539929 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala: ## @@ -279,7 +283,9 @@ case class StaticInvoke( inputTypes:

[GitHub] [spark] zhengruifeng commented on pull request #42988: [WIP][SPARK-45209][CORE][UI] Flame Graph Support For Executor Thread Dump Page

2023-09-18 Thread via GitHub
zhengruifeng commented on PR #42988: URL: https://github.com/apache/spark/pull/42988#issuecomment-1724783234 awesome! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] HyukjinKwon commented on pull request #42988: [WIP][SPARK-45209][CORE][UI] Flame Graph Support For Executor Thread Dump Page

2023-09-18 Thread via GitHub
HyukjinKwon commented on PR #42988: URL: https://github.com/apache/spark/pull/42988#issuecomment-1724783071 Looks cool. cc @mridulm FYI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] LuciferYang commented on pull request #42981: [SPARK-45211][CONNECT] Eliminated ambiguous references in `CloseableIterator#apply` to fix Scala 2.13 daily test

2023-09-18 Thread via GitHub
LuciferYang commented on PR #42981: URL: https://github.com/apache/spark/pull/42981#issuecomment-1724777489 connect module test success with Scala 2.12 with this pr: https://github.com/LuciferYang/spark/runs/16908220090

[GitHub] [spark] cloud-fan commented on a diff in pull request #42612: [SPARK-44913][SQL] DS V2 supports push down V2 UDF that has magic method

2023-09-18 Thread via GitHub
cloud-fan commented on code in PR #42612: URL: https://github.com/apache/spark/pull/42612#discussion_r1329529759 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala: ## @@ -279,7 +283,9 @@ case class StaticInvoke( inputTypes:

[GitHub] [spark] cloud-fan commented on pull request #42917: [SPARK-45163][SQL] Merge UNSUPPORTED_VIEW_OPERATION & UNSUPPORTED_TABLE_OPERATION & fix some issue

2023-09-18 Thread via GitHub
cloud-fan commented on PR #42917: URL: https://github.com/apache/spark/pull/42917#issuecomment-1724772520 looks much better now, thanks for your patience! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] yaooqinn commented on pull request #42969: [SPARK-45192][UI] Fix overdue lineInterpolate parameter for graphviz edge

2023-09-18 Thread via GitHub
yaooqinn commented on PR #42969: URL: https://github.com/apache/spark/pull/42969#issuecomment-1724771959 cc @sarutak @HyukjinKwon @dongjoon-hyun thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] cloud-fan commented on a diff in pull request #42917: [SPARK-45163][SQL] Merge UNSUPPORTED_VIEW_OPERATION & UNSUPPORTED_TABLE_OPERATION & fix some issue

2023-09-18 Thread via GitHub
cloud-fan commented on code in PR #42917: URL: https://github.com/apache/spark/pull/42917#discussion_r1329519381 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -860,6 +860,50 @@ "Exceeds char/varchar type length limitation: ." ] }, +

[GitHub] [spark] HyukjinKwon commented on pull request #42989: [SPARK-45210][DOCS][3.4] Switch languages consistently across docs for all code snippets (Spark 3.4 and below)

2023-09-18 Thread via GitHub
HyukjinKwon commented on PR #42989: URL: https://github.com/apache/spark/pull/42989#issuecomment-1724767346 cc @gengliangwang @panbingkun @sarutak @zhengruifeng FYI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] HyukjinKwon opened a new pull request, #42989: [SPARK-45210][DOCS][3.4] Switch languages consistently across docs for all code snippets (Spark 3.4 and below)

2023-09-18 Thread via GitHub
HyukjinKwon opened a new pull request, #42989: URL: https://github.com/apache/spark/pull/42989 ### What changes were proposed in this pull request? This PR proposes to recover the availity of switching languages consistently across docs for all code snippets in Spark 3.4 and below by

[GitHub] [spark] cloud-fan commented on a diff in pull request #42917: [SPARK-45163][SQL] Merge UNSUPPORTED_VIEW_OPERATION & UNSUPPORTED_TABLE_OPERATION & fix some issue

2023-09-18 Thread via GitHub
cloud-fan commented on code in PR #42917: URL: https://github.com/apache/spark/pull/42917#discussion_r1329515855 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -860,6 +860,50 @@ "Exceeds char/varchar type length limitation: ." ] }, +

[GitHub] [spark] yaooqinn commented on pull request #42982: [SPARK-45202][BUILD] Fix lint-js tool and js format

2023-09-18 Thread via GitHub
yaooqinn commented on PR #42982: URL: https://github.com/apache/spark/pull/42982#issuecomment-1724765818 The job containing lint-js passed. Thanks @sarutak @dongjoon-hyun @HyukjinKwon , merged to master -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] yaooqinn closed pull request #42982: [SPARK-45202][BUILD] Fix lint-js tool and js format

2023-09-18 Thread via GitHub
yaooqinn closed pull request #42982: [SPARK-45202][BUILD] Fix lint-js tool and js format URL: https://github.com/apache/spark/pull/42982 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] mridulm commented on a diff in pull request #42357: [SPARK-44306][YARN] Group FileStatus with few RPC calls within Yarn Client

2023-09-18 Thread via GitHub
mridulm commented on code in PR #42357: URL: https://github.com/apache/spark/pull/42357#discussion_r1329512411 ## resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala: ## @@ -533,9 +536,12 @@ private[spark] class Client( // If preload is enabled,

[GitHub] [spark] yaooqinn opened a new pull request, #42988: [WIP][SPARK-45209][CORE][UI] Flame Graph Support For Executor Thread Dump Page

2023-09-18 Thread via GitHub
yaooqinn opened a new pull request, #42988: URL: https://github.com/apache/spark/pull/42988 ### What changes were proposed in this pull request? This PR draws a CPU Flame Graph by Java stack traces for executors and drivers. Currently, the Java stack traces is just a

[GitHub] [spark] panbingkun commented on a diff in pull request #42917: [SPARK-45163][SQL] Merge UNSUPPORTED_VIEW_OPERATION & UNSUPPORTED_TABLE_OPERATION & fix some issue

2023-09-18 Thread via GitHub
panbingkun commented on code in PR #42917: URL: https://github.com/apache/spark/pull/42917#discussion_r1329502401 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -1142,7 +1142,7 @@ class Analyzer(override val catalogManager:

[GitHub] [spark] panbingkun commented on a diff in pull request #42917: [SPARK-45163][SQL] Merge UNSUPPORTED_VIEW_OPERATION & UNSUPPORTED_TABLE_OPERATION & fix some issue

2023-09-18 Thread via GitHub
panbingkun commented on code in PR #42917: URL: https://github.com/apache/spark/pull/42917#discussion_r1329502973 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -1142,7 +1142,7 @@ class Analyzer(override val catalogManager:

[GitHub] [spark] panbingkun commented on a diff in pull request #42917: [SPARK-45163][SQL] Merge UNSUPPORTED_VIEW_OPERATION & UNSUPPORTED_TABLE_OPERATION & fix some issue

2023-09-18 Thread via GitHub
panbingkun commented on code in PR #42917: URL: https://github.com/apache/spark/pull/42917#discussion_r1329502401 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -1142,7 +1142,7 @@ class Analyzer(override val catalogManager:

[GitHub] [spark] ConeyLiu commented on a diff in pull request #42612: [SPARK-44913][SQL] DS V2 supports push down V2 UDF that has magic method

2023-09-18 Thread via GitHub
ConeyLiu commented on code in PR #42612: URL: https://github.com/apache/spark/pull/42612#discussion_r1329489593 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala: ## @@ -279,7 +283,9 @@ case class StaticInvoke( inputTypes:

[GitHub] [spark] yaooqinn commented on a diff in pull request #42982: [SPARK-45202][BUILD] Fix lint-js tool and js format

2023-09-18 Thread via GitHub
yaooqinn commented on code in PR #42982: URL: https://github.com/apache/spark/pull/42982#discussion_r1329460092 ## dev/lint-js: ## @@ -44,8 +44,14 @@ if ! npm ls eslint > /dev/null; then npm ci eslint fi -npx eslint -c "$SPARK_ROOT_DIR/dev/eslint.json" $LINT_TARGET_FILES

[GitHub] [spark] yaooqinn commented on a diff in pull request #42982: [SPARK-45202][BUILD] Fix lint-js tool and js format

2023-09-18 Thread via GitHub
yaooqinn commented on code in PR #42982: URL: https://github.com/apache/spark/pull/42982#discussion_r1329459486 ## dev/lint-js: ## @@ -44,8 +44,14 @@ if ! npm ls eslint > /dev/null; then npm ci eslint fi -npx eslint -c "$SPARK_ROOT_DIR/dev/eslint.json" $LINT_TARGET_FILES

[GitHub] [spark] rangadi commented on a diff in pull request #42986: [SPARK-44463][SS][CONNECT] Improve error handling for Connect steaming Python worker

2023-09-18 Thread via GitHub
rangadi commented on code in PR #42986: URL: https://github.com/apache/spark/pull/42986#discussion_r1329448797 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/StreamingForeachBatchHelper.scala: ## @@ -125,8 +128,21 @@ object

[GitHub] [spark] ulysses-you commented on a diff in pull request #42967: [SPARK-45191][SQL] InMemoryTableScanExec simpleStringWithNodeId adds columnar info

2023-09-18 Thread via GitHub
ulysses-you commented on code in PR #42967: URL: https://github.com/apache/spark/pull/42967#discussion_r1329450429 ## sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala: ## @@ -264,8 +269,7 @@ case class CachedRDDBuilder( } private

[GitHub] [spark] cloud-fan commented on a diff in pull request #42612: [SPARK-44913][SQL] DS V2 supports push down V2 UDF that has magic method

2023-09-18 Thread via GitHub
cloud-fan commented on code in PR #42612: URL: https://github.com/apache/spark/pull/42612#discussion_r1329447311 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala: ## @@ -279,7 +283,9 @@ case class StaticInvoke( inputTypes:

[GitHub] [spark] bogao007 commented on a diff in pull request #42986: [SPARK-44463][SS][CONNECT] Improve error handling for Connect steaming Python worker

2023-09-18 Thread via GitHub
bogao007 commented on code in PR #42986: URL: https://github.com/apache/spark/pull/42986#discussion_r1329447168 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/StreamingQueryListenerHelper.scala: ## @@ -76,16 +78,21 @@ class

[GitHub] [spark] bogao007 commented on a diff in pull request #42986: [SPARK-44463][SS][CONNECT] Improve error handling for Connect steaming Python worker

2023-09-18 Thread via GitHub
bogao007 commented on code in PR #42986: URL: https://github.com/apache/spark/pull/42986#discussion_r1329447168 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/StreamingQueryListenerHelper.scala: ## @@ -76,16 +78,21 @@ class

[GitHub] [spark] copperybean commented on pull request #42495: [SPARK-44812][SQL] Push all predicates according to EqualTo and EqualNullSafe

2023-09-18 Thread via GitHub
copperybean commented on PR #42495: URL: https://github.com/apache/spark/pull/42495#issuecomment-1724694181 @cloud-fan @wangyum Could you review this PR, please. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] itholic commented on pull request #42793: [SPARK-45065][PYTHON][PS] Support Pandas 2.1.0

2023-09-18 Thread via GitHub
itholic commented on PR #42793: URL: https://github.com/apache/spark/pull/42793#issuecomment-1724692463 Thanks all! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] WweiL commented on a diff in pull request #42986: [SPARK-44463][SS][CONNECT] Improve error handling for Connect steaming Python worker

2023-09-18 Thread via GitHub
WweiL commented on code in PR #42986: URL: https://github.com/apache/spark/pull/42986#discussion_r1329441472 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/StreamingQueryListenerHelper.scala: ## @@ -76,16 +78,21 @@ class

[GitHub] [spark] HeartSaVioR commented on pull request #42895: [SPARK-45138][SS] Define a new error class and apply it when checkpointing state to DFS fails

2023-09-18 Thread via GitHub
HeartSaVioR commented on PR #42895: URL: https://github.com/apache/spark/pull/42895#issuecomment-1724683270 Maybe it's the first time you are contributing to Apache Spark? If then, congrats on your first contribution! https://spark.apache.org/contributing.html Please check the

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #42940: [SPARK-45178][SS] Fallback to execute a single batch for Trigger.AvailableNow with unsupported sources rather than using wrappe

2023-09-18 Thread via GitHub
HeartSaVioR commented on code in PR #42940: URL: https://github.com/apache/spark/pull/42940#discussion_r1329427441 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala: ## @@ -52,11 +52,40 @@ class MicroBatchExecution( @volatile

[GitHub] [spark] panbingkun commented on a diff in pull request #42917: [SPARK-45163][SQL] Merge UNSUPPORTED_VIEW_OPERATION & UNSUPPORTED_TABLE_OPERATION & fix some issue

2023-09-18 Thread via GitHub
panbingkun commented on code in PR #42917: URL: https://github.com/apache/spark/pull/42917#discussion_r1329425447 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -860,6 +860,35 @@ "Exceeds char/varchar type length limitation: ." ] }, +

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #42940: [SPARK-45178][SS] Fallback to execute a single batch for Trigger.AvailableNow with unsupported sources rather than using wrappe

2023-09-18 Thread via GitHub
HeartSaVioR commented on code in PR #42940: URL: https://github.com/apache/spark/pull/42940#discussion_r1329424458 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala: ## @@ -52,11 +52,40 @@ class MicroBatchExecution( @volatile

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #42940: [SPARK-45178][SS] Fallback to execute a single batch for Trigger.AvailableNow with unsupported sources rather than using wrappe

2023-09-18 Thread via GitHub
HeartSaVioR commented on code in PR #42940: URL: https://github.com/apache/spark/pull/42940#discussion_r1329424143 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memory.scala: ## @@ -201,7 +207,15 @@ case class MemoryStream[A : Encoder]( override def

[GitHub] [spark] bogao007 commented on a diff in pull request #42986: [SPARK-44463][SS][CONNECT] Improve error handling for Connect steaming Python worker

2023-09-18 Thread via GitHub
bogao007 commented on code in PR #42986: URL: https://github.com/apache/spark/pull/42986#discussion_r1329422764 ## python/pyspark/sql/connect/streaming/worker/foreach_batch_worker.py: ## @@ -69,8 +73,32 @@ def process(df_id, batch_id): # type: ignore[no-untyped-def] while

[GitHub] [spark] anishshri-db commented on a diff in pull request #42940: [SPARK-45178][SS] Fallback to execute a single batch for Trigger.AvailableNow with unsupported sources rather than using wrapp

2023-09-18 Thread via GitHub
anishshri-db commented on code in PR #42940: URL: https://github.com/apache/spark/pull/42940#discussion_r1329420413 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AvailableNowDataStreamWrapper.scala: ## @@ -28,6 +28,12 @@ import

[GitHub] [spark] cloud-fan commented on a diff in pull request #42971: [SPARK-43979][SQL][FOLLOWUP] Handle non alias-only project case

2023-09-18 Thread via GitHub
cloud-fan commented on code in PR #42971: URL: https://github.com/apache/spark/pull/42971#discussion_r1329419904 ## sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala: ## @@ -1410,6 +1410,21 @@ class AnalysisSuite extends AnalysisTest with

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #42940: [SPARK-45178][SS] Fallback to execute a single batch for Trigger.AvailableNow with unsupported sources rather than using wrappe

2023-09-18 Thread via GitHub
HeartSaVioR commented on code in PR #42940: URL: https://github.com/apache/spark/pull/42940#discussion_r1329418792 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AvailableNowDataStreamWrapper.scala: ## @@ -28,6 +28,12 @@ import

[GitHub] [spark] anishshri-db commented on a diff in pull request #42940: [SPARK-45178][SS] Fallback to execute a single batch for Trigger.AvailableNow with unsupported sources rather than using wrapp

2023-09-18 Thread via GitHub
anishshri-db commented on code in PR #42940: URL: https://github.com/apache/spark/pull/42940#discussion_r1329417880 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala: ## @@ -52,11 +52,40 @@ class MicroBatchExecution( @volatile

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #42940: [SPARK-45178][SS] Fallback to execute a single batch for Trigger.AvailableNow with unsupported sources rather than using wrappe

2023-09-18 Thread via GitHub
HeartSaVioR commented on code in PR #42940: URL: https://github.com/apache/spark/pull/42940#discussion_r1329417758 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -2180,6 +2180,17 @@ object SQLConf { .booleanConf

[GitHub] [spark] itholic commented on a diff in pull request #40642: [SPARK-43010][PYTHON] Migrate Column errors into error class

2023-09-18 Thread via GitHub
itholic commented on code in PR #40642: URL: https://github.com/apache/spark/pull/40642#discussion_r1329417329 ## python/pyspark/errors/error_classes.py: ## @@ -24,6 +24,21 @@ "Argument `` is required when ." ] }, + "CANNOT_ACCESS_TO_DUNDER": { Review Comment:

[GitHub] [spark] itholic commented on a diff in pull request #40642: [SPARK-43010][PYTHON] Migrate Column errors into error class

2023-09-18 Thread via GitHub
itholic commented on code in PR #40642: URL: https://github.com/apache/spark/pull/40642#discussion_r1329417329 ## python/pyspark/errors/error_classes.py: ## @@ -24,6 +24,21 @@ "Argument `` is required when ." ] }, + "CANNOT_ACCESS_TO_DUNDER": { Review Comment:

[GitHub] [spark] Hisoka-X commented on pull request #42960: [SPARK-45078][SQL][3.4] Fix `array_insert` ImplicitCastInputTypes not work

2023-09-18 Thread via GitHub
Hisoka-X commented on PR #42960: URL: https://github.com/apache/spark/pull/42960#issuecomment-1724658573 Thanks @MaxGekk @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #42940: [SPARK-45178][SS] Fallback to execute a single batch for Trigger.AvailableNow with unsupported sources rather than using wrappe

2023-09-18 Thread via GitHub
HeartSaVioR commented on code in PR #42940: URL: https://github.com/apache/spark/pull/42940#discussion_r1329416576 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -2180,6 +2180,17 @@ object SQLConf { .booleanConf

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #42940: [SPARK-45178][SS] Fallback to execute a single batch for Trigger.AvailableNow with unsupported sources rather than using wrappe

2023-09-18 Thread via GitHub
HeartSaVioR commented on code in PR #42940: URL: https://github.com/apache/spark/pull/42940#discussion_r1329415427 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -2180,6 +2180,17 @@ object SQLConf { .booleanConf

[GitHub] [spark] github-actions[bot] closed pull request #41498: [SPARK-44001][Protobuf] spark protobuf: handle well known wrapper types

2023-09-18 Thread via GitHub
github-actions[bot] closed pull request #41498: [SPARK-44001][Protobuf] spark protobuf: handle well known wrapper types URL: https://github.com/apache/spark/pull/41498 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] zhengruifeng commented on pull request #42977: [SPARK-44788][PYTHON][DOCS][FOLLOW-UP] Move `from_xml`/`schema_of_xml` to `Xml Functions`

2023-09-18 Thread via GitHub
zhengruifeng commented on PR #42977: URL: https://github.com/apache/spark/pull/42977#issuecomment-1724649338 thank you guys! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] sunchao commented on a diff in pull request #42612: [SPARK-44913][SQL] DS V2 supports push down V2 UDF that has magic method

2023-09-18 Thread via GitHub
sunchao commented on code in PR #42612: URL: https://github.com/apache/spark/pull/42612#discussion_r1329406870 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala: ## @@ -279,7 +283,9 @@ case class StaticInvoke( inputTypes:

[GitHub] [spark] HyukjinKwon closed pull request #42968: [SPARK-45113][PYTHON][DOCS][FOLLOWUP] Add sorting to the example of `collect_set/collect_list` to ensure stable results

2023-09-18 Thread via GitHub
HyukjinKwon closed pull request #42968: [SPARK-45113][PYTHON][DOCS][FOLLOWUP] Add sorting to the example of `collect_set/collect_list` to ensure stable results URL: https://github.com/apache/spark/pull/42968 -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] HyukjinKwon commented on pull request #42968: [SPARK-45113][PYTHON][DOCS][FOLLOWUP] Add sorting to the example of `collect_set/collect_list` to ensure stable results

2023-09-18 Thread via GitHub
HyukjinKwon commented on PR #42968: URL: https://github.com/apache/spark/pull/42968#issuecomment-1724630071 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon commented on pull request #42986: [SPARK-44463][SS][CONNECT] Improve error handling for Connect steaming Python worker

2023-09-18 Thread via GitHub
HyukjinKwon commented on PR #42986: URL: https://github.com/apache/spark/pull/42986#issuecomment-1724629485 Otherwise looks sane to me. cc @ueshin -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #42986: [SPARK-44463][SS][CONNECT] Improve error handling for Connect steaming Python worker

2023-09-18 Thread via GitHub
HyukjinKwon commented on code in PR #42986: URL: https://github.com/apache/spark/pull/42986#discussion_r1329393005 ## python/pyspark/sql/connect/streaming/worker/foreach_batch_worker.py: ## @@ -69,8 +73,32 @@ def process(df_id, batch_id): # type: ignore[no-untyped-def]

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #42986: [SPARK-44463][SS][CONNECT] Improve error handling for Connect steaming Python worker

2023-09-18 Thread via GitHub
HyukjinKwon commented on code in PR #42986: URL: https://github.com/apache/spark/pull/42986#discussion_r1329392786 ## python/pyspark/sql/connect/streaming/worker/foreach_batch_worker.py: ## @@ -69,8 +73,32 @@ def process(df_id, batch_id): # type: ignore[no-untyped-def]

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #42986: [SPARK-44463][SS][CONNECT] Improve error handling for Connect steaming Python worker

2023-09-18 Thread via GitHub
HyukjinKwon commented on code in PR #42986: URL: https://github.com/apache/spark/pull/42986#discussion_r1329392064 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/StreamingForeachBatchHelper.scala: ## @@ -125,8 +128,21 @@ object

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #42949: [SPARK-45093][CONNECT][PYTHON] Error reporting for addArtifacts query

2023-09-18 Thread via GitHub
HyukjinKwon commented on code in PR #42949: URL: https://github.com/apache/spark/pull/42949#discussion_r1329390408 ## python/pyspark/sql/connect/client/logging.py: ## @@ -0,0 +1,42 @@ +import logging Review Comment: Seems like linter complains that there's no license header

[GitHub] [spark] HyukjinKwon commented on pull request #42965: [SPARK-45167][CONNECT][PYTHON][FOLLOW-UP] Use lighter threading Rlock, and use the existing eventually util function

2023-09-18 Thread via GitHub
HyukjinKwon commented on PR #42965: URL: https://github.com/apache/spark/pull/42965#issuecomment-1724619612 Merged to branch-3.5 too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon closed pull request #42973: [SPARK-45167][CONNECT][PYTHON][3.5] Python client must call `release_all`

2023-09-18 Thread via GitHub
HyukjinKwon closed pull request #42973: [SPARK-45167][CONNECT][PYTHON][3.5] Python client must call `release_all` URL: https://github.com/apache/spark/pull/42973 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] heyihong opened a new pull request, #42987: [SPARK-45207][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-18 Thread via GitHub
heyihong opened a new pull request, #42987: URL: https://github.com/apache/spark/pull/42987 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

[GitHub] [spark] HyukjinKwon commented on pull request #42973: [SPARK-45167][CONNECT][PYTHON][3.5] Python client must call `release_all`

2023-09-18 Thread via GitHub
HyukjinKwon commented on PR #42973: URL: https://github.com/apache/spark/pull/42973#issuecomment-1724618155 Merged to branch-3.5. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon closed pull request #42910: [SPARK-45133][CONNECT][TESTS][FOLLOWUP] Add test that queries transition to FINISHED

2023-09-18 Thread via GitHub
HyukjinKwon closed pull request #42910: [SPARK-45133][CONNECT][TESTS][FOLLOWUP] Add test that queries transition to FINISHED URL: https://github.com/apache/spark/pull/42910 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] HyukjinKwon commented on pull request #42910: [SPARK-45133][CONNECT][TESTS][FOLLOWUP] Add test that queries transition to FINISHED

2023-09-18 Thread via GitHub
HyukjinKwon commented on PR #42910: URL: https://github.com/apache/spark/pull/42910#issuecomment-1724617224 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] heyihong commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement error enrichment and setting server-side stacktrace

2023-09-18 Thread via GitHub
heyihong commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1329325417 ## connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala: ## @@ -26,47 +26,131 @@ import

[GitHub] [spark] dongjoon-hyun commented on pull request #42793: [SPARK-45065][PYTHON][PS] Support Pandas 2.1.0

2023-09-18 Thread via GitHub
dongjoon-hyun commented on PR #42793: URL: https://github.com/apache/spark/pull/42793#issuecomment-1724614064 Thank you, @itholic and all! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] dongjoon-hyun closed pull request #42793: [SPARK-45065][PYTHON][PS] Support Pandas 2.1.0

2023-09-18 Thread via GitHub
dongjoon-hyun closed pull request #42793: [SPARK-45065][PYTHON][PS] Support Pandas 2.1.0 URL: https://github.com/apache/spark/pull/42793 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] anishshri-db commented on a diff in pull request #42940: [SPARK-45178][SS] Fallback to execute a single batch for Trigger.AvailableNow with unsupported sources rather than using wrapp

2023-09-18 Thread via GitHub
anishshri-db commented on code in PR #42940: URL: https://github.com/apache/spark/pull/42940#discussion_r1329378160 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala: ## @@ -52,11 +52,40 @@ class MicroBatchExecution( @volatile

[GitHub] [spark] anishshri-db commented on a diff in pull request #42940: [SPARK-45178][SS] Fallback to execute a single batch for Trigger.AvailableNow with unsupported sources rather than using wrapp

2023-09-18 Thread via GitHub
anishshri-db commented on code in PR #42940: URL: https://github.com/apache/spark/pull/42940#discussion_r1329377950 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala: ## @@ -52,11 +52,40 @@ class MicroBatchExecution( @volatile

[GitHub] [spark] heyihong commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement error enrichment and setting server-side stacktrace

2023-09-18 Thread via GitHub
heyihong commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1329300383 ## connector/connect/common/src/main/protobuf/spark/connect/base.proto: ## @@ -778,6 +778,67 @@ message ReleaseExecuteResponse { optional string operation_id = 2;

[GitHub] [spark] heyihong commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement error enrichment and setting server-side stacktrace

2023-09-18 Thread via GitHub
heyihong commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1329300029 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/utils/ErrorUtils.scala: ## @@ -57,28 +69,105 @@ private[connect] object ErrorUtils extends

[GitHub] [spark] heyihong commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement error enrichment and setting server-side stacktrace

2023-09-18 Thread via GitHub
heyihong commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1329300029 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/utils/ErrorUtils.scala: ## @@ -57,28 +69,105 @@ private[connect] object ErrorUtils extends

[GitHub] [spark] heyihong commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement error enrichment and setting server-side stacktrace

2023-09-18 Thread via GitHub
heyihong commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1329300029 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/utils/ErrorUtils.scala: ## @@ -57,28 +69,105 @@ private[connect] object ErrorUtils extends

[GitHub] [spark] bogao007 commented on a diff in pull request #42986: [SPARK-44463][SS][CONNECT] Improve error handling for Connect steaming Python worker

2023-09-18 Thread via GitHub
bogao007 commented on code in PR #42986: URL: https://github.com/apache/spark/pull/42986#discussion_r1329356792 ## python/pyspark/sql/connect/streaming/worker/listener_worker.py: ## @@ -83,7 +86,14 @@ def process(listener_event_str, listener_event_type): # type:

[GitHub] [spark] mridulm commented on pull request #42893: [SPARK-44459][Core] Add System.runFinalization() to periodic cleanup

2023-09-18 Thread via GitHub
mridulm commented on PR #42893: URL: https://github.com/apache/spark/pull/42893#issuecomment-1724562144 If this is specific to this deployment, as @srowen mentioned, why not do this in user code/library? You can run a thread which periodically does this -- This is an automated

[GitHub] [spark] heyihong commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement error enrichment and setting server-side stacktrace

2023-09-18 Thread via GitHub
heyihong commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1329353558 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala: ## @@ -291,15 +307,16 @@ object SparkConnectService extends

[GitHub] [spark] heyihong commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement error enrichment and setting server-side stacktrace

2023-09-18 Thread via GitHub
heyihong commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1329352625 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectFetchErrorDetailsHandler.scala: ## @@ -0,0 +1,52 @@ +/* + * Licensed to

[GitHub] [spark] mridulm commented on pull request #42426: [SPARK-44756][CORE] Executor hangs when RetryingBlockTransferor fails to initiate retry

2023-09-18 Thread via GitHub
mridulm commented on PR #42426: URL: https://github.com/apache/spark/pull/42426#issuecomment-1724557591 Very good callout @Ngone51 , we should probably add a check style error as well to prevent its usage -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] heyihong commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement error enrichment and setting server-side stacktrace

2023-09-18 Thread via GitHub
heyihong commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1329349965 ## connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala: ## @@ -93,33 +177,44 @@ private[client] object

[GitHub] [spark] heyihong commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement error enrichment and setting server-side stacktrace

2023-09-18 Thread via GitHub
heyihong commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1329349965 ## connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala: ## @@ -93,33 +177,44 @@ private[client] object

[GitHub] [spark] WweiL commented on a diff in pull request #42986: [SPARK-44463][SS][CONNECT] Improve error handling for Connect steaming Python worker

2023-09-18 Thread via GitHub
WweiL commented on code in PR #42986: URL: https://github.com/apache/spark/pull/42986#discussion_r1329337505 ## python/pyspark/sql/connect/streaming/worker/listener_worker.py: ## @@ -83,7 +86,14 @@ def process(listener_event_str, listener_event_type): # type:

  1   2   3   >