[GitHub] [spark] HeartSaVioR commented on pull request #42895: [SPARK-45138][SS] Define a new error class and apply it when checkpointing state to DFS fails

2023-09-19 Thread via GitHub
HeartSaVioR commented on PR #42895: URL: https://github.com/apache/spark/pull/42895#issuecomment-1727033619 https://github.com/neilramaswamy/nr-spark/actions/runs/6242426233/job/16951813562 This failure seems to be real one. SparkThrowableSuite provides a guide to regenerate the erro

[GitHub] [spark] gengliangwang closed pull request #42998: [3.5][SPARK-45189][SQL] Creating UnresolvedRelation from TableIdentifier should include the catalog field

2023-09-19 Thread via GitHub
gengliangwang closed pull request #42998: [3.5][SPARK-45189][SQL] Creating UnresolvedRelation from TableIdentifier should include the catalog field URL: https://github.com/apache/spark/pull/42998 -- This is an automated message from the Apache Git Service. To respond to the message, please l

[GitHub] [spark] gengliangwang commented on pull request #42998: [3.5][SPARK-45189][SQL] Creating UnresolvedRelation from TableIdentifier should include the catalog field

2023-09-19 Thread via GitHub
gengliangwang commented on PR #42998: URL: https://github.com/apache/spark/pull/42998#issuecomment-1727017308 Merging to 3.5 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] zhengruifeng commented on pull request #42860: [SPARK-45107][PYTHON][DOCS] Refine docstring of explode

2023-09-19 Thread via GitHub
zhengruifeng commented on PR #42860: URL: https://github.com/apache/spark/pull/42860#issuecomment-1727016141 ``` starting python compilation test... python compilation succeeded. starting black test... black checks failed: Oh no! 💥 💔 💥 The required version `23.9.1` does not

[GitHub] [spark] LuciferYang commented on pull request #43005: [WIP][SPARK-44112][BUILD][INFRA] Drop Java 8 and 11 support

2023-09-19 Thread via GitHub
LuciferYang commented on PR #43005: URL: https://github.com/apache/spark/pull/43005#issuecomment-1727009057 > +1 for switching to Scala 2.13 first. Please lead the activity. I trust your domain expertise. :) > > > OK ~ -- This is an automated message from the Apache Git Ser

[GitHub] [spark] dongjoon-hyun commented on pull request #43005: [WIP][SPARK-44112][BUILD][INFRA] Drop Java 8 and 11 support

2023-09-19 Thread via GitHub
dongjoon-hyun commented on PR #43005: URL: https://github.com/apache/spark/pull/43005#issuecomment-1727008229 +1 for switching to Scala 2.13 first. Please lead the activity. I trust your domain expertise. :) > If this is the case, can we drop Scala 2.12 supports first, then upgrade the

[GitHub] [spark] LuciferYang commented on pull request #43005: [WIP][SPARK-44112][BUILD][INFRA] Drop Java 8 and 11 support

2023-09-19 Thread via GitHub
LuciferYang commented on PR #43005: URL: https://github.com/apache/spark/pull/43005#issuecomment-1727006590 @dongjoon-hyun I'd like to discuss with you that https://github.com/LuciferYang/spark/actions/runs/6243680485/job/16949769483 has tested Java 17 + Scala 2.12.18, and a lot of tests fa

[GitHub] [spark] dongjoon-hyun commented on pull request #43007: [SPARK-45229][CORE] Show the number of drivers waiting in SUBMITTED status in MasterPage

2023-09-19 Thread via GitHub
dongjoon-hyun commented on PR #43007: URL: https://github.com/apache/spark/pull/43007#issuecomment-1726997821 Thank you, @yaooqinn ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [spark] yaooqinn commented on pull request #43007: [SPARK-45229][CORE] Show the number of drivers waiting in SUBMITTED status in MasterPage

2023-09-19 Thread via GitHub
yaooqinn commented on PR #43007: URL: https://github.com/apache/spark/pull/43007#issuecomment-1726994641 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

[GitHub] [spark] dongjoon-hyun commented on pull request #43007: [SPARK-45229][CORE] Show the number of drivers waiting in SUBMITTED status in MasterPage

2023-09-19 Thread via GitHub
dongjoon-hyun commented on PR #43007: URL: https://github.com/apache/spark/pull/43007#issuecomment-1726981592 Could you review this PR when you have some time, @yaooqinn ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[GitHub] [spark] dongjoon-hyun opened a new pull request, #43007: [SPARK-45229][CORE] Show the number of drivers waiting in SUBMITTED status in MasterPage

2023-09-19 Thread via GitHub
dongjoon-hyun opened a new pull request, #43007: URL: https://github.com/apache/spark/pull/43007 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

[GitHub] [spark] HyukjinKwon closed pull request #43000: [SPARK-45225] [SQL] XML: XSD file URL support

2023-09-19 Thread via GitHub
HyukjinKwon closed pull request #43000: [SPARK-45225] [SQL] XML: XSD file URL support URL: https://github.com/apache/spark/pull/43000 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[GitHub] [spark] HyukjinKwon commented on pull request #43000: [SPARK-45225] [SQL] XML: XSD file URL support

2023-09-19 Thread via GitHub
HyukjinKwon commented on PR #43000: URL: https://github.com/apache/spark/pull/43000#issuecomment-1726941236 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] hvanhovell closed pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
hvanhovell closed pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC URL: https://github.com/apache/spark/pull/42377 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] hvanhovell commented on pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
hvanhovell commented on PR #42377: URL: https://github.com/apache/spark/pull/42377#issuecomment-1726931731 Merging to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] gaoyajun02 commented on pull request #43004: [SPARK-45134][SHUFFLE] Avoid repeated fallback when failed to fetch remote push-merged block meta

2023-09-19 Thread via GitHub
gaoyajun02 commented on PR #43004: URL: https://github.com/apache/spark/pull/43004#issuecomment-1726890650 @mridulm @tgravescs @attilapiros @Ngone51 @Victsm @otterc Please help review this change. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] zhengruifeng commented on pull request #43001: [SPARK-45218][PYTHON][DOCS] Refine docstring of Column.isin

2023-09-19 Thread via GitHub
zhengruifeng commented on PR #43001: URL: https://github.com/apache/spark/pull/43001#issuecomment-1726888546 ``` starting black test... black checks failed: Oh no! 💥 💔 💥 The required version `23.9.1` does not match the running version `22.6.0`! Please run 'dev/reformat-python' sc

[GitHub] [spark] itholic commented on pull request #42994: [SPARK-43433][PS] Match `GroupBy.nth` behavior to the latest Pandas

2023-09-19 Thread via GitHub
itholic commented on PR #42994: URL: https://github.com/apache/spark/pull/42994#issuecomment-1726884937 I don't see any failure on my local testing, but it complains on GitHub CI. Let me take a deeper look why this happens. -- This is an automated message from the Apache Git Service. To r

[GitHub] [spark] HyukjinKwon closed pull request #42986: [SPARK-44463][SS][CONNECT] Improve error handling for Connect steaming Python worker

2023-09-19 Thread via GitHub
HyukjinKwon closed pull request #42986: [SPARK-44463][SS][CONNECT] Improve error handling for Connect steaming Python worker URL: https://github.com/apache/spark/pull/42986 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] HyukjinKwon commented on pull request #42986: [SPARK-44463][SS][CONNECT] Improve error handling for Connect steaming Python worker

2023-09-19 Thread via GitHub
HyukjinKwon commented on PR #42986: URL: https://github.com/apache/spark/pull/42986#issuecomment-1726838303 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] itholic commented on a diff in pull request #43002: [SPARK-43498][PS][TESTS] Enable `StatsTests.test_axis_on_dataframe` for pandas 2.0.0.

2023-09-19 Thread via GitHub
itholic commented on code in PR #43002: URL: https://github.com/apache/spark/pull/43002#discussion_r1330938324 ## python/pyspark/pandas/tests/test_stats.py: ## @@ -180,6 +176,10 @@ def test_axis_on_dataframe(self): }, index=range(10, 15001, 10),

[GitHub] [spark] gaoyajun02 commented on pull request #43004: [SPARK-45134][SHUFFLE] Avoid repeated fallback when failed to fetch remote push-merged block meta

2023-09-19 Thread via GitHub
gaoyajun02 commented on PR #43004: URL: https://github.com/apache/spark/pull/43004#issuecomment-1726832193 In fact, there is another fix, which is to add a check for outstandingRpcs in handleFailure of the network module. https://github.com/apache/spark/blob/227e262025229a67f43a8de45

[GitHub] [spark] ConeyLiu commented on a diff in pull request #42612: [SPARK-44913][SQL] DS V2 supports push down V2 UDF that has magic method

2023-09-19 Thread via GitHub
ConeyLiu commented on code in PR #42612: URL: https://github.com/apache/spark/pull/42612#discussion_r1330934027 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala: ## @@ -279,7 +283,9 @@ case class StaticInvoke( inputTypes: Seq[Ab

[GitHub] [spark] LuciferYang opened a new pull request, #43005: [SPARK-44112][BUILD][INFRA] Drop Java 8 and 11 support

2023-09-19 Thread via GitHub
LuciferYang opened a new pull request, #43005: URL: https://github.com/apache/spark/pull/43005 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

[GitHub] [spark] robreeves opened a new pull request, #43006: prototype to delete previous partitions before commit

2023-09-19 Thread via GitHub
robreeves opened a new pull request, #43006: URL: https://github.com/apache/spark/pull/43006 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How w

[GitHub] [spark] gaoyajun02 opened a new pull request, #43004: [SPARK-45134][SHUFFLE] Avoid repeated fallback when failed to fetch remote push-merged block meta

2023-09-19 Thread via GitHub
gaoyajun02 opened a new pull request, #43004: URL: https://github.com/apache/spark/pull/43004 ### What changes were proposed in this pull request? Add inflightMergedBlocks to avoid repeated fallback when failed to fetch remote push-merged block meta ### Why are the changes needed?

[GitHub] [spark] panbingkun commented on pull request #43003: [SPARK-45226][PYTHON][DOCS] Refine docstring of `rand/randn`

2023-09-19 Thread via GitHub
panbingkun commented on PR #43003: URL: https://github.com/apache/spark/pull/43003#issuecomment-1726802936 cc @zhengruifeng @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

[GitHub] [spark] panbingkun commented on pull request #43003: [SPARK-45226][PYTHON][DOCS] Refine docstring of `rand/randn`

2023-09-19 Thread via GitHub
panbingkun commented on PR #43003: URL: https://github.com/apache/spark/pull/43003#issuecomment-1726802554 - Before: https://github.com/apache/spark/assets/15246973/3d2b46b8-b453-4257-b710-81aa1793a1e1";> - After: https://github.com/apache/spark/assets/15246973/05d8ad6f-0c8

[GitHub] [spark] panbingkun opened a new pull request, #43003: [SPARK-45226][PYTHON][DOCS] Refine docstring of `rand/randn`

2023-09-19 Thread via GitHub
panbingkun opened a new pull request, #43003: URL: https://github.com/apache/spark/pull/43003 ### What changes were proposed in this pull request? The pr aims to refine docstring of `rand/randn`. ### Why are the changes needed? - We need to add a call without seed in the example,

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #43002: [SPARK-43498][PS][TESTS] Enable `StatsTests.test_axis_on_dataframe` for pandas 2.0.0.

2023-09-19 Thread via GitHub
dongjoon-hyun commented on code in PR #43002: URL: https://github.com/apache/spark/pull/43002#discussion_r1330893931 ## python/pyspark/pandas/tests/test_stats.py: ## @@ -180,6 +176,10 @@ def test_axis_on_dataframe(self): }, index=range(10, 15001

[GitHub] [spark] itholic opened a new pull request, #43002: [SPARK-43498][PS][TESTS] Enable `StatsTests.test_axis_on_dataframe` for pandas 2.0.0.

2023-09-19 Thread via GitHub
itholic opened a new pull request, #43002: URL: https://github.com/apache/spark/pull/43002 ### What changes were proposed in this pull request? This PR proposes to enable `StatsTests.test_axis_on_dataframe` for pandas 2.0.0. ### Why are the changes needed? Increase the t

[GitHub] [spark] hvanhovell commented on a diff in pull request #42987: [SPARK-45207][SQL][CONNECT] Implement Error Reconstruction for Scala Client

2023-09-19 Thread via GitHub
hvanhovell commented on code in PR #42987: URL: https://github.com/apache/spark/pull/42987#discussion_r1330878085 ## connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala: ## @@ -24,49 +24,135 @@ import scala.reflect.ClassTag i

[GitHub] [spark] hvanhovell commented on a diff in pull request #42987: [SPARK-45207][SQL][CONNECT] Implement Error Reconstruction for Scala Client

2023-09-19 Thread via GitHub
hvanhovell commented on code in PR #42987: URL: https://github.com/apache/spark/pull/42987#discussion_r1330877688 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -2882,8 +2882,7 @@ object SQLConf { "level settings.") .version("3.

[GitHub] [spark] HyukjinKwon commented on pull request #42949: [SPARK-45093][CONNECT][PYTHON] Error reporting for addArtifacts query

2023-09-19 Thread via GitHub
HyukjinKwon commented on PR #42949: URL: https://github.com/apache/spark/pull/42949#issuecomment-1726777160 Ah, the master branch and your branch should be synced to the latest master brnach. Black was latenly upgraded. -- This is an automated message from the Apache Git Service. To respo

[GitHub] [spark] hvanhovell commented on a diff in pull request #42987: [SPARK-45207][SQL][CONNECT] Implement Error Reconstruction for Scala Client

2023-09-19 Thread via GitHub
hvanhovell commented on code in PR #42987: URL: https://github.com/apache/spark/pull/42987#discussion_r1330873176 ## connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala: ## @@ -93,33 +179,65 @@ private[client] object GrpcExcep

[GitHub] [spark] HeartSaVioR closed pull request #42940: [SPARK-45178][SS] Fallback to execute a single batch for Trigger.AvailableNow with unsupported sources rather than using wrapper

2023-09-19 Thread via GitHub
HeartSaVioR closed pull request #42940: [SPARK-45178][SS] Fallback to execute a single batch for Trigger.AvailableNow with unsupported sources rather than using wrapper URL: https://github.com/apache/spark/pull/42940 -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] HeartSaVioR commented on pull request #42940: [SPARK-45178][SS] Fallback to execute a single batch for Trigger.AvailableNow with unsupported sources rather than using wrapper

2023-09-19 Thread via GitHub
HeartSaVioR commented on PR #42940: URL: https://github.com/apache/spark/pull/42940#issuecomment-1726773266 Thanks, merging to master. (I'll probably raise a discussion to dev@ for porting back to 3.5/3.4.) -- This is an automated message from the Apache Git Service. To respond to the m

[GitHub] [spark] yaooqinn commented on pull request #42969: [SPARK-45192][UI] Fix overdue lineInterpolate parameter for graphviz edge

2023-09-19 Thread via GitHub
yaooqinn commented on PR #42969: URL: https://github.com/apache/spark/pull/42969#issuecomment-1726771680 Thank you all, merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [spark] yaooqinn closed pull request #42969: [SPARK-45192][UI] Fix overdue lineInterpolate parameter for graphviz edge

2023-09-19 Thread via GitHub
yaooqinn closed pull request #42969: [SPARK-45192][UI] Fix overdue lineInterpolate parameter for graphviz edge URL: https://github.com/apache/spark/pull/42969 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] hvanhovell commented on a diff in pull request #42987: [SPARK-45207][SQL][CONNECT] Implement Error Reconstruction for Scala Client

2023-09-19 Thread via GitHub
hvanhovell commented on code in PR #42987: URL: https://github.com/apache/spark/pull/42987#discussion_r1330870268 ## connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala: ## @@ -24,49 +24,135 @@ import scala.reflect.ClassTag i

[GitHub] [spark] HeartSaVioR commented on pull request #42940: [SPARK-45178][SS] Fallback to execute a single batch for Trigger.AvailableNow with unsupported sources rather than using wrapper

2023-09-19 Thread via GitHub
HeartSaVioR commented on PR #42940: URL: https://github.com/apache/spark/pull/42940#issuecomment-1726769354 https://github.com/HeartSaVioR/spark/runs/16947355981 Looks like the test is flaky rather than consistent failure, it passed before rebasing and also passed locally. https://

[GitHub] [spark] itholic commented on pull request #40420: [SPARK-42617][PS] Support `isocalendar` from the pandas 2.0.0

2023-09-19 Thread via GitHub
itholic commented on PR #40420: URL: https://github.com/apache/spark/pull/40420#issuecomment-1726767540 @dzhigimont Can we just make the CI pass for now? I can help in the follow-ups after merging this one. Seems like the mypy checks is failing for now: ``` starting mypy annotat

[GitHub] [spark] itholic commented on a diff in pull request #40420: [SPARK-42617][PS] Support `isocalendar` from the pandas 2.0.0

2023-09-19 Thread via GitHub
itholic commented on code in PR #40420: URL: https://github.com/apache/spark/pull/40420#discussion_r1330865687 ## python/pyspark/pandas/datetimes.py: ## @@ -116,26 +117,59 @@ def pandas_microsecond(s) -> ps.Series[np.int32]: # type: ignore[no-untyped-def def nanosecond(se

[GitHub] [spark] yaooqinn commented on pull request #42988: [WIP][SPARK-45209][CORE][UI] Flame Graph Support For Executor Thread Dump Page

2023-09-19 Thread via GitHub
yaooqinn commented on PR #42988: URL: https://github.com/apache/spark/pull/42988#issuecomment-1726761486 @rednaxelafx Yes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] hvanhovell commented on a diff in pull request #42987: [SPARK-45207][SQL][CONNECT] Implement Error Reconstruction for Scala Client

2023-09-19 Thread via GitHub
hvanhovell commented on code in PR #42987: URL: https://github.com/apache/spark/pull/42987#discussion_r1330859817 ## connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/streaming/ClientStreamingQuerySuite.scala: ## @@ -175,6 +175,37 @@ class ClientStreamingQuerySuit

[GitHub] [spark] hvanhovell commented on a diff in pull request #42987: [SPARK-45207][SQL][CONNECT] Implement Error Reconstruction for Scala Client

2023-09-19 Thread via GitHub
hvanhovell commented on code in PR #42987: URL: https://github.com/apache/spark/pull/42987#discussion_r1330859551 ## connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/CustomSparkConnectBlockingStub.scala: ## @@ -27,11 +27,17 @@ private[connect] class Cu

[GitHub] [spark] rednaxelafx commented on pull request #42988: [WIP][SPARK-45209][CORE][UI] Flame Graph Support For Executor Thread Dump Page

2023-09-19 Thread via GitHub
rednaxelafx commented on PR #42988: URL: https://github.com/apache/spark/pull/42988#issuecomment-1726713952 > Currently, the width represents the number of samples. In the current implementation in this PR, the width is essentially representing the "number of threads" that's sharing t

[GitHub] [spark] itholic commented on a diff in pull request #40642: [SPARK-43010][PYTHON] Migrate Column errors into error class

2023-09-19 Thread via GitHub
itholic commented on code in PR #40642: URL: https://github.com/apache/spark/pull/40642#discussion_r1330825147 ## python/pyspark/errors/error_classes.py: ## @@ -24,6 +24,21 @@ "Argument `` is required when ." ] }, + "CANNOT_ACCESS_TO_DUNDER": { Review Comment:

[GitHub] [spark] warrenzhu25 commented on pull request #42999: [SPARK-45217][CORE] Support change log level of specific package or class

2023-09-19 Thread via GitHub
warrenzhu25 commented on PR #42999: URL: https://github.com/apache/spark/pull/42999#issuecomment-1726690587 > Why don't we use `log4j2.properties`? Added user cases in description. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] sandip-db commented on pull request #43000: [SPARK-45225] [SQL] XML: XSD file URL support

2023-09-19 Thread via GitHub
sandip-db commented on PR #43000: URL: https://github.com/apache/spark/pull/43000#issuecomment-1726677981 @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [spark] allisonwang-db opened a new pull request, #43001: [SPARK-45218][PYTHON][DOCS] Refine docstring of Column.isin

2023-09-19 Thread via GitHub
allisonwang-db opened a new pull request, #43001: URL: https://github.com/apache/spark/pull/43001 ### What changes were proposed in this pull request? This PR refines the docstring of `Column.isin` by updating the examples. ### Why are the changes needed? To impro

[GitHub] [spark] sandip-db opened a new pull request, #43000: [SPARK-45225] [SQL] XML: XSD file URL support

2023-09-19 Thread via GitHub
sandip-db opened a new pull request, #43000: URL: https://github.com/apache/spark/pull/43000 ### What changes were proposed in this pull request? Add support to read XSD file URL. ### Why are the changes needed? Add support to read XSD file URL. ### Does this PR introduce

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #42999: [SPARK-45217][CORE] Support change log level of specific package or class

2023-09-19 Thread via GitHub
dongjoon-hyun commented on code in PR #42999: URL: https://github.com/apache/spark/pull/42999#discussion_r1330789111 ## core/src/main/scala/org/apache/spark/util/Utils.scala: ## @@ -1013,7 +1013,7 @@ private[spark] object Utils * In case of IPv6, getHostAddress may return '0

[GitHub] [spark] WweiL commented on pull request #42859: [SPARK-43299][SS][CONNECT] Convert StreamingQueryException in Scala Client

2023-09-19 Thread via GitHub
WweiL commented on PR #42859: URL: https://github.com/apache/spark/pull/42859#issuecomment-1726612337 We should also uncomment and verify these comments: https://github.com/apache/spark/blob/1fdd46f173f7bc90e0523eb0a2d5e8e27e990102/connector/connect/client/jvm/src/main/scala/org/apache/sp

[GitHub] [spark] HeartSaVioR commented on pull request #42940: [SPARK-45178][SS] Fallback to execute a single batch for Trigger.AvailableNow with unsupported sources rather than using wrapper

2023-09-19 Thread via GitHub
HeartSaVioR commented on PR #42940: URL: https://github.com/apache/spark/pull/42940#issuecomment-1726601027 Ah I commented too late :) Thanks for your support! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[GitHub] [spark] HeartSaVioR commented on pull request #42940: [SPARK-45178][SS] Fallback to execute a single batch for Trigger.AvailableNow with unsupported sources rather than using wrapper

2023-09-19 Thread via GitHub
HeartSaVioR commented on PR #42940: URL: https://github.com/apache/spark/pull/42940#issuecomment-1726600452 > As we are already in the status that sources not supporting Trigger.AvailableNow can work with the wrapper instead of simply failing since Trigger.Once is deprecated, it sounds reas

[GitHub] [spark] HeartSaVioR commented on pull request #42940: [SPARK-45178][SS] Fallback to execute a single batch for Trigger.AvailableNow with unsupported sources rather than using wrapper

2023-09-19 Thread via GitHub
HeartSaVioR commented on PR #42940: URL: https://github.com/apache/spark/pull/42940#issuecomment-1726572679 Looks like python side env issue. I'll rebase and see the chance to be already fixed in latest master. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] MaxGekk commented on pull request #42939: [SPARK-43254][SQL] Assign a name to the error _LEGACY_ERROR_TEMP_2018

2023-09-19 Thread via GitHub
MaxGekk commented on PR #42939: URL: https://github.com/apache/spark/pull/42939#issuecomment-1726452334 @dengziming Could you rebase on the recent master, please. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] heyihong commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
heyihong commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1330657004 ## connector/connect/server/src/test/scala/org/apache/spark/sql/connect/service/FetchErrorDetailsHandlerSuite.scala: ## @@ -0,0 +1,120 @@ +/* + * Licensed to the Apach

[GitHub] [spark] heyihong commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
heyihong commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1330656751 ## connector/connect/server/src/test/scala/org/apache/spark/sql/connect/service/FetchErrorDetailsHandlerSuite.scala: ## @@ -0,0 +1,120 @@ +/* + * Licensed to the Apach

[GitHub] [spark] sarutak commented on pull request #42989: [SPARK-45210][DOCS][3.4] Switch languages consistently across docs for all code snippets (Spark 3.4 and below)

2023-09-19 Thread via GitHub
sarutak commented on PR #42989: URL: https://github.com/apache/spark/pull/42989#issuecomment-1726386211 Late LGTM. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[GitHub] [spark] hvanhovell commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
hvanhovell commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1330636049 ## connector/connect/server/src/test/scala/org/apache/spark/sql/connect/service/FetchErrorDetailsHandlerSuite.scala: ## @@ -0,0 +1,120 @@ +/* + * Licensed to the Apa

[GitHub] [spark] hvanhovell commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
hvanhovell commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1330634306 ## connector/connect/server/src/test/scala/org/apache/spark/sql/connect/service/FetchErrorDetailsHandlerSuite.scala: ## @@ -0,0 +1,120 @@ +/* + * Licensed to the Apa

[GitHub] [spark] hvanhovell commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
hvanhovell commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1330633746 ## connector/connect/server/src/test/scala/org/apache/spark/sql/connect/service/FetchErrorDetailsHandlerSuite.scala: ## @@ -0,0 +1,120 @@ +/* + * Licensed to the Apa

[GitHub] [spark] warrenzhu25 opened a new pull request, #42999: [SPARK-45217][CORE] Support change log level of specific package or class

2023-09-19 Thread via GitHub
warrenzhu25 opened a new pull request, #42999: URL: https://github.com/apache/spark/pull/42999 ### What changes were proposed in this pull request? Add `SparkContext.setLogLevel(loggerName: String, logLevel: String)` to support change log level of specific package or class ### Why

[GitHub] [spark] heyihong commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
heyihong commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1330626485 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/utils/ErrorUtils.scala: ## @@ -66,14 +137,26 @@ private[connect] object ErrorUtils extends Logg

[GitHub] [spark] hvanhovell commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
hvanhovell commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1330612639 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/utils/ErrorUtils.scala: ## @@ -66,14 +137,26 @@ private[connect] object ErrorUtils extends Lo

[GitHub] [spark] heyihong commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
heyihong commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1330611551 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/utils/ErrorUtils.scala: ## @@ -66,14 +137,26 @@ private[connect] object ErrorUtils extends Logg

[GitHub] [spark] heyihong commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
heyihong commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1330611551 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/utils/ErrorUtils.scala: ## @@ -66,14 +137,26 @@ private[connect] object ErrorUtils extends Logg

[GitHub] [spark] hvanhovell commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
hvanhovell commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1330599584 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/utils/ErrorUtils.scala: ## @@ -66,14 +137,26 @@ private[connect] object ErrorUtils extends Lo

[GitHub] [spark] hvanhovell commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
hvanhovell commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1330596618 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/utils/ErrorUtils.scala: ## @@ -57,7 +76,59 @@ private[connect] object ErrorUtils extends Logg

[GitHub] [spark] heyihong commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
heyihong commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1330596345 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala: ## @@ -213,4 +214,21 @@ object Connect { .version("3.5.0") .int

[GitHub] [spark] hvanhovell commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
hvanhovell commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1330581451 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala: ## @@ -213,4 +214,21 @@ object Connect { .version("3.5.0") .i

[GitHub] [spark] heyihong commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
heyihong commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1330578327 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SessionHolder.scala: ## @@ -45,6 +49,15 @@ case class SessionHolder(userId: String, ses

[GitHub] [spark] hvanhovell commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
hvanhovell commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1330569555 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SessionHolder.scala: ## @@ -45,6 +49,15 @@ case class SessionHolder(userId: String, s

[GitHub] [spark] mridulm commented on a diff in pull request #42950: [SPARK-45182][CORE] Ignore task completion from old stage after retrying indeterminate stages

2023-09-19 Thread via GitHub
mridulm commented on code in PR #42950: URL: https://github.com/apache/spark/pull/42950#discussion_r1330551562 ## core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala: ## @@ -1903,13 +1903,20 @@ private[spark] class DAGScheduler( case smt: ShuffleMapTas

[GitHub] [spark] mridulm commented on a diff in pull request #42950: [SPARK-45182][CORE] Ignore task completion from old stage after retrying indeterminate stages

2023-09-19 Thread via GitHub
mridulm commented on code in PR #42950: URL: https://github.com/apache/spark/pull/42950#discussion_r1330551562 ## core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala: ## @@ -1903,13 +1903,20 @@ private[spark] class DAGScheduler( case smt: ShuffleMapTas

[GitHub] [spark] mridulm commented on pull request #42950: [SPARK-45182][CORE] Ignore task completion from old stage after retrying indeterminate stages

2023-09-19 Thread via GitHub
mridulm commented on PR #42950: URL: https://github.com/apache/spark/pull/42950#issuecomment-1726285311 +CC @Ngone51 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

[GitHub] [spark] gengliangwang commented on pull request #42998: [3.5][SPARK-45189][SQL] Creating UnresolvedRelation from TableIdentifier should include the catalog field

2023-09-19 Thread via GitHub
gengliangwang commented on PR #42998: URL: https://github.com/apache/spark/pull/42998#issuecomment-1726239688 This is backporting https://github.com/apache/spark/pull/42964 to branch-3.5 -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

[GitHub] [spark] gengliangwang opened a new pull request, #42998: [3.5][SPARK-45189][SQL] Creating UnresolvedRelation from TableIdentifier should include the catalog field

2023-09-19 Thread via GitHub
gengliangwang opened a new pull request, #42998: URL: https://github.com/apache/spark/pull/42998 ### What changes were proposed in this pull request? Creating UnresolvedRelation from TableIdentifier should include the catalog field ### Why are the changes needed?

[GitHub] [spark] mridulm commented on pull request #42357: [SPARK-44306][YARN] Group FileStatus with few RPC calls within Yarn Client

2023-09-19 Thread via GitHub
mridulm commented on PR #42357: URL: https://github.com/apache/spark/pull/42357#issuecomment-1726224793 Merged to master. Thanks for fixing this @shuwang21 ! Thanks for the reviews @xkrogen, @venkata91 and @ShreyeshArangath :-) -- This is an automated message from the Apache Git Serv

[GitHub] [spark] hvanhovell commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
hvanhovell commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1330505500 ## connector/connect/common/src/main/protobuf/spark/connect/base.proto: ## @@ -778,6 +778,67 @@ message ReleaseExecuteResponse { optional string operation_id = 2;

[GitHub] [spark] dongjoon-hyun commented on pull request #42943: [SPARK-45175][K8S] download krb5.conf from remote storage in spark-submit on k8s

2023-09-19 Thread via GitHub
dongjoon-hyun commented on PR #42943: URL: https://github.com/apache/spark/pull/42943#issuecomment-1726223482 I'm just wondering if this is a recommended way in the Kerberos community. In any way, you are suggesting to bypass Kerberos security environment in order to download `krb5.conf` an

[GitHub] [spark] mridulm closed pull request #42357: [SPARK-44306][YARN] Group FileStatus with few RPC calls within Yarn Client

2023-09-19 Thread via GitHub
mridulm closed pull request #42357: [SPARK-44306][YARN] Group FileStatus with few RPC calls within Yarn Client URL: https://github.com/apache/spark/pull/42357 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] dongjoon-hyun closed pull request #42992: [SPARK-45215][SQL][TESTS] Combine HiveCatalogedDDLSuite and HiveDDLSuite

2023-09-19 Thread via GitHub
dongjoon-hyun closed pull request #42992: [SPARK-45215][SQL][TESTS] Combine HiveCatalogedDDLSuite and HiveDDLSuite URL: https://github.com/apache/spark/pull/42992 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[GitHub] [spark] nchammas commented on a diff in pull request #40642: [SPARK-43010][PYTHON] Migrate Column errors into error class

2023-09-19 Thread via GitHub
nchammas commented on code in PR #40642: URL: https://github.com/apache/spark/pull/40642#discussion_r1330499826 ## python/pyspark/errors/error_classes.py: ## @@ -24,6 +24,21 @@ "Argument `` is required when ." ] }, + "CANNOT_ACCESS_TO_DUNDER": { Review Comment:

[GitHub] [spark] grundprinzip commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
grundprinzip commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1330495974 ## connector/connect/common/src/main/protobuf/spark/connect/base.proto: ## @@ -778,6 +778,67 @@ message ReleaseExecuteResponse { optional string operation_id =

[GitHub] [spark] hvanhovell commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
hvanhovell commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1330495612 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/utils/ErrorUtils.scala: ## @@ -91,35 +91,40 @@ private[connect] object ErrorUtils extends Log

[GitHub] [spark] dongjoon-hyun closed pull request #42991: [SPARK-43453][PS] Ignore the `names` of `MultiIndex` when `axis=1` for `concat`

2023-09-19 Thread via GitHub
dongjoon-hyun closed pull request #42991: [SPARK-43453][PS] Ignore the `names` of `MultiIndex` when `axis=1` for `concat` URL: https://github.com/apache/spark/pull/42991 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] allisonwang-db commented on a diff in pull request #42949: [SPARK-45093][CONNECT][PYTHON] Error reporting for addArtifacts query

2023-09-19 Thread via GitHub
allisonwang-db commented on code in PR #42949: URL: https://github.com/apache/spark/pull/42949#discussion_r1330447085 ## python/pyspark/sql/connect/client/artifact.py: ## @@ -243,11 +244,15 @@ def _create_requests( self, *path: str, pyfile: bool, archive: bool, file: bo

[GitHub] [spark] hvanhovell commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
hvanhovell commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1330446640 ## connector/connect/common/src/main/protobuf/spark/connect/base.proto: ## @@ -778,6 +778,67 @@ message ReleaseExecuteResponse { optional string operation_id = 2;

[GitHub] [spark] heyihong commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
heyihong commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1330443354 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectFetchErrorDetailsHandler.scala: ## @@ -0,0 +1,57 @@ +/* + * Licensed to the

[GitHub] [spark] heyihong commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
heyihong commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1330443354 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectFetchErrorDetailsHandler.scala: ## @@ -0,0 +1,57 @@ +/* + * Licensed to the

[GitHub] [spark] hvanhovell commented on a diff in pull request #42995: [SPARK-45136][CONNECT] Enhance ClosureCleaner with Ammonite support

2023-09-19 Thread via GitHub
hvanhovell commented on code in PR #42995: URL: https://github.com/apache/spark/pull/42995#discussion_r1330369636 ## core/src/main/scala/org/apache/spark/util/Utils.scala: ## @@ -2313,7 +2234,7 @@ private[spark] object Utils e.getThrowables.asScala.exists(isBindCollisio

[GitHub] [spark] hvanhovell commented on a diff in pull request #42995: [SPARK-45136][CONNECT] Enhance ClosureCleaner with Ammonite support

2023-09-19 Thread via GitHub
hvanhovell commented on code in PR #42995: URL: https://github.com/apache/spark/pull/42995#discussion_r1330369234 ## core/src/main/scala/org/apache/spark/util/Utils.scala: ## @@ -93,11 +93,12 @@ private[spark] object CallSite { * Various utility methods used by Spark. */ pr

[GitHub] [spark] hvanhovell commented on a diff in pull request #42995: [SPARK-45136][CONNECT] Enhance ClosureCleaner with Ammonite support

2023-09-19 Thread via GitHub
hvanhovell commented on code in PR #42995: URL: https://github.com/apache/spark/pull/42995#discussion_r1330368310 ## common/utils/src/main/scala/org/apache/spark/util/SparkStreamUtils.scala: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [spark] mridulm commented on a diff in pull request #42426: [SPARK-44756][CORE] Executor hangs when RetryingBlockTransferor fails to initiate retry

2023-09-19 Thread via GitHub
mridulm commented on code in PR #42426: URL: https://github.com/apache/spark/pull/42426#discussion_r1330362915 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RetryingBlockTransferor.java: ## @@ -274,7 +287,13 @@ private void handleBlockTransferFailure(S

[GitHub] [spark] mridulm commented on a diff in pull request #42426: [SPARK-44756][CORE] Executor hangs when RetryingBlockTransferor fails to initiate retry

2023-09-19 Thread via GitHub
mridulm commented on code in PR #42426: URL: https://github.com/apache/spark/pull/42426#discussion_r1330362915 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RetryingBlockTransferor.java: ## @@ -274,7 +287,13 @@ private void handleBlockTransferFailure(S

[GitHub] [spark] heyihong commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-19 Thread via GitHub
heyihong commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1330327877 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectFetchErrorDetailsHandler.scala: ## @@ -0,0 +1,57 @@ +/* + * Licensed to the

[GitHub] [spark] mridulm commented on a diff in pull request #42357: [SPARK-44306][YARN] Group FileStatus with few RPC calls within Yarn Client

2023-09-19 Thread via GitHub
mridulm commented on code in PR #42357: URL: https://github.com/apache/spark/pull/42357#discussion_r1330352323 ## resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala: ## @@ -533,9 +536,12 @@ private[spark] class Client( // If preload is enabled,

  1   2   >