[GitHub] [spark] zhengruifeng commented on a diff in pull request #43011: [WIP][SPARK-45232][DOC] Add missing function groups to SQL references

2023-09-20 Thread via GitHub
zhengruifeng commented on code in PR #43011: URL: https://github.com/apache/spark/pull/43011#discussion_r1331213799 ## sql/gen-sql-functions-docs.py: ## @@ -34,6 +34,8 @@ "math_funcs", "conditional_funcs", "generator_funcs", "predicate_funcs", "string_funcs",

[GitHub] [spark] MaxGekk closed pull request #42996: [SPARK-45224][PYTHON] Add examples w/ map and array as parameters of `sql()`

2023-09-20 Thread via GitHub
MaxGekk closed pull request #42996: [SPARK-45224][PYTHON] Add examples w/ map and array as parameters of `sql()` URL: https://github.com/apache/spark/pull/42996 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] heyihong commented on a diff in pull request #42987: [SPARK-45207][SQL][CONNECT] Implement Complete Error Reconstruction for Scala Client

2023-09-20 Thread via GitHub
heyihong commented on code in PR #42987: URL: https://github.com/apache/spark/pull/42987#discussion_r1331247912 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -2882,8 +2882,7 @@ object SQLConf { "level settings.")

[GitHub] [spark] MaxGekk opened a new pull request, #43014: [WIP][CONNECT][PYTHON] Support map and array parameters by `sql()`

2023-09-20 Thread via GitHub
MaxGekk opened a new pull request, #43014: URL: https://github.com/apache/spark/pull/43014 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

[GitHub] [spark] itholic commented on a diff in pull request #42994: [SPARK-43433][PS] Match `GroupBy.nth` behavior to the latest Pandas

2023-09-20 Thread via GitHub
itholic commented on code in PR #42994: URL: https://github.com/apache/spark/pull/42994#discussion_r1331072003 ## python/pyspark/pandas/groupby.py: ## @@ -1155,14 +1152,32 @@ def nth(self, n: int) -> FrameLike: else: sdf =

[GitHub] [spark] cloud-fan commented on a diff in pull request #42864: [WIP][SPARK-45112][SQL] Use UnresolvedFunction based resolution in SQL Dataset functions

2023-09-20 Thread via GitHub
cloud-fan commented on code in PR #42864: URL: https://github.com/apache/spark/pull/42864#discussion_r1331077927 ## python/pyspark/sql/column.py: ## @@ -712,11 +712,11 @@ def __getitem__(self, k: Any) -> "Column": >>> df =

[GitHub] [spark] zhengruifeng commented on a diff in pull request #43011: [WIP][SPARK-45232][DOC] Add missing function groups to SQL references

2023-09-20 Thread via GitHub
zhengruifeng commented on code in PR #43011: URL: https://github.com/apache/spark/pull/43011#discussion_r1331200011 ## sql/gen-sql-functions-docs.py: ## @@ -34,6 +34,8 @@ "math_funcs", "conditional_funcs", "generator_funcs", "predicate_funcs", "string_funcs",

[GitHub] [spark] yaooqinn opened a new pull request, #43016: [SPARK-45077][UI][FOLLOWUP] Update comment to link the forked repo yaooqinn/dagre-d3

2023-09-20 Thread via GitHub
yaooqinn opened a new pull request, #43016: URL: https://github.com/apache/spark/pull/43016 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

[GitHub] [spark] wankunde opened a new pull request, #43009: [SPARK-45230][SQL] Plan sorter for Aggregate after SMJ

2023-09-20 Thread via GitHub
wankunde opened a new pull request, #43009: URL: https://github.com/apache/spark/pull/43009 ### What changes were proposed in this pull request? This PR could be a followup of https://github.com/apache/spark/pull/42488 and https://github.com/apache/spark/pull/42557. If

[GitHub] [spark] heyihong commented on a diff in pull request #42987: [SPARK-45207][SQL][CONNECT] Implement Complete Error Reconstruction for Scala Client

2023-09-20 Thread via GitHub
heyihong commented on code in PR #42987: URL: https://github.com/apache/spark/pull/42987#discussion_r1331251817 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -2882,8 +2882,7 @@ object SQLConf { "level settings.")

[GitHub] [spark] LuciferYang commented on pull request #42908: [SPARK-44872][CONNECT][FOLLOWUP] Deflake ReattachableExecuteSuite and increase retry buffer

2023-09-20 Thread via GitHub
LuciferYang commented on PR #42908: URL: https://github.com/apache/spark/pull/42908#issuecomment-1727552095 Thanks @hvanhovell -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] cxzl25 commented on a diff in pull request #42199: [SPARK-44579][SQL] Support Interrupt On Cancel in SQLExecution

2023-09-20 Thread via GitHub
cxzl25 commented on code in PR #42199: URL: https://github.com/apache/spark/pull/42199#discussion_r1331640480 ## sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala: ## @@ -77,6 +79,11 @@ object SQLExecution { } val rootExecutionId =

[GitHub] [spark] heyihong commented on a diff in pull request #42987: [SPARK-45207][SQL][CONNECT] Implement Complete Error Reconstruction for Scala Client

2023-09-20 Thread via GitHub
heyihong commented on code in PR #42987: URL: https://github.com/apache/spark/pull/42987#discussion_r1331693297 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -2882,8 +2882,7 @@ object SQLConf { "level settings.")

[GitHub] [spark] zhengruifeng opened a new pull request, #43012: [SPARK-45234][PYTHON][DOCS] Refine DocString of `regr_*` functions

2023-09-20 Thread via GitHub
zhengruifeng opened a new pull request, #43012: URL: https://github.com/apache/spark/pull/43012 ### What changes were proposed in this pull request? Refine DocString of `regr_*` functions ### Why are the changes needed? fix the wildcard import ### Does this PR

[GitHub] [spark] peter-toth commented on a diff in pull request #42864: [WIP][SPARK-45112][SQL] Use UnresolvedFunction based resolution in SQL Dataset functions

2023-09-20 Thread via GitHub
peter-toth commented on code in PR #42864: URL: https://github.com/apache/spark/pull/42864#discussion_r1331113924 ## python/pyspark/sql/column.py: ## @@ -712,11 +712,11 @@ def __getitem__(self, k: Any) -> "Column": >>> df =

[GitHub] [spark] hdaikoku commented on pull request #42426: [SPARK-44756][CORE] Executor hangs when RetryingBlockTransferor fails to initiate retry

2023-09-20 Thread via GitHub
hdaikoku commented on PR #42426: URL: https://github.com/apache/spark/pull/42426#issuecomment-1727835491 > I think `SparkUncaughtExceptionHandler` should caught this OOM exception and abort the executor. I'm not sure if I'm following this. For this particular case, OOM was actually

[GitHub] [spark] peter-toth commented on a diff in pull request #42864: [WIP][SPARK-45112][SQL] Use UnresolvedFunction based resolution in SQL Dataset functions

2023-09-20 Thread via GitHub
peter-toth commented on code in PR #42864: URL: https://github.com/apache/spark/pull/42864#discussion_r1331113924 ## python/pyspark/sql/column.py: ## @@ -712,11 +712,11 @@ def __getitem__(self, k: Any) -> "Column": >>> df =

[GitHub] [spark] MaxGekk commented on pull request #42996: [SPARK-45224][PYTHON] Add examples w/ map and array as parameters of `sql()`

2023-09-20 Thread via GitHub
MaxGekk commented on PR #42996: URL: https://github.com/apache/spark/pull/42996#issuecomment-1727192557 Merging to master. Thank you, @HyukjinKwon and @cloud-fan for review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] peter-toth commented on a diff in pull request #42864: [WIP][SPARK-45112][SQL] Use UnresolvedFunction based resolution in SQL Dataset functions

2023-09-20 Thread via GitHub
peter-toth commented on code in PR #42864: URL: https://github.com/apache/spark/pull/42864#discussion_r1331110271 ## sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala: ## @@ -708,7 +708,7 @@ private[sql] object RelationalGroupedDataset { case

[GitHub] [spark] heyihong commented on a diff in pull request #42987: [SPARK-45207][SQL][CONNECT] Implement Complete Error Reconstruction for Scala Client

2023-09-20 Thread via GitHub
heyihong commented on code in PR #42987: URL: https://github.com/apache/spark/pull/42987#discussion_r1331243346 ## connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala: ## @@ -93,33 +179,65 @@ private[client] object

[GitHub] [spark] heyihong commented on a diff in pull request #42987: [SPARK-45207][SQL][CONNECT] Implement Complete Error Reconstruction for Scala Client

2023-09-20 Thread via GitHub
heyihong commented on code in PR #42987: URL: https://github.com/apache/spark/pull/42987#discussion_r1331247912 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -2882,8 +2882,7 @@ object SQLConf { "level settings.")

[GitHub] [spark] dzhigimont commented on a diff in pull request #40420: [SPARK-42617][PS] Support `isocalendar` from the pandas 2.0.0

2023-09-20 Thread via GitHub
dzhigimont commented on code in PR #40420: URL: https://github.com/apache/spark/pull/40420#discussion_r1331413958 ## python/pyspark/pandas/datetimes.py: ## @@ -116,26 +117,59 @@ def pandas_microsecond(s) -> ps.Series[np.int32]: # type: ignore[no-untyped-def def

[GitHub] [spark] zhengruifeng commented on a diff in pull request #43011: [WIP][SPARK-45232][DOC] Add missing function groups to SQL references

2023-09-20 Thread via GitHub
zhengruifeng commented on code in PR #43011: URL: https://github.com/apache/spark/pull/43011#discussion_r1331213799 ## sql/gen-sql-functions-docs.py: ## @@ -34,6 +34,8 @@ "math_funcs", "conditional_funcs", "generator_funcs", "predicate_funcs", "string_funcs",

[GitHub] [spark] bjornjorgensen commented on a diff in pull request #43005: [WIP][SPARK-44112][BUILD][INFRA] Drop Java 8 and 11 support

2023-09-20 Thread via GitHub
bjornjorgensen commented on code in PR #43005: URL: https://github.com/apache/spark/pull/43005#discussion_r1331229836 ## .github/workflows/build_coverage.yml: ## @@ -17,7 +17,7 @@ # under the License. # -name: "Build / Coverage (master, Scala 2.12, Hadoop 3, JDK 8)" +name:

[GitHub] [spark] zhengruifeng commented on a diff in pull request #43014: [SPARK-45235][CONNECT][PYTHON] Support map and array parameters by `sql()`

2023-09-20 Thread via GitHub
zhengruifeng commented on code in PR #43014: URL: https://github.com/apache/spark/pull/43014#discussion_r1331636427 ## python/pyspark/sql/connect/plan.py: ## @@ -1049,21 +1049,23 @@ def __init__(self, query: str, args: Optional[Union[Dict[str, Any], List]] = Non

[GitHub] [spark] cloud-fan closed pull request #42971: [SPARK-43979][SQL][FOLLOWUP] Handle non alias-only project case

2023-09-20 Thread via GitHub
cloud-fan closed pull request #42971: [SPARK-43979][SQL][FOLLOWUP] Handle non alias-only project case URL: https://github.com/apache/spark/pull/42971 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] heyihong commented on a diff in pull request #42987: [SPARK-45207][SQL][CONNECT] Implement Error Reconstruction for Scala Client

2023-09-20 Thread via GitHub
heyihong commented on code in PR #42987: URL: https://github.com/apache/spark/pull/42987#discussion_r1331243346 ## connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala: ## @@ -93,33 +179,65 @@ private[client] object

[GitHub] [spark] cloud-fan commented on a diff in pull request #42864: [WIP][SPARK-45112][SQL] Use UnresolvedFunction based resolution in SQL Dataset functions

2023-09-20 Thread via GitHub
cloud-fan commented on code in PR #42864: URL: https://github.com/apache/spark/pull/42864#discussion_r1331079422 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala: ## @@ -442,6 +442,10 @@ case class InSubquery(values: Seq[Expression],

[GitHub] [spark] zhengruifeng opened a new pull request, #43011: [SPARK-45232][DOC] Add missing function groups to SQL references

2023-09-20 Thread via GitHub
zhengruifeng opened a new pull request, #43011: URL: https://github.com/apache/spark/pull/43011 ### What changes were proposed in this pull request? Add missing function groups to SQL references: - xml_funcs - lambda_funcs - collection_funcs - url_funcs - hash_funcsx

[GitHub] [spark] zhengruifeng commented on a diff in pull request #42864: [WIP][SPARK-45112][SQL] Use UnresolvedFunction based resolution in SQL Dataset functions

2023-09-20 Thread via GitHub
zhengruifeng commented on code in PR #42864: URL: https://github.com/apache/spark/pull/42864#discussion_r1331149367 ## sql/core/src/main/scala/org/apache/spark/sql/functions.scala: ## @@ -414,12 +407,13 @@ object functions { * @group agg_funcs * @since 1.3.0 */ -

[GitHub] [spark] LuciferYang commented on a diff in pull request #43008: [WIP][SPARK-44113][BUILD][INFRA][DOCS] Drop support for Scala 2.12

2023-09-20 Thread via GitHub
LuciferYang commented on code in PR #43008: URL: https://github.com/apache/spark/pull/43008#discussion_r1331207849 ## dev/change-scala-version.sh: ## @@ -19,7 +19,7 @@ set -e -VALID_VERSIONS=( 2.12 2.13 ) +VALID_VERSIONS=( 2.13 ) Review Comment: No further

[GitHub] [spark] beliefer commented on a diff in pull request #42612: [SPARK-44913][SQL] DS V2 supports push down V2 UDF that has magic method

2023-09-20 Thread via GitHub
beliefer commented on code in PR #42612: URL: https://github.com/apache/spark/pull/42612#discussion_r1331177757 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala: ## @@ -279,7 +283,9 @@ case class StaticInvoke( inputTypes:

[GitHub] [spark] hvanhovell commented on a diff in pull request #42987: [SPARK-45207][SQL][CONNECT] Implement Complete Error Reconstruction for Scala Client

2023-09-20 Thread via GitHub
hvanhovell commented on code in PR #42987: URL: https://github.com/apache/spark/pull/42987#discussion_r1331644650 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -2882,8 +2882,7 @@ object SQLConf { "level settings.")

[GitHub] [spark] LuciferYang commented on pull request #43015: [SPARK-45237][DOCS] Change the default value of `spark.history.store.hybridStore.diskBackend` in `monitoring.md` to `ROCKSDB`

2023-09-20 Thread via GitHub
LuciferYang commented on PR #43015: URL: https://github.com/apache/spark/pull/43015#issuecomment-1727645034 cc @dongjoon-hyun FYI I think this one need to backport to branch-3.4 and branch-3.5 -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] zhengruifeng commented on a diff in pull request #43011: [SPARK-45232][DOC] Add missing function groups to SQL references

2023-09-20 Thread via GitHub
zhengruifeng commented on code in PR #43011: URL: https://github.com/apache/spark/pull/43011#discussion_r1331200011 ## sql/gen-sql-functions-docs.py: ## @@ -34,6 +34,8 @@ "math_funcs", "conditional_funcs", "generator_funcs", "predicate_funcs", "string_funcs",

[GitHub] [spark] HyukjinKwon closed pull request #43002: [SPARK-43498][PS][TESTS] Enable `StatsTests.test_axis_on_dataframe` for pandas 2.0.0.

2023-09-20 Thread via GitHub
HyukjinKwon closed pull request #43002: [SPARK-43498][PS][TESTS] Enable `StatsTests.test_axis_on_dataframe` for pandas 2.0.0. URL: https://github.com/apache/spark/pull/43002 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] dongjoon-hyun commented on pull request #43007: [SPARK-45229][CORE][UI] Show the number of drivers waiting in SUBMITTED status in MasterPage

2023-09-20 Thread via GitHub
dongjoon-hyun commented on PR #43007: URL: https://github.com/apache/spark/pull/43007#issuecomment-1727060848 Thank you for revising the PR title. Since `core` module test passed and I verified manually, let me merge this. -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] peter-toth commented on a diff in pull request #42864: [WIP][SPARK-45112][SQL] Use UnresolvedFunction based resolution in SQL Dataset functions

2023-09-20 Thread via GitHub
peter-toth commented on code in PR #42864: URL: https://github.com/apache/spark/pull/42864#discussion_r1331311772 ## sql/core/src/test/scala/org/apache/spark/sql/IntegratedUDFTestUtils.scala: ## @@ -723,11 +728,14 @@ object IntegratedUDFTestUtils extends SQLHelper {

[GitHub] [spark] HyukjinKwon commented on pull request #43002: [SPARK-43498][PS][TESTS] Enable `StatsTests.test_axis_on_dataframe` for pandas 2.0.0.

2023-09-20 Thread via GitHub
HyukjinKwon commented on PR #43002: URL: https://github.com/apache/spark/pull/43002#issuecomment-1727167048 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon commented on pull request #43013: [MINOR][DOCS][CONNECT] Update notes about supported modules in PySpark API reference

2023-09-20 Thread via GitHub
HyukjinKwon commented on PR #43013: URL: https://github.com/apache/spark/pull/43013#issuecomment-1727296619 Build: https://github.com/HyukjinKwon/spark/actions/runs/6245820107/job/16955248336 -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] LuciferYang opened a new pull request, #43008: [SPARK-44113][BUILD] Drop Scala 2.12 Support

2023-09-20 Thread via GitHub
LuciferYang opened a new pull request, #43008: URL: https://github.com/apache/spark/pull/43008 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

[GitHub] [spark] HyukjinKwon opened a new pull request, #43013: [MINOR][DOCS][CONNECT] Update notes about supported modules in PySpark API reference

2023-09-20 Thread via GitHub
HyukjinKwon opened a new pull request, #43013: URL: https://github.com/apache/spark/pull/43013 ### What changes were proposed in this pull request? This PR proposes to add a couple of notes about which modules are supported by Spark Connect. ### Why are the changes needed?

[GitHub] [spark] cloud-fan commented on a diff in pull request #42864: [WIP][SPARK-45112][SQL] Use UnresolvedFunction based resolution in SQL Dataset functions

2023-09-20 Thread via GitHub
cloud-fan commented on code in PR #42864: URL: https://github.com/apache/spark/pull/42864#discussion_r1331143291 ## sql/core/src/test/scala/org/apache/spark/sql/IntegratedUDFTestUtils.scala: ## @@ -723,11 +728,14 @@ object IntegratedUDFTestUtils extends SQLHelper {

[GitHub] [spark] amaliujia opened a new pull request, #43010: [WIP]

2023-09-20 Thread via GitHub
amaliujia opened a new pull request, #43010: URL: https://github.com/apache/spark/pull/43010 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

[GitHub] [spark] yaooqinn commented on a diff in pull request #42199: [SPARK-44579][SQL] Support Interrupt On Cancel in SQLExecution

2023-09-20 Thread via GitHub
yaooqinn commented on code in PR #42199: URL: https://github.com/apache/spark/pull/42199#discussion_r1331623060 ## sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala: ## @@ -77,6 +79,11 @@ object SQLExecution { } val rootExecutionId =

[GitHub] [spark] heyihong commented on a diff in pull request #42987: [SPARK-45207][SQL][CONNECT] Implement Complete Error Reconstruction for Scala Client

2023-09-20 Thread via GitHub
heyihong commented on code in PR #42987: URL: https://github.com/apache/spark/pull/42987#discussion_r1331297747 ## connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/streaming/ClientStreamingQuerySuite.scala: ## @@ -175,6 +175,37 @@ class ClientStreamingQuerySuite

[GitHub] [spark] cloud-fan commented on a diff in pull request #42864: [WIP][SPARK-45112][SQL] Use UnresolvedFunction based resolution in SQL Dataset functions

2023-09-20 Thread via GitHub
cloud-fan commented on code in PR #42864: URL: https://github.com/apache/spark/pull/42864#discussion_r1331145071 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingSymmetricHashJoinHelperSuite.scala: ## @@ -49,7 +44,12 @@ class

[GitHub] [spark] WeichenXu123 commented on pull request #42382: [ML] Remove usage of RDD APIs for load/save in spark-ml

2023-09-20 Thread via GitHub
WeichenXu123 commented on PR #42382: URL: https://github.com/apache/spark/pull/42382#issuecomment-1727681685 @zhengruifeng Can we make the interface `saveMetadata` support both `sparkContext` and `sparkSession` argument ? and in spark repo, we always pass sparkSession as the

[GitHub] [spark] dzhigimont commented on pull request #40420: [SPARK-42617][PS] Support `isocalendar` from the pandas 2.0.0

2023-09-20 Thread via GitHub
dzhigimont commented on PR #40420: URL: https://github.com/apache/spark/pull/40420#issuecomment-1727427143 > @dzhigimont Can we just make the CI pass for now? I can help in the follow-ups after merging this one. > > Seems like the mypy checks is failing for now: > > ``` >

[GitHub] [spark] peter-toth commented on a diff in pull request #42864: [WIP][SPARK-45112][SQL] Use UnresolvedFunction based resolution in SQL Dataset functions

2023-09-20 Thread via GitHub
peter-toth commented on code in PR #42864: URL: https://github.com/apache/spark/pull/42864#discussion_r1331301998 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingSymmetricHashJoinHelperSuite.scala: ## @@ -49,7 +44,12 @@ class

[GitHub] [spark] beliefer commented on a diff in pull request #42612: [SPARK-44913][SQL] DS V2 supports push down V2 UDF that has magic method

2023-09-20 Thread via GitHub
beliefer commented on code in PR #42612: URL: https://github.com/apache/spark/pull/42612#discussion_r1331177757 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala: ## @@ -279,7 +283,9 @@ case class StaticInvoke( inputTypes:

[GitHub] [spark] heyihong commented on a diff in pull request #42377: [SPARK-44622][SQL][CONNECT] Implement FetchErrorDetails RPC

2023-09-20 Thread via GitHub
heyihong commented on code in PR #42377: URL: https://github.com/apache/spark/pull/42377#discussion_r1323060159 ## connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/streaming/ClientStreamingQuerySuite.scala: ## @@ -175,6 +175,36 @@ class ClientStreamingQuerySuite

[GitHub] [spark] panbingkun commented on pull request #42993: [SPARK-45231][INFRA] Remove unrecognized and meaningless command about amm from the GA testing workflow.

2023-09-20 Thread via GitHub
panbingkun commented on PR #42993: URL: https://github.com/apache/spark/pull/42993#issuecomment-1727147543 cc @vicennial @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] dzhigimont commented on a diff in pull request #40420: [SPARK-42617][PS] Support `isocalendar` from the pandas 2.0.0

2023-09-20 Thread via GitHub
dzhigimont commented on code in PR #40420: URL: https://github.com/apache/spark/pull/40420#discussion_r1331412813 ## python/pyspark/pandas/indexes/datetimes.py: ## @@ -214,28 +215,8 @@ def microsecond(self) -> Index: ) return

[GitHub] [spark] zhengruifeng commented on a diff in pull request #42864: [WIP][SPARK-45112][SQL] Use UnresolvedFunction based resolution in SQL Dataset functions

2023-09-20 Thread via GitHub
zhengruifeng commented on code in PR #42864: URL: https://github.com/apache/spark/pull/42864#discussion_r1331147608 ## sql/core/src/main/scala/org/apache/spark/sql/functions.scala: ## @@ -414,12 +407,13 @@ object functions { * @group agg_funcs * @since 1.3.0 */ -

[GitHub] [spark] LuciferYang opened a new pull request, #43015: [SPARK-45237][DOCS] Change the default value of `spark.history.store.hybridStore.diskBackend` in `monitoring.md` to `ROCKSDB`

2023-09-20 Thread via GitHub
LuciferYang opened a new pull request, #43015: URL: https://github.com/apache/spark/pull/43015 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

[GitHub] [spark] dzhigimont commented on pull request #40420: [SPARK-42617][PS] Support `isocalendar` from the pandas 2.0.0

2023-09-20 Thread via GitHub
dzhigimont commented on PR #40420: URL: https://github.com/apache/spark/pull/40420#issuecomment-1727680820 > @dzhigimont Can we just make the CI pass for now? I can help in the follow-ups after merging this one. > > Seems like the mypy checks is failing for now: > > ``` >

[GitHub] [spark] peter-toth commented on a diff in pull request #42864: [WIP][SPARK-45112][SQL] Use UnresolvedFunction based resolution in SQL Dataset functions

2023-09-20 Thread via GitHub
peter-toth commented on code in PR #42864: URL: https://github.com/apache/spark/pull/42864#discussion_r1331134534 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala: ## @@ -442,6 +442,10 @@ case class InSubquery(values: Seq[Expression],

[GitHub] [spark] cloud-fan commented on a diff in pull request #42199: [SPARK-44579][SQL] Support Interrupt On Cancel in SQLExecution

2023-09-20 Thread via GitHub
cloud-fan commented on code in PR #42199: URL: https://github.com/apache/spark/pull/42199#discussion_r1331581159 ## sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala: ## @@ -77,6 +79,11 @@ object SQLExecution { } val rootExecutionId =

[GitHub] [spark] zhengruifeng commented on pull request #43003: [SPARK-45226][PYTHON][DOCS] Refine docstring of `rand/randn`

2023-09-20 Thread via GitHub
zhengruifeng commented on PR #43003: URL: https://github.com/apache/spark/pull/43003#issuecomment-1727125799 thanks, merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon commented on pull request #41016: [SPARK-43341][SQL] Patch StructType.toDDL not picking up on non-nullability of nested column

2023-09-20 Thread via GitHub
HyukjinKwon commented on PR #41016: URL: https://github.com/apache/spark/pull/41016#issuecomment-1727125449 @BramBoog it has a conflicts against the lastest master branch. You would need to resolve the conflicts by git fetch upstream & git rebase upstream/master -- This is an automated

[GitHub] [spark] zhengruifeng closed pull request #43003: [SPARK-45226][PYTHON][DOCS] Refine docstring of `rand/randn`

2023-09-20 Thread via GitHub
zhengruifeng closed pull request #43003: [SPARK-45226][PYTHON][DOCS] Refine docstring of `rand/randn` URL: https://github.com/apache/spark/pull/43003 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] zhengruifeng commented on pull request #43013: [MINOR][DOCS][CONNECT] Update notes about supported modules in PySpark API reference

2023-09-20 Thread via GitHub
zhengruifeng commented on PR #43013: URL: https://github.com/apache/spark/pull/43013#issuecomment-1727400614 `pyspark.ml.connect` only supports a small subset of `pyspark.ml` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] HyukjinKwon commented on pull request #42916: [MiNOR][DOCS] Fix a typo in HashAggregateExec.scala

2023-09-20 Thread via GitHub
HyukjinKwon commented on PR #42916: URL: https://github.com/apache/spark/pull/42916#issuecomment-1727123248 We have our own logic to detect forked repostiories' github actions run. You would need to go to settings in your forked repo, and enable it. For now, seems I can't find the Githuh

[GitHub] [spark] Hisoka-X commented on a diff in pull request #42979: [SPARK-45035][SQL] Fix ignoreCorruptFiles with multiline CSV/JSON will report error

2023-09-20 Thread via GitHub
Hisoka-X commented on code in PR #42979: URL: https://github.com/apache/spark/pull/42979#discussion_r1331553477 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala: ## @@ -190,12 +191,19 @@ object MultiLineCSVDataSource extends

[GitHub] [spark] HeartSaVioR commented on pull request #42895: [SPARK-45138][SS] Define a new error class and apply it when checkpointing state to DFS fails

2023-09-20 Thread via GitHub
HeartSaVioR commented on PR #42895: URL: https://github.com/apache/spark/pull/42895#issuecomment-1727033619 https://github.com/neilramaswamy/nr-spark/actions/runs/6242426233/job/16951813562 This failure seems to be real one. SparkThrowableSuite provides a guide to regenerate the

<    1   2