dongjoon-hyun commented on PR #44129:
URL: https://github.com/apache/spark/pull/44129#issuecomment-1837400198
Thank you so much. I addressed your comment via converting
`spark.deploy.recoveryCompressionCodec` as `optionalString`. And, the default
is no compression.
--
This is an automate
dongjoon-hyun commented on code in PR #44129:
URL: https://github.com/apache/spark/pull/44129#discussion_r1413018463
##
core/src/main/scala/org/apache/spark/internal/config/Deploy.scala:
##
@@ -39,6 +41,14 @@ private[spark] object Deploy {
.checkValues(RecoverySerializer.va
HyukjinKwon commented on code in PR #44129:
URL: https://github.com/apache/spark/pull/44129#discussion_r1413017750
##
core/src/main/scala/org/apache/spark/internal/config/Deploy.scala:
##
@@ -39,6 +41,14 @@ private[spark] object Deploy {
.checkValues(RecoverySerializer.valu
HyukjinKwon commented on code in PR #43982:
URL: https://github.com/apache/spark/pull/43982#discussion_r1413016701
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparison.scala:
##
@@ -138,6 +138,11 @@ object UnwrapCastInBinaryComparis
HyukjinKwon commented on code in PR #43982:
URL: https://github.com/apache/spark/pull/43982#discussion_r1413015199
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparison.scala:
##
@@ -138,6 +138,11 @@ object UnwrapCastInBinaryComparis
HyukjinKwon commented on code in PR #43982:
URL: https://github.com/apache/spark/pull/43982#discussion_r1413014878
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparison.scala:
##
@@ -138,6 +138,11 @@ object UnwrapCastInBinaryComparis
HyukjinKwon commented on code in PR #43982:
URL: https://github.com/apache/spark/pull/43982#discussion_r1413014878
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparison.scala:
##
@@ -138,6 +138,11 @@ object UnwrapCastInBinaryComparis
dongjoon-hyun commented on PR #44129:
URL: https://github.com/apache/spark/pull/44129#issuecomment-1837388900
Could you review this when you have some time, @HyukjinKwon ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and us
HyukjinKwon closed pull request #44115: [SPARK-46208][PS][DOCS] Adding a link
for latest Pandas API specifications.
URL: https://github.com/apache/spark/pull/44115
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL a
HyukjinKwon commented on PR #44115:
URL: https://github.com/apache/spark/pull/44115#issuecomment-1837387999
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
dongjoon-hyun commented on PR #43705:
URL: https://github.com/apache/spark/pull/43705#issuecomment-1837378858
In any way, Apache Spark 3.3 will reach the end of life in two weeks.
- https://lists.apache.org/thread/ml25jdzmtcl8r6fhg849zzmqz82qh3jw (`Apache
Spark 3.3.4 EOL Release?`)
dongjoon-hyun commented on PR #44114:
URL: https://github.com/apache/spark/pull/44114#issuecomment-1837378161
Merged to master for Apache Spark 4. Thank you, @itholic .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use t
dongjoon-hyun closed pull request #44114: [SPARK-46206][PS] Use a narrower
scope exception for SQL processor
URL: https://github.com/apache/spark/pull/44114
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
dongjoon-hyun commented on PR #44122:
URL: https://github.com/apache/spark/pull/44122#issuecomment-1837377737
Merged to master for Apache Spark 4.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
dongjoon-hyun closed pull request #44122:
[SPARK-46212][CORE][SQL][SS][CONNECT][MLLIB][GRAPHX][DSTREAM][PROTOBUF][EXAMPLES]
Use other functions to simplify the code pattern of `s.c.MapOps#view.mapValues`
URL: https://github.com/apache/spark/pull/44122
--
This is an automated message from the
wangyum commented on PR #43982:
URL: https://github.com/apache/spark/pull/43982#issuecomment-1837365342
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
wangyum closed pull request #43982: [SPARK-46069][SQL] Support unwrap timestamp
type to date type
URL: https://github.com/apache/spark/pull/43982
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
dongjoon-hyun commented on PR #44127:
URL: https://github.com/apache/spark/pull/44127#issuecomment-1837364849
Merged to master for Apache Spark 4.0.0
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
dongjoon-hyun closed pull request #44127: [SPARK-46215][CORE] Improve
`FileSystemPersistenceEngine` to allow nonexistent parents
URL: https://github.com/apache/spark/pull/44127
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
dongjoon-hyun commented on PR #44127:
URL: https://github.com/apache/spark/pull/44127#issuecomment-1837364358
Thank you so much, @wangyum !
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spe
dongjoon-hyun commented on PR #44127:
URL: https://github.com/apache/spark/pull/44127#issuecomment-1837345014
Could you review this PR, @wangyum ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
wangyum commented on PR #44128:
URL: https://github.com/apache/spark/pull/44128#issuecomment-1837343886
Thanks. Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comm
wangyum closed pull request #44128: [SPARK-43228][SQL] Join keys also match
PartitioningCollection in CoalesceBucketsInJoin
URL: https://github.com/apache/spark/pull/44128
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use t
dongjoon-hyun opened a new pull request, #44129:
URL: https://github.com/apache/spark/pull/44129
…
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
dongjoon-hyun commented on code in PR #44115:
URL: https://github.com/apache/spark/pull/44115#discussion_r1412910434
##
python/docs/source/reference/pyspark.pandas/index.rst:
##
@@ -23,7 +23,7 @@ Pandas API on Spark
This page gives an overview of all public pandas API on Spark.
wankunde closed pull request #44120: [SPARK-40609][SQL] Unwrap cast in the join
condition to unlock bucketed read
URL: https://github.com/apache/spark/pull/44120
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL abo
itholic commented on PR #44114:
URL: https://github.com/apache/spark/pull/44114#issuecomment-1837310856
cc @HyukjinKwon @zhengruifeng
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
itholic commented on code in PR #44115:
URL: https://github.com/apache/spark/pull/44115#discussion_r1412909844
##
python/docs/source/reference/pyspark.pandas/index.rst:
##
@@ -23,7 +23,7 @@ Pandas API on Spark
This page gives an overview of all public pandas API on Spark.
..
wankunde opened a new pull request, #44128:
URL: https://github.com/apache/spark/pull/44128
### What changes were proposed in this pull request?
This PR updates `CoalesceBucketsInJoin.satisfiesOutputPartitioning` to
support matching `PartitioningCollection`. A common case is that we a
itholic commented on PR #44115:
URL: https://github.com/apache/spark/pull/44115#issuecomment-1837309501
Sounds reasonable to me. Let me just close this ticket.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL ab
itholic closed pull request #44115: [SPARK-46208][PS][DOCS] Use specific Pandas
version for API specifications.
URL: https://github.com/apache/spark/pull/44115
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
itholic commented on PR #44123:
URL: https://github.com/apache/spark/pull/44123#issuecomment-1837309298
> Since I'm not sure, I simply remove the following part from the PR
description.
Yeah, that was my mistake. Thanks for the correction!
--
This is an automated message from the A
dongjoon-hyun commented on PR #44127:
URL: https://github.com/apache/spark/pull/44127#issuecomment-1837308671
Could you review this PR, @HyukjinKwon ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
HyukjinKwon commented on code in PR #44115:
URL: https://github.com/apache/spark/pull/44115#discussion_r1412908509
##
python/docs/source/reference/pyspark.pandas/index.rst:
##
@@ -23,7 +23,7 @@ Pandas API on Spark
This page gives an overview of all public pandas API on Spark.
github-actions[bot] closed pull request #41199: [SPARK-43536][CORE] Fixing
statsd sink reporter
URL: https://github.com/apache/spark/pull/41199
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the sp
github-actions[bot] commented on PR #42652:
URL: https://github.com/apache/spark/pull/42652#issuecomment-1837289982
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] commented on PR #41251:
URL: https://github.com/apache/spark/pull/41251#issuecomment-1837289989
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
dongjoon-hyun commented on PR #44127:
URL: https://github.com/apache/spark/pull/44127#issuecomment-1837282489
Could you review this when you have some time, @LuciferYang ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and us
dongjoon-hyun opened a new pull request, #44127:
URL: https://github.com/apache/spark/pull/44127
### What changes were proposed in this pull request?
This PR aims to improve `FileSystemPersistenceEngine` to allow non-exist
parents
### Why are the changes needed?
To preve
dongjoon-hyun closed pull request #44123: [SPARK-46213][PYTHON] Introduce
`PySparkImportError` for error framework
URL: https://github.com/apache/spark/pull/44123
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL ab
dongjoon-hyun commented on PR #44123:
URL: https://github.com/apache/spark/pull/44123#issuecomment-1837255140
Since I'm not sure, I simply remove this part, `to replace `IndexError` in
pyspark.sql.*.` from the PR description.
Merged to master~ Thank you, @itholic .
--
This is an autom
dongjoon-hyun closed pull request #44116: [MINOR][DOCS] Add `since` tag for
`Scan.reportDriverMetrics`
URL: https://github.com/apache/spark/pull/44116
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
dongjoon-hyun commented on PR #44121:
URL: https://github.com/apache/spark/pull/44121#issuecomment-1837237412
Merged to master. Thank you, @itholic .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
dongjoon-hyun closed pull request #44121:
[SPARK-41532][CONNECT][PYTHON][FOLLOWUP] Expose `SessionNotSameException` as
PySpark exceptions
URL: https://github.com/apache/spark/pull/44121
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to G
LuciferYang commented on PR #44126:
URL: https://github.com/apache/spark/pull/44126#issuecomment-1837235615
Thanks @dongjoon-hyun ~
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific co
LuciferYang commented on PR #44117:
URL: https://github.com/apache/spark/pull/44117#issuecomment-1837235510
Seems like I'm one step ahead, hahaha ~
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
dongjoon-hyun commented on PR #44117:
URL: https://github.com/apache/spark/pull/44117#issuecomment-1837235373
Oh, you are faster than me. Thank you a lots!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
LuciferYang closed pull request #44117: [SPARK-46210][K8S][DOCS] Update
`YuniKorn` docs with v1.4
URL: https://github.com/apache/spark/pull/44117
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
dongjoon-hyun commented on PR #44117:
URL: https://github.com/apache/spark/pull/44117#issuecomment-1837235185
Thank you, @LuciferYang . Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
dongjoon-hyun commented on PR #44118:
URL: https://github.com/apache/spark/pull/44118#issuecomment-1837234980
Thank you, @LuciferYang . Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
dongjoon-hyun commented on PR #44126:
URL: https://github.com/apache/spark/pull/44126#issuecomment-1837234889
Merged to branch-3.5.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific co
dongjoon-hyun closed pull request #44126: [SPARK-45975][SQL][TESTS][3.5] Reset
storeAssignmentPolicy to original
URL: https://github.com/apache/spark/pull/44126
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL abov
dongjoon-hyun closed pull request #44118: [SPARK-46205][CORE][TESTS][FOLLOWUP]
Simplify PersistenceEngineBenchmark
URL: https://github.com/apache/spark/pull/44118
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL ab
dongjoon-hyun commented on PR #43867:
URL: https://github.com/apache/spark/pull/43867#issuecomment-1837234483
Thank you, @LuciferYang .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specifi
dongjoon-hyun commented on PR #44117:
URL: https://github.com/apache/spark/pull/44117#issuecomment-1837232986
Could you review this doc PR, @LuciferYang ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above t
dongjoon-hyun commented on PR #44118:
URL: https://github.com/apache/spark/pull/44118#issuecomment-1837232922
Could you review this PR, @LuciferYang ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
dongjoon-hyun commented on PR #44006:
URL: https://github.com/apache/spark/pull/44006#issuecomment-1837232773
Merged to master for Apache Spark 4.0.0.
Could you make backporting PRs to make it sure that all tests pass in the
release branches, @johanl-db ?
Thank you, @johanl-db
LuciferYang commented on PR #44126:
URL: https://github.com/apache/spark/pull/44126#issuecomment-1837232251
@dongjoon-hyun
For the master branch, this pr was merged before
https://github.com/apache/spark/pull/43867, so the test fail will not be
reproduced. But for branch-3.5, due t
dongjoon-hyun closed pull request #44006: [SPARK-46092][SQL] Don't push down
Parquet row group filters that overflow
URL: https://github.com/apache/spark/pull/44006
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
dongjoon-hyun commented on PR #44125:
URL: https://github.com/apache/spark/pull/44125#issuecomment-1837231001
Thank you, @panbingkun and @LuciferYang .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to g
dongjoon-hyun closed pull request #44125: [SPARK-46214][BUILD] Upgrade
commons-io to 2.15.1
URL: https://github.com/apache/spark/pull/44125
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specif
dongjoon-hyun commented on PR #44126:
URL: https://github.com/apache/spark/pull/44126#issuecomment-1837230461
Ack, @LuciferYang .
- Did you see the same failure in `master` branch?
- If this doesn't help `branch-3.5`, we can simply close this PR.
--
This is an automated message from
MaxGekk commented on PR #44110:
URL: https://github.com/apache/spark/pull/44110#issuecomment-1837229888
> The change also includes a small unit benchmark for this particular case.
I wonder of other benchmarks. Do you observe perf regressions? I am asking
just in case.
--
This is an
dongjoon-hyun commented on PR #44117:
URL: https://github.com/apache/spark/pull/44117#issuecomment-1837213257
Thank you, @itholic !
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific co
LuciferYang commented on PR #44126:
URL: https://github.com/apache/spark/pull/44126#issuecomment-1837143832
- https://github.com/apache/spark/actions/runs/7041075997/job/19163039661
- https://github.com/apache/spark/actions/runs/7054970010/job/19204693143
```
[info] HiveSourceRow
LuciferYang opened a new pull request, #44126:
URL: https://github.com/apache/spark/pull/44126
### What changes were proposed in this pull request?
Reset storeAssignmentPolicy to original in HiveCompatibilitySuite.
### Why are the changes needed?
STORE_ASSIGNMENT_POLICY was not r
LuciferYang commented on PR #43867:
URL: https://github.com/apache/spark/pull/43867#issuecomment-1837142397
seems we should backport SPARK-45975 to
https://github.com/apache/spark/pull/43870 @wForget ?
--
This is an automated message from the Apache Git Service.
To respond to the message,
panbingkun opened a new pull request, #44125:
URL: https://github.com/apache/spark/pull/44125
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How
panbingkun closed pull request #44124: [SPARK-46214][BUILD] Upgrade commons-io
to 2.15.1
URL: https://github.com/apache/spark/pull/44124
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
panbingkun opened a new pull request, #44124:
URL: https://github.com/apache/spark/pull/44124
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How
Yikun commented on PR #58:
URL: https://github.com/apache/spark-docker/pull/58#issuecomment-1837138971
@HyukjinKwon Thanks merged.
- ghcr 3.4.2 publish (test):
https://github.com/apache/spark-docker/actions/runs/7070227446
--
This is an automated message from the Apache Git Service
Yikun closed pull request #58: [SPARK-46209] Add java 11 only yml for version
before 3.5
URL: https://github.com/apache/spark-docker/pull/58
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the speci
itholic opened a new pull request, #44123:
URL: https://github.com/apache/spark/pull/44123
### What changes were proposed in this pull request?
This PR proposes to introduce `PySparkImportError` for error framework to
replace `IndexError` in pyspark.sql.*.
### Why are t
LuciferYang commented on PR #43867:
URL: https://github.com/apache/spark/pull/43867#issuecomment-1837128189
After merging this pr, there are test failures in branch-3.5. Can you take a
look at this issue? @wForget
https://github.com/apache/spark/actions/runs/7041075997/job/191630
LuciferYang commented on PR #44122:
URL: https://github.com/apache/spark/pull/44122#issuecomment-1837121298
cc @cloud-fan
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
T
LuciferYang opened a new pull request, #44122:
URL: https://github.com/apache/spark/pull/44122
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How
wankunde commented on PR #44120:
URL: https://github.com/apache/spark/pull/44120#issuecomment-1837105605
Retest this please
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
itholic commented on code in PR #44115:
URL: https://github.com/apache/spark/pull/44115#discussion_r1412770204
##
python/docs/source/reference/pyspark.pandas/index.rst:
##
@@ -23,7 +23,7 @@ Pandas API on Spark
This page gives an overview of all public pandas API on Spark.
..
itholic commented on PR #44115:
URL: https://github.com/apache/spark/pull/44115#issuecomment-1837100041
cc @HyukjinKwon @zhengruifeng
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
itholic opened a new pull request, #44121:
URL: https://github.com/apache/spark/pull/44121
### What changes were proposed in this pull request?
This PR proposes to expose `SessionNotSameException` as PySpark exceptions.
### Why are the changes needed?
`SessionNotSameE
MaxGekk commented on code in PR #44093:
URL: https://github.com/apache/spark/pull/44093#discussion_r1412764471
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/executeImmediate.scala:
##
@@ -0,0 +1,130 @@
+/*
+ * Licensed to the Apache Software Foundation (A
81 matches
Mail list logo