HyukjinKwon closed pull request #40673:
[SPARK-41537][INFRA][CONNECT][FOLLOW-UP] Removes breaking changes within master
branch
URL: https://github.com/apache/spark/pull/40673
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and u
HyukjinKwon commented on PR #40673:
URL: https://github.com/apache/spark/pull/40673#issuecomment-1501048604
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
HyukjinKwon commented on PR #40525:
URL: https://github.com/apache/spark/pull/40525#issuecomment-1501048286
For example, you can't just import `from pyspark.sql.connect.column import
Column as ConnectColumn` at `pyspark/pandas/_typing.py`. Even you don't use
Spark connect, it will check the
HyukjinKwon commented on PR #40716:
URL: https://github.com/apache/spark/pull/40716#issuecomment-1501048066
Merged to master and branch-3.4.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the sp
HyukjinKwon closed pull request #40716: [SPARK-43075][CONNECT] Change `gRPC` to
`grpcio` when it is not installed.
URL: https://github.com/apache/spark/pull/40716
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL a
HyukjinKwon commented on PR #40525:
URL: https://github.com/apache/spark/pull/40525#issuecomment-1501047901
The problem is that you don't use Spark Connect but it complains that it
needs `grpcio`
--
This is an automated message from the Apache Git Service.
To respond to the message, pleas
itholic commented on PR #40525:
URL: https://github.com/apache/spark/pull/40525#issuecomment-1501047078
Thank you for the feedback, @bjornjorgensen !
IMHO, it seems more reasonable to add `grpcio` as a dependency for the
Pandas API on Spark instead of reverting all this change back (O
grundprinzip commented on code in PR #40695:
URL: https://github.com/apache/spark/pull/40695#discussion_r1161204815
##
python/pyspark/sql/connect/client.py:
##
@@ -867,6 +878,8 @@ def _analyze(self, method: str, **kwargs: Any) ->
AnalyzeResult:
req.unpersist.bl
ming95 commented on code in PR #40714:
URL: https://github.com/apache/spark/pull/40714#discussion_r1161191880
##
sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala:
##
@@ -125,6 +125,9 @@ private[sql] class SessionState(
plan: LogicalPlan,
AngersZh commented on PR #40701:
URL: https://github.com/apache/spark/pull/40701#issuecomment-1501012548
ping @cloud-fan
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
github-actions[bot] closed pull request #39021: [SPARK-41483][CORE] Last
metrics system report should have a timeout, avoid to lead shutdown hook timeout
URL: https://github.com/apache/spark/pull/39021
--
This is an automated message from the Apache Git Service.
To respond to the message, ple
github-actions[bot] closed pull request #39259: [SPARK-41739][SQL] CheckRule
should not be executed when analyze view child
URL: https://github.com/apache/spark/pull/39259
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use t
amaliujia commented on code in PR #40693:
URL: https://github.com/apache/spark/pull/40693#discussion_r1161169706
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala:
##
@@ -915,9 +916,9 @@ case class Cast(
case TimestampType =>
buildCa
amaliujia commented on code in PR #40693:
URL: https://github.com/apache/spark/pull/40693#discussion_r1161167873
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala:
##
@@ -915,9 +916,9 @@ case class Cast(
case TimestampType =>
buildCa
HeartSaVioR commented on code in PR #40715:
URL: https://github.com/apache/spark/pull/40715#discussion_r1161163980
##
sql/core/benchmarks/StateStoreBasicOperationsBenchmark-jdk17-results.txt:
##
Review Comment:
I first skimmed through relative and thought it might be regres
HeartSaVioR closed pull request #40705: [SPARK-43067][SS] Correct the location
of error class resource file in Kafka connector
URL: https://github.com/apache/spark/pull/40705
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and us
HeartSaVioR commented on PR #40705:
URL: https://github.com/apache/spark/pull/40705#issuecomment-1500982175
I'll just merge this since the fix is super straightforward and CI passed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to Gi
bjornjorgensen opened a new pull request, #40716:
URL: https://github.com/apache/spark/pull/40716
### What changes were proposed in this pull request?
Change `gRPC` to `grpcio`
This is ONLY in the printing, for users that haven't install `gRPC`
### Why are the changes needed
bjornjorgensen commented on PR #40525:
URL: https://github.com/apache/spark/pull/40525#issuecomment-1500961292
@itholic Thank you, great work :)
After this PR
`from pyspark import pandas as ps `
ModuleNotFoundError Traceback (most recent call last)
LuciferYang commented on PR #40715:
URL: https://github.com/apache/spark/pull/40715#issuecomment-1500901102
cc @dongjoon-hyun @HeartSaVioR FYI
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
LuciferYang opened a new pull request, #40715:
URL: https://github.com/apache/spark/pull/40715
### What changes were proposed in this pull request?
This pr regenerate benchmark results of `StateStoreBasicOperationsBenchmark`.
### Why are the changes needed?
https://github.com
ming95 opened a new pull request, #40714:
URL: https://github.com/apache/spark/pull/40714
### What changes were proposed in this pull request?
Add the function without constant parameters of `SessionState#executePlan`
### Why are the changes needed?
Before
HeartSaVioR closed pull request #40561: [SPARK-42931][SS] Introduce
dropDuplicatesWithinWatermark
URL: https://github.com/apache/spark/pull/40561
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
HeartSaVioR commented on PR #40561:
URL: https://github.com/apache/spark/pull/40561#issuecomment-1500894916
Thanks all for reviewing! Merging to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
HeartSaVioR commented on PR #40561:
URL: https://github.com/apache/spark/pull/40561#issuecomment-1500894854
Confirmed CI passed for last commit.
https://github.com/HeartSaVioR/spark/runs/12606973127
--
This is an automated message from the Apache Git Service.
To respond to the message, pl
zhouyifan279 commented on PR #40645:
URL: https://github.com/apache/spark/pull/40645#issuecomment-1500887774
> Any updates, @zhouyifan279 ?
@dongjoon-hyun sorry for response late.
Speaking from my limited experience, I think very few users may encounter
the case you mentioned.
wankunde opened a new pull request, #40713:
URL: https://github.com/apache/spark/pull/40713
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How wa
HeartSaVioR commented on PR #40702:
URL: https://github.com/apache/spark/pull/40702#issuecomment-1500869567
Thanks for reviewing and merging!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the s
hvanhovell commented on code in PR #40693:
URL: https://github.com/apache/spark/pull/40693#discussion_r1161097318
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala:
##
@@ -915,9 +916,9 @@ case class Cast(
case TimestampType =>
buildC
hvanhovell commented on code in PR #40693:
URL: https://github.com/apache/spark/pull/40693#discussion_r1161097318
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala:
##
@@ -915,9 +916,9 @@ case class Cast(
case TimestampType =>
buildC
LuciferYang commented on PR #40699:
URL: https://github.com/apache/spark/pull/40699#issuecomment-1500823598
@Yikf Can you re-trigger GA?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specif
zhengruifeng closed pull request #40263: [SPARK-42659][ML] Reimplement
`FPGrowthModel.transform` with dataframe operations
URL: https://github.com/apache/spark/pull/40263
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use th
zhengruifeng commented on PR #40695:
URL: https://github.com/apache/spark/pull/40695#issuecomment-1500819117
cc @WeichenXu123 @HyukjinKwon
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spe
zhengruifeng opened a new pull request, #40712:
URL: https://github.com/apache/spark/pull/40712
### What changes were proposed in this pull request?
Add constants for un-parameterized proto data types
### Why are the changes needed?
avoid recreating them
### Does this PR i
HyukjinKwon closed pull request #40702: [SPARK-43066][SQL] Add test for
dropDuplicates in JavaDatasetSuite
URL: https://github.com/apache/spark/pull/40702
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to g
HyukjinKwon commented on PR #40702:
URL: https://github.com/apache/spark/pull/40702#issuecomment-1500814524
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
HyukjinKwon closed pull request #40698: [SPARK-43062][INFRA][PYTHON][TESTS] Add
options to lint-python to run each test separately
URL: https://github.com/apache/spark/pull/40698
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub an
HyukjinKwon commented on PR #40698:
URL: https://github.com/apache/spark/pull/40698#issuecomment-1500814404
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
HyukjinKwon closed pull request #40525: [SPARK-42859][CONNECT][PS] Basic
support for pandas API on Spark Connect
URL: https://github.com/apache/spark/pull/40525
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL abov
HyukjinKwon commented on PR #40525:
URL: https://github.com/apache/spark/pull/40525#issuecomment-1500814281
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
mridulm commented on code in PR #40707:
URL: https://github.com/apache/spark/pull/40707#discussion_r1161075409
##
sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala:
##
@@ -2795,7 +2795,9 @@ private[sql] object QueryExecutionErrors extends
QueryE
mridulm commented on PR #40663:
URL: https://github.com/apache/spark/pull/40663#issuecomment-1500810809
Thanks for checking @dongjoon-hyun and @LuciferYang !
Great to finally have this issue fixed :-)
--
This is an automated message from the Apache Git Service.
To respond to the message
mridulm commented on PR #40690:
URL: https://github.com/apache/spark/pull/40690#issuecomment-1500810568
You are right about the null -> Int, should have checked better :-)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and us
43 matches
Mail list logo