cloud-fan commented on PR #38404:
URL: https://github.com/apache/spark/pull/38404#issuecomment-1299658043
seems there is a test failure
```
SQLQueryTestSuite.interval.sql
org.scalatest.exceptions.TestFailedException: interval.sql
Expected "org.apache.spark.[SparkArithmeticExceptio
lyy-pineapple commented on PR #38171:
URL: https://github.com/apache/spark/pull/38171#issuecomment-1299657792
> How much confidence do we have in joni? Is it widely adopted by other
open-source projects? I'm a bit concerned about moving away from JDK regex and
picking a project that I just
cloud-fan commented on code in PR #38475:
URL: https://github.com/apache/spark/pull/38475#discussion_r1011251487
##
connector/connect/src/main/protobuf/spark/connect/relations.proto:
##
@@ -250,3 +251,15 @@ message SubqueryAlias {
// Optional. Qualifier of the alias.
repea
EnricoMi commented on code in PR #38223:
URL: https://github.com/apache/spark/pull/38223#discussion_r1011250306
##
python/pyspark/worker.py:
##
@@ -146,7 +146,74 @@ def verify_result_type(result):
)
-def wrap_cogrouped_map_pandas_udf(f, return_type, argspec):
+def verif
jerrypeng commented on PR #38430:
URL: https://github.com/apache/spark/pull/38430#issuecomment-1299643015
@HeartSaVioR @LuciferYang thank you for the review. I have addressed your
comments. PTAL. Thank in advance!
--
This is an automated message from the Apache Git Service.
To respond t
jerrypeng commented on code in PR #38430:
URL: https://github.com/apache/spark/pull/38430#discussion_r1011241156
##
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala:
##
@@ -19,7 +19,9 @@ package org.apache.spark.sql.execution.streaming
im
mridulm commented on PR #36165:
URL: https://github.com/apache/spark/pull/36165#issuecomment-1299640509
+CC @zhouyejoe
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To u
jerrypeng commented on code in PR #38430:
URL: https://github.com/apache/spark/pull/38430#discussion_r1011239497
##
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala:
##
@@ -64,6 +67,17 @@ class HDFSMetadataLog[T <: AnyRef : ClassTag](sparkSe
mridulm commented on PR #38064:
URL: https://github.com/apache/spark/pull/38064#issuecomment-1299638357
Can you pls take a look at the build failure @liuzqt ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL abo
zhengruifeng commented on code in PR #38475:
URL: https://github.com/apache/spark/pull/38475#discussion_r1011238884
##
connector/connect/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala:
##
@@ -123,6 +125,24 @@ class SparkConnectPlanner(plan: proto.R
jerrypeng commented on code in PR #38430:
URL: https://github.com/apache/spark/pull/38430#discussion_r1011238096
##
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala:
##
@@ -277,10 +295,34 @@ class HDFSMetadataLog[T <: AnyRef :
ClassTag](spa
HeartSaVioR commented on code in PR #38430:
URL: https://github.com/apache/spark/pull/38430#discussion_r1011237000
##
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala:
##
@@ -64,6 +67,17 @@ class HDFSMetadataLog[T <: AnyRef : ClassTag](spark
zhengruifeng commented on PR #38475:
URL: https://github.com/apache/spark/pull/38475#issuecomment-1299635038
if we try to implement it in the client side, another problem is that it's
likely to reuse and depend on some functionality in `pyspark/sql`
--
This is an automated message from th
LuciferYang commented on PR #38476:
URL: https://github.com/apache/spark/pull/38476#issuecomment-1299634288
> Sorry for the late reply. I want to know why GA doesn't have this issue?
master CI always seems healthy, how can we reproduce this? Let me investigate
this.
Run `dev/sbt-chec
mridulm commented on PR #38467:
URL: https://github.com/apache/spark/pull/38467#issuecomment-1299634164
If we are making this change, there are a bunch of other places which are
candidates for `needCreate = false` - can we include those as well ?
--
This is an automated message from the A
HyukjinKwon commented on PR #38470:
URL: https://github.com/apache/spark/pull/38470#issuecomment-1299633920
Maybe putting it to the top model level (`connector/connect/README.md`) for
now could be a good idea (?). Just wanted to avoid a different structure
compared to other compoenents (`co
mridulm commented on PR #38333:
URL: https://github.com/apache/spark/pull/38333#issuecomment-1299631344
For cases like this, it might actually be better to add the node to deny
list and fail the task to recompute the parent stage ?
--
This is an automated message from the Apache Git Servi
mridulm commented on PR #38428:
URL: https://github.com/apache/spark/pull/38428#issuecomment-1299630091
The PR as such looks reasonable to me - can we add a test to explicitly test
for EOF behavior ?
+CC @JoshRosen who had worked on this in the distant past :-)
+CC @Ngone51
grundprinzip commented on PR #38470:
URL: https://github.com/apache/spark/pull/38470#issuecomment-1299629632
What about we link to it from the top level Readme in the component?
The reason why it's not in the code is because it's client language agnostic.
--
This is an automated mes
HeartSaVioR commented on code in PR #38430:
URL: https://github.com/apache/spark/pull/38430#discussion_r1011232238
##
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala:
##
@@ -19,7 +19,9 @@ package org.apache.spark.sql.execution.streaming
zhengruifeng commented on PR #38471:
URL: https://github.com/apache/spark/pull/38471#issuecomment-1299626924
merged to master
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
zhengruifeng closed pull request #38471: [SPARK-40883][CONNECT][FOLLOW-UP]
Range.step is required and Python client should have a default value=1
URL: https://github.com/apache/spark/pull/38471
--
This is an automated message from the Apache Git Service.
To respond to the message, please log
mridulm commented on code in PR #38428:
URL: https://github.com/apache/spark/pull/38428#discussion_r1011230912
##
core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala:
##
@@ -504,44 +505,31 @@ class ExternalAppendOnlyMap[K, V, C](
* If no more p
HyukjinKwon commented on PR #38470:
URL: https://github.com/apache/spark/pull/38470#issuecomment-1299624291
For developer documentation, it might better be placed under sources as a
comment e.g., `packages.scala`. e.g.)
https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org
mridulm commented on code in PR #38428:
URL: https://github.com/apache/spark/pull/38428#discussion_r1011229842
##
core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala:
##
@@ -301,15 +300,18 @@ class KryoDeserializationStream(
private[this] var kryo: Kryo = s
dongjoon-hyun commented on PR #38474:
URL: https://github.com/apache/spark/pull/38474#issuecomment-1299623034
Thank you so much, @HyukjinKwon !
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
mridulm commented on code in PR #38428:
URL: https://github.com/apache/spark/pull/38428#discussion_r1011229019
##
core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala:
##
@@ -324,6 +326,36 @@ class KryoDeserializationStream(
}
}
}
+
+ final overri
mridulm commented on code in PR #38428:
URL: https://github.com/apache/spark/pull/38428#discussion_r1011229019
##
core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala:
##
@@ -324,6 +326,36 @@ class KryoDeserializationStream(
}
}
}
+
+ final overri
HyukjinKwon closed pull request #38474: [SPARK-40991][PYTHON] Update
`cloudpickle` to v2.2.0
URL: https://github.com/apache/spark/pull/38474
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the speci
HyukjinKwon commented on PR #38474:
URL: https://github.com/apache/spark/pull/38474#issuecomment-1299619944
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
mridulm commented on PR #38371:
URL: https://github.com/apache/spark/pull/38371#issuecomment-1299617022
Merged to master, thanks or fixing this @JiexingLi !
Thanks for looking into this @HyukjinKwon :-)
--
This is an automated message from the Apache Git Service.
To respond to the messa
asfgit closed pull request #38371: [SPARK-40968] Fix a few wrong/misleading
comments in DAGSchedulerSuite
URL: https://github.com/apache/spark/pull/38371
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
panbingkun commented on PR #38463:
URL: https://github.com/apache/spark/pull/38463#issuecomment-1299615584
cc @MaxGekk
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To u
mridulm commented on PR #38377:
URL: https://github.com/apache/spark/pull/38377#issuecomment-1299613189
Makes sense ... why not simply `val dfsLogFile = new Path(rootDir, appId +
DRIVER_LOG_FILE_SUFFIX)` instead btw ?
I am trying to see if I am missing anything here ...
--
This is an a
grundprinzip commented on PR #38470:
URL: https://github.com/apache/spark/pull/38470#issuecomment-1299609859
@HyukjinKwon I will add a Jira this is just the starting point to align
where we want to go.
My idea would be that once this is merged I will create a pr for the python
clien
cloud-fan commented on PR #38171:
URL: https://github.com/apache/spark/pull/38171#issuecomment-1299607243
How much confidence do we have in joni? Is it widely adopted by other
open-source projects? I'm a bit concerned about moving away from JDK regex and
picking a project that I just heard
LuciferYang commented on PR #38476:
URL: https://github.com/apache/spark/pull/38476#issuecomment-1299589185
Sorry for the late reply. I want to know why GA doesn't have this issue?
master CI always seems healthy, how can we reproduce this? Let me investigate
this.
--
This is an
MaxGekk closed pull request #38478: [MINOR][SQL] Wrap `given` in backticks to
fix compilation warning
URL: https://github.com/apache/spark/pull/38478
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
MaxGekk commented on PR #38478:
URL: https://github.com/apache/spark/pull/38478#issuecomment-1299585051
+1, LGTM. Merging to master.
Thank you, @LuciferYang.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL a
MaxGekk closed pull request #38438: [SPARK-40748][SQL] Migrate type check
failures of conditions onto error classes
URL: https://github.com/apache/spark/pull/38438
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL a
MaxGekk commented on PR #38438:
URL: https://github.com/apache/spark/pull/38438#issuecomment-1299581375
+1, LGTM. Merging to master.
Thank you, @panbingkun.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL ab
HeartSaVioR commented on PR #38404:
URL: https://github.com/apache/spark/pull/38404#issuecomment-1299562145
(Just to remind, please update PR title and description as this PR is no
longer a draft.)
--
This is an automated message from the Apache Git Service.
To respond to the message, ple
amaliujia commented on PR #38477:
URL: https://github.com/apache/spark/pull/38477#issuecomment-1299548329
cc @HyukjinKwon @grundprinzip
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specif
dongjoon-hyun commented on PR #38476:
URL: https://github.com/apache/spark/pull/38476#issuecomment-1299540936
Oh, thank you for reverting, @linhongliu-db and @HyukjinKwon .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and u
WeichenXu123 commented on code in PR #37734:
URL: https://github.com/apache/spark/pull/37734#discussion_r109299
##
python/pyspark/ml/functions.py:
##
@@ -106,6 +117,474 @@ def array_to_vector(col: Column) -> Column:
return
Column(sc._jvm.org.apache.spark.ml.functions.a
WeichenXu123 commented on code in PR #37734:
URL: https://github.com/apache/spark/pull/37734#discussion_r108516
##
python/pyspark/ml/functions.py:
##
@@ -106,6 +117,474 @@ def array_to_vector(col: Column) -> Column:
return
Column(sc._jvm.org.apache.spark.ml.functions.a
WeichenXu123 commented on code in PR #37734:
URL: https://github.com/apache/spark/pull/37734#discussion_r108516
##
python/pyspark/ml/functions.py:
##
@@ -106,6 +117,474 @@ def array_to_vector(col: Column) -> Column:
return
Column(sc._jvm.org.apache.spark.ml.functions.a
dongjoon-hyun commented on PR #38474:
URL: https://github.com/apache/spark/pull/38474#issuecomment-1299514591
Thank you for review, @HyukjinKwon and @itholic .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL ab
lyy-pineapple commented on PR #38171:
URL: https://github.com/apache/spark/pull/38171#issuecomment-1299505071
Add new benchmark that compared with java 11 and java 17 . cc @cloud-fan
@LuciferYang
--
This is an automated message from the Apache Git Service.
To respond to the message, ple
LuciferYang opened a new pull request, #38478:
URL: https://github.com/apache/spark/pull/38478
### What changes were proposed in this pull request?
A minor change to fix the a Scala related compilation warning
```
[WARNING]
/spark-source/sql/catalyst/src/main/scala/org/apache/sp
amaliujia opened a new pull request, #38477:
URL: https://github.com/apache/spark/pull/38477
### What changes were proposed in this pull request?
This PR consolidates the development facing documentation of Spark Connect
Python client into existing PySpark development doc (mor
LuciferYang commented on code in PR #38465:
URL: https://github.com/apache/spark/pull/38465#discussion_r1011088808
##
core/benchmarks/MapStatusesConvertBenchmark-jdk11-results.txt:
##
@@ -2,12 +2,12 @@
MapStatuses Convert Benchmark
beliefer commented on code in PR #38461:
URL: https://github.com/apache/spark/pull/38461#discussion_r1011086001
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/MergeScalarSubqueries.scala:
##
@@ -346,25 +346,19 @@ object MergeScalarSubqueries extends Rule[
ulysses-you commented on PR #36698:
URL: https://github.com/apache/spark/pull/36698#issuecomment-1299431744
@gengliangwang it is a bug fix and also have improvement for saving
unnecessary cast. The query will produce the unexpected precision and scale.
before: `decimal(28,2)`, after: `decim
itholic commented on PR #38474:
URL: https://github.com/apache/spark/pull/38474#issuecomment-1299422328
+1 for upgrading the `cloudpickle` version
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
itholic commented on code in PR #38465:
URL: https://github.com/apache/spark/pull/38465#discussion_r1011048696
##
core/benchmarks/MapStatusesConvertBenchmark-jdk11-results.txt:
##
@@ -2,12 +2,12 @@
MapStatuses Convert Benchmark
HyukjinKwon commented on PR #38470:
URL: https://github.com/apache/spark/pull/38470#issuecomment-1299418917
Maybe it's better to have a JIRA. BTW, wonder if we have an e2e example for
users can copy and paste to try. (e.g., like most of docs in
https://spark.apache.org/docs/latest/index.htm
HyukjinKwon commented on code in PR #38470:
URL: https://github.com/apache/spark/pull/38470#discussion_r1011045449
##
connector/connect/doc/client_connection_string.md:
##
@@ -0,0 +1,110 @@
+# Connecting to Spark Connect using Clients
Review Comment:
The usage documentation
HyukjinKwon closed pull request #38473: [SPARK-40990][PYTHON] DataFrame
creation from 2d NumPy array with arbitrary columns
URL: https://github.com/apache/spark/pull/38473
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use t
HyukjinKwon commented on PR #38473:
URL: https://github.com/apache/spark/pull/38473#issuecomment-1299413739
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
HyukjinKwon closed pull request #38476: Revert "[SPARK-40976][BUILD] Upgrade
sbt to 1.7.3"
URL: https://github.com/apache/spark/pull/38476
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specifi
HyukjinKwon commented on PR #38476:
URL: https://github.com/apache/spark/pull/38476#issuecomment-1299411174
Merged to master
Since this is a clean revert.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
HyukjinKwon closed pull request #38409: [SPARK-40930][CONNECT] Support
Collect() in Python client
URL: https://github.com/apache/spark/pull/38409
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
HyukjinKwon commented on PR #38409:
URL: https://github.com/apache/spark/pull/38409#issuecomment-1299410618
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
linhongliu-db commented on PR #38476:
URL: https://github.com/apache/spark/pull/38476#issuecomment-1299401798
BTW, I really couldn't understand how this is problematic:
https://github.com/sbt/sbt/compare/v1.7.2...v1.7.3
--
This is an automated message from the Apache Git Service.
To respo
linhongliu-db commented on PR #38476:
URL: https://github.com/apache/spark/pull/38476#issuecomment-1299401226
cc @LuciferYang, maybe you'll have a fix so we won't need to revert it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to Gi
linhongliu-db opened a new pull request, #38476:
URL: https://github.com/apache/spark/pull/38476
### What changes were proposed in this pull request?
This reverts commit 9fc3aa0b1c092ab1f13b26582e3ece7440fbfc3b.
### Why are the changes needed?
The upgrade breaks `
github-actions[bot] commented on PR #37259:
URL: https://github.com/apache/spark/pull/37259#issuecomment-1299388827
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
AmplabJenkins commented on PR #38452:
URL: https://github.com/apache/spark/pull/38452#issuecomment-1299377919
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
AmplabJenkins commented on PR #38453:
URL: https://github.com/apache/spark/pull/38453#issuecomment-1299377890
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
amaliujia commented on PR #38475:
URL: https://github.com/apache/spark/pull/38475#issuecomment-1299371686
@cloud-fan
This is a good example that one API can be implemented with or without a
plan.
Basically if we don't add a new plan to the proto, clients can still
implement `
amaliujia opened a new pull request, #38475:
URL: https://github.com/apache/spark/pull/38475
### What changes were proposed in this pull request?
Add `RenameColumns` to proto to support the implementation for
`toDF(columnNames: String*)` which renames the input relation to a d
leewyang commented on code in PR #37734:
URL: https://github.com/apache/spark/pull/37734#discussion_r1010991074
##
python/pyspark/ml/functions.py:
##
@@ -106,6 +117,474 @@ def array_to_vector(col: Column) -> Column:
return
Column(sc._jvm.org.apache.spark.ml.functions.array
leewyang commented on code in PR #37734:
URL: https://github.com/apache/spark/pull/37734#discussion_r1010991074
##
python/pyspark/ml/functions.py:
##
@@ -106,6 +117,474 @@ def array_to_vector(col: Column) -> Column:
return
Column(sc._jvm.org.apache.spark.ml.functions.array
srowen commented on PR #38469:
URL: https://github.com/apache/spark/pull/38469#issuecomment-1299341398
Merged to master/3.3/3.2
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific commen
srowen closed pull request #38469: [MINOR][BUILD] Correct the `files` contend
in `checkstyle-suppressions.xml`
URL: https://github.com/apache/spark/pull/38469
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
dongjoon-hyun opened a new pull request, #38474:
URL: https://github.com/apache/spark/pull/38474
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### H
xinrong-meng opened a new pull request, #38473:
URL: https://github.com/apache/spark/pull/38473
### What changes were proposed in this pull request?
Support DataFrame creation from 2d NumPy array with arbitrary columns.
### Why are the changes needed?
Currently, DataFrame creatio
AmplabJenkins commented on PR #38462:
URL: https://github.com/apache/spark/pull/38462#issuecomment-1299294257
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
AmplabJenkins commented on PR #38463:
URL: https://github.com/apache/spark/pull/38463#issuecomment-1299294202
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
amaliujia commented on code in PR #38409:
URL: https://github.com/apache/spark/pull/38409#discussion_r1010931805
##
python/pyspark/sql/connect/dataframe.py:
##
@@ -305,8 +308,12 @@ def _print_plan(self) -> str:
return self._plan.print()
return ""
-def
amaliujia commented on code in PR #38409:
URL: https://github.com/apache/spark/pull/38409#discussion_r1010931805
##
python/pyspark/sql/connect/dataframe.py:
##
@@ -305,8 +308,12 @@ def _print_plan(self) -> str:
return self._plan.print()
return ""
-def
dtenedor commented on code in PR #38418:
URL: https://github.com/apache/spark/pull/38418#discussion_r1010897010
##
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4:
##
@@ -1001,7 +1001,13 @@ createOrReplaceTableColTypeList
;
createOrRepl
amaliujia commented on code in PR #38418:
URL: https://github.com/apache/spark/pull/38418#discussion_r1010890792
##
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4:
##
@@ -1001,7 +1001,13 @@ createOrReplaceTableColTypeList
;
createOrRep
kristopherkane commented on PR #38358:
URL: https://github.com/apache/spark/pull/38358#issuecomment-1299119179
Thanks for the fix! Is it possible this could land in 3.1 as well?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHu
grundprinzip commented on PR #38470:
URL: https://github.com/apache/spark/pull/38470#issuecomment-1299084859
Good point, I will incorporate that into the doc.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL ab
anchovYu commented on PR #38169:
URL: https://github.com/apache/spark/pull/38169#issuecomment-1299073908
the title needs to be updated from 2220 to 2200 :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
amaliujia commented on PR #38470:
URL: https://github.com/apache/spark/pull/38470#issuecomment-1299054036
Overall LGTM
Is the `user_id` (or the user session token) be relevant to this
doc?https://github.com/apache/spark/blob/8f6b18536e44ffd36656ceb56a434e399ad6d1b8/python/pyspark/sql/
amaliujia commented on PR #38472:
URL: https://github.com/apache/spark/pull/38472#issuecomment-1299035641
R: @zhengruifeng
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
amaliujia opened a new pull request, #38472:
URL: https://github.com/apache/spark/pull/38472
### What changes were proposed in this pull request?
This PR tests `session.sql` in Python client both in `toProto` path and the
data collection path.
### Why are the change
gengliangwang commented on PR #36698:
URL: https://github.com/apache/spark/pull/36698#issuecomment-1299022752
@ulysses-you Is the following query an actual bug before the refactor? Or
did the refactor just remove the redundant cast?
```
SELECT CAST(1 AS DECIMAL(28, 2))
UNION ALL
amaliujia opened a new pull request, #38471:
URL: https://github.com/apache/spark/pull/38471
### What changes were proposed in this pull request?
To match existing Python DataFarme API, this PR changes the `Range.step` as
required and Python client keep `1` as a default value
amaliujia commented on PR #38471:
URL: https://github.com/apache/spark/pull/38471#issuecomment-1299015533
R: @zhengruifeng
I sent out this PR based on your suggestion.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
carlfu-db commented on PR #38404:
URL: https://github.com/apache/spark/pull/38404#issuecomment-1298960790
https://user-images.githubusercontent.com/114777395/199313517-3122d622-ba62-4ac5-8fbf-d01b4e59c394.png";>
I have rebase the PR on to the latest apache/master, not sure how to trigg
SandishKumarHN commented on code in PR #38344:
URL: https://github.com/apache/spark/pull/38344#discussion_r1010721467
##
connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/ProtobufUtils.scala:
##
@@ -178,46 +176,73 @@ private[sql] object ProtobufUtils extends
jerrypeng commented on code in PR #38430:
URL: https://github.com/apache/spark/pull/38430#discussion_r1010692304
##
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala:
##
@@ -277,10 +295,34 @@ class HDFSMetadataLog[T <: AnyRef :
ClassTag](spa
leewyang commented on code in PR #37734:
URL: https://github.com/apache/spark/pull/37734#discussion_r1010663824
##
python/pyspark/ml/functions.py:
##
@@ -106,6 +117,474 @@ def array_to_vector(col: Column) -> Column:
return
Column(sc._jvm.org.apache.spark.ml.functions.array
leewyang commented on code in PR #37734:
URL: https://github.com/apache/spark/pull/37734#discussion_r1010663824
##
python/pyspark/ml/functions.py:
##
@@ -106,6 +117,474 @@ def array_to_vector(col: Column) -> Column:
return
Column(sc._jvm.org.apache.spark.ml.functions.array
MaxGekk commented on code in PR #38438:
URL: https://github.com/apache/spark/pull/38438#discussion_r1010683264
##
sql/core/src/test/java/test/org/apache/spark/sql/JavaColumnExpressionSuite.java:
##
@@ -79,12 +83,16 @@ public void isInCollectionCheckExceptionMessage() {
cr
leewyang commented on code in PR #37734:
URL: https://github.com/apache/spark/pull/37734#discussion_r1010681773
##
python/pyspark/ml/functions.py:
##
@@ -106,6 +117,474 @@ def array_to_vector(col: Column) -> Column:
return
Column(sc._jvm.org.apache.spark.ml.functions.array
1 - 100 of 167 matches
Mail list logo