pan3793 opened a new pull request, #40920:
URL: https://github.com/apache/spark/pull/40920
### What changes were proposed in this pull request?
Remove unnecessary serialize/deserialize of `Path` on parallel gather
partition stats.
### Why are the changes needed?
LuciferYang commented on code in PR #40898:
URL: https://github.com/apache/spark/pull/40898#discussion_r1174911075
##
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/CheckConnectJvmClientCompatibility.scala:
##
@@ -145,6 +145,7 @@ object CheckConn
LuciferYang commented on code in PR #40898:
URL: https://github.com/apache/spark/pull/40898#discussion_r1174911075
##
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/CheckConnectJvmClientCompatibility.scala:
##
@@ -145,6 +145,7 @@ object CheckConn
vicennial commented on code in PR #40675:
URL: https://github.com/apache/spark/pull/40675#discussion_r1174917132
##
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/application/ReplE2ESuite.scala:
##
@@ -0,0 +1,128 @@
+/*
+ * Licensed to the Apache Software Found
vicennial commented on code in PR #40675:
URL: https://github.com/apache/spark/pull/40675#discussion_r1174917132
##
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/application/ReplE2ESuite.scala:
##
@@ -0,0 +1,128 @@
+/*
+ * Licensed to the Apache Software Found
LuciferYang commented on code in PR #40898:
URL: https://github.com/apache/spark/pull/40898#discussion_r1174911075
##
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/CheckConnectJvmClientCompatibility.scala:
##
@@ -145,6 +145,7 @@ object CheckConn
CavemanIV opened a new pull request, #40921:
URL: https://github.com/apache/spark/pull/40921
### What changes were proposed in this pull request?
A minor bugfix in `ShuffleBlockFetcherIterator.diagnose`, which not handle
type ShuffleBlockBatchId properly
### Why are the changes
zhengruifeng closed pull request #40862: [SPARK-43169][INFRA][FOLLOWUP] Add
more memory for mima check
URL: https://github.com/apache/spark/pull/40862
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
zhengruifeng commented on PR #40862:
URL: https://github.com/apache/spark/pull/40862#issuecomment-1519595951
mima test OOM again
https://github.com/apache/spark/actions/runs/4783257117/jobs/8503361722
--
This is an automated message from the Apache Git Service.
To respond to the message,
zhengruifeng commented on PR #40862:
URL: https://github.com/apache/spark/pull/40862#issuecomment-1519597372
merged to master
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
LuciferYang commented on PR #40862:
URL: https://github.com/apache/spark/pull/40862#issuecomment-1519601176
Let's merge this one to avoid oom :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to th
LuciferYang commented on PR #40862:
URL: https://github.com/apache/spark/pull/40862#issuecomment-1519602330
Thanks @zhengruifeng @HyukjinKwon @pan3793 @Hisoka-X
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
kori73 commented on PR #40810:
URL: https://github.com/apache/spark/pull/40810#issuecomment-1519612722
> @kori73 Could you update the example (output) according to the recent
commit, please.
updated the example according to the recent commit
--
This is an automated message from the
cloud-fan opened a new pull request, #40922:
URL: https://github.com/apache/spark/pull/40922
### What changes were proposed in this pull request?
This is a followup of https://github.com/apache/spark/pull/40699 to avoid
changing the Cast behavior. It pulls out the cast-to-stri
cloud-fan commented on PR #40922:
URL: https://github.com/apache/spark/pull/40922#issuecomment-1519616957
cc @Yikf @sadikovi @gengliangwang
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the sp
MaxGekk commented on PR #40810:
URL: https://github.com/apache/spark/pull/40810#issuecomment-1519620141
+1, LGTM. Merging to master.
Thank you, @kori73.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
MaxGekk closed pull request #40810: [SPARK-42317][SQL] Assign name to
_LEGACY_ERROR_TEMP_2247: CANNOT_MERGE_SCHEMAS
URL: https://github.com/apache/spark/pull/40810
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL a
MaxGekk commented on PR #40810:
URL: https://github.com/apache/spark/pull/40810#issuecomment-1519623179
@kori73 Congratulations with your first contribution to Apache Spark!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
zhengruifeng closed pull request #40899: [SPARK-43249][CONNECT] Fix missing
stats for SQL Command
URL: https://github.com/apache/spark/pull/40899
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
zhengruifeng commented on PR #40899:
URL: https://github.com/apache/spark/pull/40899#issuecomment-1519628355
merged to master and branch-3.4, thanks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go t
lyy-pineapple commented on code in PR #38171:
URL: https://github.com/apache/spark/pull/38171#discussion_r1174987899
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressionsJoni.scala:
##
@@ -0,0 +1,481 @@
+/*
+ * Licensed to the Apache Software
lyy-pineapple commented on code in PR #38171:
URL: https://github.com/apache/spark/pull/38171#discussion_r1174997469
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressionsJoni.scala:
##
@@ -0,0 +1,481 @@
+/*
+ * Licensed to the Apache Software
bjornjorgensen commented on PR #40658:
URL: https://github.com/apache/spark/pull/40658#issuecomment-1519710374
https://pandas.pydata.org/pandas-docs/version/2.0.1/whatsnew/v2.0.1.html
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
lyy-pineapple commented on code in PR #38171:
URL: https://github.com/apache/spark/pull/38171#discussion_r1175016808
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressionsJoni.scala:
##
@@ -0,0 +1,481 @@
+/*
+ * Licensed to the Apache Software
bogao007 opened a new pull request, #40923:
URL: https://github.com/apache/spark/pull/40923
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How wa
cloud-fan commented on code in PR #40885:
URL: https://github.com/apache/spark/pull/40885#discussion_r1175031042
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormat.scala:
##
@@ -203,6 +203,21 @@ trait FileFormat {
* method. Technically, a file f
cloud-fan commented on code in PR #40885:
URL: https://github.com/apache/spark/pull/40885#discussion_r1175034679
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormat.scala:
##
@@ -203,6 +203,21 @@ trait FileFormat {
* method. Technically, a file f
bogao007 commented on code in PR #40923:
URL: https://github.com/apache/spark/pull/40923#discussion_r1175032518
##
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Dataset.scala:
##
@@ -1275,6 +1276,24 @@ class Dataset[T] private[sql] (
proto.Aggregate.Gro
bogao007 commented on code in PR #40923:
URL: https://github.com/apache/spark/pull/40923#discussion_r1175032518
##
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Dataset.scala:
##
@@ -1275,6 +1276,24 @@ class Dataset[T] private[sql] (
proto.Aggregate.Gro
bogao007 commented on code in PR #40923:
URL: https://github.com/apache/spark/pull/40923#discussion_r1175036078
##
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala:
##
@@ -616,6 +618,38 @@ class SparkConnectPlanner(val sessio
rshkv commented on code in PR #40902:
URL: https://github.com/apache/spark/pull/40902#discussion_r1175069446
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala:
##
@@ -272,7 +272,7 @@ package object dsl {
def attr: UnresolvedAttribute = analysi
rshkv commented on code in PR #40902:
URL: https://github.com/apache/spark/pull/40902#discussion_r1175068610
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala:
##
@@ -281,89 +281,108 @@ package object dsl {
def attr: UnresolvedAttribute = anal
zhengruifeng closed pull request #40866: [SPARK-43178][CONNECT][PYTHON] Migrate
UDF errors into PySpark error framework
URL: https://github.com/apache/spark/pull/40866
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
U
zhengruifeng commented on PR #40866:
URL: https://github.com/apache/spark/pull/40866#issuecomment-1519852696
merged to master
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
cloud-fan commented on PR #40563:
URL: https://github.com/apache/spark/pull/40563#issuecomment-1519867307
thanks, merging to master!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific c
cloud-fan closed pull request #40563: [SPARK-41233][FOLLOWUP] Refactor
`array_prepend` with `RuntimeReplaceable`
URL: https://github.com/apache/spark/pull/40563
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL abov
cloud-fan commented on code in PR #40915:
URL: https://github.com/apache/spark/pull/40915#discussion_r1175103099
##
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectAggregationIterator.scala:
##
@@ -111,25 +111,17 @@ class ObjectAggregationIterator(
}
cloud-fan commented on code in PR #40915:
URL: https://github.com/apache/spark/pull/40915#discussion_r1175104259
##
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectAggregationIterator.scala:
##
@@ -231,11 +224,15 @@ class SortBasedAggregator(
grouping
cloud-fan commented on code in PR #40915:
URL: https://github.com/apache/spark/pull/40915#discussion_r1175105524
##
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectAggregationIterator.scala:
##
@@ -252,6 +249,7 @@ class SortBasedAggregator(
var hasN
cloud-fan commented on PR #40875:
URL: https://github.com/apache/spark/pull/40875#issuecomment-1519891829
thanks, merging to master!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific c
cloud-fan closed pull request #40875: [SPARK-43214][SQL] Post driver-side
metrics for LocalTableScanExec/CommandResultExec
URL: https://github.com/apache/spark/pull/40875
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use th
itholic commented on PR #40658:
URL: https://github.com/apache/spark/pull/40658#issuecomment-1519900926
Thanks, @bjornjorgensen !
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comm
itholic opened a new pull request, #40924:
URL: https://github.com/apache/spark/pull/40924
### What changes were proposed in this pull request?
This PR proposes to migrate the Spark SQL pandas arrow type errors into
error class.
### Why are the changes needed?
Leveraging
LuciferYang opened a new pull request, #40925:
URL: https://github.com/apache/spark/pull/40925
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How
ulysses-you commented on code in PR #40915:
URL: https://github.com/apache/spark/pull/40915#discussion_r1175192930
##
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectAggregationIterator.scala:
##
@@ -111,25 +111,17 @@ class ObjectAggregationIterator(
LuciferYang commented on code in PR #40898:
URL: https://github.com/apache/spark/pull/40898#discussion_r1175199390
##
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/CheckConnectJvmClientCompatibility.scala:
##
@@ -145,6 +145,7 @@ object CheckConn
ulysses-you commented on code in PR #40915:
URL: https://github.com/apache/spark/pull/40915#discussion_r1175202831
##
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/ObjectAggregationIterator.scala:
##
@@ -252,6 +249,7 @@ class SortBasedAggregator(
var ha
itholic opened a new pull request, #40926:
URL: https://github.com/apache/spark/pull/40926
### What changes were proposed in this pull request?
This PR proposes to migrate `TypeError` from Spark SQL types into error
class.
### Why are the changes needed?
To improve P
justaparth commented on code in PR #40686:
URL: https://github.com/apache/spark/pull/40686#discussion_r1175212228
##
connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/ProtobufDeserializer.scala:
##
@@ -288,7 +289,21 @@ private[sql] class ProtobufDeserializer(
itholic opened a new pull request, #40927:
URL: https://github.com/apache/spark/pull/40927
### What changes were proposed in this pull request?
This is follow-up for https://github.com/apache/spark/pull/39991 to remove
unused exception.
### Why are the changes neede
itholic opened a new pull request, #40928:
URL: https://github.com/apache/spark/pull/40928
### What changes were proposed in this pull request?
This PR proposes to migrate built-in `TypeError` and `ValueError` from Spark
Connect Structured Streaming into PySpark error framework.
ryan-johnson-databricks commented on code in PR #40885:
URL: https://github.com/apache/spark/pull/40885#discussion_r1175271997
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormat.scala:
##
@@ -203,6 +203,21 @@ trait FileFormat {
* method. Technic
ryan-johnson-databricks commented on code in PR #40885:
URL: https://github.com/apache/spark/pull/40885#discussion_r1175271997
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormat.scala:
##
@@ -203,6 +203,21 @@ trait FileFormat {
* method. Technic
ryan-johnson-databricks commented on code in PR #40885:
URL: https://github.com/apache/spark/pull/40885#discussion_r1175286323
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormat.scala:
##
@@ -203,6 +203,21 @@ trait FileFormat {
* method. Technic
ryan-johnson-databricks commented on code in PR #40885:
URL: https://github.com/apache/spark/pull/40885#discussion_r1175286323
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormat.scala:
##
@@ -203,6 +203,21 @@ trait FileFormat {
* method. Technic
ryan-johnson-databricks commented on code in PR #40885:
URL: https://github.com/apache/spark/pull/40885#discussion_r1175286323
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormat.scala:
##
@@ -203,6 +203,21 @@ trait FileFormat {
* method. Technic
ryan-johnson-databricks commented on code in PR #40885:
URL: https://github.com/apache/spark/pull/40885#discussion_r1175286323
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormat.scala:
##
@@ -203,6 +203,21 @@ trait FileFormat {
* method. Technic
ryan-johnson-databricks commented on code in PR #40885:
URL: https://github.com/apache/spark/pull/40885#discussion_r1175286323
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormat.scala:
##
@@ -203,6 +203,21 @@ trait FileFormat {
* method. Technic
ryan-johnson-databricks commented on code in PR #40885:
URL: https://github.com/apache/spark/pull/40885#discussion_r1175286323
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormat.scala:
##
@@ -203,6 +203,21 @@ trait FileFormat {
* method. Technic
ryan-johnson-databricks commented on code in PR #40885:
URL: https://github.com/apache/spark/pull/40885#discussion_r1175275233
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormat.scala:
##
@@ -203,6 +203,21 @@ trait FileFormat {
* method. Technic
LuciferYang commented on PR #40847:
URL: https://github.com/apache/spark/pull/40847#issuecomment-1520218583
@xkrogen @sunchao @pan3793 Synchronize my experimental results
1. Before building, we need to add the following content to
`resource-managers/yarn/pom.xml` refer to
https://git
ryan-johnson-databricks commented on code in PR #40885:
URL: https://github.com/apache/spark/pull/40885#discussion_r1175364533
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormat.scala:
##
@@ -203,6 +203,21 @@ trait FileFormat {
* method. Technic
majdyz opened a new pull request, #40929:
URL: https://github.com/apache/spark/pull/40929
### What changes were proposed in this pull request?
This PR adds lazy allocation support for the backing array of ColumnVector
used in Spark VectorizedReader. This is added as a memory o
LuciferYang commented on PR #40929:
URL: https://github.com/apache/spark/pull/40929#issuecomment-1520312702
@majdyz Can you enable GA first refer to
https://user-images.githubusercontent.com/1475305/234031906-ad7fa49e-209b-4369-888a-e81a1299943d.png";>
https://github.com/apache/spark/p
cloud-fan commented on code in PR #40885:
URL: https://github.com/apache/spark/pull/40885#discussion_r1175418393
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormat.scala:
##
@@ -203,6 +203,21 @@ trait FileFormat {
* method. Technically, a file f
LuciferYang commented on code in PR #40920:
URL: https://github.com/apache/spark/pull/40920#discussion_r1175420930
##
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala:
##
@@ -789,22 +789,22 @@ case class RepairTableCommand(
if (partitionSpecsAndLocs.
LuciferYang commented on code in PR #40920:
URL: https://github.com/apache/spark/pull/40920#discussion_r1175422370
##
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala:
##
@@ -789,22 +789,22 @@ case class RepairTableCommand(
if (partitionSpecsAndLocs.
LuciferYang commented on code in PR #40920:
URL: https://github.com/apache/spark/pull/40920#discussion_r1175425968
##
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala:
##
@@ -789,22 +789,22 @@ case class RepairTableCommand(
if (partitionSpecsAndLocs.
ryan-johnson-databricks opened a new pull request, #40930:
URL: https://github.com/apache/spark/pull/40930
### What changes were proposed in this pull request?
Experimental PR in response to
https://github.com/apache/spark/pull/40885#discussion_r1174277575, so that
reviewers
LuciferYang commented on PR #40847:
URL: https://github.com/apache/spark/pull/40847#issuecomment-1520363712
More
1. The conclusion using hadoop 3.0.x and hadoop 3.1.x is the same
2. User hadoop 3.2.x can't build `hadoop-cloud` module too
3. Currently, only hadoop 3.3. x can build all
ryan-johnson-databricks commented on code in PR #40885:
URL: https://github.com/apache/spark/pull/40885#discussion_r1175437133
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormat.scala:
##
@@ -203,6 +203,21 @@ trait FileFormat {
* method. Technic
majdyz closed pull request #40929: [SPARK-43264][SQL] Avoid allocation of
unwritten ColumnVector in Spark Vectorized Reader
URL: https://github.com/apache/spark/pull/40929
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use t
majdyz commented on PR #40929:
URL: https://github.com/apache/spark/pull/40929#issuecomment-1520371824
@LuciferYang Thanks, I think it's already been enabled now
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
pan3793 commented on code in PR #40920:
URL: https://github.com/apache/spark/pull/40920#discussion_r1175442719
##
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala:
##
@@ -789,22 +789,22 @@ case class RepairTableCommand(
if (partitionSpecsAndLocs.leng
pan3793 commented on code in PR #40920:
URL: https://github.com/apache/spark/pull/40920#discussion_r1175443729
##
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala:
##
@@ -789,22 +789,22 @@ case class RepairTableCommand(
if (partitionSpecsAndLocs.leng
pan3793 commented on code in PR #40920:
URL: https://github.com/apache/spark/pull/40920#discussion_r1175444671
##
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala:
##
@@ -789,22 +789,22 @@ case class RepairTableCommand(
if (partitionSpecsAndLocs.leng
ryan-johnson-databricks commented on code in PR #40885:
URL: https://github.com/apache/spark/pull/40885#discussion_r1175449282
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormat.scala:
##
@@ -203,6 +203,21 @@ trait FileFormat {
* method. Technic
ryan-johnson-databricks commented on code in PR #40885:
URL: https://github.com/apache/spark/pull/40885#discussion_r1175452917
##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormat.scala:
##
@@ -203,6 +203,21 @@ trait FileFormat {
* method. Technic
amaliujia opened a new pull request, #40931:
URL: https://github.com/apache/spark/pull/40931
### What changes were proposed in this pull request?
Move Error framework to a common utils module so that we can share it
between Spark and Spark Connect without introducing heavy dep
amaliujia commented on PR #40931:
URL: https://github.com/apache/spark/pull/40931#issuecomment-1520403075
@cloud-fan @hvanhovell
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comm
cloud-fan commented on PR #40879:
URL: https://github.com/apache/spark/pull/40879#issuecomment-1520408175
thanks, merging to master!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific c
cloud-fan closed pull request #40879: [SPARK-43217] Correctly recurse in nested
maps/arrays in findNestedField
URL: https://github.com/apache/spark/pull/40879
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
peter-toth opened a new pull request, #40932:
URL: https://github.com/apache/spark/pull/40932
### What changes were proposed in this pull request?
This PR moves `MergeScalarSubqueries` from `catalyst` to `spark-sql`
### Why are the changes needed?
Make SPARK-40193 / https://gith
peter-toth commented on PR #37630:
URL: https://github.com/apache/spark/pull/37630#issuecomment-1520435320
I extracted the first commit of this PR, that just moves
`MergeScalarSubqueries` from `spark-catalyst` to `spark-sql` to
https://github.com/apache/spark/pull/40932 to make the actual c
hvanhovell commented on code in PR #40931:
URL: https://github.com/apache/spark/pull/40931#discussion_r1175496791
##
common/utils/src/main/scala/org/apache/spark/SparkThrowableHelper.scala:
##
@@ -34,7 +33,7 @@ private[spark] object ErrorMessageFormat extends Enumeration {
*/
hvanhovell commented on code in PR #40931:
URL: https://github.com/apache/spark/pull/40931#discussion_r1175497831
##
common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala:
##
@@ -30,6 +29,7 @@ import org.apache.commons.text.StringSubstitutor
import org.apa
pan3793 commented on PR #40920:
URL: https://github.com/apache/spark/pull/40920#issuecomment-1520446235
cc @sunchao
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsu
amaliujia commented on code in PR #40931:
URL: https://github.com/apache/spark/pull/40931#discussion_r1175520012
##
common/utils/src/main/scala/org/apache/spark/SparkThrowableHelper.scala:
##
@@ -34,7 +33,7 @@ private[spark] object ErrorMessageFormat extends Enumeration {
*/
RyanBerti commented on code in PR #40615:
URL: https://github.com/apache/spark/pull/40615#discussion_r1175541542
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/datasketchesAggregates.scala:
##
@@ -0,0 +1,336 @@
+/*
+ * Licensed to the Apache S
aokolnychyi commented on PR #40919:
URL: https://github.com/apache/spark/pull/40919#issuecomment-1520498405
@cloud-fan @sunchao @viirya @huaxingao @dongjoon-hyun @gengliangwang, this
is a follow-up to PR #40308.
--
This is an automated message from the Apache Git Service.
To respond to th
RyanBerti commented on code in PR #40615:
URL: https://github.com/apache/spark/pull/40615#discussion_r1175544253
##
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/aggregate/DatasketchesHllSketchSuite.scala:
##
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apac
RyanBerti commented on PR #40615:
URL: https://github.com/apache/spark/pull/40615#issuecomment-1520502797
>about adding a third boolean argument, with the default value being false
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitH
dzhigimont commented on code in PR #40370:
URL: https://github.com/apache/spark/pull/40370#discussion_r1175551612
##
python/pyspark/pandas/frame.py:
##
@@ -3519,16 +3516,8 @@ def between_time(
Initial time as a time filter limit.
end_time : datetime.time or
dzhigimont commented on code in PR #40370:
URL: https://github.com/apache/spark/pull/40370#discussion_r1175552369
##
python/pyspark/pandas/frame.py:
##
@@ -3582,14 +3571,18 @@ def between_time(
if not isinstance(self.index, ps.DatetimeIndex):
raise TypeErro
DerekTBrown commented on PR #40798:
URL: https://github.com/apache/spark/pull/40798#issuecomment-1520512047
Looks good. Closing in favor of #40831
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
DerekTBrown closed pull request #40798: SPARK-43166: name docker users
URL: https://github.com/apache/spark/pull/40798
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubs
dzhigimont commented on code in PR #40665:
URL: https://github.com/apache/spark/pull/40665#discussion_r1175559912
##
python/pyspark/pandas/namespace.py:
##
@@ -1782,12 +1780,8 @@ def date_range(
Normalize start/end dates to midnight before generating date range.
na
sunchao commented on PR #39950:
URL: https://github.com/apache/spark/pull/39950#issuecomment-1520563772
Yea @yabola is correct, if we have 100 row groups in a file and there are
100 tasks to read them, each task will only be assigned a range (e.g., a single
row group) in the file to read, s
sunchao commented on PR #40893:
URL: https://github.com/apache/spark/pull/40893#issuecomment-1520573329
@pan3793 AFAIK the development efforts in Hive community are only in Hive
3.x/4.x at the moment, and the 2.x branch is barely maintained. I can try to
start a conversation in the Hive com
amaliujia commented on PR #40899:
URL: https://github.com/apache/spark/pull/40899#issuecomment-1520596794
Thanks for adding the JIRA!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
1 - 100 of 179 matches
Mail list logo