yaooqinn opened a new pull request, #40602:
URL: https://github.com/apache/spark/pull/40602
### What changes were proposed in this pull request?
Fix `rename a table` in derby and pg, which schema name is not allowed to
qualify the new table name
### Why are
yaooqinn commented on code in PR #40602:
URL: https://github.com/apache/spark/pull/40602#discussion_r1152824490
##
core/src/main/resources/error/error-classes.json:
##
@@ -129,6 +129,12 @@
],
"sqlState" : "429BB"
},
+ "CANNOT_RENAME_ACROSS_SCHEMA" : {
+"message
ScrapCodes commented on PR #40553:
URL: https://github.com/apache/spark/pull/40553#issuecomment-1489811022
Hi @VindhyaG, this might be useful - may be we can benefit from the usecase
you have for this. Is it just for logging?
Not sure what others think, it might be good to limit the API
HeartSaVioR commented on code in PR #40561:
URL: https://github.com/apache/spark/pull/40561#discussion_r1152828935
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala:
##
@@ -679,6 +679,8 @@ object RemoveNoopUnion extends Rule[LogicalPlan] {
HeartSaVioR commented on code in PR #40561:
URL: https://github.com/apache/spark/pull/40561#discussion_r1152828935
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala:
##
@@ -679,6 +679,8 @@ object RemoveNoopUnion extends Rule[LogicalPlan] {
grundprinzip commented on code in PR #40586:
URL: https://github.com/apache/spark/pull/40586#discussion_r1152826039
##
connector/connect/common/src/main/protobuf/spark/connect/commands.proto:
##
@@ -177,3 +179,97 @@ message WriteOperationV2 {
// (Optional) A condition for ove
MaxGekk commented on code in PR #40593:
URL: https://github.com/apache/spark/pull/40593#discussion_r1152878072
##
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4:
##
@@ -928,11 +928,19 @@ primaryExpression
(FILTER LEFT_PAREN WHERE wher
yaooqinn commented on PR #40601:
URL: https://github.com/apache/spark/pull/40601#issuecomment-1489866478
This change makes sense to me. This is a breaking change, then shall we add
a migration guide for it?
--
This is an automated message from the Apache Git Service.
To respond to the mes
grundprinzip opened a new pull request, #40603:
URL: https://github.com/apache/spark/pull/40603
### What changes were proposed in this pull request?
Instead of just showing the Scala callsite show the abbreviate version of
the proto message in the Spark UI.
### Why are the changes
cloud-fan commented on PR #40437:
URL: https://github.com/apache/spark/pull/40437#issuecomment-1489876875
@yaooqinn this is a good point. If we are sure this is only for CLI display,
not thriftserver protocol, I agree we don't need to follow Hive.
--
This is an automated message from the
cloud-fan commented on code in PR #40593:
URL: https://github.com/apache/spark/pull/40593#discussion_r1152908374
##
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4:
##
@@ -928,11 +928,19 @@ primaryExpression
(FILTER LEFT_PAREN WHERE wh
cloud-fan commented on code in PR #40593:
URL: https://github.com/apache/spark/pull/40593#discussion_r1152908724
##
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4:
##
@@ -928,11 +928,19 @@ primaryExpression
(FILTER LEFT_PAREN WHERE wh
cloud-fan commented on code in PR #40593:
URL: https://github.com/apache/spark/pull/40593#discussion_r1152910161
##
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4:
##
@@ -928,11 +928,19 @@ primaryExpression
(FILTER LEFT_PAREN WHERE wh
yaooqinn commented on PR #40437:
URL: https://github.com/apache/spark/pull/40437#issuecomment-1489909241
> If we are sure this is only for CLI display,
Yes. hiveResultString is only used in spark-sql CLI. The thrift server-side
always uses command output schema. Maybe this is the inco
cloud-fan commented on code in PR #40437:
URL: https://github.com/apache/spark/pull/40437#discussion_r1152923808
##
sql/core/src/main/scala/org/apache/spark/sql/execution/HiveResult.scala:
##
@@ -59,18 +59,6 @@ object HiveResult {
formatDescribeTableOutput(executedPlan.
cloud-fan opened a new pull request, #40604:
URL: https://github.com/apache/spark/pull/40604
This reverts commit a111a02de1a814c5f335e0bcac4cffb0515557dc.
### What changes were proposed in this pull request?
SQLMetrics is not only used in the UI, but is also a programmin
cloud-fan commented on PR #40604:
URL: https://github.com/apache/spark/pull/40604#issuecomment-1489923055
cc @ulysses-you
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
T
cloud-fan commented on PR #40604:
URL: https://github.com/apache/spark/pull/40604#issuecomment-1489923733
also cc @xinrong-meng , this is not a blocker but it's better if we can make
it into 3.4.0.
--
This is an automated message from the Apache Git Service.
To respond to the message, ple
Yikf commented on PR #40437:
URL: https://github.com/apache/spark/pull/40437#issuecomment-1489933393
Yes. `hiveResultString` is added to ensure compatibility with hive output.
`hiveResultString` is only used by the spark-sql CLI. It is used only as the
CLI display.
`thriftServe
LuciferYang commented on code in PR #40598:
URL: https://github.com/apache/spark/pull/40598#discussion_r1152946827
##
common/network-common/src/main/java/org/apache/spark/network/util/JavaUtils.java:
##
@@ -373,18 +373,22 @@ public static byte[] bufferToArray(ByteBuffer buffer)
yaooqinn commented on code in PR #38732:
URL: https://github.com/apache/spark/pull/38732#discussion_r1152951801
##
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala:
##
@@ -494,10 +525,46 @@ class ExecutorPodsAllo
yaooqinn commented on code in PR #38732:
URL: https://github.com/apache/spark/pull/38732#discussion_r1152954378
##
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala:
##
@@ -520,10 +552,46 @@ class ExecutorPodsAllo
yaooqinn commented on code in PR #38732:
URL: https://github.com/apache/spark/pull/38732#discussion_r1152957287
##
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala:
##
@@ -750,6 +750,26 @@ private[spark] object Config extends Logging {
pan3793 opened a new pull request, #38732:
URL: https://github.com/apache/spark/pull/38732
### What changes were proposed in this pull request?
Fail Spark Application when number of executor failures reach threshold.
### Why are the changes needed?
Sometimes,
yaooqinn commented on code in PR #38732:
URL: https://github.com/apache/spark/pull/38732#discussion_r1152960657
##
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala:
##
@@ -117,6 +120,12 @@ class ExecutorPodsAlloc
yaooqinn commented on code in PR #38732:
URL: https://github.com/apache/spark/pull/38732#discussion_r1152961738
##
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala:
##
@@ -148,6 +163,10 @@ class ExecutorPodsAlloc
LuciferYang opened a new pull request, #40605:
URL: https://github.com/apache/spark/pull/40605
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How
cloud-fan commented on PR #40300:
URL: https://github.com/apache/spark/pull/40300#issuecomment-1489959075
thanks, merging to master!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific c
pan3793 commented on code in PR #38732:
URL: https://github.com/apache/spark/pull/38732#discussion_r1152964045
##
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala:
##
@@ -148,6 +163,10 @@ class ExecutorPodsAlloca
cloud-fan closed pull request #40300: [SPARK-42683] Automatically rename
conflicting metadata columns
URL: https://github.com/apache/spark/pull/40300
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
cloud-fan commented on PR #40437:
URL: https://github.com/apache/spark/pull/40437#issuecomment-1489962469
> I'm not sure why spark-sql CLI has to be compatible with hive output,
personally, I don't think it's necessary. Maybe we should display spark's
schema as is, just like thriftSever?
pan3793 commented on code in PR #38732:
URL: https://github.com/apache/spark/pull/38732#discussion_r1152967630
##
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala:
##
@@ -750,6 +750,26 @@ private[spark] object Config extends Logging {
pan3793 commented on code in PR #38732:
URL: https://github.com/apache/spark/pull/38732#discussion_r1038973719
##
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala:
##
@@ -136,6 +151,10 @@ class ExecutorPodsAlloca
lyy-pineapple commented on PR #38171:
URL: https://github.com/apache/spark/pull/38171#issuecomment-1489985307
> `joni` seems to be used in Hbase client only instead of Hbase server or
Hbase common.
>
> * https://mvnrepository.com/artifact/org.apache.hbase/hbase-client/2.5.3
>
>
lyy-pineapple commented on PR #38171:
URL: https://github.com/apache/spark/pull/38171#issuecomment-1489987177
> https://user-images.githubusercontent.com/8748814/204439049-53f0bd4f-9ea0-4289-8268-d16aef5b4334.png";>
>
> @lyy-pineapple Would you share the test sql pattern? I test some c
grundprinzip opened a new pull request, #40606:
URL: https://github.com/apache/spark/pull/40606
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### Ho
LuciferYang commented on code in PR #40605:
URL: https://github.com/apache/spark/pull/40605#discussion_r1153013745
##
dev/connect-jvm-client-mima-check:
##
@@ -34,20 +34,18 @@ fi
rm -f .connect-mima-check-result
-echo "Build sql module, connect-client-jvm module and connect
huangxiaopingRD commented on code in PR #40232:
URL: https://github.com/apache/spark/pull/40232#discussion_r1153014538
##
docs/sql-ref-syntax-ddl-create-table-datasource.md:
##
@@ -118,7 +118,7 @@ CREATE TABLE student (id INT, name STRING, age INT) USING
CSV;
CREATE TABLE stud
huangxiaopingRD commented on code in PR #40232:
URL: https://github.com/apache/spark/pull/40232#discussion_r1153014538
##
docs/sql-ref-syntax-ddl-create-table-datasource.md:
##
@@ -118,7 +118,7 @@ CREATE TABLE student (id INT, name STRING, age INT) USING
CSV;
CREATE TABLE stud
yaooqinn commented on code in PR #38732:
URL: https://github.com/apache/spark/pull/38732#discussion_r1153042664
##
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala:
##
@@ -750,6 +750,26 @@ private[spark] object Config extends Logging {
zhengruifeng opened a new pull request, #40607:
URL: https://github.com/apache/spark/pull/40607
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How was this p
yaooqinn commented on PR #38732:
URL: https://github.com/apache/spark/pull/38732#issuecomment-1490050958
Does Kubernetes support other mechanisms to add a timeout during
pod/container/app initialization? If not, we shall bring this feature in at the
spark layer. Also cc @Yikun
--
This i
WeichenXu123 commented on code in PR #40607:
URL: https://github.com/apache/spark/pull/40607#discussion_r1153066202
##
python/pyspark/ml/torch/distributor.py:
##
@@ -581,11 +593,11 @@ def _run_distributed_training(
f"Started distributed training with {self.num_proce
WeichenXu123 commented on code in PR #40607:
URL: https://github.com/apache/spark/pull/40607#discussion_r1153066202
##
python/pyspark/ml/torch/distributor.py:
##
@@ -581,11 +593,11 @@ def _run_distributed_training(
f"Started distributed training with {self.num_proce
WeichenXu123 commented on code in PR #40607:
URL: https://github.com/apache/spark/pull/40607#discussion_r1153067103
##
python/pyspark/ml/torch/distributor.py:
##
@@ -330,6 +340,7 @@ def __init__(
num_processes: int = 1,
local_mode: bool = True,
use_gpu
WeichenXu123 commented on code in PR #40607:
URL: https://github.com/apache/spark/pull/40607#discussion_r1153067929
##
python/pyspark/ml/torch/distributor.py:
##
@@ -144,15 +145,21 @@ def __init__(
num_processes: int = 1,
local_mode: bool = True,
use_g
zhengruifeng commented on code in PR #40607:
URL: https://github.com/apache/spark/pull/40607#discussion_r1153069439
##
python/pyspark/ml/torch/distributor.py:
##
@@ -330,6 +340,7 @@ def __init__(
num_processes: int = 1,
local_mode: bool = True,
use_gpu
WeichenXu123 commented on code in PR #40607:
URL: https://github.com/apache/spark/pull/40607#discussion_r1153069493
##
python/pyspark/ml/tests/connect/test_parity_torch_distributor.py:
##
@@ -0,0 +1,511 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
zhengruifeng commented on code in PR #40607:
URL: https://github.com/apache/spark/pull/40607#discussion_r1153071026
##
python/pyspark/ml/tests/connect/test_parity_torch_distributor.py:
##
@@ -0,0 +1,511 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
zhengruifeng commented on code in PR #40607:
URL: https://github.com/apache/spark/pull/40607#discussion_r1153072179
##
python/pyspark/ml/torch/distributor.py:
##
@@ -581,11 +593,11 @@ def _run_distributed_training(
f"Started distributed training with {self.num_proce
infoankitp commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1153083574
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##
@@ -5056,128 +4950,45 @@ case class ArrayCompact(child: Expre
infoankitp commented on code in PR #40563:
URL: https://github.com/apache/spark/pull/40563#discussion_r1153083910
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala:
##
@@ -1400,120 +1400,24 @@ case class ArrayContains(left: Expre
HeartSaVioR commented on code in PR #40561:
URL: https://github.com/apache/spark/pull/40561#discussion_r1153114879
##
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala:
##
@@ -980,3 +1022,65 @@ object StreamingDeduplicateExec {
private v
HeartSaVioR commented on code in PR #40561:
URL: https://github.com/apache/spark/pull/40561#discussion_r1153114879
##
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala:
##
@@ -980,3 +1022,65 @@ object StreamingDeduplicateExec {
private v
HeartSaVioR commented on code in PR #40561:
URL: https://github.com/apache/spark/pull/40561#discussion_r1153116775
##
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala:
##
@@ -980,3 +1022,65 @@ object StreamingDeduplicateExec {
private v
cloud-fan commented on PR #40604:
URL: https://github.com/apache/spark/pull/40604#issuecomment-1490156737
thanks for review, merging to master/3.4!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
cloud-fan closed pull request #40604: Revert "[SPARK-41765][SQL] Pull out v1
write metrics to WriteFiles"
URL: https://github.com/apache/spark/pull/40604
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
VindhyaG commented on code in PR #40553:
URL: https://github.com/apache/spark/pull/40553#discussion_r1153193529
##
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Dataset.scala:
##
@@ -535,6 +535,159 @@ class Dataset[T] private[sql] (
}
}
+ /**
+ *
VindhyaG commented on code in PR #40553:
URL: https://github.com/apache/spark/pull/40553#discussion_r1153193529
##
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Dataset.scala:
##
@@ -535,6 +535,159 @@ class Dataset[T] private[sql] (
}
}
+ /**
+ *
VindhyaG commented on PR #40553:
URL: https://github.com/apache/spark/pull/40553#issuecomment-1490227613
> Hi @VindhyaG, this might be useful - may be we can benefit from the
usecase you have for this. Is it just for logging? Not sure what others think,
it might be good to limit the API sur
martin-kokos closed pull request #39941: [MINOR][DOCS] Add link to Hadoop docs
URL: https://github.com/apache/spark/pull/39941
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
T
martin-kokos commented on PR #39941:
URL: https://github.com/apache/spark/pull/39941#issuecomment-1490231287
Fixed by
https://github.com/apache/spark/commit/c9c3880e3ad6f57a359f1de05b7e772c06660d0b
--
This is an automated message from the Apache Git Service.
To respond to the message, ple
HeartSaVioR commented on PR #40600:
URL: https://github.com/apache/spark/pull/40600#issuecomment-1490244143
Thanks! Merging to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
HeartSaVioR closed pull request #40600: [SPARK-42968][SS] Add option to skip
commit coordinator as part of StreamingWrite API for DSv2 sources/sinks
URL: https://github.com/apache/spark/pull/40600
--
This is an automated message from the Apache Git Service.
To respond to the message, please l
jaceklaskowski commented on code in PR #40567:
URL: https://github.com/apache/spark/pull/40567#discussion_r1153247899
##
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##
@@ -4195,6 +4195,15 @@ object SQLConf {
.booleanConf
.createWithDefa
MaxGekk commented on code in PR #40126:
URL: https://github.com/apache/spark/pull/40126#discussion_r1153316438
##
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveAliasesSuite.scala:
##
@@ -88,4 +94,46 @@ class ResolveAliasesSuite extends AnalysisTest {
juanvisoler opened a new pull request, #40608:
URL: https://github.com/apache/spark/pull/40608
Add support for calling debugCodegen from Python & Java
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does thi
dongjoon-hyun commented on PR #40604:
URL: https://github.com/apache/spark/pull/40604#issuecomment-1490405810
+1 for reverting decision. Thank you, @cloud-fan and all.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use th
VindhyaG commented on code in PR #40553:
URL: https://github.com/apache/spark/pull/40553#discussion_r1151950076
##
sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala:
##
@@ -883,6 +883,129 @@ class Dataset[T] private[sql](
println(showString(numRows, truncate, verti
MaxGekk commented on PR #40593:
URL: https://github.com/apache/spark/pull/40593#issuecomment-1490430902
Merging to master. Thank you, @cloud-fan for review.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
MaxGekk closed pull request #40593: [SPARK-42979][SQL] Define literal
constructors as keywords
URL: https://github.com/apache/spark/pull/40593
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spe
yabola commented on code in PR #39950:
URL: https://github.com/apache/spark/pull/39950#discussion_r1153375489
##
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetFooterReader.java:
##
@@ -17,23 +17,53 @@
package org.apache.spark.sql.execution.d
yabola commented on code in PR #39950:
URL: https://github.com/apache/spark/pull/39950#discussion_r1153376111
##
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetFooterReader.java:
##
@@ -17,23 +17,53 @@
package org.apache.spark.sql.execution.d
yabola commented on code in PR #39950:
URL: https://github.com/apache/spark/pull/39950#discussion_r1153376539
##
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetFooterReader.java:
##
@@ -17,23 +17,53 @@
package org.apache.spark.sql.execution.d
yabola commented on code in PR #39950:
URL: https://github.com/apache/spark/pull/39950#discussion_r1153377375
##
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetFooterReader.java:
##
@@ -17,23 +17,53 @@
package org.apache.spark.sql.execution.d
yabola commented on code in PR #39950:
URL: https://github.com/apache/spark/pull/39950#discussion_r1153375489
##
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetFooterReader.java:
##
@@ -17,23 +17,53 @@
package org.apache.spark.sql.execution.d
ScrapCodes commented on PR #40553:
URL: https://github.com/apache/spark/pull/40553#issuecomment-1490453383
I see this as developer facing API, So just having
```
def getString(numRows: Int, truncate: Int): String =
getString(numRows, truncate, vertical = false)
```
would
ScrapCodes commented on PR #40553:
URL: https://github.com/apache/spark/pull/40553#issuecomment-1490457386
Do you think, a more interesting way can be returning a JSON representation?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to G
juanvisoler commented on PR #40608:
URL: https://github.com/apache/spark/pull/40608#issuecomment-1490470454
@holdenk @MaxGekk
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment
Hisoka-X opened a new pull request, #40609:
URL: https://github.com/apache/spark/pull/40609
### What changes were proposed in this pull request?
This PR proposes to assign name to _LEGACY_ERROR_TEMP_2044,
"BINARY_ARITHMETIC_CAUSE_OVERFLOW".
### Why are the changes n
VindhyaG commented on code in PR #40553:
URL: https://github.com/apache/spark/pull/40553#discussion_r1153437032
##
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Dataset.scala:
##
@@ -535,6 +535,159 @@ class Dataset[T] private[sql] (
}
}
+ /**
+ *
ivoson opened a new pull request, #40610:
URL: https://github.com/apache/spark/pull/40610
### What changes were proposed in this pull request?
Add a destructive iterator to SparkResult and change
`Dataset.toLocalIterator` to use the desctructive iterator.
With the desctructive iterator
yabola commented on code in PR #39950:
URL: https://github.com/apache/spark/pull/39950#discussion_r1153492439
##
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java:
##
@@ -89,17 +90,28 @@
@Override
public void ini
yabola commented on code in PR #39950:
URL: https://github.com/apache/spark/pull/39950#discussion_r1153492439
##
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java:
##
@@ -89,17 +90,28 @@
@Override
public void ini
yabola commented on code in PR #39950:
URL: https://github.com/apache/spark/pull/39950#discussion_r1153492439
##
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java:
##
@@ -89,17 +90,28 @@
@Override
public void ini
rangadi commented on code in PR #40561:
URL: https://github.com/apache/spark/pull/40561#discussion_r1153505179
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala:
##
@@ -679,6 +679,8 @@ object RemoveNoopUnion extends Rule[LogicalPlan] {
dongjoon-hyun commented on PR #40587:
URL: https://github.com/apache/spark/pull/40587#issuecomment-1490660361
I verified that Apache Spark 3.4.0 RC5 successfully has SBOM artifacts.
-
https://repository.apache.org/content/repositories/orgapachespark-1439/org/apache/spark/spark-core_2.12/3
arturobernalg commented on PR #40608:
URL: https://github.com/apache/spark/pull/40608#issuecomment-1490659794
LGTM +1
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To uns
hvanhovell opened a new pull request, #40611:
URL: https://github.com/apache/spark/pull/40611
### What changes were proposed in this pull request?
This PR adds direct serialization from user domain objects to arrow batches.
This removes the need to go through catalyst.
### Why are
hvanhovell commented on code in PR #40611:
URL: https://github.com/apache/spark/pull/40611#discussion_r1153602634
##
connector/connect/client/jvm/pom.xml:
##
@@ -120,6 +120,19 @@
+
Review Comment:
Needed for a couple of classes used during tests.
amaliujia commented on code in PR #40611:
URL: https://github.com/apache/spark/pull/40611#discussion_r1153614619
##
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/connect/client/arrow/ArrowSerializer.scala:
##
@@ -0,0 +1,529 @@
+/*
+ * Licensed to the Apache So
amaliujia commented on code in PR #40581:
URL: https://github.com/apache/spark/pull/40581#discussion_r1152606377
##
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala:
##
@@ -482,27 +482,66 @@ class SparkConnectPlanner(val sess
MaxGekk commented on code in PR #40609:
URL: https://github.com/apache/spark/pull/40609#discussion_r1153634383
##
sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionErrorsSuite.scala:
##
@@ -625,6 +625,20 @@ class QueryExecutionErrorsSuite
}
}
+ test("B
viirya commented on PR #40587:
URL: https://github.com/apache/spark/pull/40587#issuecomment-1490740514
Cool. Thanks @dongjoon-hyun
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific co
MaxGekk commented on code in PR #40609:
URL: https://github.com/apache/spark/pull/40609#discussion_r1153638873
##
sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionErrorsSuite.scala:
##
@@ -625,6 +625,20 @@ class QueryExecutionErrorsSuite
}
}
+ test("B
amaliujia commented on code in PR #40611:
URL: https://github.com/apache/spark/pull/40611#discussion_r1153615770
##
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/connect/client/arrow/ArrowSerializer.scala:
##
@@ -0,0 +1,529 @@
+/*
+ * Licensed to the Apache So
rangadi commented on code in PR #40561:
URL: https://github.com/apache/spark/pull/40561#discussion_r1153675952
##
sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala:
##
@@ -1742,6 +1742,8 @@ class DataFrameSuite extends QueryTest
Seq(Row(2, 1, 2), Row(1, 2,
VindhyaG commented on PR #40553:
URL: https://github.com/apache/spark/pull/40553#issuecomment-1490821588
> I see this as developer facing API, So just having
>
> ```
> def getString(numRows: Int, truncate: Int): String =
> getString(numRows, truncate, vertical = false)
>
VindhyaG commented on PR #40553:
URL: https://github.com/apache/spark/pull/40553#issuecomment-1490823634
> Do you think, a more interesting way can be returning a JSON
representation?
For rest api yes JSON would make more sense but for logging i suppose string
in tabular form is more use
shrprasa commented on PR #40363:
URL: https://github.com/apache/spark/pull/40363#issuecomment-1490831779
@thousandhu @dongjoon-hyun @holdenk
The approach in this PR only handles the cleanup on driver side. It won't
clean up the files if files were uploaded during job submission but then
1 - 100 of 213 matches
Mail list logo