beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r889863328
##
sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala:
##
@@ -121,4 +124,34 @@ private[sql] object H2Dialect extends JdbcDialect {
}
super.clas
beliefer opened a new pull request, #36773:
URL: https://github.com/apache/spark/pull/36773
### What changes were proposed in this pull request?
Spark supports a lot of linear regression aggregate functions now.
Because `REGR_AVGX`, `REGR_AVGY`, `REGR_COUNT`, `REGR_SXX` and `REGR_SXY`
AngersZh commented on PR #36723:
URL: https://github.com/apache/spark/pull/36723#issuecomment-1147096805
> Gentle ping, @AngersZh .
Can't find a efficient way to keep the order, will close this.
--
This is an automated message from the Apache Git Service.
To respond to the m
AngersZh closed pull request #36723: [SPARK-39337][SQL] Refactor
DescribeTableExec to remove duplicate filters
URL: https://github.com/apache/spark/pull/36723
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL ab
cxzl25 opened a new pull request, #36772:
URL: https://github.com/apache/spark/pull/36772
### What changes were proposed in this pull request?
This PR aims to upgrade Apache Hive `hive-storage-api` library from 2.7.2 to
2.7.3.
### Why are the changes needed?
[HIVE-25190](htt
Eugene-Mark commented on PR #36499:
URL: https://github.com/apache/spark/pull/36499#issuecomment-1147065740
We have "maximum scale" [defined in
Spark](https://github.com/apache/spark/blob/bab70b1ef24a2461395b32f609a9274269cb000e/sql/catalyst/src/main/scala/org/apache/spark/sql/types/DecimalT
LuciferYang commented on PR #36616:
URL: https://github.com/apache/spark/pull/36616#issuecomment-1147061886
cc @dongjoon-hyun @sunchao
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specifi
dongjoon-hyun commented on code in PR #35789:
URL: https://github.com/apache/spark/pull/35789#discussion_r889832491
##
sql/core/src/test/scala/org/apache/spark/sql/BloomFilterAggregateQuerySuite.scala:
##
@@ -0,0 +1,215 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
dongjoon-hyun commented on code in PR #35789:
URL: https://github.com/apache/spark/pull/35789#discussion_r889832491
##
sql/core/src/test/scala/org/apache/spark/sql/BloomFilterAggregateQuerySuite.scala:
##
@@ -0,0 +1,215 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
WangGuangxin commented on PR #36626:
URL: https://github.com/apache/spark/pull/36626#issuecomment-1147034877
@viirya @cloud-fan Could you please help review this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
wangyum commented on PR #36766:
URL: https://github.com/apache/spark/pull/36766#issuecomment-1147023754
@cloud-fan @sigmod
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
wangyum commented on PR #36764:
URL: https://github.com/apache/spark/pull/36764#issuecomment-1147022373
Merged to master.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
wangyum commented on PR #36764:
URL: https://github.com/apache/spark/pull/36764#issuecomment-1147022276
Thank you @MaxGekk
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
wangyum closed pull request #36764: [SPARK-39377][SQL][TESTS] Normalize expr
ids in ListQuery and Exists expressions
URL: https://github.com/apache/spark/pull/36764
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
cxzl25 commented on PR #36740:
URL: https://github.com/apache/spark/pull/36740#issuecomment-1147014470
@sarutak @cloud-fan @dongjoon-hyun
After the introduction of SPARK-34636, some SQL will fail to parse.
--
This is an automated message from the Apache Git Service.
To respond to th
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r889799255
##
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##
@@ -259,6 +259,49 @@ class V2ExpressionBuilder(
} else {
Non
beliefer commented on code in PR #36663:
URL: https://github.com/apache/spark/pull/36663#discussion_r889799255
##
sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala:
##
@@ -259,6 +259,49 @@ class V2ExpressionBuilder(
} else {
Non
LuciferYang commented on PR #36732:
URL: https://github.com/apache/spark/pull/36732#issuecomment-1146991307
> The change is OK; I don't think it adds any performance. The only
hesitation here would be code churn and possible merge conflicts due to this.
I'm kind of neutral on it, I think it
wangyum commented on PR #36696:
URL: https://github.com/apache/spark/pull/36696#issuecomment-1146989466
Please wait me several days.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific c
LuciferYang commented on PR #36573:
URL: https://github.com/apache/spark/pull/36573#issuecomment-1146988756
close this first and will re-open this when we can write parquet data
through V2 API
--
This is an automated message from the Apache Git Service.
To respond to the message,
LuciferYang closed pull request #36573: [SPARK-38829][SQL][FOLLOWUP] Add
`PARQUET_TIMESTAMP_NTZ_ENABLED` configuration for `ParquetWrite.prepareWrite`
URL: https://github.com/apache/spark/pull/36573
--
This is an automated message from the Apache Git Service.
To respond to the message, please
beliefer commented on code in PR #36662:
URL: https://github.com/apache/spark/pull/36662#discussion_r889782405
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala:
##
@@ -2504,9 +2504,10 @@ object Decode {
usage = """
_FUNC_(b
beliefer commented on PR #36295:
URL: https://github.com/apache/spark/pull/36295#issuecomment-1146967033
> Can we have some kind of performance numbers for "push down OFFSET could
improves the performance."?
For most JDBC data source, push down OFFSET could improves the performance.
github-actions[bot] commented on PR #32397:
URL: https://github.com/apache/spark/pull/32397#issuecomment-1146915375
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] closed pull request #35250: [SPARK-37961][SQL] Override
maxRows/maxRowsPerPartition for some logical operators
URL: https://github.com/apache/spark/pull/35250
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub an
github-actions[bot] commented on PR #35363:
URL: https://github.com/apache/spark/pull/35363#issuecomment-1146915363
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] commented on PR #35612:
URL: https://github.com/apache/spark/pull/35612#issuecomment-1146915346
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] commented on PR #35417:
URL: https://github.com/apache/spark/pull/35417#issuecomment-1146915359
We're closing this PR because it hasn't been updated in a while. This isn't
a judgement on the merit of the PR in any way. It's just a way of keeping the
PR queue manageable.
github-actions[bot] closed pull request #35460: [SPARK-38160][SQL] Shuffle by
rand could lead to incorrect answers when ShuffleFetchFailed happend
URL: https://github.com/apache/spark/pull/35460
--
This is an automated message from the Apache Git Service.
To respond to the message, please log
github-actions[bot] closed pull request #35620: [SPArK-38294][SQL]
DDLUtils.verifyNotReadPath should check target is subDir
URL: https://github.com/apache/spark/pull/35620
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use t
huaxingao commented on PR #36696:
URL: https://github.com/apache/spark/pull/36696#issuecomment-1146911640
@wangyum is helping me testing this because he has lots of `in/notIn` test
cases. I will change this PR to draft for now.
--
This is an automated message from the Apache Git Service.
dongjoon-hyun commented on PR #36696:
URL: https://github.com/apache/spark/pull/36696#issuecomment-1146909697
cc @sunchao
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
T
dongjoon-hyun commented on PR #36723:
URL: https://github.com/apache/spark/pull/36723#issuecomment-1146909417
Gentle ping, @AngersZh .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spec
dongjoon-hyun commented on PR #36701:
URL: https://github.com/apache/spark/pull/36701#issuecomment-1146909153
Thank you, @pralabhkumar and @HyukjinKwon . Merged to master for Apache
Spark 3.4.
--
This is an automated message from the Apache Git Service.
To respond to the message, please l
dongjoon-hyun closed pull request #36701: [SPARK-39179][PYTHON][TESTS] Improve
the test coverage for pyspark/shuffle.py
URL: https://github.com/apache/spark/pull/36701
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
AmplabJenkins commented on PR #36771:
URL: https://github.com/apache/spark/pull/36771#issuecomment-1146896095
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
sadikovi commented on code in PR #36726:
URL: https://github.com/apache/spark/pull/36726#discussion_r889742470
##
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala:
##
@@ -1879,5 +1880,53 @@ class JDBCSuite extends QueryTest
val fields = schema.fields
a
dtenedor opened a new pull request, #36771:
URL: https://github.com/apache/spark/pull/36771
### What changes were proposed in this pull request?
Extend DEFAULT column support in ALTER TABLE ADD COLUMNS commands to include
V2 data sources.
Example:
```
> create or repl
sadikovi commented on PR #36295:
URL: https://github.com/apache/spark/pull/36295#issuecomment-1146883974
Can we have some kind of performance numbers for "push down OFFSET could
improves the performance."?
--
This is an automated message from the Apache Git Service.
To respond to the mes
Kimahriman commented on PR #36767:
URL: https://github.com/apache/spark/pull/36767#issuecomment-1146853120
> Could you update the PR description with the actual logs (BEFORE and
AFTER)?
I'm not really sure how to trigger the old deprecation warning, don't
actually use k8s.
--
This
Kimahriman commented on code in PR #36767:
URL: https://github.com/apache/spark/pull/36767#discussion_r889715991
##
core/src/main/scala/org/apache/spark/SparkConf.scala:
##
@@ -638,7 +638,9 @@ private[spark] object SparkConf extends Logging {
DeprecatedConfig("spark.black
srowen commented on PR #36499:
URL: https://github.com/apache/spark/pull/36499#issuecomment-1146838926
I see, so we should interpret this as "maximum scale" or something in Spark?
that seems OK, and if we're only confident about Teradata, this seems OK. Let's
add a note in the release notes
Eugene-Mark commented on PR #36499:
URL: https://github.com/apache/spark/pull/36499#issuecomment-1146822523
For NUMBER(*) on Teradata, the scale is not fixed but can suit itself to
different value, as they said, it's only constrained by `system limit`. So the
issue for Teradata is about how
AmplabJenkins commented on PR #36769:
URL: https://github.com/apache/spark/pull/36769#issuecomment-1146784621
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
AmplabJenkins commented on PR #36770:
URL: https://github.com/apache/spark/pull/36770#issuecomment-1146784611
Can one of the admins verify this patch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
cxzl25 commented on PR #36770:
URL: https://github.com/apache/spark/pull/36770#issuecomment-1146770734
## Current
![fail_task_current](https://user-images.githubusercontent.com/3898450/172043757-c9b8bf54-1c80-4a0e-b67e-a2a4d79e93b0.png)
## Fix
![fail_task_duration](https://
cxzl25 opened a new pull request, #36770:
URL: https://github.com/apache/spark/pull/36770
### What changes were proposed in this pull request?
When task status is failed and `executorRunTime` has no value, try to use
`duration`.
### Why are the changes needed?
When the execu
MaxGekk commented on PR #36662:
URL: https://github.com/apache/spark/pull/36662#issuecomment-1146765074
@beliefer @cloud-fan Please, take a look at this PR.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
MaxGekk closed pull request #36760: [SPARK-39374][SQL] Improve error message
for user specified column list
URL: https://github.com/apache/spark/pull/36760
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
MaxGekk commented on PR #36760:
URL: https://github.com/apache/spark/pull/36760#issuecomment-1146764006
+1, LGTM. Merging to master.
Thank you, @wangyum and @singhpk234 for review.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to G
cxzl25 opened a new pull request, #36769:
URL: https://github.com/apache/spark/pull/36769
### What changes were proposed in this pull request?
Introduce configuration items and set batch size when constructing orc
writer.
### Why are the changes needed?
Now vectorized columar or
51 matches
Mail list logo