date:20220603

[spark] branch master updated: [SPARK-39368][SQL] Move `RewritePredicateSubquery` into `InjectRuntimeFilter`

2022-06-03 Thread yumwang

This is an automated email from the ASF dual-hosted git repository.

yumwang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new bb51add5c79 [SPARK-39368][SQL] Move `RewritePredicateSubquery` into 
`InjectRuntimeFilter`
bb51add5c79 is described below

commit bb51add5c79558df863d37965603387d40cc4387
Author: Yuming Wang 
AuthorDate: Sat Jun 4 08:30:49 2022 +0800

[SPARK-39368][SQL] Move `RewritePredicateSubquery` into 
`InjectRuntimeFilter`

### What changes were proposed in this pull request?

This PR moves `RewritePredicateSubquery` into `InjectRuntimeFilter`.

### Why are the changes needed?

Reduce the number of `RewritePredicateSubquery` runs, since 
`spark.sql.optimizer.runtimeFilter.semiJoinReduction.enabled` is disabled by 
default. For example:
```
build/sbt "sql/testOnly *TPCDSQuerySuite"
```
Before this PR:
```
...
org.apache.spark.sql.catalyst.optimizer.RewritePredicateSubquery
  17978319 / 31026106 26 / 624
...
```
After this PR:
```
...
org.apache.spark.sql.catalyst.optimizer.RewritePredicateSubquery
  16680901 / 18994542 26 / 312
...
```

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing test.

Closes #36755 from wangyum/RewritePredicateSubquery.

Authored-by: Yuming Wang 
Signed-off-by: Yuming Wang 
---
 .../apache/spark/sql/catalyst/optimizer/InjectRuntimeFilter.scala | 8 +++-
 .../scala/org/apache/spark/sql/execution/SparkOptimizer.scala | 3 +--
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/InjectRuntimeFilter.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/InjectRuntimeFilter.scala
index 01c1786e05a..baaf82c00db 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/InjectRuntimeFilter.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/InjectRuntimeFilter.scala
@@ -288,7 +288,13 @@ object InjectRuntimeFilter extends Rule[LogicalPlan] with 
PredicateHelper with J
 case s: Subquery if s.correlated => plan
 case _ if !conf.runtimeFilterSemiJoinReductionEnabled &&
   !conf.runtimeFilterBloomFilterEnabled => plan
-case _ => tryInjectRuntimeFilter(plan)
+case _ =>
+  val newPlan = tryInjectRuntimeFilter(plan)
+  if (conf.runtimeFilterSemiJoinReductionEnabled && 
!plan.fastEquals(newPlan)) {
+RewritePredicateSubquery(newPlan)
+  } else {
+newPlan
+  }
   }
 
 }
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala
index 84e5975189b..0e7455009c5 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala
@@ -52,8 +52,7 @@ class SparkOptimizer(
 Batch("PartitionPruning", Once,
   PartitionPruning) :+
 Batch("InjectRuntimeFilter", FixedPoint(1),
-  InjectRuntimeFilter,
-  RewritePredicateSubquery) :+
+  InjectRuntimeFilter) :+
 Batch("MergeScalarSubqueries", Once,
   MergeScalarSubqueries) :+
 Batch("Pushdown Filters from PartitionPruning", fixedPoint,


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-39294][SQL] Support vectorized Orc scans with DEFAULT values

2022-06-03 Thread gengliang

This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 8e76c2142b3 [SPARK-39294][SQL] Support vectorized Orc scans with 
DEFAULT values
8e76c2142b3 is described below

commit 8e76c2142b382410f1c0091d873b2ee84e9cbd62
Author: Daniel Tenedorio 
AuthorDate: Fri Jun 3 13:48:27 2022 -0700

[SPARK-39294][SQL] Support vectorized Orc scans with DEFAULT values

### What changes were proposed in this pull request?

Support vectorized Orc scans when the table schema has associated DEFAULT 
column values.

(Note, this PR depends on https://github.com/apache/spark/pull/36672 which 
adds the same for Parquet files.)

Example:

```
create table t(i int) using orc;
insert into t values(42);
alter table t add column s string default concat('abc', def');
select * from t;
> 42, 'abcdef'
```

### Why are the changes needed?

This change makes it easier to build, query, and maintain tables backed by 
Orc data.

### Does this PR introduce _any_ user-facing change?

Yes.

### How was this patch tested?

This PR includes new test coverage.

Closes #36675 from dtenedor/default-orc-vectorized.

Authored-by: Daniel Tenedorio 
Signed-off-by: Gengliang Wang 
---
 .../execution/datasources/orc/OrcColumnarBatchReader.java| 12 +++-
 .../scala/org/apache/spark/sql/sources/InsertSuite.scala |  2 ++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git 
a/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.java
 
b/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.java
index 40ed0b2454c..175ad37aace 100644
--- 
a/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.java
+++ 
b/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.java
@@ -164,6 +164,7 @@ public class OrcColumnarBatchReader extends 
RecordReader {
 // Just wrap the ORC column vector instead of copying it to Spark column 
vector.
 orcVectorWrappers = new 
org.apache.spark.sql.vectorized.ColumnVector[resultSchema.length()];
 
+StructType requiredSchema = new StructType(requiredFields);
 for (int i = 0; i < requiredFields.length; i++) {
   DataType dt = requiredFields[i].dataType();
   if (requestedPartitionColIds[i] != -1) {
@@ -176,7 +177,16 @@ public class OrcColumnarBatchReader extends 
RecordReader {
 // Initialize the missing columns once.
 if (colId == -1) {
   OnHeapColumnVector missingCol = new OnHeapColumnVector(capacity, dt);
-  missingCol.putNulls(0, capacity);
+  // Check if the missing column has an associated default value in 
the schema metadata.
+  // If so, fill the corresponding column vector with the value.
+  Object defaultValue = requiredSchema.existenceDefaultValues()[i];
+  if (defaultValue == null) {
+missingCol.putNulls(0, capacity);
+  } else if (!missingCol.appendObjects(capacity, 
defaultValue).isPresent()) {
+throw new IllegalArgumentException("Cannot assign default column 
value to result " +
+  "column batch in vectorized Orc reader because the data type is 
not supported: " +
+  defaultValue);
+  }
   missingCol.setIsConstant();
   orcVectorWrappers[i] = missingCol;
 } else {
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala
index 35a6f8f8a0b..1b70998c642 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala
@@ -1608,6 +1608,8 @@ class InsertSuite extends DataSourceTest with 
SharedSparkSession {
   TestCase(
 dataSource = "orc",
 Seq(
+  Config(
+None),
   Config(
 Some(SQLConf.ORC_VECTORIZED_READER_ENABLED.key -> "false"),
 insertNullsToStorage = false))),


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.3 updated: [SPARK-39259][SQL][TEST][FOLLOWUP] Fix Scala 2.13 `ClassCastException` in `ComputeCurrentTimeSuite`

2022-06-03 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
 new b2046c282a5 [SPARK-39259][SQL][TEST][FOLLOWUP] Fix Scala 2.13 
`ClassCastException` in `ComputeCurrentTimeSuite`
b2046c282a5 is described below

commit b2046c282a5be8ade421db61b583a6738f0e9ed6
Author: Dongjoon Hyun 
AuthorDate: Fri Jun 3 12:52:16 2022 -0700

[SPARK-39259][SQL][TEST][FOLLOWUP] Fix Scala 2.13 `ClassCastException` in 
`ComputeCurrentTimeSuite`

### What changes were proposed in this pull request?

Unfortunately, #36654 causes seven Scala 2.13 test failures in master/3.3 
and Apache Spark 3.3 RC4.
This PR aims to fix Scala 2.13 ClassCastException in the test code.

### Why are the changes needed?

```
$ dev/change-scala-version.sh 2.13
$ build/sbt "catalyst/testOnly *.ComputeCurrentTimeSuite" -Pscala-2.13
...
[info] ComputeCurrentTimeSuite:
[info] - analyzer should replace current_timestamp with literals *** FAILED 
*** (1 second, 189 milliseconds)
[info]   java.lang.ClassCastException: scala.collection.mutable.ArrayBuffer 
cannot be cast to scala.collection.immutable.Seq
[info]   at 
org.apache.spark.sql.catalyst.optimizer.ComputeCurrentTimeSuite.literals(ComputeCurrentTimeSuite.scala:146)
[info]   at 
org.apache.spark.sql.catalyst.optimizer.ComputeCurrentTimeSuite.$anonfun$new$1(ComputeCurrentTimeSuite.scala:47)
...
[info] *** 7 TESTS FAILED ***
[error] Failed tests:
[error] 
org.apache.spark.sql.catalyst.optimizer.ComputeCurrentTimeSuite
[error] (catalyst / Test / testOnly) sbt.TestsFailedException: Tests 
unsuccessful
[error] Total time: 189 s (03:09), completed Jun 3, 2022 10:29:39 AM
```

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs and manually tests with Scala 2.13.

```
$ dev/change-scala-version.sh 2.13
$ build/sbt "catalyst/testOnly *.ComputeCurrentTimeSuite" -Pscala-2.13
...
[info] ComputeCurrentTimeSuite:
[info] - analyzer should replace current_timestamp with literals (545 
milliseconds)
[info] - analyzer should replace current_date with literals (11 
milliseconds)
[info] - SPARK-33469: Add current_timezone function (3 milliseconds)
[info] - analyzer should replace localtimestamp with literals (4 
milliseconds)
[info] - analyzer should use equal timestamps across subqueries (182 
milliseconds)
[info] - analyzer should use consistent timestamps for different timezones 
(13 milliseconds)
[info] - analyzer should use consistent timestamps for different timestamp 
functions (2 milliseconds)
[info] Run completed in 1 second, 579 milliseconds.
[info] Total number of tests run: 7
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 7, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 12 s, completed Jun 3, 2022, 10:54:03 AM
```

Closes #36762 from dongjoon-hyun/SPARK-39259.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit d79aa36b12d9d6816679ba6348705fdd3bd0061e)
Signed-off-by: Dongjoon Hyun 
---
 .../apache/spark/sql/catalyst/optimizer/ComputeCurrentTimeSuite.scala | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ComputeCurrentTimeSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ComputeCurrentTimeSuite.scala
index c034906c09b..86461522f74 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ComputeCurrentTimeSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ComputeCurrentTimeSuite.scala
@@ -135,7 +135,7 @@ class ComputeCurrentTimeSuite extends PlanTest {
 assert(offsetsFromQuarterHour.size == 1)
   }
 
-  private def literals[T](plan: LogicalPlan): Seq[T] = {
+  private def literals[T](plan: LogicalPlan): 
scala.collection.mutable.ArrayBuffer[T] = {
 val literals = new scala.collection.mutable.ArrayBuffer[T]
 plan.transformWithSubqueries { case subQuery =>
   subQuery.transformAllExpressions { case expression: Literal =>
@@ -143,6 +143,6 @@ class ComputeCurrentTimeSuite extends PlanTest {
 expression
   }
 }
-literals.asInstanceOf[Seq[T]]
+literals
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-39259][SQL][TEST][FOLLOWUP] Fix Scala 2.13 `ClassCastException` in `ComputeCurrentTimeSuite`

2022-06-03 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new d79aa36b12d [SPARK-39259][SQL][TEST][FOLLOWUP] Fix Scala 2.13 
`ClassCastException` in `ComputeCurrentTimeSuite`
d79aa36b12d is described below

commit d79aa36b12d9d6816679ba6348705fdd3bd0061e
Author: Dongjoon Hyun 
AuthorDate: Fri Jun 3 12:52:16 2022 -0700

[SPARK-39259][SQL][TEST][FOLLOWUP] Fix Scala 2.13 `ClassCastException` in 
`ComputeCurrentTimeSuite`

### What changes were proposed in this pull request?

Unfortunately, #36654 causes seven Scala 2.13 test failures in master/3.3 
and Apache Spark 3.3 RC4.
This PR aims to fix Scala 2.13 ClassCastException in the test code.

### Why are the changes needed?

```
$ dev/change-scala-version.sh 2.13
$ build/sbt "catalyst/testOnly *.ComputeCurrentTimeSuite" -Pscala-2.13
...
[info] ComputeCurrentTimeSuite:
[info] - analyzer should replace current_timestamp with literals *** FAILED 
*** (1 second, 189 milliseconds)
[info]   java.lang.ClassCastException: scala.collection.mutable.ArrayBuffer 
cannot be cast to scala.collection.immutable.Seq
[info]   at 
org.apache.spark.sql.catalyst.optimizer.ComputeCurrentTimeSuite.literals(ComputeCurrentTimeSuite.scala:146)
[info]   at 
org.apache.spark.sql.catalyst.optimizer.ComputeCurrentTimeSuite.$anonfun$new$1(ComputeCurrentTimeSuite.scala:47)
...
[info] *** 7 TESTS FAILED ***
[error] Failed tests:
[error] 
org.apache.spark.sql.catalyst.optimizer.ComputeCurrentTimeSuite
[error] (catalyst / Test / testOnly) sbt.TestsFailedException: Tests 
unsuccessful
[error] Total time: 189 s (03:09), completed Jun 3, 2022 10:29:39 AM
```

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs and manually tests with Scala 2.13.

```
$ dev/change-scala-version.sh 2.13
$ build/sbt "catalyst/testOnly *.ComputeCurrentTimeSuite" -Pscala-2.13
...
[info] ComputeCurrentTimeSuite:
[info] - analyzer should replace current_timestamp with literals (545 
milliseconds)
[info] - analyzer should replace current_date with literals (11 
milliseconds)
[info] - SPARK-33469: Add current_timezone function (3 milliseconds)
[info] - analyzer should replace localtimestamp with literals (4 
milliseconds)
[info] - analyzer should use equal timestamps across subqueries (182 
milliseconds)
[info] - analyzer should use consistent timestamps for different timezones 
(13 milliseconds)
[info] - analyzer should use consistent timestamps for different timestamp 
functions (2 milliseconds)
[info] Run completed in 1 second, 579 milliseconds.
[info] Total number of tests run: 7
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 7, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 12 s, completed Jun 3, 2022, 10:54:03 AM
```

Closes #36762 from dongjoon-hyun/SPARK-39259.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
---
 .../apache/spark/sql/catalyst/optimizer/ComputeCurrentTimeSuite.scala | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ComputeCurrentTimeSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ComputeCurrentTimeSuite.scala
index c034906c09b..86461522f74 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ComputeCurrentTimeSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ComputeCurrentTimeSuite.scala
@@ -135,7 +135,7 @@ class ComputeCurrentTimeSuite extends PlanTest {
 assert(offsetsFromQuarterHour.size == 1)
   }
 
-  private def literals[T](plan: LogicalPlan): Seq[T] = {
+  private def literals[T](plan: LogicalPlan): 
scala.collection.mutable.ArrayBuffer[T] = {
 val literals = new scala.collection.mutable.ArrayBuffer[T]
 plan.transformWithSubqueries { case subQuery =>
   subQuery.transformAllExpressions { case expression: Literal =>
@@ -143,6 +143,6 @@ class ComputeCurrentTimeSuite extends PlanTest {
 expression
   }
 }
-literals.asInstanceOf[Seq[T]]
+literals
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-39265][SQL] Support vectorized Parquet scans with DEFAULT values

2022-06-03 Thread gengliang

This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 3e6598e8d4f [SPARK-39265][SQL] Support vectorized Parquet scans with 
DEFAULT values
3e6598e8d4f is described below

commit 3e6598e8d4fbfd7db595d991f6ebad92eb2fa33f
Author: Daniel Tenedorio 
AuthorDate: Fri Jun 3 11:44:55 2022 -0700

[SPARK-39265][SQL] Support vectorized Parquet scans with DEFAULT values

### What changes were proposed in this pull request?

Support vectorized Parquet scans when the table schema has associated 
DEFAULT column values.

Example:

```
create table t(i int) using parquet;
insert into t values(42);
alter table t add column s string default concat('abc', def');
select * from t;
> 42, 'abcdef'
```

### Why are the changes needed?

This change makes it easier to build, query, and maintain tables backed by 
Parquet data.

### Does this PR introduce _any_ user-facing change?

Yes.

### How was this patch tested?

This PR includes new test coverage.

Closes #36672 from dtenedor/default-parquet-vectorized.

Authored-by: Daniel Tenedorio 
Signed-off-by: Gengliang Wang 
---
 .../datasources/parquet/ParquetColumnVector.java   | 31 +++--
 .../parquet/SpecificParquetRecordReaderBase.java   |  5 ++-
 .../parquet/VectorizedParquetRecordReader.java |  6 ++-
 .../execution/vectorized/WritableColumnVector.java | 52 ++
 .../execution/datasources/DataSourceStrategy.scala | 10 +++--
 .../sql/internal/BaseSessionStateBuilder.scala |  2 +-
 .../sql/sources/DataSourceAnalysisSuite.scala  |  3 +-
 .../org/apache/spark/sql/sources/InsertSuite.scala | 17 ---
 .../spark/sql/hive/HiveSessionStateBuilder.scala   |  2 +-
 9 files changed, 97 insertions(+), 31 deletions(-)

diff --git 
a/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetColumnVector.java
 
b/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetColumnVector.java
index c8399d9137f..2ad8cdfcca6 100644
--- 
a/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetColumnVector.java
+++ 
b/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetColumnVector.java
@@ -55,22 +55,14 @@ final class ParquetColumnVector {
   /** Reader for this column - only set if 'isPrimitive' is true */
   private VectorizedColumnReader columnReader;
 
-  ParquetColumnVector(
-  ParquetColumn column,
-  WritableColumnVector vector,
-  int capacity,
-  MemoryMode memoryMode,
-  Set missingColumns) {
-this(column, vector, capacity, memoryMode, missingColumns, true);
-  }
-
   ParquetColumnVector(
   ParquetColumn column,
   WritableColumnVector vector,
   int capacity,
   MemoryMode memoryMode,
   Set missingColumns,
-  boolean isTopLevel) {
+  boolean isTopLevel,
+  Object defaultValue) {
 DataType sparkType = column.sparkType();
 if (!sparkType.sameType(vector.dataType())) {
   throw new IllegalArgumentException("Spark type: " + sparkType +
@@ -83,8 +75,21 @@ final class ParquetColumnVector {
 this.isPrimitive = column.isPrimitive();
 
 if (missingColumns.contains(column)) {
-  vector.setAllNull();
-  return;
+  if (defaultValue == null) {
+vector.setAllNull();
+return;
+  }
+  // For Parquet tables whose columns have associated DEFAULT values, this 
reader must return
+  // those values instead of NULL when the corresponding columns are not 
present in storage.
+  // Here we write the 'defaultValue' to each element in the new 
WritableColumnVector using
+  // the appendObjects method. This delegates to some specific append* 
method depending on the
+  // type of 'defaultValue'; for example, if 'defaultValue' is a Float, 
then we call the
+  // appendFloats method.
+  if (!vector.appendObjects(capacity, defaultValue).isPresent()) {
+throw new IllegalArgumentException("Cannot assign default column value 
to result " +
+  "column batch in vectorized Parquet reader because the data type is 
not supported: " +
+  defaultValue);
+  }
 }
 
 if (isPrimitive) {
@@ -101,7 +106,7 @@ final class ParquetColumnVector {
 
   for (int i = 0; i < column.children().size(); i++) {
 ParquetColumnVector childCv = new 
ParquetColumnVector(column.children().apply(i),
-  vector.getChild(i), capacity, memoryMode, missingColumns, false);
+  vector.getChild(i), capacity, memoryMode, missingColumns, false, 
null);
 children.add(childCv);
 
 
diff --git 
a/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java

[spark] branch master updated: [SPARK-39372][R] Support R 4.2.0

2022-06-03 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new c63e37ee645 [SPARK-39372][R] Support R 4.2.0
c63e37ee645 is described below

commit c63e37ee64581abe5d3c639508627233acb2fd70
Author: Hyukjin Kwon 
AuthorDate: Fri Jun 3 10:25:40 2022 -0700

[SPARK-39372][R] Support R 4.2.0

### What changes were proposed in this pull request?

This PR proposes:

- Updates AppVeyor to use the latest R version 4.2.0.
- Uses the correct way of checking if an object is a matrix: `is.matrix`.
After R 4.2.0,  `class(upperBoundsOnCoefficients) != "matrix")` fails:
```
-- 1. Error (test_mllib_classification.R:245:3): spark.logit 
---
Error in `if (class(upperBoundsOnCoefficients) != "matrix") {
stop("upperBoundsOnCoefficients must be a matrix.")
}`: the condition has length > 1
```

This fixes `spark.logit` when `lowerBoundsOnCoefficients` or 
`upperBoundsOnCoefficients` is specified.

- Explicitly use the first element in `is.na` comparison. From R 4.2.0, it 
throws an exception as below:
```
Error in if (is.na(c(1, 2))) print("abc") : the condition has length > 1
```
Previously it was a warning.

This fixes `createDataFrame` or `as.DataFrame` when the data type is a 
nested complex type.

### Why are the changes needed?

To support/test the latest R. R community tends to use the latest versions 
aggressively.

### Does this PR introduce _any_ user-facing change?

Yes, after this PR, we officially support R 4.2.0 in SparkR.

### How was this patch tested?

CI in this PR should test it out.

Closes #36758 from HyukjinKwon/upgrade-r-appveyor.

Lead-authored-by: Hyukjin Kwon 
Co-authored-by: Hyukjin Kwon 
Signed-off-by: Dongjoon Hyun 
---
 R/pkg/R/mllib_classification.R| 4 ++--
 R/pkg/R/serialize.R   | 7 ++-
 dev/appveyor-install-dependencies.ps1 | 2 +-
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/R/pkg/R/mllib_classification.R b/R/pkg/R/mllib_classification.R
index 093467ecf7d..7204f8bb7df 100644
--- a/R/pkg/R/mllib_classification.R
+++ b/R/pkg/R/mllib_classification.R
@@ -322,7 +322,7 @@ setMethod("spark.logit", signature(data = "SparkDataFrame", 
formula = "formula")
 }
 
 if (!is.null(lowerBoundsOnCoefficients)) {
-  if (class(lowerBoundsOnCoefficients) != "matrix") {
+  if (!is.matrix(lowerBoundsOnCoefficients)) {
 stop("lowerBoundsOnCoefficients must be a matrix.")
   }
   row <- nrow(lowerBoundsOnCoefficients)
@@ -331,7 +331,7 @@ setMethod("spark.logit", signature(data = "SparkDataFrame", 
formula = "formula")
 }
 
 if (!is.null(upperBoundsOnCoefficients)) {
-  if (class(upperBoundsOnCoefficients) != "matrix") {
+  if (!is.matrix(upperBoundsOnCoefficients)) {
 stop("upperBoundsOnCoefficients must be a matrix.")
   }
 
diff --git a/R/pkg/R/serialize.R b/R/pkg/R/serialize.R
index 7760d9be16f..85c318f30c3 100644
--- a/R/pkg/R/serialize.R
+++ b/R/pkg/R/serialize.R
@@ -58,7 +58,12 @@ writeObject <- function(con, object, writeType = TRUE) {
   # Checking types is needed here, since 'is.na' only handles atomic vectors,
   # lists and pairlists
   if (type %in% c("integer", "character", "logical", "double", "numeric")) {
-if (is.na(object)) {
+if (is.na(object[[1]])) {
+  # Uses the first element for now to keep the behavior same as R before
+  # 4.2.0. This is wrong because we should differenciate c(NA) from a
+  # single NA as the former means array(null) and the latter means null
+  # in Spark SQL. However, it requires non-trivial comparison to 
distinguish
+  # both in R. We should ideally fix this.
   object <- NULL
   type <- "NULL"
 }
diff --git a/dev/appveyor-install-dependencies.ps1 
b/dev/appveyor-install-dependencies.ps1
index d469c98fdb3..19b49b90b38 100644
--- a/dev/appveyor-install-dependencies.ps1
+++ b/dev/appveyor-install-dependencies.ps1
@@ -129,7 +129,7 @@ $env:PATH = "$env:HADOOP_HOME\bin;" + $env:PATH
 Pop-Location
 
 # == R
-$rVer = "4.0.2"
+$rVer = "4.2.0"
 $rToolsVer = "4.0.2"
 
 InstallR


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.2 updated: [SPARK-39367][DOCS][SQL][3.2] Make SQL error objects private

2022-06-03 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 4a529a06414 [SPARK-39367][DOCS][SQL][3.2] Make SQL error objects 
private
4a529a06414 is described below

commit 4a529a06414c0997aadde9169d1e791ed7675d3c
Author: Gengliang Wang 
AuthorDate: Fri Jun 3 21:53:39 2022 +0900

[SPARK-39367][DOCS][SQL][3.2] Make SQL error objects private

### What changes were proposed in this pull request?

Partially backport https://github.com/apache/spark/pull/36754 to 
branch-3.2, make the following objects as private so that they won't show up in 
API doc:
* QueryCompilationErrors
* QueryExecutionErrors
* QueryParsingErrors

### Why are the changes needed?

Fix bug in doc

### Does this PR introduce _any_ user-facing change?

No
### How was this patch tested?

GA tests

Closes #36761 from gengliangwang/fix3.2Doc.

Authored-by: Gengliang Wang 
Signed-off-by: Hyukjin Kwon 
---
 .../main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala | 2 +-
 .../main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala   | 2 +-
 .../src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
index 7c2780abf7f..e3102aa1e02 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
@@ -46,7 +46,7 @@ import org.apache.spark.sql.types._
  * As commands are executed eagerly, this also includes errors thrown during 
the execution of
  * commands, which users can see immediately.
  */
-object QueryCompilationErrors {
+private[sql] object QueryCompilationErrors {
 
   def groupingIDMismatchError(groupingID: GroupingID, groupByExprs: 
Seq[Expression]): Throwable = {
 new AnalysisException(
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
index 11dc57cb781..c5ac476aa31 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
@@ -64,7 +64,7 @@ import org.apache.spark.util.CircularBuffer
  * This does not include exceptions thrown during the eager execution of 
commands, which are
  * grouped into [[QueryCompilationErrors]].
  */
-object QueryExecutionErrors {
+private[sql] object QueryExecutionErrors {
 
   def columnChangeUnsupportedError(): Throwable = {
 new UnsupportedOperationException("Please add an implementation for a 
column change here")
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
index b6a8163d503..56855e1e533 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
@@ -27,7 +27,7 @@ import org.apache.spark.sql.catalyst.trees.Origin
  * Object for grouping all error messages of the query parsing.
  * Currently it includes all ParseException.
  */
-object QueryParsingErrors {
+private[sql] object QueryParsingErrors {
 
   def invalidInsertIntoError(ctx: InsertIntoContext): Throwable = {
 new ParseException("Invalid InsertIntoContext", ctx)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.2 updated (19ce3e67a89 -> 5f571a23c0d)

2022-06-03 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


from 19ce3e67a89 [SPARK-38807][CORE] Fix the startup error of spark shell 
on Windows
 add 5f571a23c0d [SPARK-39373][SPARK-39273][SPARK-39252][PYTHON][3.2] 
Recover branch-3.2 build broken by and

No new revisions were added by this update.

Summary of changes:
 python/pyspark/pandas/tests/test_series.py | 2 +-
 python/pyspark/sql/tests/test_dataframe.py | 1 -
 2 files changed, 1 insertion(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

svn commit: r54845 - in /dev/spark/v3.3.0-rc4-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/R/articles/ _site/api/R/deps/ _site/api/R/deps/bootstrap-5.1.0/ _site/api/R/deps/jquery-3.6.0/ _site/api

2022-06-03 Thread maxgekk

Author: maxgekk
Date: Fri Jun  3 12:28:47 2022
New Revision: 54845

Log:
Apache Spark v3.3.0-rc4 docs


[This commit notification would consist of 2665 parts, 
which exceeds the limit of 50 ones, so it was shortened to the summary.]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

svn commit: r54843 - /dev/spark/v3.3.0-rc4-bin/

2022-06-03 Thread maxgekk

Author: maxgekk
Date: Fri Jun  3 11:54:40 2022
New Revision: 54843

Log:
Apache Spark v3.3.0-rc4

Added:
dev/spark/v3.3.0-rc4-bin/
dev/spark/v3.3.0-rc4-bin/SparkR_3.3.0.tar.gz   (with props)
dev/spark/v3.3.0-rc4-bin/SparkR_3.3.0.tar.gz.asc
dev/spark/v3.3.0-rc4-bin/SparkR_3.3.0.tar.gz.sha512
dev/spark/v3.3.0-rc4-bin/pyspark-3.3.0.tar.gz   (with props)
dev/spark/v3.3.0-rc4-bin/pyspark-3.3.0.tar.gz.asc
dev/spark/v3.3.0-rc4-bin/pyspark-3.3.0.tar.gz.sha512
dev/spark/v3.3.0-rc4-bin/spark-3.3.0-bin-hadoop2.tgz   (with props)
dev/spark/v3.3.0-rc4-bin/spark-3.3.0-bin-hadoop2.tgz.asc
dev/spark/v3.3.0-rc4-bin/spark-3.3.0-bin-hadoop2.tgz.sha512
dev/spark/v3.3.0-rc4-bin/spark-3.3.0-bin-hadoop3-scala2.13.tgz   (with 
props)
dev/spark/v3.3.0-rc4-bin/spark-3.3.0-bin-hadoop3-scala2.13.tgz.asc
dev/spark/v3.3.0-rc4-bin/spark-3.3.0-bin-hadoop3-scala2.13.tgz.sha512
dev/spark/v3.3.0-rc4-bin/spark-3.3.0-bin-hadoop3.tgz   (with props)
dev/spark/v3.3.0-rc4-bin/spark-3.3.0-bin-hadoop3.tgz.asc
dev/spark/v3.3.0-rc4-bin/spark-3.3.0-bin-hadoop3.tgz.sha512
dev/spark/v3.3.0-rc4-bin/spark-3.3.0-bin-without-hadoop.tgz   (with props)
dev/spark/v3.3.0-rc4-bin/spark-3.3.0-bin-without-hadoop.tgz.asc
dev/spark/v3.3.0-rc4-bin/spark-3.3.0-bin-without-hadoop.tgz.sha512
dev/spark/v3.3.0-rc4-bin/spark-3.3.0.tgz   (with props)
dev/spark/v3.3.0-rc4-bin/spark-3.3.0.tgz.asc
dev/spark/v3.3.0-rc4-bin/spark-3.3.0.tgz.sha512

Added: dev/spark/v3.3.0-rc4-bin/SparkR_3.3.0.tar.gz
==
Binary file - no diff available.

Propchange: dev/spark/v3.3.0-rc4-bin/SparkR_3.3.0.tar.gz
--
svn:mime-type = application/octet-stream

Added: dev/spark/v3.3.0-rc4-bin/SparkR_3.3.0.tar.gz.asc
==
--- dev/spark/v3.3.0-rc4-bin/SparkR_3.3.0.tar.gz.asc (added)
+++ dev/spark/v3.3.0-rc4-bin/SparkR_3.3.0.tar.gz.asc Fri Jun  3 11:54:40 2022
@@ -0,0 +1,17 @@
+-BEGIN PGP SIGNATURE-
+
+iQJHBAABCgAxFiEEgPuOvo66aFBJiXA0kbXcgV2/ENMFAmKZ9hwTHG1heGdla2tA
+YXBhY2hlLm9yZwAKCRCRtdyBXb8Q02g6EADly9nJXABQs9frXWjgUexvm5TY6+lY
+mbUg3K+faPfljt1NKRjqzkue5ePMm6zm2x2Sj33Rco9iIGQk8H3BKc+6IIOreknJ
+bgGBmZ/ffo7NM2RlReVTKUuVllrFtmXECznG+o4K2w8HrOr498KtXQ2eE33XKG2h
+SzDhMyn6VIIal2FDwc63Edyh2CV89wQpHOFhrhMQbhBziV/IQ5d4ggrbMB+WOVQi
+IK5l0PqUEB+8LYODMC2F5OVt8p0VRr8OOv5YzA6/3Dca5hKHElbDqDgU0KVFQR2d
+03CHh3DmQP7QDfsGN4z+w/VbXu9oBLPeCd4N8mxIRwReqJUuGYrkpgOa1X+5wPKN
+NfR4LBnde7MiBWaonKl/UtvyuYqjA1bxIi/Ff0juhzpWkffLz/dB434HqJe2wArA
+B/wjzcYKkcMt+402si0/B00rjGS2bC8tuTnQbppr1Ln+7i9qDrX0WBzaqSeHAR2l
+J9dwPrGf0w0XPni0fqM3+tZyIkIxWCjhBT4OgBYX/yT3EyBj3KRTjVkpJ3In/fpe
+YD90gZGKR8/YdU0cbnKA6oV9vC3aH8fXUC8gM74cot9OLvczBTYG1GwLVh86e7VG
+qMBcNSxJabiK0uEI2mt09eXrAINxAlw+1vi2NM0ZuAZ0j5pi/SZu23QIiSu8FiIt
+AaoHVlpVgkCL+g==
+=tqAA
+-END PGP SIGNATURE-

Added: dev/spark/v3.3.0-rc4-bin/SparkR_3.3.0.tar.gz.sha512
==
--- dev/spark/v3.3.0-rc4-bin/SparkR_3.3.0.tar.gz.sha512 (added)
+++ dev/spark/v3.3.0-rc4-bin/SparkR_3.3.0.tar.gz.sha512 Fri Jun  3 11:54:40 2022
@@ -0,0 +1 @@
+c53dcb750d9c7ace040b9c6a11661aaea3bdd0500b0da688521fb6a0989ad95dba82655b2c523fbcb6ded11f9c2c81542263fff4d7e28f1e06e7e697c0299bc4
  SparkR_3.3.0.tar.gz

Added: dev/spark/v3.3.0-rc4-bin/pyspark-3.3.0.tar.gz
==
Binary file - no diff available.

Propchange: dev/spark/v3.3.0-rc4-bin/pyspark-3.3.0.tar.gz
--
svn:mime-type = application/octet-stream

Added: dev/spark/v3.3.0-rc4-bin/pyspark-3.3.0.tar.gz.asc
==
--- dev/spark/v3.3.0-rc4-bin/pyspark-3.3.0.tar.gz.asc (added)
+++ dev/spark/v3.3.0-rc4-bin/pyspark-3.3.0.tar.gz.asc Fri Jun  3 11:54:40 2022
@@ -0,0 +1,17 @@
+-BEGIN PGP SIGNATURE-
+
+iQJHBAABCgAxFiEEgPuOvo66aFBJiXA0kbXcgV2/ENMFAmKZ9h4THG1heGdla2tA
+YXBhY2hlLm9yZwAKCRCRtdyBXb8Q02biEACsBNascO1EuZR/a4+rjpSP5RVUC6KD
+8GL8oXtB6KKMy4wUlYPj3xODg3AI7L/9+OQ3lAUpSTgUBr3RvzKEgyhxUYSyTdx4
+CIv7r1ft1NDgYA59sreFu2YuKMY6CsyP9Ze6KSHG2zWxAps9VPN/Ar9dzGUFFC22
+0MdZVXmnl3Ea2KXrxCPINH6p1xANbmQA+G3gLX73oT3z1jCzwbSxubWhj6Yw55YQ
+sMIvWT/4IIkYldEDaGVmZWCAQ/UyCXiLRraymmG2DQVhAeoHxGo5jxdggnRLlSqW
+0J5PWmtNUHjj9g9pFjbm76x4BJLUGuLptnumvbkqYgh5X6h+OKBWMw5ceIpMR2/f
+vPRGa9y1Bk0WluNeN3IIsMe7UuFoJBIuCeOi8UmTbVGoV+naY5psSMtJPylQ8mJR
+c8nY8gXCWeMCWxokNQQIWxXZpRMwWlojoV2AmRUR+nYG+roebyhI3H4rU6SiVXlP
+vae+kIjPQCILPqEwRlCa+vfqj9ukfE0AmusnGhN3/Mc0qOTtkOqRVd2+KHpF+i4C
+JnXqqJhtg4KUCsLqey3gUJsjXgTAHIXxISYWzPWQYBrKBnXBA0/GP1+cow9vTeuB
+TzmirWfaVBv4DkSoWzQ0q8ils3aKsiML07VSyhcVCTWQcoLJ+WR8z3kV+a0vYr5j
+oY4OgV1u6UmElA==
+=n+my
+-END PGP SIGNATURE-

Added:

[spark] 01/01: Preparing development version 3.3.1-SNAPSHOT

2022-06-03 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 03012f432ac24049291c71415a32677f612a7afd
Author: Maxim Gekk 
AuthorDate: Fri Jun 3 09:20:38 2022 +

Preparing development version 3.3.1-SNAPSHOT
---
 R/pkg/DESCRIPTION  | 2 +-
 assembly/pom.xml   | 2 +-
 common/kvstore/pom.xml | 2 +-
 common/network-common/pom.xml  | 2 +-
 common/network-shuffle/pom.xml | 2 +-
 common/network-yarn/pom.xml| 2 +-
 common/sketch/pom.xml  | 2 +-
 common/tags/pom.xml| 2 +-
 common/unsafe/pom.xml  | 2 +-
 core/pom.xml   | 2 +-
 docs/_config.yml   | 6 +++---
 examples/pom.xml   | 2 +-
 external/avro/pom.xml  | 2 +-
 external/docker-integration-tests/pom.xml  | 2 +-
 external/kafka-0-10-assembly/pom.xml   | 2 +-
 external/kafka-0-10-sql/pom.xml| 2 +-
 external/kafka-0-10-token-provider/pom.xml | 2 +-
 external/kafka-0-10/pom.xml| 2 +-
 external/kinesis-asl-assembly/pom.xml  | 2 +-
 external/kinesis-asl/pom.xml   | 2 +-
 external/spark-ganglia-lgpl/pom.xml| 2 +-
 graphx/pom.xml | 2 +-
 hadoop-cloud/pom.xml   | 2 +-
 launcher/pom.xml   | 2 +-
 mllib-local/pom.xml| 2 +-
 mllib/pom.xml  | 2 +-
 pom.xml| 2 +-
 repl/pom.xml   | 2 +-
 resource-managers/kubernetes/core/pom.xml  | 2 +-
 resource-managers/kubernetes/integration-tests/pom.xml | 2 +-
 resource-managers/mesos/pom.xml| 2 +-
 resource-managers/yarn/pom.xml | 2 +-
 sql/catalyst/pom.xml   | 2 +-
 sql/core/pom.xml   | 2 +-
 sql/hive-thriftserver/pom.xml  | 2 +-
 sql/hive/pom.xml   | 2 +-
 streaming/pom.xml  | 2 +-
 tools/pom.xml  | 2 +-
 38 files changed, 40 insertions(+), 40 deletions(-)

diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION
index 9479bb3bf87..0e449e841cf 100644
--- a/R/pkg/DESCRIPTION
+++ b/R/pkg/DESCRIPTION
@@ -1,6 +1,6 @@
 Package: SparkR
 Type: Package
-Version: 3.3.0
+Version: 3.3.1
 Title: R Front End for 'Apache Spark'
 Description: Provides an R Front end for 'Apache Spark' 
.
 Authors@R:
diff --git a/assembly/pom.xml b/assembly/pom.xml
index 2e9c4d9960b..d12f2ad73fa 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.3.0
+3.3.1-SNAPSHOT
 ../pom.xml
   
 
diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml
index 2a9acfa335e..842d63f5d38 100644
--- a/common/kvstore/pom.xml
+++ b/common/kvstore/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.3.0
+3.3.1-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml
index 7b17e625d75..f7d187bf952 100644
--- a/common/network-common/pom.xml
+++ b/common/network-common/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.3.0
+3.3.1-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml
index c5c920e7747..53f38df8851 100644
--- a/common/network-shuffle/pom.xml
+++ b/common/network-shuffle/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.3.0
+3.3.1-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/common/network-yarn/pom.xml b/common/network-yarn/pom.xml
index 697b5a3928e..845f6659407 100644
--- a/common/network-yarn/pom.xml
+++ b/common/network-yarn/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.3.0
+3.3.1-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/common/sketch/pom.xml b/common/sketch/pom.xml
index ad2db11370a..8e159089193 100644
--- a/common/sketch/pom.xml
+++ b/common/sketch/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.3.0
+3.3.1-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/common/tags/pom.xml b/common/tags/pom.xml
index 1a7bdee70f3..1987c133285 100644
--- a/common/tags/pom.xml
+++

[spark] branch branch-3.3 updated (61d22b6f313 -> 03012f432ac)

2022-06-03 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


from 61d22b6f313 [SPARK-39371][DOCS][CORE] Review and fix issues in 
Scala/Java API docs of Core module
 add 4e3599bc11a Preparing Spark release v3.3.0-rc4
 new 03012f432ac Preparing development version 3.3.1-SNAPSHOT

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] tag v3.3.0-rc4 created (now 4e3599bc11a)

2022-06-03 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to tag v3.3.0-rc4
in repository https://gitbox.apache.org/repos/asf/spark.git


  at 4e3599bc11a (commit)
This tag includes the following new commits:

 new 4e3599bc11a Preparing Spark release v3.3.0-rc4

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 01/01: Preparing Spark release v3.3.0-rc4

2022-06-03 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to tag v3.3.0-rc4
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 4e3599bc11a1cb0ea9fc819e7f752d2228e54baf
Author: Maxim Gekk 
AuthorDate: Fri Jun 3 09:20:31 2022 +

Preparing Spark release v3.3.0-rc4
---
 R/pkg/DESCRIPTION  | 2 +-
 assembly/pom.xml   | 2 +-
 common/kvstore/pom.xml | 2 +-
 common/network-common/pom.xml  | 2 +-
 common/network-shuffle/pom.xml | 2 +-
 common/network-yarn/pom.xml| 2 +-
 common/sketch/pom.xml  | 2 +-
 common/tags/pom.xml| 2 +-
 common/unsafe/pom.xml  | 2 +-
 core/pom.xml   | 2 +-
 docs/_config.yml   | 6 +++---
 examples/pom.xml   | 2 +-
 external/avro/pom.xml  | 2 +-
 external/docker-integration-tests/pom.xml  | 2 +-
 external/kafka-0-10-assembly/pom.xml   | 2 +-
 external/kafka-0-10-sql/pom.xml| 2 +-
 external/kafka-0-10-token-provider/pom.xml | 2 +-
 external/kafka-0-10/pom.xml| 2 +-
 external/kinesis-asl-assembly/pom.xml  | 2 +-
 external/kinesis-asl/pom.xml   | 2 +-
 external/spark-ganglia-lgpl/pom.xml| 2 +-
 graphx/pom.xml | 2 +-
 hadoop-cloud/pom.xml   | 2 +-
 launcher/pom.xml   | 2 +-
 mllib-local/pom.xml| 2 +-
 mllib/pom.xml  | 2 +-
 pom.xml| 2 +-
 repl/pom.xml   | 2 +-
 resource-managers/kubernetes/core/pom.xml  | 2 +-
 resource-managers/kubernetes/integration-tests/pom.xml | 2 +-
 resource-managers/mesos/pom.xml| 2 +-
 resource-managers/yarn/pom.xml | 2 +-
 sql/catalyst/pom.xml   | 2 +-
 sql/core/pom.xml   | 2 +-
 sql/hive-thriftserver/pom.xml  | 2 +-
 sql/hive/pom.xml   | 2 +-
 streaming/pom.xml  | 2 +-
 tools/pom.xml  | 2 +-
 38 files changed, 40 insertions(+), 40 deletions(-)

diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION
index 0e449e841cf..9479bb3bf87 100644
--- a/R/pkg/DESCRIPTION
+++ b/R/pkg/DESCRIPTION
@@ -1,6 +1,6 @@
 Package: SparkR
 Type: Package
-Version: 3.3.1
+Version: 3.3.0
 Title: R Front End for 'Apache Spark'
 Description: Provides an R Front end for 'Apache Spark' 
.
 Authors@R:
diff --git a/assembly/pom.xml b/assembly/pom.xml
index d12f2ad73fa..2e9c4d9960b 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.3.1-SNAPSHOT
+3.3.0
 ../pom.xml
   
 
diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml
index 842d63f5d38..2a9acfa335e 100644
--- a/common/kvstore/pom.xml
+++ b/common/kvstore/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.3.1-SNAPSHOT
+3.3.0
 ../../pom.xml
   
 
diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml
index f7d187bf952..7b17e625d75 100644
--- a/common/network-common/pom.xml
+++ b/common/network-common/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.3.1-SNAPSHOT
+3.3.0
 ../../pom.xml
   
 
diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml
index 53f38df8851..c5c920e7747 100644
--- a/common/network-shuffle/pom.xml
+++ b/common/network-shuffle/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.3.1-SNAPSHOT
+3.3.0
 ../../pom.xml
   
 
diff --git a/common/network-yarn/pom.xml b/common/network-yarn/pom.xml
index 845f6659407..697b5a3928e 100644
--- a/common/network-yarn/pom.xml
+++ b/common/network-yarn/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.3.1-SNAPSHOT
+3.3.0
 ../../pom.xml
   
 
diff --git a/common/sketch/pom.xml b/common/sketch/pom.xml
index 8e159089193..ad2db11370a 100644
--- a/common/sketch/pom.xml
+++ b/common/sketch/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.3.1-SNAPSHOT
+3.3.0
 ../../pom.xml
   
 
diff --git a/common/tags/pom.xml b/common/tags/pom.xml
index 1987c133285..1a7bdee70f3 100644
--- a/common/tags/pom.xml
+++

[spark] branch branch-3.3 updated: [SPARK-39371][DOCS][CORE] Review and fix issues in Scala/Java API docs of Core module

2022-06-03 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
 new 61d22b6f313 [SPARK-39371][DOCS][CORE] Review and fix issues in 
Scala/Java API docs of Core module
61d22b6f313 is described below

commit 61d22b6f313c20de1b65a595e88b6f5bd9595299
Author: Yuanjian Li 
AuthorDate: Fri Jun 3 17:49:01 2022 +0900

[SPARK-39371][DOCS][CORE] Review and fix issues in Scala/Java API docs of 
Core module

Compare the 3.3.0 API doc with the latest release version 3.2.1. Fix the 
following issues:

* Add missing Since annotation for new APIs
* Remove the leaking class/object in API doc

Improve API docs

No

Existing UT

Closes #36757 from xuanyuanking/doc.

Authored-by: Yuanjian Li 
Signed-off-by: Hyukjin Kwon 
(cherry picked from commit 1fbb1d46feb992c3441f2a4f2c5d5179da465d4b)
Signed-off-by: Hyukjin Kwon 
---
 core/src/main/scala/org/apache/spark/errors/SparkCoreErrors.scala   | 2 +-
 .../spark/storage/BlockSavedOnDecommissionedBlockManagerException.scala | 2 +-
 launcher/src/main/java/org/apache/spark/launcher/AbstractLauncher.java  | 2 +-
 launcher/src/main/java/org/apache/spark/launcher/InProcessLauncher.java | 2 +-
 launcher/src/main/java/org/apache/spark/launcher/JavaModuleOptions.java | 2 ++
 5 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/core/src/main/scala/org/apache/spark/errors/SparkCoreErrors.scala 
b/core/src/main/scala/org/apache/spark/errors/SparkCoreErrors.scala
index aecef8ed2d6..1da02884462 100644
--- a/core/src/main/scala/org/apache/spark/errors/SparkCoreErrors.scala
+++ b/core/src/main/scala/org/apache/spark/errors/SparkCoreErrors.scala
@@ -30,7 +30,7 @@ import org.apache.spark.storage.{BlockId, BlockManagerId, 
BlockNotFoundException
 /**
  * Object for grouping error messages from (most) exceptions thrown during 
query execution.
  */
-object SparkCoreErrors {
+private[spark] object SparkCoreErrors {
   def unexpectedPy4JServerError(other: Object): Throwable = {
 new RuntimeException(s"Unexpected Py4J server ${other.getClass}")
   }
diff --git 
a/core/src/main/scala/org/apache/spark/storage/BlockSavedOnDecommissionedBlockManagerException.scala
 
b/core/src/main/scala/org/apache/spark/storage/BlockSavedOnDecommissionedBlockManagerException.scala
index 4684d9c6775..21a022864bb 100644
--- 
a/core/src/main/scala/org/apache/spark/storage/BlockSavedOnDecommissionedBlockManagerException.scala
+++ 
b/core/src/main/scala/org/apache/spark/storage/BlockSavedOnDecommissionedBlockManagerException.scala
@@ -17,5 +17,5 @@
 
 package org.apache.spark.storage
 
-class BlockSavedOnDecommissionedBlockManagerException(blockId: BlockId)
+private[spark] class BlockSavedOnDecommissionedBlockManagerException(blockId: 
BlockId)
   extends Exception(s"Block $blockId cannot be saved on decommissioned 
executor")
diff --git 
a/launcher/src/main/java/org/apache/spark/launcher/AbstractLauncher.java 
b/launcher/src/main/java/org/apache/spark/launcher/AbstractLauncher.java
index 8a1256f7341..80b71e53075 100644
--- a/launcher/src/main/java/org/apache/spark/launcher/AbstractLauncher.java
+++ b/launcher/src/main/java/org/apache/spark/launcher/AbstractLauncher.java
@@ -26,7 +26,7 @@ import static org.apache.spark.launcher.CommandBuilderUtils.*;
 /**
  * Base class for launcher implementations.
  *
- * @since Spark 2.3.0
+ * @since 2.3.0
  */
 public abstract class AbstractLauncher> {
 
diff --git 
a/launcher/src/main/java/org/apache/spark/launcher/InProcessLauncher.java 
b/launcher/src/main/java/org/apache/spark/launcher/InProcessLauncher.java
index 688e1f763c2..6867518b321 100644
--- a/launcher/src/main/java/org/apache/spark/launcher/InProcessLauncher.java
+++ b/launcher/src/main/java/org/apache/spark/launcher/InProcessLauncher.java
@@ -37,7 +37,7 @@ import java.util.logging.Logger;
  * driver memory or configs which modify the driver's class path) do not take 
effect. Logging
  * configuration is also inherited from the parent application.
  *
- * @since Spark 2.3.0
+ * @since 2.3.0
  */
 public class InProcessLauncher extends AbstractLauncher {
 
diff --git 
a/launcher/src/main/java/org/apache/spark/launcher/JavaModuleOptions.java 
b/launcher/src/main/java/org/apache/spark/launcher/JavaModuleOptions.java
index c7d3df99c6e..978466cd77c 100644
--- a/launcher/src/main/java/org/apache/spark/launcher/JavaModuleOptions.java
+++ b/launcher/src/main/java/org/apache/spark/launcher/JavaModuleOptions.java
@@ -21,6 +21,8 @@ package org.apache.spark.launcher;
  * This helper class is used to place the all `--add-opens` options
  * required by Spark when using Java 17. `DEFAULT_MODULE_OPTIONS` has added
  * `-XX:+IgnoreUnrecognizedVMOptions` to be compatible with Java 8 and Java 11.
+ *
+ * @since 3.3.0
  */
 public class JavaModuleOptions {

[spark] branch master updated: [SPARK-39371][DOCS][CORE] Review and fix issues in Scala/Java API docs of Core module

2022-06-03 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 1fbb1d46feb [SPARK-39371][DOCS][CORE] Review and fix issues in 
Scala/Java API docs of Core module
1fbb1d46feb is described below

commit 1fbb1d46feb992c3441f2a4f2c5d5179da465d4b
Author: Yuanjian Li 
AuthorDate: Fri Jun 3 17:49:01 2022 +0900

[SPARK-39371][DOCS][CORE] Review and fix issues in Scala/Java API docs of 
Core module

### What changes were proposed in this pull request?

Compare the 3.3.0 API doc with the latest release version 3.2.1. Fix the 
following issues:

* Add missing Since annotation for new APIs
* Remove the leaking class/object in API doc

### Why are the changes needed?

Improve API docs

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Existing UT

Closes #36757 from xuanyuanking/doc.

Authored-by: Yuanjian Li 
Signed-off-by: Hyukjin Kwon 
---
 core/src/main/scala/org/apache/spark/errors/SparkCoreErrors.scala   | 2 +-
 .../storage/BlockSavedOnDecommissionedBlockManagerException.scala   | 2 +-
 .../src/main/java/org/apache/spark/launcher/AbstractLauncher.java   | 2 +-
 .../src/main/java/org/apache/spark/launcher/InProcessLauncher.java  | 2 +-
 .../src/main/java/org/apache/spark/launcher/JavaModuleOptions.java  | 2 ++
 .../scala/org/apache/spark/sql/diagnostic/DiagnosticListener.scala  | 4 ++--
 .../scala/org/apache/spark/sql/diagnostic/DiagnosticStore.scala | 6 +++---
 7 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/core/src/main/scala/org/apache/spark/errors/SparkCoreErrors.scala 
b/core/src/main/scala/org/apache/spark/errors/SparkCoreErrors.scala
index aecef8ed2d6..1da02884462 100644
--- a/core/src/main/scala/org/apache/spark/errors/SparkCoreErrors.scala
+++ b/core/src/main/scala/org/apache/spark/errors/SparkCoreErrors.scala
@@ -30,7 +30,7 @@ import org.apache.spark.storage.{BlockId, BlockManagerId, 
BlockNotFoundException
 /**
  * Object for grouping error messages from (most) exceptions thrown during 
query execution.
  */
-object SparkCoreErrors {
+private[spark] object SparkCoreErrors {
   def unexpectedPy4JServerError(other: Object): Throwable = {
 new RuntimeException(s"Unexpected Py4J server ${other.getClass}")
   }
diff --git 
a/core/src/main/scala/org/apache/spark/storage/BlockSavedOnDecommissionedBlockManagerException.scala
 
b/core/src/main/scala/org/apache/spark/storage/BlockSavedOnDecommissionedBlockManagerException.scala
index 4684d9c6775..21a022864bb 100644
--- 
a/core/src/main/scala/org/apache/spark/storage/BlockSavedOnDecommissionedBlockManagerException.scala
+++ 
b/core/src/main/scala/org/apache/spark/storage/BlockSavedOnDecommissionedBlockManagerException.scala
@@ -17,5 +17,5 @@
 
 package org.apache.spark.storage
 
-class BlockSavedOnDecommissionedBlockManagerException(blockId: BlockId)
+private[spark] class BlockSavedOnDecommissionedBlockManagerException(blockId: 
BlockId)
   extends Exception(s"Block $blockId cannot be saved on decommissioned 
executor")
diff --git 
a/launcher/src/main/java/org/apache/spark/launcher/AbstractLauncher.java 
b/launcher/src/main/java/org/apache/spark/launcher/AbstractLauncher.java
index eee15419209..a944950cf15 100644
--- a/launcher/src/main/java/org/apache/spark/launcher/AbstractLauncher.java
+++ b/launcher/src/main/java/org/apache/spark/launcher/AbstractLauncher.java
@@ -26,7 +26,7 @@ import static org.apache.spark.launcher.CommandBuilderUtils.*;
 /**
  * Base class for launcher implementations.
  *
- * @since Spark 2.3.0
+ * @since 2.3.0
  */
 public abstract class AbstractLauncher> {
 
diff --git 
a/launcher/src/main/java/org/apache/spark/launcher/InProcessLauncher.java 
b/launcher/src/main/java/org/apache/spark/launcher/InProcessLauncher.java
index 688e1f763c2..6867518b321 100644
--- a/launcher/src/main/java/org/apache/spark/launcher/InProcessLauncher.java
+++ b/launcher/src/main/java/org/apache/spark/launcher/InProcessLauncher.java
@@ -37,7 +37,7 @@ import java.util.logging.Logger;
  * driver memory or configs which modify the driver's class path) do not take 
effect. Logging
  * configuration is also inherited from the parent application.
  *
- * @since Spark 2.3.0
+ * @since 2.3.0
  */
 public class InProcessLauncher extends AbstractLauncher {
 
diff --git 
a/launcher/src/main/java/org/apache/spark/launcher/JavaModuleOptions.java 
b/launcher/src/main/java/org/apache/spark/launcher/JavaModuleOptions.java
index c7d3df99c6e..978466cd77c 100644
--- a/launcher/src/main/java/org/apache/spark/launcher/JavaModuleOptions.java
+++ b/launcher/src/main/java/org/apache/spark/launcher/JavaModuleOptions.java
@@ -21,6 +21,8 @@ package org.apache.spark.launcher;
  * This helper class is used to place the all `--add-opens`

[spark] branch master updated (873ad5596b5 -> bed0b907f8e)

2022-06-03 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 873ad5596b5 [SPARK-37623][SQL] Support ANSI Aggregate Function: 
regr_intercept
 add bed0b907f8e [SPARK-39369][INFRA] Use JAVA_OPTS for AppVeyer build to 
increase the memory properly

No new revisions were added by this update.

Summary of changes:
 appveyor.yml | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9e6f2dd7268 -> 873ad5596b5)

2022-06-03 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 9e6f2dd7268 [SPARK-39320][SQL] Support aggregate function `MEDIAN`
 add 873ad5596b5 [SPARK-37623][SQL] Support ANSI Aggregate Function: 
regr_intercept

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/analysis/FunctionRegistry.scala   |  1 +
 .../expressions/aggregate/Covariance.scala |  4 +-
 .../expressions/aggregate/linearRegression.scala   | 57 +-
 .../aggregate/AggregateExpressionSuite.scala   | 17 +++
 .../sql-functions/sql-expression-schema.md |  1 +
 .../sql-tests/inputs/linear-regression.sql |  6 +++
 .../inputs/postgreSQL/aggregates_part1.sql |  2 +-
 .../inputs/udf/postgreSQL/udf-aggregates_part1.sql |  2 +-
 .../sql-tests/results/linear-regression.sql.out| 35 -
 .../results/postgreSQL/aggregates_part1.sql.out| 10 +++-
 .../udf/postgreSQL/udf-aggregates_part1.sql.out| 10 +++-
 11 files changed, 136 insertions(+), 9 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.3 updated: [SPARK-39259][SQL][3.3] Evaluate timestamps consistently in subqueries

2022-06-03 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
 new 4a0f0ff6c22 [SPARK-39259][SQL][3.3] Evaluate timestamps consistently 
in subqueries
4a0f0ff6c22 is described below

commit 4a0f0ff6c22b85cb0fc1eef842da8dbe4c90543a
Author: Ole Sasse 
AuthorDate: Fri Jun 3 09:12:26 2022 +0300

[SPARK-39259][SQL][3.3] Evaluate timestamps consistently in subqueries

### What changes were proposed in this pull request?

Apply the optimizer rule ComputeCurrentTime consistently across subqueries.

This is a backport of https://github.com/apache/spark/pull/36654.

### Why are the changes needed?

At the moment timestamp functions like now() can return different values 
within a query if subqueries are involved

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

A new unit test was added

Closes #36752 from olaky/SPARK-39259-spark_3_3.

Authored-by: Ole Sasse 
Signed-off-by: Max Gekk 
---
 .../sql/catalyst/optimizer/finishAnalysis.scala| 41 +-
 .../spark/sql/catalyst/plans/QueryPlan.scala   | 11 ++-
 .../optimizer/ComputeCurrentTimeSuite.scala| 89 --
 3 files changed, 95 insertions(+), 46 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/finishAnalysis.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/finishAnalysis.scala
index ef9c4b9af40..242c799dd22 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/finishAnalysis.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/finishAnalysis.scala
@@ -17,14 +17,16 @@
 
 package org.apache.spark.sql.catalyst.optimizer
 
-import scala.collection.mutable
+import java.time.{Instant, LocalDateTime}
 
 import org.apache.spark.sql.catalyst.CurrentUserContext.CURRENT_USER
 import org.apache.spark.sql.catalyst.expressions._
 import org.apache.spark.sql.catalyst.plans.logical._
 import org.apache.spark.sql.catalyst.rules._
 import org.apache.spark.sql.catalyst.trees.TreePattern._
-import org.apache.spark.sql.catalyst.util.DateTimeUtils.{convertSpecialDate, 
convertSpecialTimestamp, convertSpecialTimestampNTZ}
+import org.apache.spark.sql.catalyst.trees.TreePatternBits
+import org.apache.spark.sql.catalyst.util.DateTimeUtils
+import org.apache.spark.sql.catalyst.util.DateTimeUtils.{convertSpecialDate, 
convertSpecialTimestamp, convertSpecialTimestampNTZ, instantToMicros, 
localDateTimeToMicros}
 import org.apache.spark.sql.connector.catalog.CatalogManager
 import org.apache.spark.sql.types._
 import org.apache.spark.util.Utils
@@ -73,29 +75,30 @@ object RewriteNonCorrelatedExists extends Rule[LogicalPlan] 
{
  */
 object ComputeCurrentTime extends Rule[LogicalPlan] {
   def apply(plan: LogicalPlan): LogicalPlan = {
-val currentDates = mutable.Map.empty[String, Literal]
-val timeExpr = CurrentTimestamp()
-val timestamp = timeExpr.eval(EmptyRow).asInstanceOf[Long]
-val currentTime = Literal.create(timestamp, timeExpr.dataType)
+val instant = Instant.now()
+val currentTimestampMicros = instantToMicros(instant)
+val currentTime = Literal.create(currentTimestampMicros, TimestampType)
 val timezone = Literal.create(conf.sessionLocalTimeZone, StringType)
-val localTimestamps = mutable.Map.empty[String, Literal]
 
-plan.transformAllExpressionsWithPruning(_.containsPattern(CURRENT_LIKE)) {
-  case currentDate @ CurrentDate(Some(timeZoneId)) =>
-currentDates.getOrElseUpdate(timeZoneId, {
-  Literal.create(currentDate.eval().asInstanceOf[Int], DateType)
-})
-  case CurrentTimestamp() | Now() => currentTime
-  case CurrentTimeZone() => timezone
-  case localTimestamp @ LocalTimestamp(Some(timeZoneId)) =>
-localTimestamps.getOrElseUpdate(timeZoneId, {
-  Literal.create(localTimestamp.eval().asInstanceOf[Long], 
TimestampNTZType)
-})
+def transformCondition(treePatternbits: TreePatternBits): Boolean = {
+  treePatternbits.containsPattern(CURRENT_LIKE)
+}
+
+plan.transformDownWithSubqueries(transformCondition) {
+  case subQuery =>
+subQuery.transformAllExpressionsWithPruning(transformCondition) {
+  case cd: CurrentDate =>
+Literal.create(DateTimeUtils.microsToDays(currentTimestampMicros, 
cd.zoneId), DateType)
+  case CurrentTimestamp() | Now() => currentTime
+  case CurrentTimeZone() => timezone
+  case localTimestamp: LocalTimestamp =>
+val asDateTime = LocalDateTime.ofInstant(instant, 
localTimestamp.zoneId)
+Literal.create(localDateTimeToMicros(asDateTime), TimestampNTZType)
+}
 }
   }
 }

[spark] branch master updated: [SPARK-39368][SQL] Move `RewritePredicateSubquery` into `InjectRuntimeFilter`

[spark] branch master updated: [SPARK-39294][SQL] Support vectorized Orc scans with DEFAULT values

[spark] branch branch-3.3 updated: [SPARK-39259][SQL][TEST][FOLLOWUP] Fix Scala 2.13 `ClassCastException` in `ComputeCurrentTimeSuite`

[spark] branch master updated: [SPARK-39259][SQL][TEST][FOLLOWUP] Fix Scala 2.13 `ClassCastException` in `ComputeCurrentTimeSuite`

[spark] branch master updated: [SPARK-39265][SQL] Support vectorized Parquet scans with DEFAULT values

[spark] branch master updated: [SPARK-39372][R] Support R 4.2.0

[spark] branch branch-3.2 updated: [SPARK-39367][DOCS][SQL][3.2] Make SQL error objects private

[spark] branch branch-3.2 updated (19ce3e67a89 -> 5f571a23c0d)

svn commit: r54845 - in /dev/spark/v3.3.0-rc4-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/R/articles/ _site/api/R/deps/ _site/api/R/deps/bootstrap-5.1.0/ _site/api/R/deps/jquery-3.6.0/ _site/api

svn commit: r54843 - /dev/spark/v3.3.0-rc4-bin/

[spark] 01/01: Preparing development version 3.3.1-SNAPSHOT

[spark] branch branch-3.3 updated (61d22b6f313 -> 03012f432ac)

[spark] tag v3.3.0-rc4 created (now 4e3599bc11a)

[spark] 01/01: Preparing Spark release v3.3.0-rc4

[spark] branch branch-3.3 updated: [SPARK-39371][DOCS][CORE] Review and fix issues in Scala/Java API docs of Core module

[spark] branch master updated: [SPARK-39371][DOCS][CORE] Review and fix issues in Scala/Java API docs of Core module

[spark] branch master updated (873ad5596b5 -> bed0b907f8e)

[spark] branch master updated (9e6f2dd7268 -> 873ad5596b5)

[spark] branch branch-3.3 updated: [SPARK-39259][SQL][3.3] Evaluate timestamps consistently in subqueries

19 matches

Site Navigation

Mail list logo

Footer information