date:20210528

[spark] branch master updated (678592a -> 5c8a141)

2021-05-28 Thread jiangxb1987

This is an automated email from the ASF dual-hosted git repository.

jiangxb1987 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 678592a  [SPARK-35559][TEST] Speed up one test in 
AdaptiveQueryExecSuite
 add 5c8a141  [SPARK-35538][SQL] Migrate transformAllExpressions call sites 
to use transformAllExpressionsWithPruning

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/expressions/DynamicPruning.scala| 3 ++-
 .../spark/sql/catalyst/expressions/complexTypeCreator.scala   | 4 +++-
 .../spark/sql/catalyst/expressions/datetimeExpressions.scala  | 2 ++
 .../org/apache/spark/sql/catalyst/expressions/subquery.scala  | 4 ++--
 .../org/apache/spark/sql/catalyst/optimizer/UpdateFields.scala| 7 ---
 .../scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala | 4 +++-
 .../scala/org/apache/spark/sql/catalyst/trees/TreePatterns.scala  | 8 
 .../main/scala/org/apache/spark/sql/execution/CacheManager.scala  | 3 ++-
 .../execution/adaptive/PlanAdaptiveDynamicPruningFilters.scala| 4 +++-
 .../spark/sql/execution/adaptive/ReuseAdaptiveSubquery.scala  | 3 ++-
 .../scala/org/apache/spark/sql/execution/exchange/Exchange.scala  | 8 +---
 .../spark/sql/execution/streaming/IncrementalExecution.scala  | 4 +++-
 .../spark/sql/execution/streaming/MicroBatchExecution.scala   | 4 +++-
 .../sql/execution/streaming/continuous/ContinuousExecution.scala  | 3 ++-
 .../src/main/scala/org/apache/spark/sql/execution/subquery.scala  | 5 +++--
 15 files changed, 47 insertions(+), 19 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated: [SPARK-35559][TEST] Speed up one test in AdaptiveQueryExecSuite

2021-05-28 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new c0ad339  [SPARK-35559][TEST] Speed up one test in 
AdaptiveQueryExecSuite
c0ad339 is described below

commit c0ad339f7a6520a7840a933977dca8670c8bf83a
Author: Wenchen Fan 
AuthorDate: Fri May 28 12:39:34 2021 -0700

[SPARK-35559][TEST] Speed up one test in AdaptiveQueryExecSuite

### What changes were proposed in this pull request?

I just noticed that `AdaptiveQueryExecSuite.SPARK-34091: Batch shuffle 
fetch in AQE partition coalescing` takes more than 10 minutes to finish, which 
is unacceptable.

This PR sets the shuffle partitions to 10 in that test, so that the test 
can finish with 5 seconds.

### Why are the changes needed?

speed up the test

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

N/A

Closes #32695 from cloud-fan/test.

Authored-by: Wenchen Fan 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 678592a6121a5237f05956e7d9f0565d82d1860a)
Signed-off-by: Dongjoon Hyun 
---
 .../apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala
index f7570c0..c8c4f97 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala
@@ -1457,7 +1457,7 @@ class AdaptiveQueryExecSuite
   test("SPARK-34091: Batch shuffle fetch in AQE partition coalescing") {
 withSQLConf(
   SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> "true",
-  SQLConf.SHUFFLE_PARTITIONS.key -> "1",
+  SQLConf.SHUFFLE_PARTITIONS.key -> "10",
   SQLConf.FETCH_SHUFFLE_BLOCKS_IN_BATCH.key -> "true") {
   withTable("t1") {
 spark.range(100).selectExpr("id + 1 as 
a").write.format("parquet").saveAsTable("t1")

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b763db3 -> 678592a)

2021-05-28 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b763db3  [SPARK-35194][SQL][FOLLOWUP] Recover build error with Scala 
2.13 on GA
 add 678592a  [SPARK-35559][TEST] Speed up one test in 
AdaptiveQueryExecSuite

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] dongjoon-hyun commented on pull request #345: Add Apache Spark 3.1.2 doc

2021-05-28 Thread GitBox



dongjoon-hyun commented on pull request #345:
URL: https://github.com/apache/spark-website/pull/345#issuecomment-850536782


   Thank you, @HyukjinKwon and @gengliangwang !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-35194][SQL][FOLLOWUP] Recover build error with Scala 2.13 on GA

2021-05-28 Thread sarutak

This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new b763db3  [SPARK-35194][SQL][FOLLOWUP] Recover build error with Scala 
2.13 on GA
b763db3 is described below

commit b763db3efdd6a58e34c136b03426371400afefd1
Author: Kousuke Saruta 
AuthorDate: Sat May 29 00:11:16 2021 +0900

[SPARK-35194][SQL][FOLLOWUP] Recover build error with Scala 2.13 on GA

### What changes were proposed in this pull request?

This PR fixes a build error with Scala 2.13 on GA.
#32301 seems to bring this error.

### Why are the changes needed?

To recover CI.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

GA

Closes #32696 from sarutak/followup-SPARK-35194.

Authored-by: Kousuke Saruta 
Signed-off-by: Kousuke Saruta 
---
 .../org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasing.scala | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasing.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasing.scala
index cd7032d..e0e8f92 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasing.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasing.scala
@@ -146,7 +146,8 @@ object NestedColumnAliasing {
 val nestedFieldToAlias = 
attributeToExtractValuesAndAliases.values.flatten.toMap
 
 // A reference attribute can have multiple aliases for nested fields.
-val attrToAliases = 
AttributeMap(attributeToExtractValuesAndAliases.mapValues(_.map(_._2)))
+val attrToAliases =
+  
AttributeMap(attributeToExtractValuesAndAliases.mapValues(_.map(_._2)).toSeq)
 
 plan match {
   case Project(projectList, child) =>

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-35194][SQL] Refactor nested column aliasing for readability

2021-05-28 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new e863166  [SPARK-35194][SQL] Refactor nested column aliasing for 
readability
e863166 is described below

commit e8631660ecf316e4333210650d1f40b5912fb11b
Author: Karen Feng 
AuthorDate: Fri May 28 13:18:44 2021 +

[SPARK-35194][SQL] Refactor nested column aliasing for readability

### What changes were proposed in this pull request?

Refactors `NestedColumnAliasing` and `GeneratorNestedColumnAliasing` for 
readability.

### Why are the changes needed?

Improves readability for future maintenance.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing tests.

Closes #32301 from karenfeng/refactor-nested-column-aliasing.

Authored-by: Karen Feng 
Signed-off-by: Wenchen Fan 
---
 .../sql/catalyst/expressions/AttributeMap.scala|   6 +
 .../sql/catalyst/expressions/AttributeMap.scala|   6 +
 .../catalyst/optimizer/NestedColumnAliasing.scala  | 426 -
 .../spark/sql/catalyst/optimizer/Optimizer.scala   |   4 +-
 .../optimizer/NestedColumnAliasingSuite.scala  |   2 +-
 5 files changed, 250 insertions(+), 194 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala-2.12/org/apache/spark/sql/catalyst/expressions/AttributeMap.scala
 
b/sql/catalyst/src/main/scala-2.12/org/apache/spark/sql/catalyst/expressions/AttributeMap.scala
index 42b92d4..189318a 100644
--- 
a/sql/catalyst/src/main/scala-2.12/org/apache/spark/sql/catalyst/expressions/AttributeMap.scala
+++ 
b/sql/catalyst/src/main/scala-2.12/org/apache/spark/sql/catalyst/expressions/AttributeMap.scala
@@ -23,6 +23,10 @@ package org.apache.spark.sql.catalyst.expressions
  * of the name, or the expected nullability).
  */
 object AttributeMap {
+  def apply[A](kvs: Map[Attribute, A]): AttributeMap[A] = {
+new AttributeMap(kvs.map(kv => (kv._1.exprId, kv)))
+  }
+
   def apply[A](kvs: Seq[(Attribute, A)]): AttributeMap[A] = {
 new AttributeMap(kvs.map(kv => (kv._1.exprId, kv)).toMap)
   }
@@ -37,6 +41,8 @@ class AttributeMap[A](val baseMap: Map[ExprId, (Attribute, 
A)])
 
   override def get(k: Attribute): Option[A] = baseMap.get(k.exprId).map(_._2)
 
+  override def getOrElse[B1 >: A](k: Attribute, default: => B1): B1 = 
get(k).getOrElse(default)
+
   override def contains(k: Attribute): Boolean = get(k).isDefined
 
   override def + [B1 >: A](kv: (Attribute, B1)): Map[Attribute, B1] = 
baseMap.values.toMap + kv
diff --git 
a/sql/catalyst/src/main/scala-2.13/org/apache/spark/sql/catalyst/expressions/AttributeMap.scala
 
b/sql/catalyst/src/main/scala-2.13/org/apache/spark/sql/catalyst/expressions/AttributeMap.scala
index e6b53e3..7715291 100644
--- 
a/sql/catalyst/src/main/scala-2.13/org/apache/spark/sql/catalyst/expressions/AttributeMap.scala
+++ 
b/sql/catalyst/src/main/scala-2.13/org/apache/spark/sql/catalyst/expressions/AttributeMap.scala
@@ -23,6 +23,10 @@ package org.apache.spark.sql.catalyst.expressions
  * of the name, or the expected nullability).
  */
 object AttributeMap {
+  def apply[A](kvs: Map[Attribute, A]): AttributeMap[A] = {
+new AttributeMap(kvs.map(kv => (kv._1.exprId, kv)))
+  }
+
   def apply[A](kvs: Seq[(Attribute, A)]): AttributeMap[A] = {
 new AttributeMap(kvs.map(kv => (kv._1.exprId, kv)).toMap)
   }
@@ -37,6 +41,8 @@ class AttributeMap[A](val baseMap: Map[ExprId, (Attribute, 
A)])
 
   override def get(k: Attribute): Option[A] = baseMap.get(k.exprId).map(_._2)
 
+  override def getOrElse[B1 >: A](k: Attribute, default: => B1): B1 = 
get(k).getOrElse(default)
+
   override def contains(k: Attribute): Boolean = get(k).isDefined
 
   override def updated[B1 >: A](key: Attribute, value: B1): Map[Attribute, B1] 
=
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasing.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasing.scala
index 5b12667..cd7032d 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasing.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasing.scala
@@ -17,71 +17,151 @@
 
 package org.apache.spark.sql.catalyst.optimizer
 
+import scala.collection.mutable
+
 import org.apache.spark.sql.catalyst.expressions._
 import org.apache.spark.sql.catalyst.plans.logical._
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types._
 
 /**
- * This aims to handle a nested column aliasing pattern inside the 
`ColumnPruning` optimizer rule.
- * If a project or its child references to nested fields, and not all the 
fields
- * in a nested attribute are used, we can substitute them by alias attributes; 
then a pro

[spark] branch branch-3.1 updated: [SPARK-35454][SQL][3.1] One LogicalPlan can match multiple dataset ids

2021-05-28 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new 36ab645  [SPARK-35454][SQL][3.1] One LogicalPlan can match multiple 
dataset ids
36ab645 is described below

commit 36ab645f2a02961f6f248bfc1250594dcc252b03
Author: yi.wu 
AuthorDate: Fri May 28 13:04:55 2021 +

[SPARK-35454][SQL][3.1] One LogicalPlan can match multiple dataset ids

### What changes were proposed in this pull request?

Change the type of `DATASET_ID_TAG` from `Long` to `HashSet[Long]` to allow 
the logical plan to match multiple datasets.

### Why are the changes needed?

During the transformation from one Dataset to another Dataset, the 
DATASET_ID_TAG of logical plan won't change if the plan itself doesn't change:


https://github.com/apache/spark/blob/b5241c97b17a1139a4ff719bfce7f68aef094d95/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala#L234-L237

However, dataset id always changes even if the logical plan doesn't change:

https://github.com/apache/spark/blob/b5241c97b17a1139a4ff719bfce7f68aef094d95/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala#L207-L208

And this can lead to the mismatch between dataset's id and col's 
__dataset_id. E.g.,

```scala
  test("SPARK-28344: fail ambiguous self join - Dataset.colRegex as column 
ref") {
// The test can fail if we change it to:
// val df1 = spark.range(3).toDF()
// val df2 = df1.filter($"id" > 0).toDF()
val df1 = spark.range(3)
val df2 = df1.filter($"id" > 0)

withSQLConf(
  SQLConf.FAIL_AMBIGUOUS_SELF_JOIN_ENABLED.key -> "true",
  SQLConf.CROSS_JOINS_ENABLED.key -> "true") {
  assertAmbiguousSelfJoin(df1.join(df2, df1.colRegex("id") > 
df2.colRegex("id")))
}
  }
```

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Added unit tests.

Closes #32692 from Ngone51/spark-35454-3.1.

Authored-by: yi.wu 
Signed-off-by: Wenchen Fan 
---
 .../catalyst/plans/logical/AnalysisHelper.scala|  4 +-
 .../main/scala/org/apache/spark/sql/Dataset.scala  | 11 +--
 .../analysis/DetectAmbiguousSelfJoin.scala |  8 +-
 .../apache/spark/sql/DataFrameSelfJoinSuite.scala  | 87 ++
 4 files changed, 100 insertions(+), 10 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/AnalysisHelper.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/AnalysisHelper.scala
index 54b0141..b31b3e6 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/AnalysisHelper.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/AnalysisHelper.scala
@@ -91,7 +91,9 @@ trait AnalysisHelper extends QueryPlan[LogicalPlan] { self: 
LogicalPlan =>
   }
 } else {
   CurrentOrigin.withOrigin(origin) {
-rule.applyOrElse(afterRuleOnChildren, identity[LogicalPlan])
+val afterRule = rule.applyOrElse(afterRuleOnChildren, 
identity[LogicalPlan])
+afterRule.copyTagsFrom(self)
+afterRule
   }
 }
   }
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
index 6afbbce..1c76f4c 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
@@ -20,7 +20,7 @@ package org.apache.spark.sql
 import java.io.{ByteArrayOutputStream, CharArrayWriter, DataOutputStream}
 
 import scala.collection.JavaConverters._
-import scala.collection.mutable.ArrayBuffer
+import scala.collection.mutable.{ArrayBuffer, HashSet}
 import scala.reflect.runtime.universe.TypeTag
 import scala.util.control.NonFatal
 
@@ -69,7 +69,7 @@ private[sql] object Dataset {
   val curId = new java.util.concurrent.atomic.AtomicLong()
   val DATASET_ID_KEY = "__dataset_id"
   val COL_POS_KEY = "__col_position"
-  val DATASET_ID_TAG = TreeNodeTag[Long]("dataset_id")
+  val DATASET_ID_TAG = TreeNodeTag[HashSet[Long]]("dataset_id")
 
   def apply[T: Encoder](sparkSession: SparkSession, logicalPlan: LogicalPlan): 
Dataset[T] = {
 val dataset = new Dataset(sparkSession, logicalPlan, 
implicitly[Encoder[T]])
@@ -231,9 +231,10 @@ class Dataset[T] private[sql](
   case _ =>
 queryExecution.analyzed
 }
-if 
(sparkSession.sessionState.conf.getConf(SQLConf.FAIL_AMBIGUOUS_SELF_JOIN_ENABLED)
 &&
-plan.getTagValue(Dataset.DATASET_ID_TAG).isEmpty) {
-  plan.setTagValue(Dataset.DATASET_ID_TAG, id)
+if 
(sparkSession.sessionState.conf.getConf(SQLConf.FAIL_AMBIGUOUS_SELF_JOIN_ENABLED))

[spark] branch master updated: [SPARK-35552][SQL] Make query stage materialized more readable

2021-05-28 Thread gengliang

This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 3b94aad  [SPARK-35552][SQL] Make query stage materialized more readable
3b94aad is described below

commit 3b94aad5e72a6b96e4a8f517ac60e0a2fed2590b
Author: ulysses-you 
AuthorDate: Fri May 28 20:42:11 2021 +0800

[SPARK-35552][SQL] Make query stage materialized more readable

### What changes were proposed in this pull request?

Add a new method `isMaterialized` in `QueryStageExec`.

### Why are the changes needed?

Currently, we use `resultOption().get.isDefined` to check if a query stage 
has materialized. The code is not readable at a glance. It's better to use a 
new method like `isMaterialized` to define it.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass CI.

Closes #32689 from ulysses-you/SPARK-35552.

Authored-by: ulysses-you 
Signed-off-by: Gengliang Wang 
---
 .../spark/sql/execution/adaptive/AQEPropagateEmptyRelation.scala   | 5 ++---
 .../spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala   | 6 +++---
 .../apache/spark/sql/execution/adaptive/DynamicJoinSelection.scala | 2 +-
 .../org/apache/spark/sql/execution/adaptive/QueryStageExec.scala   | 7 +--
 4 files changed, 11 insertions(+), 9 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEPropagateEmptyRelation.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEPropagateEmptyRelation.scala
index 614fc78..648d2e7 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEPropagateEmptyRelation.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEPropagateEmptyRelation.scala
@@ -37,14 +37,13 @@ object AQEPropagateEmptyRelation extends 
PropagateEmptyRelationBase {
 super.nonEmpty(plan) || getRowCount(plan).exists(_ > 0)
 
   private def getRowCount(plan: LogicalPlan): Option[BigInt] = plan match {
-case LogicalQueryStage(_, stage: QueryStageExec) if 
stage.resultOption.get().isDefined =>
+case LogicalQueryStage(_, stage: QueryStageExec) if stage.isMaterialized =>
   stage.getRuntimeStatistics.rowCount
 case _ => None
   }
 
   private def isRelationWithAllNullKeys(plan: LogicalPlan): Boolean = plan 
match {
-case LogicalQueryStage(_, stage: BroadcastQueryStageExec)
-  if stage.resultOption.get().isDefined =>
+case LogicalQueryStage(_, stage: BroadcastQueryStageExec) if 
stage.isMaterialized =>
   stage.broadcast.relationFuture.get().value == 
HashedRelationWithAllNullKeys
 case _ => false
   }
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
index 556c036..ebff790 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
@@ -420,7 +420,7 @@ case class AdaptiveSparkPlanExec(
   context.stageCache.get(e.canonicalized) match {
 case Some(existingStage) if conf.exchangeReuseEnabled =>
   val stage = reuseQueryStage(existingStage, e)
-  val isMaterialized = stage.resultOption.get().isDefined
+  val isMaterialized = stage.isMaterialized
   CreateStageResult(
 newPlan = stage,
 allChildStagesMaterialized = isMaterialized,
@@ -442,7 +442,7 @@ case class AdaptiveSparkPlanExec(
 newStage = reuseQueryStage(queryStage, e)
   }
 }
-val isMaterialized = newStage.resultOption.get().isDefined
+val isMaterialized = newStage.isMaterialized
 CreateStageResult(
   newPlan = newStage,
   allChildStagesMaterialized = isMaterialized,
@@ -455,7 +455,7 @@ case class AdaptiveSparkPlanExec(
 
 case q: QueryStageExec =>
   CreateStageResult(newPlan = q,
-allChildStagesMaterialized = q.resultOption.get().isDefined, newStages 
= Seq.empty)
+allChildStagesMaterialized = q.isMaterialized, newStages = Seq.empty)
 
 case _ =>
   if (plan.children.isEmpty) {
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/DynamicJoinSelection.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/DynamicJoinSelection.scala
index 61124f0..a8c74b5 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/DynamicJoinSelection.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/DynamicJoinSelection.scala
@@ -53,7 +53,7 @@ object DynamicJoinSelection extends Rule[Logica

[spark] branch master updated (2de19e4 -> 7eb7448)

2021-05-28 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 2de19e4  [SPARK-35483][INFRA] Add docker-integration-tests to 
run-tests.py and GA
 add 7eb7448  [SPARK-35510][PYTHON] Fix and reenable 
test_stats_on_non_numeric_columns_should_be_discarded_if_numeric_only_is_true

No new revisions were added by this update.

Summary of changes:
 python/pyspark/pandas/tests/test_stats.py | 24 +++-
 1 file changed, 15 insertions(+), 9 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (d189cf7 -> 2de19e4)

2021-05-28 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d189cf7  Revert "[SPARK-35483][INFRA] Add docker-integration-tests to 
run-tests.py and GA"
 add 2de19e4  [SPARK-35483][INFRA] Add docker-integration-tests to 
run-tests.py and GA

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml   | 82 ++
 dev/run-tests.py   | 16 +++--
 dev/sparktestsupport/modules.py| 15 
 .../spark/sql/jdbc/DB2IntegrationSuite.scala   |  2 +-
 .../spark/sql/jdbc/DB2KrbIntegrationSuite.scala|  2 +-
 .../sql/jdbc/DockerIntegrationFunSuite.scala}  | 22 --
 .../sql/jdbc/DockerJDBCIntegrationSuite.scala  |  5 +-
 .../sql/jdbc/DockerKrbJDBCIntegrationSuite.scala   |  2 +-
 .../sql/jdbc/MariaDBKrbIntegrationSuite.scala  |  2 +-
 .../sql/jdbc/MsSqlServerIntegrationSuite.scala |  2 +-
 .../spark/sql/jdbc/MySQLIntegrationSuite.scala |  2 +-
 .../spark/sql/jdbc/OracleIntegrationSuite.scala|  3 +-
 .../spark/sql/jdbc/PostgresIntegrationSuite.scala  |  2 +-
 .../sql/jdbc/PostgresKrbIntegrationSuite.scala |  2 +-
 .../spark/sql/jdbc/v2/DB2IntegrationSuite.scala|  2 +-
 .../sql/jdbc/v2/MsSqlServerIntegrationSuite.scala  |  2 +-
 .../spark/sql/jdbc/v2/MySQLIntegrationSuite.scala  |  4 +-
 .../spark/sql/jdbc/v2/OracleIntegrationSuite.scala |  3 +-
 .../sql/jdbc/v2/PostgresIntegrationSuite.scala |  2 +-
 .../spark/sql/jdbc/v2/PostgresNamespaceSuite.scala |  2 +-
 .../spark/sql/jdbc/v2/V2JDBCNamespaceTest.scala|  3 +-
 .../org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala  |  3 +-
 22 files changed, 146 insertions(+), 34 deletions(-)
 copy 
external/{kinesis-asl/src/test/scala/org/apache/spark/streaming/kinesis/KinesisFunSuite.scala
 => 
docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/DockerIntegrationFunSuite.scala}
 (68%)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (678592a -> 5c8a141)

[spark] branch branch-3.1 updated: [SPARK-35559][TEST] Speed up one test in AdaptiveQueryExecSuite

[spark] branch master updated (b763db3 -> 678592a)

[GitHub] [spark-website] dongjoon-hyun commented on pull request #345: Add Apache Spark 3.1.2 doc

[spark] branch master updated: [SPARK-35194][SQL][FOLLOWUP] Recover build error with Scala 2.13 on GA

[spark] branch master updated: [SPARK-35194][SQL] Refactor nested column aliasing for readability

[spark] branch branch-3.1 updated: [SPARK-35454][SQL][3.1] One LogicalPlan can match multiple dataset ids

[spark] branch master updated: [SPARK-35552][SQL] Make query stage materialized more readable

[spark] branch master updated (2de19e4 -> 7eb7448)

[spark] branch master updated (d189cf7 -> 2de19e4)

10 matches

Site Navigation

Mail list logo

Footer information