(spark) branch branch-3.5 updated: [SPARK-45357][CONNECT][TESTS][3.5] Normalize `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite`

dongjoon Fri, 16 Feb 2024 09:20:33 -0800

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.5
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/branch-3.5 by this push:
     new c61d89aa9485 [SPARK-45357][CONNECT][TESTS][3.5] Normalize 
`dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite`
c61d89aa9485 is described below

commit c61d89aa94859e3b75409a71d48d4f1a023eceac
Author: yangjie01 <yangji...@baidu.com>
AuthorDate: Fri Feb 16 09:20:11 2024 -0800

    [SPARK-45357][CONNECT][TESTS][3.5] Normalize `dataframeId` when comparing 
`CollectMetrics` in `SparkConnectProtoSuite`
    
    ### What changes were proposed in this pull request?
    This PR add a new function `normalizeDataframeId` to sets the `dataframeId` 
to the constant 0 of `CollectMetrics`  before comparing `LogicalPlan` in the 
test case of `SparkConnectProtoSuite`.
    
    ### Why are the changes needed?
    The test scenario in `SparkConnectProtoSuite` does not need to compare the 
`dataframeId` in `CollectMetrics`
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    - Manually check
    
    run
    
    ```
    build/mvn clean install -pl connector/connect/server -am -DskipTests
    build/mvn test -pl connector/connect/server
    ```
    
    **Before**
    
    ```
    - Test observe *** FAILED ***
      == FAIL: Plans do not match ===
      !CollectMetrics my_metric, [min(id#0) AS min_val#0, max(id#0) AS 
max_val#0, sum(id#0) AS sum(id)#0L], 0   CollectMetrics my_metric, [min(id#0) 
AS min_val#0, max(id#0) AS max_val#0, sum(id#0) AS sum(id)#0L], 53
       +- LocalRelation <empty>, [id#0, name#0]                                 
                                +- LocalRelation <empty>, [id#0, name#0] 
(PlanTest.scala:179)
    ```
    
    **After**
    
    ```
    Run completed in 41 seconds, 631 milliseconds.
    Total number of tests run: 882
    Suites: completed 24, aborted 0
    Tests: succeeded 882, failed 0, canceled 0, ignored 0, pending 0
    All tests passed.
    ```
    
    ### Was this patch authored or co-authored using generative AI tooling?
    No
    
    Closes #45141 from LuciferYang/SPARK-45357-35.
    
    Authored-by: yangjie01 <yangji...@baidu.com>
    Signed-off-by: Dongjoon Hyun <dh...@apple.com>
---
 .../apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala  | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git 
a/connector/connect/server/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala
 
b/connector/connect/server/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala
index 0c12bf5e625a..8bc4de835124 100644
--- 
a/connector/connect/server/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala
+++ 
b/connector/connect/server/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala
@@ -30,7 +30,7 @@ import org.apache.spark.sql.{AnalysisException, Column, 
DataFrame, Observation,
 import org.apache.spark.sql.catalyst.analysis
 import org.apache.spark.sql.catalyst.expressions.{AttributeReference, 
GenericInternalRow, UnsafeProjection}
 import org.apache.spark.sql.catalyst.plans.{FullOuter, Inner, LeftAnti, 
LeftOuter, LeftSemi, PlanTest, RightOuter}
-import org.apache.spark.sql.catalyst.plans.logical.{Distinct, LocalRelation, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.plans.logical.{CollectMetrics, Distinct, 
LocalRelation, LogicalPlan}
 import org.apache.spark.sql.catalyst.types.DataTypeUtils
 import org.apache.spark.sql.connect.common.InvalidPlanInput
 import 
org.apache.spark.sql.connect.common.LiteralValueProtoConverter.toLiteralProto
@@ -1067,7 +1067,10 @@ class SparkConnectProtoSuite extends PlanTest with 
SparkConnectPlanTest {
 
   // Compares proto plan with LogicalPlan.
   private def comparePlans(connectPlan: proto.Relation, sparkPlan: 
LogicalPlan): Unit = {
+    def normalizeDataframeId(plan: LogicalPlan): LogicalPlan = plan transform {
+      case cm: CollectMetrics => cm.copy(dataframeId = 0)
+    }
     val connectAnalyzed = analyzePlan(transform(connectPlan))
-    comparePlans(connectAnalyzed, sparkPlan, false)
+    comparePlans(normalizeDataframeId(connectAnalyzed), 
normalizeDataframeId(sparkPlan), false)
   }
 }


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch branch-3.5 updated: [SPARK-45357][CONNECT][TESTS][3.5] Normalize `dataframeId` when comparing `CollectMetrics` in `SparkConnectProtoSuite`

Reply via email to