(spark) branch master updated: [SPARK-46331][SQL][FOLLOWUP] Time travel should support current datetime functions

maxgekk Tue, 12 Mar 2024 09:53:00 -0700

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new efc1aedce6d7 [SPARK-46331][SQL][FOLLOWUP] Time travel should support 
current datetime functions
efc1aedce6d7 is described below

commit efc1aedce6d767b40862a8bbade3e250fe7b5637
Author: Wenchen Fan <wenc...@databricks.com>
AuthorDate: Tue Mar 12 21:52:32 2024 +0500

    [SPARK-46331][SQL][FOLLOWUP] Time travel should support current datetime 
functions
    
    ### What changes were proposed in this pull request?
    
    This is a followup of https://github.com/apache/spark/pull/44261 to fix one 
regression: we missed one exception that time travel needs to evaluate current 
datetime functions before the analysis phase completes. This PR fixes it by 
explicitly compute current datetime values in the time travel handling.
    
    ### Why are the changes needed?
    
    re-support current datetime functions as time travel timestamp
    
    ### Does this PR introduce _any_ user-facing change?
    
    no, the regression has not been released yet.
    
    ### How was this patch tested?
    
    new test
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    no
    
    Closes #45476 from cloud-fan/time-travel.
    
    Authored-by: Wenchen Fan <wenc...@databricks.com>
    Signed-off-by: Max Gekk <max.g...@gmail.com>
---
 .../apache/spark/sql/catalyst/analysis/TimeTravelSpec.scala | 13 ++++++++++---
 .../apache/spark/sql/connector/DataSourceV2SQLSuite.scala   | 13 +++++++++++++
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TimeTravelSpec.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TimeTravelSpec.scala
index 8bfcd955497b..fecec238145e 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TimeTravelSpec.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TimeTravelSpec.scala
@@ -18,7 +18,9 @@
 package org.apache.spark.sql.catalyst.analysis
 
 import org.apache.spark.sql.AnalysisException
-import org.apache.spark.sql.catalyst.expressions.{Cast, Expression, Literal, 
RuntimeReplaceable, SubqueryExpression, Unevaluable}
+import org.apache.spark.sql.catalyst.expressions.{Alias, Cast, Expression, 
Literal, SubqueryExpression, Unevaluable}
+import org.apache.spark.sql.catalyst.optimizer.{ComputeCurrentTime, 
ReplaceExpressions}
+import org.apache.spark.sql.catalyst.plans.logical.{OneRowRelation, Project}
 import org.apache.spark.sql.errors.QueryCompilationErrors
 import org.apache.spark.sql.types.TimestampType
 import org.apache.spark.sql.util.CaseInsensitiveStringMap
@@ -42,14 +44,19 @@ object TimeTravelSpec {
         throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(
           "INVALID_TIME_TRAVEL_TIMESTAMP_EXPR.INPUT", ts)
       }
-      val tsToEval = ts.transform {
-        case r: RuntimeReplaceable => r.replacement
+      val tsToEval = {
+        val fakeProject = Project(Seq(Alias(ts, "ts")()), OneRowRelation())
+        
ComputeCurrentTime(ReplaceExpressions(fakeProject)).asInstanceOf[Project]
+          .expressions.head.asInstanceOf[Alias].child
+      }
+      tsToEval.foreach {
         case _: Unevaluable =>
           throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(
             "INVALID_TIME_TRAVEL_TIMESTAMP_EXPR.UNEVALUABLE", ts)
         case e if !e.deterministic =>
           throw QueryCompilationErrors.invalidTimestampExprForTimeTravel(
             "INVALID_TIME_TRAVEL_TIMESTAMP_EXPR.NON_DETERMINISTIC", ts)
+        case _ =>
       }
       val tz = Some(sessionLocalTimeZone)
       // Set `ansiEnabled` to false, so that it can return null for invalid 
input and we can provide
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
index f21c0c2b52fa..93f199dfd585 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
@@ -3003,6 +3003,19 @@ class DataSourceV2SQLSuiteV1Filter
         .collect()
       assert(res10 === Array(Row(7), Row(8)))
 
+      checkError(
+        exception = intercept[AnalysisException] {
+          // `current_date()` is a valid expression for time travel timestamp, 
but the test uses
+          // a fake time travel implementation that only supports two 
hardcoded timestamp values.
+          sql("SELECT * FROM t TIMESTAMP AS OF current_date()")
+        },
+        errorClass = "TABLE_OR_VIEW_NOT_FOUND",
+        parameters = Map("relationName" -> "`t`"),
+        context = ExpectedContext(
+          fragment = "t",
+          start = 14,
+          stop = 14))
+
       checkError(
         exception = intercept[AnalysisException] {
           sql("SELECT * FROM t TIMESTAMP AS OF INTERVAL 1 DAY").collect()


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated: [SPARK-46331][SQL][FOLLOWUP] Time travel should support current datetime functions

Reply via email to