Bhuwan Sahni created SPARK-45655:
------------------------------------

             Summary: current_date() not supported in Streaming Query Observed 
metrics
                 Key: SPARK-45655
                 URL: https://issues.apache.org/jira/browse/SPARK-45655
             Project: Spark
          Issue Type: Bug
          Components: Structured Streaming
    Affects Versions: 3.5.0, 3.4.1
            Reporter: Bhuwan Sahni


Streaming queries do not support current_date() inside CollectMetrics. The 
primary reason is that current_date() (resolves to CurrentBatchTimestamp) is 
marked as non-deterministic. However, {{current_date}} and 
{{current_timestamp}} are both deterministic today, and 
{{current_batch_timestamp}} should be the same.

 

As an example, the query below fails due to observe call on the DataFrame.

 
{quote}val inputData = MemoryStream[Timestamp]

inputData.toDF()
      .filter("value < current_date()")
      .observe("metrics", count(expr("value >= 
current_date()")).alias("dropped"))
      .writeStream
      .queryName("ts_metrics_test")
      .format("memory")
      .outputMode("append")
      .start()
{quote}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to