GitHub user tdas opened a pull request:

    https://github.com/apache/spark/pull/19495

    [SPARK-22278][SS] Expose current event time watermark and current 
processing time in GroupState

    ## What changes were proposed in this pull request?
    
    Complex state-updating and/or timeout-handling logic in mapGroupsWithState 
functions may require taking decisions based on the current event-time 
watermark and/or processing time. Currently, you can use the SQL function 
`current_timestamp` to get the current processing time, but it needs to be 
passed inserted in every row with a select, and then passed through the 
encoder, which isn't efficient. Furthermore, there is no way to get the current 
watermark.
    
    This PR exposes both of them through the GroupState API. 
    Additionally, it also cleans up some of the GroupState docs. 
    
    ## How was this patch tested?
    
    New unit tests

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tdas/spark SPARK-22278

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19495.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19495
    
----
commit c9a042e2f0228584f6a3f643cfac412c73ed98d7
Author: Tathagata Das <tathagata.das1...@gmail.com>
Date:   2017-10-10T00:01:02Z

    Expose event time watermark in the GorupState

commit 67114ab59f5a8d79fbe66b7deb93869f656346b9
Author: Tathagata Das <tathagata.das1...@gmail.com>
Date:   2017-10-14T00:16:08Z

    Exposed processing time

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to