GitHub user tdas opened a pull request: https://github.com/apache/spark/pull/21739
[SPARK-22187][SS] Update unsaferow format for saved state such that we can set timeouts when state is null ## What changes were proposed in this pull request? Currently, the group state of user-defined-type is encoded as top-level columns in the UnsafeRows stores in the state store. The timeout timestamp is also saved as (when needed) as the last top-level column. Since the group state is serialized to top-level columns, you cannot save "null" as a value of state (setting null in all the top-level columns is not equivalent). So we don't let the user set the timeout without initializing the state for a key. Based on user experience, this leads to confusion. This PR is to change the row format such that the state is saved as nested columns. This would allow the state to be set to null, and avoid these confusing corner cases. However, queries recovering from existing checkpoint will use the previous format to maintain compatibility with existing production queries. ## How was this patch tested? Refactored existing end-to-end tests and added new tests for explicitly testing obj-to-row conversion for both state formats. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tdas/spark SPARK-22187-1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21739.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21739 ---- commit ef509c8986dbcc9b37387b0bde56c3d71abb7602 Author: Tathagata Das <tathagata.das1565@...> Date: 2017-10-05T02:25:22Z Partial implementation commit 976a7ea3d5d528e6f1091c696c7f6e865027ee23 Author: Tathagata Das <tathagata.das1565@...> Date: 2018-07-09T11:05:10Z Fixed and added tests commit cfc3f68aabeb4e83bfe8131e93e5f0133fba4869 Author: Tathagata Das <tathagata.das1565@...> Date: 2018-07-09T11:19:01Z Refactored commit 9525484a444ce231ff366bc556fe5a1d46ac4d4f Author: Tathagata Das <tathagata.das1565@...> Date: 2018-07-09T17:38:43Z Minor refactoring ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org