[ https://issues.apache.org/jira/browse/SPARK-22187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537330#comment-16537330 ]
Apache Spark commented on SPARK-22187: -------------------------------------- User 'tdas' has created a pull request for this issue: https://github.com/apache/spark/pull/21739 > Update unsaferow format for saved state such that we can set timeouts when > state is null > ---------------------------------------------------------------------------------------- > > Key: SPARK-22187 > URL: https://issues.apache.org/jira/browse/SPARK-22187 > Project: Spark > Issue Type: Sub-task > Components: Structured Streaming > Affects Versions: 2.2.0 > Reporter: Tathagata Das > Assignee: Tathagata Das > Priority: Major > Labels: release-notes, releasenotes > > Currently the group state of user-defined-type is encoded as top-level > columns in the unsaferows stores in state store. The timeout timestamp is > also saved as (when needed) as the last top-level column. Since, the > groupState is serialized to top level columns, you cannot save "null" as a > value of state (setting null in all the top-level columns is not equivalent). > So we dont let the user to set the timeout without initializing the state for > a key. Based on user experience, his leads to confusion. > This JIRA is to change the row format such that the state is saved as nested > columns. This would allow the state to be set to null, and avoid these > confusing corner cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org