[ 
https://issues.apache.org/jira/browse/SPARK-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16622623#comment-16622623
 ] 

Jungtaek Lim commented on SPARK-10816:
--------------------------------------

I'm aware of the needs for more advanced cases (like dynamic gap session 
window), but for simple case of session window we still have a chance to make 
it pretty simple. For DSL we may want to provide advanced (complicated) cases, 
but for SQL why not support basic case which can be expressed as SQL statement?

map/flatMapGroupsWithState is something reserved for experts: end users have to 
understand how typed API works, and the limitation of defining watermark in 
metadata of column (when row is serialized to object the information is lost. 
there's relevant issue in Spark JIRA but we just identify it as limitation and 
end users are dealing with it), how to craft state function correctly.

Moreover, as I mentioned in doc, map/flapMapGroupsWithState don't handle 
multiple sessions in same key which is even not enough to handle fixed gap of 
session window. Event time and watermark would require us to deal with 
arbitrary changes of sessions, like multiple sessions which are not yet target 
of eviction, as well as multiple sessions being merged into one due to late 
event. Current mechanism of map/flapMapGroupsWithState don't handle this, and 
at least require end users to deal with it at their own hands.

> EventTime based sessionization
> ------------------------------
>
>                 Key: SPARK-10816
>                 URL: https://issues.apache.org/jira/browse/SPARK-10816
>             Project: Spark
>          Issue Type: New Feature
>          Components: Structured Streaming
>            Reporter: Reynold Xin
>            Priority: Major
>         Attachments: SPARK-10816 Support session window natively.pdf
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to