Small correction: "timeout" in map/flatmapGroupsWithState would not work
similar as State TTL when event time and watermark is set. So timeout in
map/flatmapGroupsWithState is to guarantee removal of state when the state
will not be used, as similar as what we do with streaming aggregation,
whereas State TTL is just work as its name is represented
(self-explanatory). Hence State TTL looks valid for all the cases.

2018년 10월 19일 (금) 오후 12:20, Jungtaek Lim <kabh...@gmail.com>님이 작성:

> Hi devs,
>
> While Spark 2.4.0 is still in progress of release votes, I'm seeing some
> pull requests on non-SS are being reviewed and merged into master branch,
> so I guess discussion about next release is OK.
>
> Looks like there's a major TODO left on structured streaming: allowing
> stateful operation in continuous mode (watermark, stateful exactly-once)
> and no other major milestone is shared. (Please let me know if I'm missing
> here!) As a structured streaming contributor's point of view, there're
> another features we could discuss and see which are good to have, and
> prioritize if possible (NOTE: just a brainstorming and some items might not
> be valid for structured streaming):
>
> * Native support on session window (SPARK-10816 [1])
>   ** patch available
> * Support delegation token on Kafka (SPARK-25501 [2])
>   ** patch available
> * Queryable State (SPARK-16738 [3])
>   ** some discussion took place, but no action is taken yet
> * End to end exactly-once with Kafka sink
>   ** given Kafka is the first class on streaming source/sink nowadays
> * Custom window / custom watermark
> * Physically scale (up/down) streaming state
> * State TTL (especially for non-watermark state)
>   ** "timeout" in map/flatmapGroupsWithState fits it, but just to check
> whether we want to have it for normal streaming aggregation
> * Provide discarded events due to late via side output or similar feature
>   ** for me it looks like tricky one, since Spark's RDD as well as SQL
> semantic provide one output
> * more?
>
> Would like to hear others opinions about this. Please also share if
> there're ongoing efforts on other items for structured streaming. Happy to
> help out if it needs another hand.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 1. https://issues.apache.org/jira/browse/SPARK-10816
> 2. https://issues.apache.org/jira/browse/SPARK-25501
> 3. https://issues.apache.org/jira/browse/SPARK-16738
>
>

Reply via email to