Hi Jungtaek 
Thank you very much for clarification

> 5 окт. 2020 г., в 15:17, Jungtaek Lim <kabhwan.opensou...@gmail.com> 
> написал(а):
> 
> 
> Hi,
> 
> That's not explained in the SS guide doc but explained in the scala API doc.
> http://spark.apache.org/docs/latest/api/scala/org/apache/spark/sql/streaming/GroupState.html
> 
> The statement being quoted from the scala API doc answers your question.
> 
>> The timeout is reset every time the function is called on a group, that is, 
>> when the group has new data, or the group has timed out. So the user has to 
>> set the timeout duration every time the function is called, otherwise there 
>> will not be any timeout set.
> 
> Simply saying, you'd want to always set timeout unless you remove state for 
> the group (key).
> 
> Hope this helps.
> 
> Thanks,
> Jungtaek Lim (HeartSaVioR)
> 
> ‪On Mon, Oct 5, 2020 at 6:16 PM ‫Yuri Oleynikov (יורי אולייניקוב‬‎ 
> <yur...@gmail.com> wrote:‬
>> Hi all, I have following question:
>> What happens to the state (in terms of expiration) if I’m updating the state 
>> without setting timeout? 
>> 
>> E.g. in FlatMapGroupsWithStateFunction
>> first batch:
>> state.update(myObj)
>> state.setTimeoutDuration(timeout)
>> second batch:
>> state.update(myObj)
>> third batch (no data for a long time):
>> ???? state timed-out after initial timeout  expired? Not timed-out? 

Reply via email to